It is commonly said that the software industry has increasing returns , whereas most other industries have diminishing returns. Since the Web sits in the overlap between software and the real world, should we expect it to have increasing or decreasing returns? The evidence suggests that Web use follows a Zipf distribution (power law). Even though we don't know for sure whether this conclusion will continue to hold in the future, it is interesting to consider how a Zipf distribution will characterize Web use around the end of the Year 2000. The figure shows the distribution of website popularity by the end of Year 2000 under certain assumptions (continued growth  leading to 100 million sites, 500 million users, twenty daily pageviews per user).

 

Double-logarithmic scales: website popularity by end of Year 2000
Predicted usage numbers for websites in the Year 2000.
The x-axis shows sites ranked by popularity (#1 is the most heavily used site)
The y-axis shows the number of pageviews per year for each site
Note that both axes have logarithmic scales

 

It is pretty clear that the largest websites will be much larger than the smaller websites. In fact, the model predicts that the largest website will run at a rate of about 200 billion pageviews per year by the end of the Year 2000. This seems realistic enough given that Yahoo is already running at a rate of 11 billion pageviews per year and is growing by 400 percent per year.

Why are the largest sites going to be so large? One reason is the use of hypertext which will guide additional traffic to sites that already have traffic. For example, when Slate magazine wants to refer to a current news story, they can link to the version of the story on MSNBC instead of, say, CNN's site. Similarly, if Expedia wants to refer to a good film about a certain travel destination, they can link to the coverage of that film on Cinemania. And, of course, all four services can link to Encarta for background information. Thus, any time Slate, MSNBC, Expedia, Cinemania, or Encarta attracts a new user, that user may well also visit some of the other services. As another example, when a user gets a stock quote from Yahoo, that user is likely to follow links to Yahoo's listing of relevant press releases and maybe also to the Motley Fool.

In other words, hypertext will tend to guide disproportionally much traffic to the largest sites and thus support an increasing returns model. Also, most of the cost of running a website is independent of the number of users: content creation is an up-front investment that is independent of usage for traditional content, even though the cost of other material such as moderated discussion groups does grow with usage. Server and ISP costs do grow by usage, though usually more slowly than the number of users since there is some economy of scale in running a larger operation. All these considerations tend to favor the larger sites.

However, I am optimistic on the future of smaller sites . The reason is the added value possible for a smaller and more focused site. A huge site will necessarily have to be rather generic, even though it can use customization features to partly tailor the content to individual users. Therefore, the value of each pageview on the largest sites will tend to be very low and mainly derived from mass-media type advertising. In the long term, I expect mass Web advertising to generate maybe one cent per pageview.

In contrast, smaller sites can provide specialized content that is of higher value to specific, narrow groups of users. As an analogy, consider the book industry: even though it does have best-sellers, a very large number of books are published every year because different readers want different books for different purposes. Hypertext can allow smaller sites to provide a high level of service by linking to larger sites that provide that service in return for the traffic generated by the links. For example, I can allow you to buy the latest Web strategy book from Amazon.com even though my site is too small to have its own mail order fulfillment.

By providing narrowcasted content, smaller sites can derive more value per pageview, since users should be willing to pay a small microtransaction fee for information that is more useful to them. The following table shows what happens if medium-large sites can charge 3 cents per pageview and smaller sites can charge ten cents per pageview. I have also indicated the value if even smaller sites can derive a dollar of value from each pageview: I don't expect users to pay a dollar per page, but strategic use of the Web should easily result in values of a dollar or more per page used to sell products or to reengineer business practices.

 

Cohort of Websites Value per
Pageview
Combined Annual Revenues
for sites in this cohort
Top 10,000 sites 1 cent $19 billion
Next million sites 3 cents $27 billion
Next ten million sites
(numbered 2-11 million from the top)
10 cents $234 billion
Next ten million sites
(numbered 12-21 million from the top)
$1 $1 trillion
Last 79 million sites 0 0

 

It is clear from the table that even though bigger is better in the sense that the largest sites will have the largest revenues, the Web economy as a whole will be dominated by the smaller sites if they can find ways of delivering specialized value. The table assigns zero revenues to the vast majority of low-end sites since they will typically be individual homepages or educational sites. Even though the owners of these tiny sites may not get paid for doing their sites, this does not mean that the sites have zero value. For example, I will certainly check out a student's homepage before hiring an intern, so if the site helps get that person a job, it could be worth quite a lot. This more informal and personal value will be hard to quantify but should be an interesting exercise for researchers in the budding field of knowledge economics.

2003 Update: Web Still Follows Zipf

A large bunch of data collected in 2003 continues to show that the Web follows a Zipf distribution (power law).

This despite the fact that the Web was 35 times larger in 2003 than when I published my original analysis in April 1997. The distribution also holds when one limits the analysis to a specific genre of websites such as the currently (2003) popular weblogs.

See also my follow-up column, Diversity is Power for Specialized Sites .

2005 Update

As of early 2005, the Web only had 58 million sites, not the 100 million I used in my analysis in this article. The growth rate slowed down considerably after the dot-com bubble burst. However, this fact doesn't change the essence of my argument nor does it change my conclusion. We probably have to wait until 2010 to get 100 million sites, and by then the number of users may be a billion, and not the 500 million I used in my analysis. The dollar values might thus be twice as big, but we have to wait a few more years to attain them.

Even today (2005), it's true that the top 10,000 sites generate less than 10% of the cumulative value of the sites ranked 10,001 to 11,999,999.

Recent discussions of the economics of Web use  doesn't consider the possibility that smaller sites may be more targeted and thus provide more value per use than bigger sites. I still think this is the case.