publisher colophon

SETH FINKELSTEIN

Google, Links, and Popularity versus Authority

Suppose one wished to search through the data available on the Internet to find some information. Often, a user searches for Web pages associated with some particular keywords. However, the number of Web pages available is enormous. Whether millions or billions, the number of items that could potentially be read vastly exceeds any human capacity to examine them. This fundamental mathematical fact creates an opportunity for a solution by the use of automated assistance: that is, a search engine.

A search engine typically contains an index of some portion of all available existing pages and a means of returning an ordered subset of the available pages in response to a user query. Given that users likely wish to examine as few results as possible, the ordering of the results in response to the user query has become a subject of intense interest. The number of pages that merely contain the desired keywords could still be many thousands, but the user may start to lose patience when examining more than a few results.

Thus, primitive implementations of returning all pages that contain certain keywords, in an order based, perhaps, on the age of the page or on when the page was placed in the search engines database, work poorly in terms of returning results that are significant to the user. A major advance in quality of results was the PageRank algorithm of the Google system.

Academic citation literature has been applied to the web, largely by counting citations or backlinks to a given page. This gives some approximation of a page's importance or quality. PageRank extends this idea by not counting links from all pages equally, and by normalizing by the number of links on a page. This innovation proved to be extremely successful. By taking into account the link structure among a network of pages, and employing a measurement based on the results, the structure of links was used in part to impose a structure of relevancy. However, this practice of using links as a metric for meaning has proved to have many complicated social effects.1

In sociological terms, it was insightful of the Google creators to realize that a popular answer would be a popular answer; that is, if someone were to search for the term widget, a popular answer, for the purpose of seeming to fit the needs of the searcher, would be to look for a popular page in some sense. For example, a frequently referenced (linked) page likely had some appealing or attractive aspect to many people. So when that page was returned to a searcher as a result, it would then likely have a similarly appealing or attractive aspect to that searcher.

A very naive initial concept of the functioning of PageRank in a search would include the following steps:

1. Select all pages containing the target term.

2. Order this subset by the size of their PageRank.

3. Return the top results of this ordered subset.

Some reflection would quickly show this model to be untenable. For example, the page that happened to possess the highest PageRank would then appear as the first result for a search on every word it contained, the page with the second highest PageRank would dominate for another set of words, and so on. Obviously, these results might not be very meaningful responses for the search words. Additional criteria for ranking pages for search terms must therefore be introduced, to prevent a small number of pages from dominating the results. Such criteria can include looking for large numbers of the search term; use of the search terms in emphasized or special contexts; or, crucially, hyperlinks from other pages that use the search term in the anchor text of the hyperlink.

The anchor text criterion is particularly powerful. If many people or a few prominent people refer (link) to a page with the desired term, that page is likely to be a good result to return for the desired term. So a somewhat more refined search algorithm would include the following steps:

1. Select all pages containing the target term or that have the target term in the anchor text of links to the page.

2. Calculate the number of links to the page containing the target term and the number of times the term appears on the page, as well as the PageRank.

3. Order the results by a weighted combination of the preceding factors.

As the algorithm becomes more and more elaborate, the addition of an increasing number of factors can create many unintended consequences. As the various ranking aspects interact with each other, several small factors can combine to be equivalent to a large amount of another factor; or, inversely, a very high scoring on one particular basis may overwhelm negligible amounts of every other score. Crucially, all such quantitative criteria do not convey any sense of quality, as to whether the page might be considered good or bad from a perspective based on truth or merit (in an academic sense). While syntactical analysis of page elements (determining how many keywords are present, where they are, and whether they have any special attributes) is easy, semantic analysis (determining what the elements mean) is hard. There can be a confusion of quantitative with qualitative value, or popularity with authority.

Both the nature of the page-ranking activity and its uses underscore the importance of seeing search results as a value-laden process with serious social implications. The following pages will elaborate this idea by exploring three propositions. First, searching is not a democratic activity. Second, searching inherently raises the question of whether, when searching, we want to see society as we are or as we should be. Third, the current norms of searching, based on popularity, are not an appropriate model for civil society.

PageRank and Democracy

It's common to think about the technical examination of a network structure in terms of a political system imposing social structure. The analysis of relevancy in terms of popularity lends itself to an easy analogy of voting and democracy. But an analysis of the fundamental driver of Google's approach, the PageRank, reveals the problems with this analogy.

PageRank relies on the uniquely democratic nature of the web by using its vast link structure as an indicator of an individual page's value. In essence, Google interprets a link from page A to page B as a vote, by page A, for page B. But, Google looks at more than the sheer volume of votes, or links a page receives; it also analyzes the page that casts the vote. Votes cast by pages that are themselves “important” weigh more heavily and help to make other pages “important.”2

Someone might simplistically think that a democratic practice implies that one link is one vote and might then mentally equivalence that idea to a concept of everyone having equal power. But the ranking algorithms are rarely simple direct democracy. They're akin to “shareholder democracy” as practiced in corporations: that is, each person doesn't have a single vote; rather, individual voting power varies by orders of magnitude (for corporations, this depends on how many shares are owned by the shareholder). The votes are more like weighted contributions from blocks or interest groups, not equal individual contributions. One link is not one vote, but it has influence proportional to the relative power (in terms of popularity) of the voter. Because blocks of common interests, or social factions, can affect the results of a search to a degree depending on their relative weight in the network, the results of the algorithmic calculation by a search engine come to reflect political struggles in society.

A Proxy for Societal Importance

The outcome to these political struggles via searching can be quite real. Being highly ranked is the end result of a complex algorithm that is often taken as a proxy for societal importance. Inversely, being lowly ranked can doom a source to marginalization. One response to this concern may be that searching is necessary because of the “information overload” in contemporary society.

While information overload may be a modern cliché, there has always been too much information, ever since the days of cavemen grunting around the campfire, when more occurred at a tribal council than could be effectively retold to an absent hunter. The need to summarize events—to present important (according to some definition) information in a short, accessible form—is hardly new. Many issues surrounding search engines can in fact be framed as instances of long-standing journalistic problems. The universe of available information needs to undergo a winnowing process that can be described as selection, sorting, and spinning, according to the following model:

1. Selection: Which items are important?

2. Sorting: In what order should the items be presented?

3. Spinning: How should one view the items in context?

Compare this description to the colloquial summary of journalism as determining the who, what, when, where, why, and how of an event. For both journalism and search engines, crucial decisions are made as to what results to present to the end user, from among an overwhelming set of possibilities. And both have a concept of objectivity in theory but also inescapable problems with values that enter the decision-making process.

Consider the following passage, where a journalist outlines his education in the algorithm used for determining newsworthiness of traffic accidents, in the era before civil rights reforms took hold.

The unwritten guidelines for reporting fatal automobile accidents were more complicated, the rough rule of thumb being: No n[-]gg[-]rs after 11 p.m. on weekdays, 9 p.m. on Saturdays (as the Sunday paper went to press early). Fatal highway accidents were reported without regard to the color of the deceased until these home edition deadlines: To get a late story in the final editions required making changes, and by tradition only white traffic deaths were considered worth submitting. The exception to this rule was in the area of quantity: If two black persons died in a late evening auto crash, that event had a fair chance of making the news columns. Three dead was considered a safe number by everyone, except those reporters who were known to be viciously anti-Negro. Most of us, of course, considered ourselves neutral or objective in that regard. Yet none of us questioned the professional proposition that the loss of a white life had more news value than the loss of a black life.3

The journalist describes determining newsworthiness by weighing various factors, such as number, time, and (crucially) social influence. All these factors might be calculated in a “neutral or objective” manner (by asking what time an accident was, how many people are dead, and what their color was). But by taking into account the relative weight (like PageRank) of race, social judgments are incorporated in the results of “news value.”

Censorship and Search Links

Sometimes authorities don't want links to be made or, at least, to be visible. Perhaps contrary to a naive impression, there are specific cases where the results of a search are affected by government prohibition; that is, search results that might otherwise be shown are deliberately excluded. The suppression may be local to a country or global to all Google results.4

Search engines do not simply present a raw dump of a database query to the user's screen. The retrieval of the data is just one step. There is much postprocessing afterward, in terms of presentation and customization.

When Google “removes” material, often it is still in the Google index itself. But the postprocessing has removed it from any results shown to the user. This system can be applied, for quality reasons, to remove sites that “spam” the search engine. And that is, by volume, certainly the overwhelming application of the mechanism. But it can also be directed against sites that have been prohibited for government-based reasons.

One very simplistic model of links in the world is that all nodes are ideally visible to all other nodes. But search engines act as sources or portals for a set of links. So suppressing sites in search results will be an ongoing battle.

As We Are or As We Should Be?

Some of the debate over search results echoes ancient descriptive versus prescriptive philosophical conflicts. Should the world be presented as it is (at least as created through the particular search algorithm) or as it should be? The two case studies that follow highlight how Google's approach to the world raises this issue to sometimes emotional heights.

Case Study—Chester's Guide to Molesting Google

What if the terms sometimes used to find an innocuous site are also linked to a site that seems to be associated with child predators? Such a situation led to “moral panic” and a newspaper censorship campaign to have a site removed from both its host and the Google search index. The uproar turned out to originate from a single page of text of “sick humor.”

An article headlined “Sick Website Taken Down” in the U.K. Chester Chronicle reported: “People power and The Chronicle have won the fight to get a sickening paedophile site—in the name of Chester—removed from the web.”5 Almost every fact in that article was wrong: the targeted site was not a pedophile site; the Google search index is not the Web. But the confusions of the people involved in this campaign (which ranged up to U.K. members of Parliament) are revealing. The article related:

Councillors and readers were disgusted earlier this month when we told how a disturbing site could be accessed after innocently typing “Chester Guide” into the popular search engine run by Google.

This week, the US firm agreed to remove the site, entitled “Chester's guide to picking up little girls,” after receiving complaints from our readers.

The move also comes after Cheshire Constabulary's paedophile unit alerted the Internet Watch Foundation....

However, they urged objectors to bombard Google and the Internet service provider Marhost.com with complaints.

A driver of the controversy was apparently that the same words that would naturally be used to find material about the town of Chester were also featured on a page of extremely tasteless material. Thus some sort of association or connection was implied. Of course, extreme bad taste is not illegal. Contrary to the inflammatory description, all that was being returned was a page of very low humor. Bizarre tastelessness makes the rounds of the Internet every day and even has a genre of books devoted to it (e.g., Truly Tasteless Jokes). But contrast these statements from the same article:

Google's international public relations manager, Debbie Frost, said... :

“When an illegal site is discovered, search engines like Google will remove such sites from their indices in order to abide by the law.”

“After our investigation, we have determined that the site in question is illegal and therefore it will be removed from our index.”

... John Price, leader of Chester City Council, was furious when we informed him of the site's existence.

This week, he said: “It's great news the site has been removed. Good riddance to bad rubbish. However, we must now be vigilant and make sure it does not come back.”

Chester MP Christine Russell was also outraged and immediately agreed to demand a change in the law to make such sickening sites illegal.

Crucially, no judicial process seems to have been applied in Google's determination. There was certainly no judicial avenue of appeal, no public evidence record to examine. One might argue that there was little value to the page that was removed from the index, but the implications of such a removal can be troubling.

Case Study—Jew Watch

While Chester's problem of a popular link that yielded unfortunate search results may sound unique, it is not. One of the most well-known examples of complex issues of unintended consequences and social dilemmas is the high ranking of an anti-Semitic Web site, Jew Watch, for Google searches on the keyword “Jew.” The Web site describes itself as “keeping a close watch on Jewish communities, organizations, monopoly, banking, and media control worldwide.” The front page contains such categories as “Jewish-Zionist-Soviet Anti-American Spies,” “Jewish Communist Rulers & Killers,” and “Jewish Terrorists.” It is unarguably a site devoted to anti-Semitic “hate speech.” However, such material, though repulsive, is completely protected under the U.S. Constitution's First Amendment, though other countries may consider it illegal.

For a long time, this objectionable site was the first result in a Google search for the keyword “Jew.” As reported by ZDNet:

The dispute began... when Steven Weinstock, a New York real estate investor and former yeshiva student, did a Google search on “Jew.”... Weinstock has launched an online petition, asking Google to remove the site from its index.6

After the controversy had been in the news for some time, Google posted an explanation of the search result.

A site's ranking in Google's search results is automatically determined by computer algorithms using thousands of factors to calculate a page's relevance to a given query. Sometimes subtleties of language cause anomalies to appear that cannot be predicted. A search for “Jew” brings up one such unexpected result.7

The explanation was in part aimed at defusing charges that Google was anti-Semitic and had deliberately placed a hate site in a high search ranking. Such a charge is completely unfounded. But the problem is more closely outlined by the Anti-Defamation League's analysis: “The longevity of ownership, the way articles are posted to it, the links to and from the site, and the structure of the site itself all increase the ranking of ‘Jewwatch’ within the Google formula.”8 While Google did not in any way promote the hate site, there is more to the ranking than “subtleties of language.” The Google system was, in effect, used by the site to promote itself.

Another site, Remove Jew Watch (www.removejewwatch.com), was set up to launch a petition to “get Google.com to remove Jewwatch.com from their search engine.” Other people tried to have different sites rank higher for the keyword “Jew.” But Jonathan Bernstein, regional director of the Anti-Defamation League, noted that “one can stumble across plenty of Holocaust denial Web sites by simply typing ‘Holocaust’ into Google.” He added: “Some responsibility for this needs to rest on our own shoulders and not just a company like Google. We have to prepare our kids for things they come across on the Internet. This is part of the nature of an Internet world. The disadvantage is we see more of it and our kids see more of it. The advantage is, we see more of it, so we're able to respond to it.... I'm not sure what people would want to see happen. You couldn't really ask Google not to list it.”9

It might be noted, however, that Google will place sites on certain blacklists if they are illegal. A search for the keyword “Jew” in some country-specific Google versions (in Germany and France) shows Jew Watch removed from Google.10 And in at least one situation (the “Chester's guide” case mentioned previously), Google has blacklisted a site that was not illegal. But that way lies madness, and Google has sound reasons to duck the issues as much as it can. The problem will not disappear, and there will be constant pressure from various groups.

Ironically, all the controversy probably raised the rank and relevance of the Jew Watch site within Google's algorithms, at least temporarily. Most important, people who made hyperlinks to the site for the purposes of reference added to the number of links to the site on the Web, which could have contributed to raising its search ranking. For a while, the site lost its service provider and, since it was not available, dropped in ranking; but then it rose back up (around April 22, 2004). Eventually, the Wikipedia entry for the word “Jew” took over the top position for a search on that word, and attention to this case subsided. But as hate groups realize the power that comes from prominent placement in searches, the topic will certainly be revisited. As an ironic aside, during the height of the controversy, one neo-Nazi was apparently jealous of all the attention received by a like-minded rival, so he tried to generate a campaign to ban his own site, presumably so publicity and anticensorship sentiment would give that site similar prominence. The campaign failed, but it illustrates the extremes of convoluted political maneuvering that can be found in the topic.

To some extent, the high position of the Jew Watch site in search results for the keyword “Jew” can represent a kind of plurality dominance over diluted opposition. If one were to ask what the most prominent associations with the word “Jew” are, anti-Semitism would sadly have to be significant. And it would by no means need to have anywhere near a majority share to be returned as a first result. If, hypothetically, anti-Semitism were the association 19 percent of the time and there were nine other slightly different positive associations that each had 9 percent of the remaining time, being the greatest single identifiable block could give it a ranking of “most popular” in some algorithmic sense. This is the popularity versus authority conflict all over again. A site that has a plurality of weighted link votes need not be accurate or even inoffensive to the population outside that group.

Moreover, if a goal is to return relevant results, anti-Semites also use search engines, and a hate site counts as a correct result to them. In a sense, Google argued that it was performing a descriptive function in reflecting relative prominence for a search term, against the tangle that would develop if it was prescriptive in its results. But a contrary point of view is that an algorithm that gives high ratings to hate sites is by definition flawed in some way and should not be justified merely by the fact of being an algorithm. At least, if the choice is made that a dominant plurality result is correct, even if it is sometimes offensive, it should be recognized that there are significant social implications of such a choice.

Intentionally or unintentionally, the Jew Watch site had done search engine optimization for the keyword “Jew.” In extreme forms, an optimization activity turns into “Google spamming,” where search engine spammers try to get irrelevant pages to rank highly in order to obtain profit from ad clicks. The activity can reach a point of doing significant damage to search results, and it has generated some drastic countermeasures, where harsh antispam actions cause problems with legitimate sites. But significant self-promotion can be done short of spamming, and search engine optimization is merely puffery, not fraud.

A different form of linking to game Google is a practice known as “Google bombing” (defined at Wordspy.com as “setting up a large number of Web pages with links that point to a specific Web site so that the site will appear near the top of a Google search when users enter the link text”). Technically, this manipulates Google search results by hyping the ranking factor associated with the words used to link to a site—for example, using the phrase “miserable failure” to link to a biography of President George W. Bush or connecting the phrase “out-of-touch executives” to Google corporate information. From a Web site's standpoint, Google bombing is the mirror image of search engine optimization, where a site seeks to rank highly for desired keywords.

Search engine optimization for political ends is a largely unresearched area. Google bombing is now a crude process, done for laughs. In the future, it might well involve much more serious political dirty tricks. Indeed, political campaigning is at heart a process of manipulating information, and as search engines become more important as sources of information, we can expect more and varied creative attempts at their manipulation.

PageRank Selling and Commodification of Social Relations

The factors that Google uses to rank pages have long been a target for financial ends. Once any sort of value is created by a link, there's an immediate thought that a market can be created to monetize that value. While many people think of linking as a purely social relationship, it's quite possible to have such expressions of social interconnection be subverted for commercial purposes.

But search engines cannot simply let the market decide the value of a link. That would eventually produce pages of results that are nothing but advertisements, which would then drive users away from the search engine. Those would not be popular results—advertisements tend to be unpopular (even if they are sometimes effective in generating business). Moreover, paying for links on the Web usually competes with the search engine's own paid advertising program. So a search company has an incentive to disallow outright sales of links, while marketers have an incentive to attempt to buy as much influence as possible.

A crude way to do such buying of links would be to seek out high-ranking pages and offer payment for placement. But such pages are relatively easy to monitor, and internal ranking penalties can be applied if a site owner is found to be participating in such practices. More sophisticated schemes are being refined by companies that offer independent Web writers (bloggers) small amounts of money to write about products on the writer's own Web site. These arrangements are commonly discussed in terms of traditional journalistic ethics regarding sponsorship or disclosure. The idea is that if the writer discloses that the article is a paid placement, the reader can then apply the appropriate adjustment to the credibility of the content.

However, such a traditional framework misses an important aspect of the exchange. In the case of PageRank selling, the sources of the ranking will not be evident. It won't matter what the writer says about the product or what the reader thinks in terms of trusting the article, as the ranking algorithm will see only the link itself. If the accumulated purchasing of links eventually results in a high ranking, that process will be virtually invisible to the searcher. In a way, this is a disintermediation of the elite influencers—commodifying their social capital—and a reintermediation of that influencing process with an agency specializing in the task. Instead of courting a relatively few A-list writers who are highly valued for their ability to have their choice of topics echoed by many other writers, the lesser writers can be purchased directly (and perhaps more simply and cheaply).

Even for prominent writers who would decline an explicit pay-for-placement deal, the many ways linkage can be purchased (literally or metaphorically) leads to controversy over proper behavior. For example, one company, FON, set off a round of discussion by having many advisory board members who were also widely read Web writers.11 But the tiny company also got publicity from another source: influential commentators on the Internet who write blogs—including some who may be compensated in the future for advising FON about its business. Though an appropriate journalistic disclosure was made almost everywhere in this case, the aspect in which the social was intermingling with the commercial remained unsettling. A focus on disclaimers often assumes a certain background in separation and avoidance of conflict of interest and is insufficient when those strictures are no longer in place. While blurring the lines between business and friendship is not at all a new problem, the shifting systems of attention sorting and seeking are now bringing these issues to notice in new contexts.

To put it simply, there's an old joke that runs as follows:

BILLIONAIRE TO WOMAN: Would you have sex with me for a million dollars?

WOMAN: Well... yes.

BILLIONAIRE: Would you have sex with me for ten dollars?

WOMAN: What kind of a girl do you think I am?

BILLIONAIRE: We've already determined that. Now we're just arguing over the price.

Two factors make up the humor in this joke: commerce itself and amount. The obvious aspect of the joke is that there are two categories of interactions, commercial and social, between which there is not supposed to be any overlap, regardless of the dollar amount at stake. A less-often-remarked aspect is that there is indeed a “class” division between high-priced commercial and low-priced commercial.

Future controversies may present a real-life version of that joke that might go roughly as follows:

COMPANY TO BLOGGER: Would you write about me for advisory board membership?

BLOGGER: Well... yes.

COMPANY: Would you write about me for ten dollars?

BLOGGER: What kind of a flack do you think I am?

COMPANY: We've already determined that. Now we're just arguing over the price.

Is a few dollars the same as an advisory board membership? No—there's a class division, in that an advisory board membership is high-class and expensive, while a few dollars is tawdry and cheap. But there's also a problem when executive “escorts” criticize street prostitutes.

The Nofollow Attribute

There's a public relations saying (attributed to many people) that goes, “I don't care what the newspapers say about me as long as they spell my name right.” The concept is that any mention, positive or negative, is helpful in terms of recognition. Links have a somewhat similar phenomena, where any link, even originating from a page making negative statements about the site, can help build the site's search ranking. This is a particular pernicious issue in the case of hate sites (as discussed earlier), as any publicity for the sites tends to generate more links to the sites even if the publicity is negative. A link, by itself, cannot distinguish fame from infamy.

One attempt to address this dilemma has been the introduction of a special attribute, nofollow, to try to distinguish the purely referential aspect of a link from any implied popularity or importance of the site that has been referenced.12 If you're a blogger (or a blog reader), you're painfully familiar with people who try to raise their own Web sites' search engine rankings by submitting linked blog comments like “Visit my discount pharmaceuticals site.” This is called comment spam. We researchers don't like it either, and we've been testing a new tag that blocks it. From now on, when Google sees the attribute (rel=nofollow) on hyperlinks, those links won't get any credit when we rank Web sites in our search results. This isn't a negative vote for the site where the comment was posted; it's just a way to make sure that spammers get no benefit from abusing public areas like blog comments, trackbacks, and referrer lists.

The results of this attribute have been mixed. It certainly has prevented many blog owners who have open comment areas from inadvertently adding to spam pollution. But even if some link spammers have been discouraged, more than enough remain undeterred so that the problem of spammers is still overwhelming. While many blogs have automatically implemented the nofollow attribute on all links in their public areas, a large number of spammers will apparently spam anyway—finding it more efficient to be indiscriminate, perhaps, or in hopes of benefiting somehow in any case.

Businesses That Mine Data for Popularity: Not a Model for Civil Society

From a political standpoint, one might hope that the use of the nofollow attribute regarding hate sites would lower their rankings as people who mention them unfavorably discourage linking. But the use of this attribute in linking requires both knowledge of its existence and some sophisticated knowledge of how to code a link (as opposed to using a simple interface). So while this way of separating meanings is helpful overall, it is complicated enough to carry out so that the problem is not substantially addressed in practice.

Moving from the specifics of the nofollow attribute to the more general impact of links on people's consciousness, it should be clear by now that Google-like approaches to searching, which base rankings on the popularity of links, tend not to question the society's basic hierarchy. One initial simplistic way of thinking about link networks is to somehow lump all nodes together, as if there were no other structure for determining who received links. But since many links are made by people, all the prejudices and biases that affect who someone networks with personally or professionally can affect who they network with in terms of hypertext linkage. One writer described this (often gender-based) cliquishness in the following manner:

Point of fact, if you follow the thread of this discussion, you would see something like Dave linking to Cory who then links to Scoble who links to Dave who links to Tim who links to Steve who then links to Dave who links to Doc who follows through with a link to Dan, and so on. If you throw in the fact that the Google Guys are, well, guys, then we start to see a pattern here: men have a real thing for the hypertext link....

[Later] When we women ask the power-linkers why they don't link to us more, what we're talking about is communication, and wanting a fair shot of being heard; but what the guys hear is a woman asking for a little link love. Hey lady, do you have what it takes? More important, are you willing to give what it takes?

Groupies and blogging babes, only, need apply.13

Recall that popularity can be confused with authority and that a link from a popular site carries more weight to a search engine. The self-reinforcing nature of references within a small group can then be a very powerful tool for excluding those outside the inner circle. Instead of democracy, there's effectively oligarchy.

The best way, by far, to get a link from an A-List blogger is to provide a link to the A-List blogger. As the blogosphere has become more rigidly hierarchical, not by design but as a natural consequence of hyperlinking patterns, filtering algorithms, aggregation engines, and subscription and syndication technologies, not to mention human nature, it has turned into a grand system of patronage operated—with the best of intentions, mind you—by a tiny, self-perpetuating elite. A blog-peasant, one of the Great Unread, comes to the wall of the castle to offer a tribute to a royal, and the royal drops a couple of coins of attention into the peasant's little purse. The peasant is happy, and the royal's hold over his position in the castle is a little bit stronger.14

In fact, rather than subvert hierarchy, it's much more likely that hyper-links (and associated popularity algorithms) reflect existing hierarchies.15 This is true for a very deep reason—if an information-searching system continually returned results that were disturbing or upsetting, there would be strong pressure to regard that system as incorrect and change it or to defect to a different provider. As can be seen in some of the discussions earlier in this essay, even isolated anomalous results can draw angry reactions. Subversive results would not be acceptable.

The positive results from data-mining links for popularity are certainly impressive but have also inspired flights of punditry that project a type of divinity or mystification into the technology. New York Times columnist Thomas Friedman wrote an op-ed column entitled “Is Google God?” where he quoted a Wi-Fi company vice president as saying:

If I can operate Google, I can find anything. And with wireless, it means I will be able to find anything, anywhere, anytime. Which is why I say that Google, combined with Wi-Fi, is a little bit like God. God is wireless, God is everywhere and God sees and knows everything. Throughout history, people connected to God without wires. Now, for many questions in the world, you ask Google, and increasingly, you can do it without wires, too.16

However, in contrast to the utopianism, there is much research to show that the mundane world is very much the same as it ever was. Hindman and his colleagues note: “It is clear that in some ways the Web functions quite similarly to traditional media. Yes, almost anyone can put up a political Web site. But our research suggests that this is usually the online equivalent of hosting a talk show on public access television at 3:30 in the morning.”17

Link popularity is itself no solution to problems in governance. Determining what opinions are popular is usually one of the least complicated political tasks. But what if the results are hateful or are manufactured by an organized lobbying campaign? How much weight should be given to strong minority views in opposition to the majority? These questions, which determine the character of a society, are not answered by merely listing the popular opinions and options. Moreover, some of the lessons learned from such businesses are arguably exactly the wrong lessons needed for a pluralistic democracy, where you cannot simply ban the minority that isn't profitable. Unfortunately and maybe self-provingly, that is not a popular position.


NOTES

1. S. Brin and L. Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” WWW7 / Computer Networks 30, nos. 1–7 (1998): 107–17.

2. Google Inc., “Our Search: Google Technology,” 2007, http://www.google.com/intl/en/technology/.

3. W. Hinckle, If You Have a Lemon, Make Lemonade (New York: Putnam, 1974).

4. B. Edelman and J. Zittrain, “Localized Google Search Result Exclusions,” http://cyber.law.harvard.edu/filtering/google/; Seth Finkelstein, “Google Censorship—How It Works,” http://sethf.com/anticensorware/general/google-censorship.php; BBC News, “Google Censors Itself for China,” http://news.bbc.co.uk/1/hi/technology/4645596.stm.

5. “Sick Website Taken Down,” Chester Chronicle, February 21, 2003, http://iccheshireonline.icnetwork.co.uk/0100news/chesterchronicle/page.cfm?ob jectid=12663897&method=full&siteid=50020.

6. D. Becker, “Google Caught in Anti-Semitism Flap,” 2004, http://zdnet.com.com/2100-1104-5186012.html.

7. Google Inc., “An Explanation of Our Search Results,” 2004, http://www.google.com/explanation.html.

8. Anti-Defamation League, “Google Search Ranking of Hate Sites Not Intentional,” 2004, http://www.adl.org/rumors/google_search_rumors.asp.

9. J. Eskenazi, “No. 1 Google Result for ‘Jew’ Is Fanatical Hate Site—for Now,” 2004, http://www.jewishsf.com/content/2-0/module/displaystory/story_id/21783/format/html/displaystory.html.

10. B. Edelman and J. Zittrain, “Localized Google Search Result Exclusions,”2002, http://cyber.law.harvard.edu/filtering/google/.

11. R. Buckman, “Blog Buzz on High-Tech Start-ups Causes Some Static,” 2006, http://online.wsj.com/public/article/SB113945389770169170-0DZ4wQf felheiC5fe4GISe73UwQ_20070209.html.

12. Google Inc., “Preventing Comment Spam,” 2005, http://googleblog.blogspot.com/2005/01/preventing-comment-spam.html.

13. S. Powers, “Guys Don't Link,” 2005, http://weblog.burningbird.net/2005/03/07/guys-dont-link/.

14. N. Carr, “The Great Unread,” 2006, http://www.roughtype.com/archives/2006/08/the_great_unrea.php.

15. J. Garfunkel, “The New Gatekeepers,” 2005, http://civilities.net/TheNew Gatekeepers.

16. T. Friedman, “Is Google God?” 2003, http://www.cnn.com/2003/US/06/29/nyt.friedman/.

17. M. Hindman, K. Tsioutsiouliklis, and J. Johnson, “‘Googlearchy’: How a Few Heavily-Linked Sites Dominate Politics on the Web,” 2003, http://www.cs.princeton.edu/~kt/mpsa03.pdf.

Share