Tuesday, March 09, 2004

Part II: In the beginning -- before Google -- a darkness was upon the land.

This is a continuation of an article I found in the Washington Post on Google, information and what the future may hold.
I thought it was an interesting enough article that I wanted to post it in its entirety in my blog.


Search For Tomorrow
We Wanted Answers, And Google Really Clicked. What's Next?

By Joel Achenbach
Washington Post Staff Writer
Sunday, February 15, 2004; Page D01

( . . . continued from Part I . . . )

Advanced Search

In the early days of search engines, finding information was like fishing in a canal: You might hook something good, but you were just as likely to reel in an old tin can or a rubber boot. Now you often find exactly what you want.

One reason Google works so well today is that there's so much for its robotic crawlers to explore. Google initially searched about 20 million Web pages; the company's home page now boasts that it searches 3,307,998,701 pages.

"In 1996, if you tried to Google someone, if Google existed, it wouldn't have been a very satisfying experience," says Seth Godin, author of a number of best-selling e-books. "We hit a critical mass of really valuable stuff that was online, I think, about 2000."

The expansion of the information universe makes the navigational tool all the more valuable. And yet the search function at first seemed to be an unglamorous computer application. The pioneering search engine companies, including Yahoo!, Excite, AltaVista and Lycos, wanted to transform themselves into something snazzier, a "portal," the full gee-whiz Internet Century home page that would offer the user a link to everything between here and Neptune, plus plane tickets.

But the history of computer technology is full of companies that failed to see the potential glory right in front of them. In the early 1980s, IBM thought that the "operating system" within the computer wasn't nearly as important as the hardware, the box itself. And then Microsoft, which benefited from that oversight, became so focused on software programs that it was slow to capitalize on the Internet revolution, leaving Netscape to create the first commercial Web browser. And then almost everyone underestimated Search.

Not Google. When the company debuted in September 1998, it looked like a throwback. This wasn't a portal. The home page showed mostly white space, anchored by a little rectangle, a box, perfectly blank. Fill in blank and get results. This was plain ol' boring Search, without news headlines, plane tickets, e-mail or any other bells and whistles.

But what results! Google has farms of computers working in parallel. You can put in a couple of words and -- gzzzzt! -- get 600,000-plus results within some preposterously brief amount of time. (Google brags about it: "Search took 0.17 seconds." Showoffs!)

Google, the creation of Stanford graduate students Sergey Brin and Larry Page, is like many other search engines in its basic operation. It has powerful software programs that automatically "crawl" the Web, clicking on every possible link, scouting the terrain. What has made Google special is that, in assessing the quality of sites, it takes note of how many other pages link to any given page. This is an old idea from academia, called citation analysis. If many Web sites link to a particular page, the page rises in Google's vaunted "page rank" and is more likely to be on the first page of the search results.

"You're getting the advantage of the group mind," says Paul Saffo, a research director at the Institute for the Future.

This is a key concept: As the Web has grown, it has developed a kind of embedded wisdom. Obviously the Web isn't a conscious entity, but neither is it a completely random pile of stuff. The way one part links to another reflects the preferences of Web users -- and Google tapped into that. Google, in detecting patterns on the Web, harvested meaning from all that madness.

This points the way to one of the next big leaps for search engines: finding meaning in the way a single person searches the Web. In other words, the search engines will study the user's queries and Web habits and, over time, personalize all future searches. Right now, Google and the other search engines don't really know their users.

For example, Saffo isn't really interested in the stuff that most people look for when they do a Web search. He's one of the premier futurists of Silicon Valley and fondly recalls the days, back in the 1980s and early 1990s, the pre-Web era, when the Internet was the reserve of the technological elite who posted their brilliant thoughts on electronic bulletin boards. Now, everyone from about third grade up has an e-mail address and loiters around the Web as though it's the corner 7-Eleven. The results of a Web search reflect the tastes of a broad swath of ordinary Americans who in some cases are still wearing short pants.

"The more people get on the Web, the more the Web becomes the vaster wasteland that is the successor to the vast wasteland of television. I don't care what the majority of people are looking at, because the majority of people are really boring," Saffo says.

He needs a better search engine. He needs one that knows that he's a big-brain tech guru and not an eighth-grader with a paper due.

"The field is called user modeling," says Dan Gruhl of IBM. "It's all about computers watching interactions with people to try to understand their interests and something about them."

Imagine a version of Google that's got a bit of TiVo in it: It doesn't require you to pose a query. It already knows! It's one step ahead of you. It has learned your habits and thought processes and interests. It's your secretary, your colleague, your counselor, your own graduate student doing research for which you'll get all the credit.

To put it in computer terminology, it is your intelligent agent.


Calling Agent 001101

No one knows how the intelligent agents of the future might really work, and once you venture more than a few months out you're already into some seriously fuzzy territory. But you might imagine that this intelligent agent could gradually take on so many characteristics of your mind that it becomes something of a digital doppelganger, your shadow self.

To borrow and slightly distort something from "Star Trek," it's like your personal digital Borg, having absorbed your thoughts and melded them with an existing software program.

Perhaps this digital self could become a commodity, something marketable. Imagine that you have to write a paper for a class about the future of search engines. You don't want to use your own lame, broken-down, distracted, gummed-up-with-stupid-stuff virtual secretary to do your research. You want to download Bill Gates's intelligent agent, or Paul Saffo's, or Sergey Brin's, to help you ask smarter questions and find the best answers.

There are primitive intelligent agents already. Amazon.com makes book recommendations based on your previous purchases and the judgments of others who have liked the same books you've liked. But this form of collaborative filtering is still fairly crude.

Microsoft senior researcher Eric Horvitz describes a variety of new and future technologies in which software is more active, more of an entity, no longer just some inert codes waiting for the user to issue a command. For example, there's a program he already uses called IQ, for "implicit query."

"As you're working, we continue to formulate queries in the background, that the user doesn't even know about. They're happening very quietly," Horvitz says.

But Horvitz is keenly aware that people don't want a program that's too pushy, that's constantly interrupting. Humans have limited powers of attention. Software, says Horvitz, "needs to be endowed with the kind of common courtesies we'd expect from a well-mannered colleague."

And lurking over the future of such programs is the dilemma of privacy. There's valuable information in the way people use the Web, but they may not want others, or even a machine, to pay close attention to every place they venture. How do you create an intelligent agent that knows when to look away? How do you avoid what Horvitz calls the "monster possibilities"?

What everyone wants is a reasonable, discreet intelligent agent, like an English butler. It should be one that can get things accomplished, to take the extra steps even without being prompted.

"I don't think anyone wants a search engine," says Seth Godin. "I think people want a find engine."

Find, and do. Solve problems. Make it so.

"I often use the analogy of Web agents being like travel agents," says James Hendler, a computer science professor at the University of Maryland. "When I go to my travel agent and say where I want to go, they don't usually just say, 'Yes, you can get there.' They give me some options of different ways to get there. They think about some things I might have forgotten. Do I need a car, do I need a hotel reservation? And then they go do it for me."

Computers as a general rule do only what they're told to do. They don't have artificial intelligence in the classic sense. They have no common sense. IBM's Gruhl, the chief architect of a new product called WebFountain, points out that no computer has ever learned what any 2-year-old human knows.

A computer, he says, can become easily confused by the sentence "Tommy hit a boy with a broken leg." The computer doesn't understand that a broken leg is not going to be an instrument used in an attack. "Common sense, how the world works, even something like irony, are very difficult for computers to understand," says Gruhl.

(continued)
© 2004 The Washington Post Company

No comments: