Wednesday, July 30, 2008

Cuil - The Next "Google Killer"?

Cuil - The Next "Google Killer"?
With the recent press about Cuil, the latest "Google-Killer Search Engine", it seems that we've forgotten the lessons from the late 90s. Cuil's claim to fame appears to be:


It was founded by Ex-Googlers
They claim to have a larger web-page index than Google
The first point is somewhat interesting, but not exactly a path to success. As for the second point, I'd like to say: (1) How do you know that? (2) What does that mean? (3) So?

How Do You Know That?

Google doesn't release the size of its index.

What Does That Mean?

How did they count the size of Google's index? If two urls have identical content, are those the same page? What if the content is merely very similar? Suppose the only difference is that Google isn't indexing the duplicate pages (or, say, the spammy pages), does it matter that Cuil's index is bigger?

So?

Bigger isn't better. I thought we'd learned that back in the late 90s. For most queries, it doesn't matter if the search engine returns 30 results or 1000. You' generally don't go past the 3rd page. What really matters is the ranking of the pages. If the page you wanted is on the 15th page, it might as well not be there at all.

How Cuil Actually Stacks Up:

Interface:

Pros: Slick and pretty. The content drill down is nice - although it doesn't always display relevant things. I also like having the page numbers locked at the bottom so that I don't have to scroll.
Cons: Ranking of results is unclear. There's 3 columns and the rows don't line up with each other. When I'm trying to actually find a good page, I'm not sure where to read.

Speed, Reliability, Performance

Pros: Speedy
Cons: Searches frequently fail. I got "no results" when I tried searching for "Google Talk". I tried the same search a second time and it worked.

Search Result Quality

Selection Criteria for Sample Queries: All queries were selected from my Google Web History, and were queries in which I was attempting to answer a question.

Query #1 (an error I am getting with Google App Engine): error 403 cpu quota exceeded
Cuil: No Results
Yahoo: #1 Result is Google App Engine article about it
Google: #1 Result is a Google Group question about this. #3 (or #5) is the Google App Engine article
Winner: with Google as a close second.
Answer: Common Error. Try using python's profiling.

Query #2: send pdf to kindle
Cuil: Show articles mentioning that you can do this, but not telling me how.
Yahoo: #1 Result is a discussion about it.
Google: #1 Result is a link to Amazon explaining how to do this.
Winner: Google
Answer: Your Kindle has an email address that you email the pdf to.

Query #3: 99 luftballoons translation
Cuil: #1 Result is a translation
Yahoo: #1 Result is someone asking for a translation
Google: #1 Result is a translation
Winner: Cuil & Google (tie).
Answer: It's about war. And red balloons. :-)

Query #4: "imagine no religion" billboard seattle
Cuil: No results
Yahoo: #1 Result is blog post mentioning it. #2 Result is press release about it.
Google: #1 Result is press release about it. #2 Result is blog post mentioning it.
Winner: Google, with Yahoo as a close second
Answer: This billboard was put up by the Freedom From Religion Foundation.

Query #5: percent female math majors in US
Cuil: No results
Yahoo: #1 is a seemingly-relevant but dead link. #2 also seems relevant, but not a direct answer. #3 is about carbon monoxide levels at death. Hmm...
Google: #1 is related article that contains an answer to the question. #2 is a very relevant study, and the summary (which is as far as I read) indirectly answers the question. #3 is about a particular school's gender ratio.
Winner: Google.
Answer: 48% of math majors in the US are female.
Bonus Query: cuil

Cuil: Nothing even remotely related to the search engine.
Yahoo: #1 result is the search engine.
Google: #1 result is the search engine.
Winner: Google and Yahoo. Poor Cuil...
Answer: Google and Yahoo both know what Cuil is (as well as what each other). Cuil, sadly, does not.

Conclusions

While Cuil may claim to have a larger search index, the number of "no result" searches certainly suggest lesser web coverage. The flashy interface is mostly just that - flashy. It's pretty, but the three column layout leave your eyes wandering all over the page unsure of which result is meant to be the most relevant. A more cynical person might even suggest that the three column layout helps mask the fact that Cuil may not know an appropriate ranking.

If you want to get real traction as yet-another-search-engine, you'd better attack a different market from Google (or Baidu in China, or Yahoo in Japan, etc) or you'd better be substantially better than Google. Just being better isn't good enough, and Cuil has a long way to go even on that end.

Sphere: Related Content

0 comments: