Authors
Brian Amento, Loren Terveen, Will Hill
Publication date
2000/7/1
Book
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval
Pages
296-303
Description
For many topics, the World Wide Web contains hundreds or thousands of relevant documents of widely varying quality. Users face a daunting challenge in identifying a small subset of documents worthy of their attention.
Link analysis algorithms have received much interest recently, in large part for their potential to identify high quality items. We report here on an experimental evaluation of this potential.
We evaluated a number of link and content-based algorithms using a dataset of web documents rated for quality by human topic experts. Link-based metrics did a good job of picking out high-quality items. Precision at 5 is about 0.75, and precision at 10 is about 0.55; this is in a dataset where 0.32 of all documents were of high quality. Surprisingly, a simple content-based metric performed nearly as well; ranking documents by the total number of pages on their containing site.
Total citations
200020012002200320042005200620072008200920102011201220132014201520162017201820192020202120222023202451830212730232817242117229116774453132
Scholar articles
B Amento, L Terveen, W Hill - Proceedings of the 23rd annual international ACM …, 2000