| Date | Topic covered | Assignments | ||||||
| 14-Jan | Class syllabus | |||||||
| Introduction to information retrieval | Wikipedia - Information Retrieval | |||||||
| Amount of information | http://www.sims.berkeley.edu/research/projects/how-much-info-2003/ | |||||||
| 21-Jan | No Class | |||||||
| 28-Jan | Concept of a document | (all) Greengrass: 2.1.1-2.1.4, 2.1.6 | ||||||
| Specialty search engines | (all) van Rijsbergen: Ch 7, (up to Swets model) http://www.dcs.gla.ac.uk/Keith/Chapter.7/Ch.7.html | |||||||
| Retrieval evaluation | (graduate students required) Manning, Raghavan, Schutze: Ch 8 | |||||||
| Group assignments | (all) Vertical Search White Paper" Slack Barsinger, http://clgiles.ist.psu.edu/IST441/materials/papers/ | |||||||
| Exercise 1 | ||||||||
| 4-Feb | Query models | (all) Greengrass: 3 & 4 | ||||||
| Properties of text | (all) Manning, Raghavan, Schutze: Ch 1, Ch 2.1-2.2, Ch 5.1, Ch 20.1-20.2 | |||||||
| Crawling - robots.txt | (all) Robots.txt exclusion principle, http://www.robotstxt.org/ | |||||||
| 11-Feb | Classic information retrieval - vector models | (all) Greengrass: 6.1-6.4 | ||||||
| Similarity ranking | (all) Manning, Raghavan, Schutze: Ch 6 | |||||||
| Exercise 2 | ||||||||
| Exercise 1 due | ||||||||
| 18-Feb | Indexing | (all) Manning, Raghavan, Schutze: Ch 1, Ch 4.1-4.2 | ||||||
| (graduate students required) Manning, Raghavan, Schutze: Ch 4 | ||||||||
| 25-Feb | Web search basics | (all) Manning, Raghavan, Schutze: Ch 19.1-19.5 | ||||||
| Specialty search engine proposal presentations | (graduate students required) Manning, Raghavan, Schutze: entire Ch 19 | |||||||
| Exercise 2 due | ||||||||
| Exercise 3 | ||||||||
| 3-Mar | Search engines | (all) The Anatomy of a Large-Scale Hypertextual Web Search Engine, S. Brin, L. Page, http://clgiles.ist.psu.edu/IST441/materials/papers/ | ||||||
| Link analysis - Google | (all) http://www.webworkshop.net/pagerank.html | |||||||
| Specialty search engine proposal presentations | (graduate students required) Manning, Raghavan, Schutze: Ch 21 | |||||||
| 10-Mar | No Class - Spring break | |||||||
| 17-Mar | Social Networks | (all) Social Network Analysis, http://en.wikipedia.org/wiki/Social_network | ||||||
| Exercise 3 due | (all) Chs 1-5, http://www.faculty.ucr.edu/~hanneman/nettext/ | |||||||
| Exercise 4 | (all) Reread Ch 19.2.1 Manning, Raghavan, Schutze | |||||||
| (all) Social Networking Services, http://en.wikipedia.org/wiki/Social_networking | ||||||||
| 24-Mar | XML | (all) "Introduction to XML," Read all in XML BASIC: http://www.w3schools.com/xml/default.asp | ||||||
| Issues in advanced search | (all) TREC 2008, http://trec.nist.gov/call08.html | |||||||
| Specialty search engine proposal updates | ||||||||
| 31-Mar | Review for exam | |||||||
| Exercise 4 due | ||||||||
| 7-Apr | Exam | |||||||
| 14-Apr | Specialty search engine presentations - groups | 2, 13, 7, 9, 5 | ||||||
| 21-Apr | Specialty search engine presentations - groups | 10, 1, 8, 12, 3 | ||||||
| 28-Apr | Specialty search engine presentations - groups | 14, 11, 4, 6, 15 | ||||||