Date Topic covered Assignments
23-Aug Introduction to class (all) Information Retrieval,
Introduction to information retrieval (all) Search Engine,
Introduction to search engines (all) List of search engines,
How Google works?
Fill out student information form
30-Aug Class cancelled
6-Sep Complexity and scalability of search (Big O) (all) Scalability,
How much information (all) Big O,
Exercise 1 (all) Unstructured data,
13-Sep Concept of a document  (all) Greengrass: 2.1.1-2.1.4, 2.1.6
Retrieval evaluation (all) van Rijsbergen: Ch 7, (up to Swets model)
Enterprise, specialty (vertical) search (graduate students required) Manning, Raghavan, Schutze: Ch 8
Teams assigned (all)
20-Sep Robots.txt (all) Web Crawler,
Web crawling (all) Robots.txt exclusion principle,
Specialy search engines (all) Scrapy -
Scrapy introduction (all) Manning, Raghavan, Schutze: Ch 20.1-20.2
linux exercises (all)
Exercise 1 due (all) Vertical Search White Paper" Slack Barsinger,
Exercise 2 (all) Search Engine Technology,
27-Sep Properties of text (all) Manning, Raghavan, Schutze: Ch1, Ch 2.1-2.2, Ch 5.1
4-Oct Classic information retrieval - vector models (all) Greengrass: 6.1-6.4
Similarity ranking (all) Manning, Raghavan, Schutze: Ch 6
Query models (all) Greengrass:  3& 4
Exercise 2 due
Exercise 3
Graduate student project presentation
11-Oct Specialty search engine updates (all) Manning, Raghavan, Schutze: Ch 1, Ch 4.1-4.2
Indexing (graduate students required) Manning, Raghavan, Schutze: Ch 4
Google Custom Search Engine (all) Manning, Raghavan, Schutze: Ch 7.1
(Programmable Search Engine) (graduate students required) Manning, Raghavan, Schutze: Ch 7
Exercise 4
18-Oct Web search basics (all) Manning, Raghavan, Schutze: Ch 19.1-19.5
Elasticsearch introduction (graduate students required) Manning, Raghavan, Schutze: entire Ch 19
Exercise 3 due (all) World Wide Web
25-Oct Search engines (all) The Anatomy of a Large-Scale Hypertextual Web Search Engine, S. Brin, L. Page,
Link analysis - Google (all)
Exercise 4 due (graduate students required) Manning, Raghavan, Schutze: Ch 21
Exercise 5
Specialty Google Programmable Search engine presentations
1-Nov XML and the semantic web (all) "Introduction to XML," Read all in XML BASIC:
Issues in advanced search (all) "Semantic Web Tutorial"
Customizing Elasticsearch (all) Metadata,
Ranking by Transformers (all) Web2.0,
8-Nov Review for exam Old exams available online
Team status reports of project
Exercise 5 due
Exercise  solution set available
15-Nov Exam
22-Nov Thanksgiving break
29-Nov Work on specialty search engines
Project and search engine updates
6-Dec Specialty search enging project presentations
9-Dec Last day of classes
Peer evaluations due
12-Dec 2 PDFs of Search Engine Project Reports due at 8 AM
Please submit a PDF via canvas and 2 hard copies of the PDF to Westgate E350