Date Topic covered Assignments
24-Aug Introduction to class (all) Information Retrieval,
Introduction to information retrieval (all) Search Engine,
Introduction to search engines (all) List of search engines,
How Google works?
Fill out student information form
31-Aug Complexity and scalability of search (Big O) (all) Scalability,
How much information (all) Big O,
Exercise 1 (all) Unstructured data,
Teams assigned (all)
7-Sep Concept of a document  (all) Greengrass: 2.1.1-2.1.4, 2.1.6
Retrieval evaluation (all) van Rijsbergen: Ch 7, (up to Swets model)
Enterprise, specialty (vertical) search (graduate students required) Manning, Raghavan, Schutze: Ch 8
14-Sep Robots.txt (all) Web Crawler,
Web crawling (all) Robots.txt exclusion principle,
Specialy search engines (all) Scrapy -
Scrapy introduction (all) Manning, Raghavan, Schutze: Ch 20.1-20.2
linux exercises (all)
Exercise 1 due (all) Vertical Search White Paper" Slack Barsinger,
Exercise 2 (all) Search Engine Technology,
21-Sep Properties of text (all) Manning, Raghavan, Schutze: Ch1, Ch 2.1-2.2, Ch 5.1
28-Sep Classic information retrieval - vector models (all) Greengrass: 6.1-6.4
Similarity ranking (all) Manning, Raghavan, Schutze: Ch 6
Query models (all) Greengrass:  3& 4
Exercise 2 due
Exercise 3
Graduate student project presentation
5-Oct Specialty search engine updates (all) Manning, Raghavan, Schutze: Ch 1, Ch 4.1-4.2
Indexing (graduate students required) Manning, Raghavan, Schutze: Ch 4
Google Custom Search Engine (all) Manning, Raghavan, Schutze: Ch 7.1
(Programmable Search Engine) (graduate students required) Manning, Raghavan, Schutze: Ch 7
Exercise 4
12-Oct Web search basics (all) Manning, Raghavan, Schutze: Ch 19.1-19.5
Elasticsearch introduction (graduate students required) Manning, Raghavan, Schutze: entire Ch 19
Exercise 3 due (all) World Wide Web
19-Oct Search engines (all) The Anatomy of a Large-Scale Hypertextual Web Search Engine, S. Brin, L. Page,
Link analysis - Google (all)
Exercise 4 due (graduate students required) Manning, Raghavan, Schutze: Ch 21
Exercise 5
Specialty Google Custom Search engine presentations
Graduate student project updates
26-Oct XML and the semantic web (all) "Introduction to XML," Read all in XML BASIC:
Issues in advanced search (all) "Semantic Web Tutorial"
Customizing Elasticsearch (all) Metadata,
(all) Web2.0,
2-Nov Review for exam Old exams available online
Team status reports of project
Exercise 5 due
Exercise  solution set available
9-Nov Exam
16-Nov Work on specialty search engines
Project and search engine updates
23-Nov Thanksgiving break
30-Nov Specialty search enging project presentations
Teams 4, 3, 2, 1
7-Dec Specialty search enging project presentations
Graduate student teams
10-Dec Last day of classes
Peer evaluations due
13-Dec 2 PDFs of Search Engine Project Reports due at 8 AM
Please submit a PDF via canvas and 2 hard copies of the PDF to Westgate E350