IST 441 Information Retrieval and Search
can be counted for the IST 402 requirement.
and Place: Spring, 2017. 4-7, Monday, 210 IST Bldg.
Hours: Dr. Lee Giles, IST 311A, Tuesday,
311A IST Bldg, 3-4 or TBA.
TA: Sagnik Choudhury,
IST 309, TBA.
This is a three hour course for juniors, seniors and graduate
students that meets once a week. The course will cover:
organization, representation, and access to information;
categorization, indexing, and content analysis; data structures for
unstructured data; design and maintenance of such databases,
indexing and indexes, retrieval and classification schemes; use of
codes, formats, and standards; analysis, construction and evaluation
of search and navigation techniques; and search engines and how they
relate to the above.
This course is intended to prepare students to understand, design,
develop and use information retrieval and search systems.
IST students should have taken IST 210 and IST 240. IST 220
and IST 230 are also useful. Other students should consult with the
(syllabus): This schedule is
subject to change. Please check it on a regular basis for
assignments. The reading list is here; most classes will have online
handouts. It is the student's responsibility to download that
Course Materials and References: Course materials can
be found here. There will also be links on the schedule.
- The project is a group activity unless approved by the
Policy: All exercise solution sets are hardcopies and are
due at the start of class on the date due. Starting right after
the required submission date, 1/3 of the grade will be deducted
for every day tardy until no grade is available.
- All exercise assignments unless stated are individual
For more information on any of the above, please contact Lee
Texts and Readings: No text is
required. Online papers and chapters and selections from online
books will be used.
The reading list is on the schedule above. We will use
chapters and section from
There are many other useful texts
both on search and information retrieval. A good selection, but a
bit outdated, can be found at the resources
section of the first book.
One that is very good and downloadable is
Engines, Information Retrieval in Practice by W.
Bruce Croft, Donald Metzler, and Trevor Strohman.
A very mathematical treatment of
can be found in
PageRank and Beyond by Amy Langville and Carl Meyer.
Popular but less technical books
that you may find useful and are very informative but a bit
John Battelle, The Search: How Google and Its
Rivals Rewrote the Rules of Business and Transformed Our Culture,
Ian Witten, Marco Gori, Teresa Numerico, Web Dragons: Inside the Myths of Search Engine
Technology, Morgan Kauffman, 2006.
We will be using the popular open
source enterprise search platform, Solr/Lucene
, which is
based on the even more popular Lucene
All email to the instructor and TA about this class should contain
"IST441" in the subject line. For example, the subject line
might read "IST441: Question about ....". Email without this
information might be deleted by spam filters or placed in a folder
to be read at a later date. Email with the appropriate
identifier will usually be read within 24 hours of being received.
Reuse: Materials from this
course can be publicly reused in other courses.