IST 402 – Emerging
Technologies
Information
Retrieval and Search Engines
Professor: Dr. C. Lee Giles
TA: Seyda Ertekin
Time and Place: 4:40-5:45 Monday, Wednesday, 205 IST Bldg.
Office hours: Instructor:
TA:
Course Overview
This is three hour course
that meets twice a week. The course will
cover: Organization, representation, and access to information. Categorization, indexing, and content analysis. Data
structures for unstructured data. Design and maintenance
of such databases, indexing and indexes, retrieval and classification schemes.
Use of codes, formats, and standards. Analysis, construction and evaluation of search and navigation
techniques. Search engines and how they relate to the above.
This is an introductory
course for IST students covering the practices, issues, and theoretical
foundations of organizing and analyzing information and information content for
the purpose of providing intellectual access to textual and non-textual
information resources. This course will introduce students to the principles of
information storage and retrieval systems and databases. Students will learn
how effective information search and retrieval is interrelated with the
organization and description of information to be retrieved. Students will also
learn to use a set of tools and procedures for organizing information, will
become familiar with the techniques involved in conducting effective searches
of print and online information resources and will build a search engine.
Course
This course is intended to prepare students to design, develop and use information systems. We will explore the practices, issues and theoretical foundations of organizing and analyzing information and information content for the purpose of providing intellectual access to textual and non-textual information resources. This course will introduce students to the principles of information storage and retrieval systems and databases. They will learn how effective information search and retrieval is interrelated with the organization and description of information to be retrieved. Students will also learn to use a set of tools and procedures for organizing information, and will become familiar with the techniques involved in conducting effective searches of print and online information resources. The course also introduces the major types of information retrieval systems, search engine, the different theoretical foundations underlying these systems, and the methods and measures that can be used to evaluate& them.
These topics will be examined
through readings, discussion, hands-on experience using and constructing
various information retrieval systems, and through exercises designed to help
explore the capabilities and utility of different retrieval systems.
Course Prerequisites:
Students must have taken IST
210 and IST 220. Having taken IST 230 and IST 240 is recommended.
Grading:
30 points |
|
30 points |
|
25 points |
|
Exercises (to be done
as individuals) |
15 points |
Late Policy:
Starting right after the required submission date of any exercise or project,
25% of the grade will be deducted for every day tardy until no grade is
available.
Initial
groups for the project and research presentation can be found here.
Text: Robert R. Korfhage, Information
Storage and Retrieval, Wiley, 1997. Material may be drawn from other
texts.
Other useful texts:
C.
J. van Rijsbergen, Information Retrieval, Butterworths, 1975. (downloadable!!)
Richard
K. Belew, Finding
Out About: A Cognitive Perspective on Search Engine Technology and WWW,
Ricardo A.
Baeza-Yates, Berthier Ribeiro-Neto, Modern Information Retrieval, ACM Press, 1999.
David A. Grossman, Ophir Frieder, Information
Retrieval: Algorithms and Heuristics, Kluwer,
1998.
Schedule:
Course Materials and
References: