IST 402 – Emerging Technologies

 

Information Retrieval and Search Engines

 

Professor:  Dr. C. Lee Giles

 

TA: Seyda Ertekin

 

 

Time and Place: 4:40-5:45 Monday, Wednesday, 205 IST Bldg.

 

Office hours:  Instructor: 1:30-2:30 Tuesdays at 311A IST Bldg, immediately after class, or by appointment.

                          TA: 2:30- 4:30 Tuesdays at 310 IST Bldg, immediately after class, or by appointment.

 

Course Overview 

 

This is three hour course that meets twice a week.  The course will cover:  Organization, representation, and access to information.  Categorization, indexing, and content analysis.  Data structures for unstructured data.  Design and maintenance of such databases, indexing and indexes, retrieval and classification schemes.  Use of codes, formats, and standards.  Analysis, construction and evaluation of search and navigation techniques.  Search engines and how they relate to the above.

 

This is an introductory course for IST students covering the practices, issues, and theoretical foundations of organizing and analyzing information and information content for the purpose of providing intellectual access to textual and non-textual information resources. This course will introduce students to the principles of information storage and retrieval systems and databases. Students will learn how effective information search and retrieval is interrelated with the organization and description of information to be retrieved. Students will also learn to use a set of tools and procedures for organizing information, will become familiar with the techniques involved in conducting effective searches of print and online information resources and will build a search engine.

 

Course Mission Statement 

 

This course is intended to prepare students to design, develop and use information systems. We will explore the practices, issues and theoretical foundations of organizing and analyzing information and information content for the purpose of providing intellectual access to textual and non-textual information resources. This course will introduce students to the principles of information storage and retrieval systems and databases. They will learn how effective information search and retrieval is interrelated with the organization and description of information to be retrieved. Students will also learn to use a set of tools and procedures for organizing information, and will become familiar with the techniques involved in conducting effective searches of print and online information resources. The course also introduces the major types of information retrieval systems, search engine, the different theoretical foundations underlying these systems, and the methods and measures that can be used to evaluate& them. 

 

These topics will be examined through readings, discussion, hands-on experience using and constructing various information retrieval systems, and through exercises designed to help explore the capabilities and utility of different retrieval systems.

 

Course Prerequisites:

 

Students must have taken IST 210 and IST 220. Having taken IST 230 and IST 240 is recommended.

 

Grading:                               

 

Mid term

30 points

Project & Report

30 points

Research Presentation

25 points

Exercises (to be done as individuals)

15 points

 

Late Policy: Starting right after the required submission date of any exercise or project, 25% of the grade will be deducted for every day tardy until no grade is available.

 

Initial groups for the project and research presentation can be found here.

 

Text:  Robert R. Korfhage, Information Storage and Retrieval, Wiley, 1997.  Material may be drawn from other texts.

 

                Other useful texts:

C. J. van Rijsbergen, Information Retrieval, Butterworths, 1975. (downloadable!!)

Richard K. Belew, Finding Out About: A Cognitive Perspective on Search Engine Technology and WWW, Cambridge, 2000.

Ricardo A. Baeza-Yates, Berthier Ribeiro-Neto, Modern Information Retrieval, ACM Press, 1999.

David A. Grossman, Ophir Frieder, Information Retrieval: Algorithms and Heuristics, Kluwer, 1998.

 

 

Schedule:   This schedule is subject to change. Please check it on a regular basis for assignments. Some classes will have online handouts. It is the student’s responsibility to download that material.

 

Course Materials and References: Here is a link to some of our course materials. There could also be links under the schedule.

 

Acknowledgements!