
Prof. Dr. Felix Naumann
Hasso-Plattner-Institut
für Softwaresystemtechnik
Prof.-Dr.-Helmert-Str. 2-3
D-14482 Potsdam, Germany
Datenbank-Spektrum Article Accepted
Special Issue on RDF Data Management (German Database Forum)
Paper accepted at SSDBM 2013
25th International Conference on Scientific and Statistical Database Management (SSDBM), July...
Data Profiling Revisited: Article accepted for SIGMOD Record
Felix Naumann. Data Profiling Revisited. SIGMOD Record (to appear), 2013. Data...
Paper accepted at MSND workshop @ WWW 2013
Analyzing and Predicting Viral Tweets Maximilian Jenders, Gjergji Kasneci, and Felix...
Runner Up for Best Paper Award at BTW 2013
The submission "Duplicate Detection on GPUs" by Benedikt Forchhammer, Thorsten...
Article accepted at Information Systems Journal (IS)
Cost-Aware Query Planning for Similarity Search Dustin Lange and Felix...
Paper and demo accepted at BTW Conference
15th BTW conference on "Database Systems for Business, Technology, and Web" (BTW 2013)...
Felix Naumann gives keynote talk at ICIQ 2012
On November 17 Felix Naumann talked about "The Quality of Web Data" at the 2012...
Search Engines
Description
Search engines permeate every facet of our online lives and many offline. This lecture introduces the basic architectures and technology for search engines both on the Web and on other collections of digital artifacts. Topics covered include
- Search Engine Architectures
- Crawling
- Text Processing
- Ranking Indexes
- Search Queries
- Information Retrieval Methods
- Search Engine Evaluation
Feedback
- If you have any ideas for future exercises or comments on the exercise/lecture, please don't hesitate to contact us or use this web form.
Schedule
- Tuesdays, 9:15 Uhr, HS3
- Thursdays, 9:15 Uhr, HS3
The lectures are given in English and are available as tele-task recordings for logged in students.
| Date | Topic | Slides | |
|---|---|---|---|
| Tue | 12.04.2011 | Introduction | |
| Thu | 14.04.2011 | Architecture | |
| Tue | 19.04.2011 | Exercise 1: Nutch and Googlewhacking | pdf, Nutch |
| Thu | 21.04.2011 | Crawling | |
| Tue | 26.04.2011 | Crawling | |
| Thu | 28.04.2011 | Exercise 2: Crawling Journal Club | |
| Tue | 03.05.2011 | Text processing | |
| Thu | 05.05.2011 | Exercise 3: Fingerprints & Zipf | pdf, Texts, Duplicates |
| Tue | 10.05.2011 | Text processing | |
| Thu | 12.05.2011 | Indexing | |
| Tue | 17.05.2011 | Indexing | |
| Thu | 19.05.2011 | Exercise 4: Text Processing | |
| Tue | 24.05.2011 | Indexing | |
| Thu | 26.05.2011 | Querying | |
| Tue | 31.05.2011 | Cancelled | |
| Thu | 02.06.2011 | Christi Himmelfahrt | |
| Tue | 07.06.2011 | Querying | |
| Thu | 09.06.2011 | Exercise 5: Querying | pdf, MovieSearch |
| Tue | 14.06.2011 | Querying | |
| Thu | 16.06.2011 | Cancelled: Students are encouraged to attend FutureSOC Symposium | |
| Tue | 21.06.2011 | Retrieval Models | |
| Thu | 23.06.2011 | Retrieval Models | |
| Tue | 28.06.2011 | Moved to 06.07. | |
| Thu | 30.06.2011 | Moved to 13.07. | |
| Tue | 05.07.2011 | Exercise 6: Retrieval Models | |
| Wed | 06.07.2011, 17:00 | Retrieval Models | |
| Thu | 07.07.2011 | Retrieval Models | |
| Tue | 12.07.2011, 10:00 | Exercise 7 | |
| Wed | 13.07.2011, 17:00 | Evaluation | |
| Thu | 14.07.2011 | Question Answering: Saeedeh Momtazi | |
| Tue | 19.07.2011 | Social Search | |
| Thu | 21.07.2011 | Outlook | |
Literature
- Search Engines: Information Retrieval in Practice by Bruce Croft, Donald Metzler, Trevor Strohman
- Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze
Examination
The written exam will take place on 27.07.2011 (Wednesday) from 10:00 to 12:00 in HS 1.


