A subject in the Department of Computer Science and Software Engineering at the University of Melbourne (Semester 2, 2006)
2006-11-13
On cheese and !eggs
In our last lecture we enjoyed some demonstrations of your search engines, including James' fancy boolean queries and David's cute web interface. As we discovered, web search engines are quite complex! They are also a great source of practical problems for a subject on algorithms and data structures, because they involve amortised searching, sorting, string matching, directed graphs, and more. Come back and do more computer science to learn how to make these algorithms run faster and make better use of computer hardware, distributed computation, and automatic language analysis...
Answers to tutorial exercises from chapters 7 and 8
7.1: 4: yes; 7.2: 3a: 996, 3b: 4980, 3c: 996, 6: by the entry t[c] in the shift table for the text's character c aligned against the last character of the pattern; 7.3: 1b: 3, 1c: 10/6, 2b: 6 (for K=19), 2c: 14/6;
8.1: 2b: yes.
This University "expects its graduates to be educated and well-informed, able to contribute effectively to their communities wherever in the world they choose to live and work." Melbourne graduates are expected to have a rich set of qualities and skills.
Search engines provide efficient keyword lookup over large collections of web pages. For the second project you will implement a simple search engine, analyze the efficiency of its algorithms, and write up your work in the form of a product sheet. The project specification was handed out in today's lecture, and is also available from the CSSE Helpdesk and from the online handouts page (linked from the sidebar).
In yesterday's discussion of hash tables I asked how many cells do we need to randomly occupy in an array of size 1,000,000 before there is a 95% chance of collision. It turns out to be about 2440; i.e. the array only has to be 0.244% full! (See p18 of this week's slides, available online or via the CSSE Helpdesk, and the program circulated on the discussion list).