Posted on

NUS Module Review: CS3245 Information Retrieval

This is a somewhat not so popular module, most people probably never heard of information retrieval (IR) as a topic under computer science. But come to think of it, we use information retrieval everyday when we are searching on Google or our local computer. So it is really nice to understand some of the concepts and algorithms behind the searching process.

The module’s syllabus can be found here, the entire information retrieval process is roughly broken down into 3 phases, indexing, ranking and evaluation. There are several different overall approaches, such as boolean retrieval, vector space model. Each approach differs in these 3 phases and there are a lot of variations to each approach suited for different situations and needs (different algorithms to index documents, different ranking schemes for documents). So there are a lot of content and when mixed together, they can be confusing sometimes. Prepare a lot of time to really understand what is going on and appreciate how different algorithms work together.

The programming assignments are mostly implementing “search engines” in python. For each major approach of the IR system, there’s one corresponding programming assignment so you can apply what you just learnt. Some assignments are easy as you just need to follow the algorithm provided in the lecture notes. But others can prove to be difficult when there you have to implement a complicated boolean operator algorithm.

The exam is tricky, typically there are half computational questions and half essay questions. For computational questions, you just need to follow the standard algorithm (maybe with slight modification here and there). But for essay questions, it usually asks for something not directly related to the content taught in class and needs your justification. It is somewhat just the “extension questions” in JC, where you need basic understanding of the topic, but the knowledge you learnt in textbook is not enough to answer the question, you have to think based on your own judgement and general knowledge.

