NUS Statistics Module Review: ST5213 Categorical Data Analysis 2


– This module is essentially about regression analysis for categorical data. It is a totally different demon from ST3131 Regression analysis. The theory and techniques are very strange to me, yet interesting. The link functions and logit models are examined, followed by binomial responses, contingency tables and multinomial logit models.

– Prof Tan is a good lecturer. Her notes are self contained and no reading of external material is necessary. She explains concepts in an easy to understand manner, along with suitable annotations on her slides. She is also approachable for questions. She also kindly provides all her R code she used during lectures so we can play with them at home.

– There are no actual tutorials for postgraduate modules but Prof Tan gave us 10 tutorials to practice on the concepts taught in lecture. She typically ends lectures at 9pm and uses about 30 minutes to explain the tutorial questions. Due to poor attendance, Prof Tan decided to record the tutorials. This shows that she is also very hardworking as she recorded herself explaining the solutions so we can listen to them in our free time.

– The 2 homeworks were not easy for me. I found the questions to be ambiguous and did not answer to Prof Tan’s expectations. That is the only grouse I have in this module, which is the huge wall of text in the homework questions. Otherwise, Prof Tan grades all homework by herself and in a very diligent manner. Instead of simply crossing out the wrong answer, she takes time to write comments on the homework highlighting where you made the mistake. I don’t recall any other prof or grader who does that in university level for Math/Stats modules.

– Final exam was extremely tedious. I could not finish it. It consists of 5 questions. 4 questions involved a mixture of reading R output and some calculations and interpretation of coefficients. The last question is a theoretical question which asks to prove some confidence interval and some other thing. Time management is key. Write as fast as possible and be extremely familiar with reading R output. The tutorial questions given are of limited use as these exam questions are of a different level altogether.


– This module is interesting and a great introduction to categorical data. Still, I would not take it if there were other suitable level 5000 math modules with no timetable clashes since my interest in statistics is pretty limited. For other students with the interest in statistics, this is a good module to take.

