In this era of information, vast amounts of new data are produced every day from various fields including scientific research, healthcare, industry and service processes. Using data effectively and extracting meaningful insights from data can significantly improve efficiencies, cut costs and add more value to organizations. This course aims to provide you with an understanding of basic techniques for data analysis, machine learning and dimension reduction for big data, and expose you to hands-on computational tools that are fundamental for data science. Besides supervised and unsupervised learning, another fundamental technology of Artificial Intelligence – reinforcement learning will also be introduced, including Markov decision process and Q- learning. Suited for anyone from different backgrounds, this course will show you how you could apply various methods to data examples and case studies from both research and industrial sources in the Singapore context.
-
Taken in: AY 19/20 Sem 1
Grade: A-Assignment 1 (10%): 10/10
Assignment 2 (10%): 9.5/10
Assignment 3 (10%): 10/10
Project (20%): 15.5/20
Finals (50%)This is a new ger-core module introduced for our batch. The programming language we have to learn is the R language, using a software called RStudio. The module is conducted in such a way whereby we have to watch online lectures before attending lab/tut each week (they do not have any deadline for this so if you choose not to watch it and leave it to right before finals no one can stop you, but I would not recommend that!). There will be lab notes and lab tasks uploaded before each lab/tut. The lab tasks are very similar to whatever covered in the lab notes, so I would suggest that you read through and understand everything in the lab notes before class and note down any questions you may have and clarify them during the lab itself. Or maybe your doubt may be cleared during the 1st hour of the lab as the TA will go through the lab notes. But the TA goes through the lab notes very quickly so if you get lost in the beginning you would have no idea what is going on and be unable to attempt the lab tasks in the 2nd hour. OR if you understand everything in the lab notes then you can choose not to listen to the TA during the 1st hour and go straight into attempting the lab tasks (like what I did). Because I have a lecture right after the lab/tut meaning I have to leave early and usually do not have the full hour to attempt the lab tasks. I was lucky to get a TA that really knows her stuff and teaches in a way that I could understand and I really enjoyed/looked forward to each lab/tut, she also sends us her lab demo she uses to teach the lab notes. The other TA from the other lab/tut timing does not do that 🙃
The 3 assignments are really doable and many people even asked the TA how to do the assignment, you could even refer to the lab notes, google for answers and discuss with your friends so there is nothing to worry about. Since this module was only introduced for our batch we had no idea about this and studied super hard for the first assignment 🙃 what a waste of time and energy hahah.
The project can be done in a group of a maximum of 6 people. I did mine in a group of 3 (same members as PS0001 module in yr1). If you have been keeping up with lab/tut each week and understand the content, this project would be a piece of cake. We had to choose a dataset and come up with a 3-page report to answer our study aim/hypothesis that we came up with ourselves. Tip: start early and have questions to clarify during the lab/tut AND if necessary, schedule a consultation with the prof/TA. Our group wanted to consult the prof but I guess she was too busy and did not reply to our email, so we met with our super nice TA instead. Also, choose a dataset where you can understand and can come up with relevant study questions that allow you to use as many methods taught in this module. We do not know how the cohort did but after asking a few of our friends, I think 15.5 is considered not bad already.
The finals comprised of 15 MCQs and 4 short answer questions with a few subparts. You are allowed to bring in 2 sided A4 helpsheet. There were minimal R code tested, but more of what does the output of each method mean, what does each method aim to achieve and when to use which method, etc. I would not say it is difficult, there are some tricky questions but overall quite easy. As compared to PS0001, this module is much much more manageable. Another note: if you do not understand the statistical equations in the lecture slides it does not matter but the equations in your lab notes are important. Make sure you understand them and what each component means.
This review was reposted with the kind permission of Awesome NTU CBC Student. Originally published at https://awesomentucbcstudent.blogspot.com/2019/12/ay1920-y2s1-review.html
July 18, 2021