ICS77B / Math77B: Collaborative Filtering
CLOSED : 2013 OFFERING
Lecture: Tues/Thurs 11-12:30pm, Roland Hall 421 (PRISM Lab)
Lab: Tues 2-3:30pm, Roland Hall 421 (PRISM Lab)
.... Additional lab hours: Fridays 2:30-4pm, PRISM Lab
Instructor: Prof. Alex Ihler (firstname.lastname@example.org), Office Bren Hall 4066
Teaching Assistant: Sholeh Forouzan (email@example.com)
Many companies collect data at an unprecedented scale. Online stores such as Amazon collect click patterns and purchases by people navigating their webpages, credit score companies such as Experian and banks record clients' financial histories, Netflix records peoples' interest in movies, and so on.
A new field is starting to emerge known as "collaborative filtering" where this type of data is used to predict quantities of interest: What is the next book a customer would buy? Will this person pay his/her loan?, What are the next movies this customer will be interested in?
As evidence for the prominence of this problem in industry, Netflix announced a challenge in 2005, in which anyone who could improve their customer recommendation system by more than 10% would receive $1,000,000.
This course will be based around several real-world collaborative filtering data sets. Students will study the theoretical aspects of machine learning, clustering, matrix factorizations, and statistical estimation in order to approach the problem of collaborative filtering and recommendation.
Note that this class is highly interactive. You set the pace that is right for you! There is no fixed agenda and no exams. We want you to get an appreciation for research, and research can only be learned by doing it yourself. This class will also give you access to (funded) summer research projects.
Kaggle class competitions
We will have two data sets for testing:
More information to come here.
Work and Grading
The class will consist of
(These may be subject to change as the course proceeds.)
For the week of May 21st, we will form small groups (2-3 students) and choose one of the following papers per group to present in class. Please let me know your group and paper selection by Thursday, 5/9. Presentations will be 5-7 minutes each, plus 2-3 minutes for questions, on the key ideas and results/conclusions of the paper. (You do not need to present the details of derivations, etc.) I suggest using powerpoint slides, but whiteboard presentation is also OK. Scores will be given based on correctness, organization, professionalism, and clarity.
Please discuss the paper among your group and with the instructor and TA early, no later than Tuesday 5/14, to help get on track and correct any issues.
Other reading (not for presentation):