CS178: Machine Learning and Data Mining
Lecture: Tues/Thurs 2pm-3:30pm, DBH 1100
Discussion: Thurs 7pm-7:50pm, BS3 1200
Instructor: Prof. Alex Ihler (firstname.lastname@example.org), Office Bren Hall 4066
Teaching Assistant: Qi Lou (email@example.com)
Course Notes in development
Introduction to machine learning and data mining
How can a machine learn from experience, to become better at a given task? How can we automatically extract knowledge or make sense of massive quantities of data? These are the fundamental questions of machine learning. Machine learning and data mining algorithms use techniques from statistics, optimization, and computer science to create automated systems which can sift through large volumes of data at high speed to make predictions or decisions without human intervention.
Machine learning as a field is now incredibly pervasive, with applications from the web (search, advertisements, and suggestions) to national security, from analyzing biochemical interactions to traffic and emissions to astrophysics. Perhaps most famously, the $1M Netflix prize stirred up interest in learning algorithms in professionals, students, and hobbyists alike.
This class will familiarize you with a broad cross-section of models and algorithms for machine learning, and prepare you for research or industry application of machine learning techniques.
We will assume basic familiarity with the concepts of probability and linear algebra. Some programming will be required; we will primarily use Python, using the libraries "numpy" and "matplotlib", as well as course code.
Textbook and Reading
There is no required textbook for the class. However, useful books on the subject for supplementary reading include Murphy's "Machine Learning: A Probabilistic Perspective", Duda, Hart & Stork, "Pattern Classification", and Hastie, Tibshirani, and Friedman, "The Elements of Statistical Learning".
I use Piazza to manage student discussions and questions. Our class link is: http://piazza.com/uci/winter2016/cs178.
This year, we will be using Python for most of the programming in the course. I strongly suggest the "full SciPy stack", which includes NumPy, MatPlotLib, SciPy, and iPython notebook for interactive work and visualization; see http://www.scipy.org/install.html for installation information.
Here is a simple introduction to numpy and plotting for the course; and of course you can find complete documentation for these libraries as well as many more tutorial guides online.
I usually use Python 2.7 by default, but try to program in a 3.0 compatible way; if you find parts of the code do not work for more recent versions of Python please let me know the issue and I will try to fix it.
Syllabus (subject to change)
You may find the slides from last year helpful; they are similar from year to year.