Machine Learning for Data Science and Analytics

Learn the principles of machine learning and the importance of algorithms.

Machine Learning is a growing field that is used when searching the web, placing ads, credit scoring, stock trading and for many other applications.

This data science course is an introduction to machine learning and algorithms. You will develop a basic understanding of the principles of machine learning and derive practical solutions using predictive analytics. We will also examine why algorithms play an essential role in Big Data analysis.

  • Institution: ColumbiaX
  • Subject: Computer Science
  • Level: Introductory
  • Prerequisites:High School Math. Some exposure to computer programming.
  • Language: English
  • Video Transcript: English
  • What machine learning is and how it is related to statistics and data analysis
  • How machine learning uses computer algorithms to search for patterns in data
  • How to use data patterns to make decisions and predictions with real-world examples from healthcare involving genomics and preterm birth
  • How to uncover hidden themes in large collections of documents using topic modeling
  • How to prepare data, deal with missing data and create custom data analysis solutions for different industries
  • Basic and frequently used algorithmic techniques including sorting, searching, greedy algorithms and dynamic programming
Week 1 – Introduction to Data ScienceWeek 2 – Statistical Thinking

  • Examples of Statistical Thinking
  • Numerical Data, Summary Statistics
  • From Population to Sampled Data
  • Different Types of Biases
  • Introduction to Probability
  • Introduction to Statistical Inference

Week 3 – Statistical Thinking 2

  • Association and Dependence
  • Association and Causation
  • Conditional Probability and Bayes Rule
  • Simpsons Paradox, Confounding
  • Introduction to Linear Regression
  • Special Regression Models

Week 4 – Exploratory Data Analysis and Visualization

  • Goals of statistical graphics and data visualization
  • Graphs of Data
  • Graphs of Fitted Models
  • Graphs to Check Fitted Models
  • What makes a good graph?
  • Principles of graphics

Week 5 – Introduction to Bayesian Modeling

  • Bayesian inference: combining models and data in a forecasting problem
  • Bayesian hierarchical modeling for studying public opinion
  • Bayesian modeling for Big Data


Ansaf Salleb-Aouissi
Department of Computer Science at Columbia University


Cliff Stein
Professor of IEOR and of Computer Science at Columbia University


David Blei
Professor of Computer Science and Statistics at Columbia University


Itsik Peer
Associate Professor of Computer Science at Columbia University


Mihalis Yannakakis
Professor of Computer Science at Columbia University


Peter Orbanz
Assistant Professor of Statistics at Columbia University