Introduction to Machine Learning and Machine Learning Systems

Raffi Khatchadourian (based on material from Christian Kaestner and Eunsuk Kang)

November 26, 2025

Disclaimer

New part of the course.

What is Machine Learning (ML)?

AI vs. ML
AI vs. ML
  • ML is a subfield of Artificial Intelligence (AI).
  • ML is concerned with the design and development of algorithms that allow computers to learn from, make predictions or decisions, or generate “new” data based on (“other”) data.1
  • This is different than traditional programming, where “hard-and-fast” rules are explicitly coded by programmers.
    • Conditional statements, loops, functions, etc.
  • In ML, the system learns patterns from data and uses these patterns to make predictions or decisions on new, unseen data or generate “new” data.

Foundation Models

AI vs. ML
AI vs. ML
  • Foundation models are large-scale machine learning models trained on vast amounts of data.
  • They are contained within “Deep Learning” in the Venn diagram above.
  • They are also referred to as “generative AI” models and include large language models (LLMs) like GPT-4, PaLM, and LLaMA.
  • Foundation models can perform a wide range of tasks, such as text generation, image generation, and more, often with minimal fine-tuning for specific applications.

Types of Machine Learning (I)

Types of Machine Learning (II)

Applications of Machine Learning

Case Study: Food Delivery Service

Predicting Delivery Time

How Does ML Work?

Typical Machine Learning Pipeline

ML Pipeline
ML Pipeline

ML Tasks by Phase

Before Deployment

  1. Obtain labeled data.
  2. Identify and extract features.
  3. Split data into training and evaluation set.
  4. Learn model from training data.
  5. Evaluate model on evaluation data.
  6. Repeat, revising features.

After Deployment

  1. Evaluate model on production data; monitor.
  2. Select production data for retraining.
  3. Update model regularly.

Design Decisions in ML-based Systems

Example Data

RestaurantID Order OrderTime ReadyTime PickupTime
5 5A;3;10;11C;C:No onion 18:11 18:23 18:31

Data Processing

Data Cleaning

Feature Engineering

QUESTION: What features would you use for delivery prediction?

Features

Possible Features for Delivery Prediction

  1. Order time, day of week.
  2. Average number of orders in that hour.
  3. Order size.
  4. Special requests.
  5. Order items.
  6. Preparation time.

Learning

Build a predictor that best describes an outcome for the observed features.

RestaurantID Order3 SpecialRequest DayOfWeek PreparationTime
5 yes yes 2 12

Evaluation

Dataset partitioning.
Dataset partitioning.

Evaluation Methods

Evaluation Methods

Precision and Recall

Precision and recall.
Precision and recall.

Underfitting vs. Overfitting

Balancing Underfitting and Overfitting

Underfitting Example

Text Genre
When the earth stood … Science fiction
Two households, both alike… Romance
To Sherlock Holmes she… Adventure

Overfitting Example

Text Genre
When the earth stood … Science fiction
Two households, both alike… Romance
To Sherlock Holmes she… Adventure

Learning and Evaluating in Production


  1. Whether “new” data is actually generated or illegally copied (sometimes verbatim) is currently a controversial topic in the AI/ML community. The problem lies in the “explainability” of ML models, especially large language models (LLMs) and generative AI models.↩︎