- Course formalities
- Machine learning
- Course objectives and outline

- Lectures, P1: Ludwig Krippahl
- P2 - P3: Joaquim Ferreira da Silva
- P4 - P5: Francisco Azevedo

- Slides, notes, lecture videos

- Lectures: 2 x 1h per week, ~70% exposition, ~30% discussion
- Tutorials: 1 x 2h per week
- Questions about exercises and assignments
- This year will be a bit crowded. We'll make up for it with some lecture slots

- Theoretical: 2 written tests or final exam.
- One handwritten A4 sheet for notes
- Exam scored in two independent parts
- Practical: 2 assignments, groups of 2 students in same class.
- Groups formed by October 7
- All submission to praticasice@gmail.com using official FCT address
**NOTE: automated processing only**

- Required: minimum of 9.5 in each component.
- Final grade: simple average of the two components.
- If frequency from 2015/16 - 18, do not enroll in practical classes

- Python 3.6, Spyder IDE
- Several libraries needed: NumPy, Matplotlib, Scikit-Learn, ...
- Simple instalation: Anaconda, https://www.anaconda.com/download

**Lecture notes, available on web site****Bishop, Pattern Recognition and ML 2006**- Alpaydin, Introduction to ML (2nd ed.) 2010
- Marsland, Machine Learning, 2009
- Mitchell, Machine Learning, 1997

- "Field of study that gives computers the ability to learn without being explicitly programmed"

(Samuel, 1959)

- "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E"

(Mitchell, 1997)

- A task that the system must perform.
- A measure of its performance
- The data used to improve its performance.

- Task: identify requests for flight information and tickets.
- Performance measure: correctly identified expressions.
- Data: Annotated voice records.
- Source: Erdogan, Using semantic analysis to improve speech recognition performance

- Example of the result:

```
``` please book me on
flight twenty one
i would like to fly
from philadelphia
to dallas
could you please list the flights
from boston to denver
on july twenty eighth

- Predicting prices (Regression)
- Classifying spam emails (Classification)
- Products purchased together (Association Rules)
- Grouping similar images (Clustering)
- Distributions in diagnosis (Density Estimation)

- Computer Science
- Statistics and Probability
- Mathematics
- Neuroscience
- Philosophy

- Example: identify handwritten digits

- Train with labelled data

- Find a function for classifying

- Large volumes of data
- Google searches
- Facebook relations graph
- Credit card fraud

- Need to respond to changing conditions
- Personalization (e.g. Facebook feed)
- Spam filtering (email and comments)

- ML is good for tasks we do not have a recipe for...
- ... if we have the right data.

- The set of possible hypotheses
- We need to assume something about the solution

- Example: we want to separate red from blue

- Set of all hypotheses

- Represents the set of hypotheses (with parameters)

- One element of the hypothesis class set
- One instance of the Model (e.g. instantiating parameters)
- $\theta_1=0 \qquad y \leq 0$
- $\theta_1=\theta_2= -1 \qquad (x + 1)^2 + (y + 1)^2 \leq 1$
- Goal: find the best hypothesis from the best hypothesis class.

- We are biased by what we assume from the start.
- Hypothesis class
- But we must assume something.
- we cannot proceed without a hypothesis class
- There is no learning without inductive bias.
- Without inductive bias it is not possible to extrapolate from known data to unknown events (will gravity still work tomorrow?)
- Since we want to infer something outside known data we must assume some constraints

- Unsupervised learning

- All data is unlabelled
- Find some structure in data

- Group searches with features from image and HTML
- Cai et al, Clustering of WWW Image Search Results, 2004

- All data is unlabelled
- Find some structure in data

- Allows us to obtain new features from the data
- Can be used as a step in broader learning tasks
- (preprocessing, visualization, deep learning)

- Unsupervised learning
- Supervised learning

- All data is labelled
- Predict value correctly

- Continuous values: Regression
- Discrete classes: Classification

- Example: face identification
- Valenti et al, Machine Learning Techniques for Face Analysis, 2008

- Unsupervised learning
- Supervised learning
- Reinforcement learning
- Optimize some output
- But no direct feedback for each case

- Example: learn to play a game
- Must learn to predict cost and benefit of each move.
- But can only know final result at the end of the game.

- Robotics: locomotion, manipulation
- Control of autonomous vehicles
- Operations research: pricing, marketing, routing
- Games

- Unsupervised learning
- Supervised learning
- Reinforcement learning
- Semi-supervised learning
- Some data labelled, most unlabelled
- Mixes the two approaches
- Structure of unlabelled data helps choose hypothesis

- Unsupervised learning
- Supervised learning
- Reinforcement learning
- Semi-supervised learning

- Supervised learning
- Unsupervised learning

- Understand the foundations of ML problems and solutions
- Experience with useful ML techniques and applications
- Learn to understand the literature
- Learn to understand the mathematical formulations

- Introduction and Supervised Learning
- Regression, Classification
- Learning Theory
- Unsupervised Learning

- Learn from data, without explicit rules
- Hypothesis class, model and hypothesis
- Inductive bias and learning
- Supervised, unsupervised, reinforcement, semi-supervised

- Alpaydin, Chapter 1
- Mitchell, Chapter 1
- Marsland, Sections 1.1 through 1.4.