Fundamental concepts in AI/ML

Date

Wednesday, January 28, 2026

Links of interest

Notes

In class, we discussed:

Kinds of ML:
- Unsupervised learning: Using unlabeled datasets. Here, we are looking for patterns or structure within the data.
- Supervised learning: Using labeled datasets (i.e., “training data”). With supervised learning, we look to build a model from the training data.
- Semi-supervised learning means that only some of the data are labelled while in active learning your application interactively asks for labels (a form of semi-supervised learning).
- Reinforcement learning: Developing and refining models as data arrive.
Which elements are common to AI/ML applications, including:
- Features
- Data to learn on (labeled or unlabeled).
- A model with parameters and, often, weights. The shape of these weights can vary greatly.
- A way to evaluate performance (in supervised learning, by applying the model to test and validation subsets of the training data).
Three sorts of problems you are likely to address with AI/ML in your career as a scientist:
1. Clustering: An unsupervised learning approach where data are grouped together.
2. Classification: Finding a function, $f$, which maps inputs, $X$, to discrete outputs, $y$, or \emphasis{classes}. When you have two target classes (e.g., Yes and No), you have a \emphasis{binary classification} problem. When you have more than two classes, you have a \emphasis{multiclass classification} problem.
3. Regression: Finding a function, $f$, which maps inputs, $X$, to a continuous output, $y$.
Bias (the difference between the predicted and true value of some parameter) and variance (the change in a model’s performance as different training data are used), two metrics that are used to evaluate model performance.
Definitions for loss and cost:
Loss tells us how far one prediction is from some target value; cost describes loss across a dataset.