This post does not explain what ML is and the usual stuff. Rather it mentions some basic details about ML. This is based on Aurelion’s ML book.

Types of ML

Based on how they are classified, below are some classifications

Whether or not they are trained with human supervision
- Supervised
- Unsupervised
- Semi supervised
- Reinforcement Learning
Whether or not they can learn incrementally on the fly
- Online
- batch learning
Whether they work by simply comparing new data points to known data points, or instead by detecting patterns in the training data and building a predictive model, much like scientists do
- instance-based
- model-based learning

Common Supervised Learning Algorithms

k-Nearest Neighbors
Linear Regression
Logistic Regression
Support Vector Machines (SVMs)
Decision Trees and Random Forests
Neural networks

Common Unsupervised Learning Algorithms

Clustering
- K-Means
- DBSCAN
- Hierarchical Cluster Analysis (HCA)
Anomaly detection and novelty detection
- One-class SVM
- Isolation Forest
Visualization and dimensionality reduction
- Principal Component Analysis (PCA)
- Kernel PCA
- Locally Linear Embedding (LLE)
- t-Distributed Stochastic Neighbor Embedding (t-SNE)
Association rule learning
- Apriori
- Eclat

Most semi supervised learning algorithms are combinations of unsupervised and supervised algorithms

For example, deep belief networks (DBNs) are based on unsupervised components called restricted Boltzmann machines (RBMs) stacked on top of one another. RBMs are trained sequentially in an unsupervised manner, and then the whole system is fine-tuned using supervised learning techniques.

Reinforcement Learning Algorithms

There are no common algorithms. These work on Reward and Penalties and the learning happens over time by running it on multiple real-life examples

Algorithms which play Chess or Go are an example
Programs used in Robots are another example

Batch and Online Learning algorithms

As the name says there are algorithms which have to be trained offline i.e. Batch Algorithms and algorithms which can learn on the fly i.e. Online algorithms

Instance vs Model algorithms

Depending on whether the algorithm uses learned instances to predict for new inputs like say Classification Algorithms (k-NearestNeighbors for example) or uses a Model like say Regression Algorithms where you have a line/plane.

Challenges in ML

Bad Data
Bad Model

What to do with the Bad data or Bad model?

Feature Selection
Feature Extraction
Regularization
Hyperparameters

Data Load

Below function will be useful to get data from online datasets.

import os

import tarfile

import urllib

DOWNLOAD_ROOT = “https://raw.githubusercontent.com/ageron/handson-ml2/master/”

HOUSING_PATH = os.path.join(“datasets”, “housing”)

HOUSING_URL = DOWNLOAD_ROOT + “datasets/housing/housing.tgz”

def fetch_housing_data(housing_url=HOUSING_URL, housing_path=HOUSING_PATH):

os.makedirs(housing_path, exist_ok=True)

tgz_path = os.path.join(housing_path, “housing.tgz”)

urllib.request.urlretrieve(housing_url, tgz_path)

housing_tgz = tarfile.open(tgz_path)

housing_tgz.extractall(path=housing_path)

housing_tgz.close()

Look at the next few posts on ML End to End process for more information.