Fundamentals

Machine learning focuses on the development of algorithms and models that enable computers to learn and make predictions or decisions without being explicitly programmed. In essence, machine learning systems use data to improve their performance on a specific task or problem over time. The fundamental idea behind machine learning is to give computers the ability to learn from experience, much like humans do, but at a much faster and data-driven scale.

Machine learning can be categorized into different types based on the learning process:

  • Supervised Learning: In supervised learning, the model learns from labeled data and makes predictions or classifications based on the labels.

  • Unsupervised Learning: Unsupervised learning involves learning patterns and structures in data without explicit labels. Common techniques include clustering and dimensionality reduction.

  • Reinforcement Learning: Reinforcement learning is used in scenarios where an agent interacts with an environment and learns by receiving rewards or penalties for its actions.

Supervised vs Unsupervised Learning

Below is a comparison of some of the major differences between the two machine learning techniques:

Aspect
Supervised Learning
Unsupervised Learning

Definition

Involves training a model on labeled data, where both the input features and the corresponding output labels are provided.

Involves training a model on unlabeled data, where only input features are provided, and the model must find patterns or structures within the data.

Goal

Predict or classify new data points based on past examples.

Discover underlying structures such as clusters or patterns.

Output

Predictive model or classification label for each input sample.

Clusters, patterns, or dimensionality reduction without labels.

Examples

Linear regression, decision trees, random forests, SVM.

K-means clustering, DBSCAN, PCA.

Data Dependency

Requires labeled data, which can be costly and time-consuming to collect.

Works with raw, unannotated data, making it more flexible.

Accuracy

Generally higher accuracy with labeled data and well-tuned models.

Accuracy can be harder to quantify, as there are no labels for validation.

Complexity

Can be computationally intensive, especially with large datasets or complex models.

Often simpler in terms of model complexity, but can still be computationally expensive.

Last updated