Preprocessing
Preprocessing is a critical step in machine learning, and the specific considerations will vary depending on your dataset, problem, and algorithm we will be using. Below is a table summarizing the key considerations for preprocessing in supervised and unsupervised learning:
Considerations
Supervised Learning
Unsupervised Learning
Handling missing values
Yes
Yes
Data normalization/scaling
Yes
Yes
Feature selection
Yes
No
Handling imbalanced datasets
Yes
NA
Data transformation
Yes
Yes
Outlier detection and handling
Yes
Yes
Data splitting
Yes
NA
Dimensionality reduction
Optional
Yes
Data clustering
NA
Yes
Data visualization
Optional
Yes
Note that preprocessing is an essential step in machine learning, and the considerations vary depending on whether you're dealing with supervised or unsupervised learning, datasets and the task.
Last updated