Essential Machine Learning Algorithms and Metrics
Classified in Mathematics
Written on in
English with a size of 106.59 KB
Evaluation Metrics for ML Models
Accuracy: The ratio of correctly predicted instances.
Precision: Correct positive predictions divided by total predicted positives.
Recall: Correct positive predictions divided by actual positives.
F1 Score: The harmonic mean of precision and recall.
K-Nearest Neighbors (KNN) Algorithm
A classification algorithm that works by finding the 'k' closest training examples to a data point.
Strengths: Simple to understand, effective for smaller datasets.
Weaknesses: Sensitive to irrelevant features and the scale of the data.
Applications: Image recognition, recommendation systems.
Ensemble Learning Techniques
Combines multiple models to improve predictive performance.
Methods:
- Bagging (e.g., Random Forests)
- Boosting (e.g., AdaBoost)
Advantages: Increases accuracy and reduces overfitting.
Use Cases: Forecasting, fraud detection.
Decision Tree Model
A model that splits data into branches to make decisions.
Key Features: Easy to visualize and highly interpretable.
Limitations: Prone to overfitting.
Examples: Classifying customer behavior, credit scoring.
Linear Regression
Predicts a target variable by fitting a linear equation to observed data.
Formula: Y = β0 + β1X1 + ... + βnXn
Use Cases: Sales forecasting, risk assessment.
Logistic Regression
Used for binary classification; predicts the probability of an event occurring.
Formula: P = 1 / (1 + e^(-Z)), where Z is the linear combination of inputs.
Applications: Credit scoring, marketing response prediction.
Bayesian Networks
Graphical models that represent probabilistic relationships among variables.
Key Features: Incorporates prior knowledge and evidence.
Applications: Diagnosing diseases, risk management.
Support Vector Machines (SVM)
A supervised learning model that finds the best hyperplane for classification.
Key Features: Effective in high-dimensional spaces, robust against overfitting.
Use Cases: Text categorization, image classification.
Key Machine Learning Terminology
Overfitting: Occurs when the model learns noise instead of general patterns.
Training Set: The dataset used to train the model.
Test Set: The dataset used to evaluate the model’s performance.