Machine Learning Algorithms: Concepts and Criteria

Classified in Computers

Written on September 16, 2025 in English with a size of 4.32 KB

Decision Trees

Key Characteristics

Advantages:
- Easy to implement
- Can extract rules
- Non-incremental (more efficient and practical, without backtracking)
Disadvantages:
- Objective function must be discrete
- Primarily for classification problems

Stop Criteria

All examples belong to the same class
All samples have the same value for attributes
The gain on each split is insignificant
The number of samples has reached a certain limit

Overfitting Problem

If the number of nodes is too large, decisions are made based on very small partitions of the samples, which reduces generalization ability.

Formulas

Entropy (Ent):

Ent(S) = -(p⁺ * log₂(p⁺)) - (p^- * log₂(p^-))

Where:

S is the set of examples for that node.
p⁺ is the probability of positive outcomes.
p^- is the probability of negative outcomes.

Information (Info) for Attribute A:

Info(Attribute A) = Σ (P(v_i) * Ent(v_i))

Where:

P(v_i) is the probability of value i (Number of examples with value i / Total number of examples).

Information Gain (Gain):

Gain(S, A) = Ent(S) - Info(A)

Algorithm Steps

Choose the best attribute to split the examples.
Expand the tree by creating a new branch for each value of the chosen attribute.
Pass the examples to each node according to the attribute's value.
Repeat for each leaf node until a stop criterion is reached:
1. If all examples belong to the same class, assign the node to that class.
2. If not, repeat steps 1 through 4.

Recursive Function: `generateTree(Examples)`

If examples meets a stop criterion, return a leaf node.
If not, choose the best attribute to split examples and create an attribute node.
For each value i of the chosen attribute, create a subtree: subtree = generateTree(examples_subset_i).
Return the generated subtree with the created subtrees as descendants.
End.

Neural Networks

Key Features (Advantages)

Ability to adapt and learn
Capacity to generalize
Ability to classify
Used mainly for classification, categorization, and optimization problems
Rapid and simple deployment

Perceptron

Formulas

Error:

Error = (Desired Output - Network Output)

Weight Change (Delta Rule):

Δw = η * x_i * Error

Updated Weight:

w_ij(t + 1) = w_ij(t) + Δw_ij

Multi-Layer Perceptron (MLP)

Formulas

Sum of Squared Errors (SSE):

Error = 1/2 * Σ (Desired Output - Network Output)²

Weight Change (Backpropagation):

Δw_ij = x_i * η * δ

Where δ (delta error) is calculated as:

If it's an output node: δ = (Desired Output - Network Output) * f'(net_input)
If it's a hidden node: δ = (Σ (δ_{next_layer} * w_{to_next_layer})) * f'(net_input)

Stop Criteria

Maximum number of iterations reached
Error in training falls below a minimum threshold
Error increases for k consecutive times in the validation phase

Genetic Algorithms

Normalization of the Input Vector

X = min + (max - min) * (decimal_value / (2^{number_of_bits} - 1))

Stop Criteria

No significant improvement in fitness
Optimal solution found (if known)
Loss of diversity in the population
Maximum number of generations reached

Related entries:

Tags:

Decision Trees

Key Characteristics

Stop Criteria

Overfitting Problem

Formulas

Algorithm Steps

Recursive Function: generateTree(Examples)

Neural Networks

Key Features (Advantages)

Perceptron

Formulas

Multi-Layer Perceptron (MLP)

Formulas

Stop Criteria

Genetic Algorithms

Normalization of the Input Vector

Stop Criteria

Related entries:

Recursive Function: `generateTree(Examples)`