Machine Learning Algorithms: Concepts and Criteria

Classified in Computers

Written on in English with a size of 4.32 KB

Decision Trees

Key Characteristics

  • Advantages:
    • Easy to implement
    • Can extract rules
    • Non-incremental (more efficient and practical, without backtracking)
  • Disadvantages:
    • Objective function must be discrete
    • Primarily for classification problems

Stop Criteria

  • All examples belong to the same class
  • All samples have the same value for attributes
  • The gain on each split is insignificant
  • The number of samples has reached a certain limit

Overfitting Problem

If the number of nodes is too large, decisions are made based on very small partitions of the samples, which reduces generalization ability.

Formulas

Entropy (Ent):

Ent(S) = -(p+ * log2(p+)) - (p- * log2(p-))

Where:

  • S is the set of examples for that node.
  • p+ is the probability of positive outcomes.
  • p- is the probability of negative outcomes.

Information (Info) for Attribute A:

Info(Attribute A) = Σ (P(vi) * Ent(vi))

Where:

  • P(vi) is the probability of value i (Number of examples with value i / Total number of examples).

Information Gain (Gain):

Gain(S, A) = Ent(S) - Info(A)

Algorithm Steps

  1. Choose the best attribute to split the examples.
  2. Expand the tree by creating a new branch for each value of the chosen attribute.
  3. Pass the examples to each node according to the attribute's value.
  4. Repeat for each leaf node until a stop criterion is reached:
    1. If all examples belong to the same class, assign the node to that class.
    2. If not, repeat steps 1 through 4.

Recursive Function: generateTree(Examples)

  1. If examples meets a stop criterion, return a leaf node.
  2. If not, choose the best attribute to split examples and create an attribute node.
  3. For each value i of the chosen attribute, create a subtree: subtree = generateTree(examples_subset_i).
  4. Return the generated subtree with the created subtrees as descendants.
  5. End.

Neural Networks

Key Features (Advantages)

  • Ability to adapt and learn
  • Capacity to generalize
  • Ability to classify
  • Used mainly for classification, categorization, and optimization problems
  • Rapid and simple deployment

Perceptron

Formulas

Error:

Error = (Desired Output - Network Output)

Weight Change (Delta Rule):

Δw = η * xi * Error

Updated Weight:

wij(t + 1) = wij(t) + Δwij

Multi-Layer Perceptron (MLP)

Formulas

Sum of Squared Errors (SSE):

Error = 1/2 * Σ (Desired Output - Network Output)2

Weight Change (Backpropagation):

Δwij = xi * η * δ

Where δ (delta error) is calculated as:

  • If it's an output node: δ = (Desired Output - Network Output) * f'(net_input)
  • If it's a hidden node: δ = (Σ (δnext_layer * wto_next_layer)) * f'(net_input)

Stop Criteria

  • Maximum number of iterations reached
  • Error in training falls below a minimum threshold
  • Error increases for k consecutive times in the validation phase

Genetic Algorithms

Normalization of the Input Vector

X = min + (max - min) * (decimal_value / (2number_of_bits - 1))

Stop Criteria

  • No significant improvement in fitness
  • Optimal solution found (if known)
  • Loss of diversity in the population
  • Maximum number of generations reached

Related entries: