Artificial Intelligence Fundamentals: Search, Logic, and MDPs

Posted by Anonymous and classified in Mathematics

Written on in English with a size of 5.23 KB

Agent Environments and PEAS

PEAS: Performance, Environment, Actuators, Sensors

Environment Properties

  • Fully vs. Partially Observable: Full information or hidden information?
  • Deterministic vs. Stochastic: Outcome is known or probability is known (if probability is unknown, it is Nondeterministic).
  • Episodic vs. Sequential: Independent actions vs. sequential (dependent) actions.
  • Static vs. Dynamic: No environment change vs. environment change over time.
  • Semidynamic: The environment does not change with time, but the agent's performance score does.
  • Discrete vs. Continuous: Distinct chunks (e.g., Chess) or smooth/measurable values (e.g., Driving).
  • Known vs. Unknown: Are the rules of the world given or must they be learned?
  • Single vs. Multi-agent: One agent or multiple agents.

Chess Example (with a clock): Fully observable, Multi-agent, Deterministic, Sequential, Semidynamic, Discrete.


Search Algorithms

State space graph: No revisit, not memory efficient.

Search tree: Revisit allowed, memory efficient.

Manhattan Distance

%IMAGE_1%

If a tile is at (0,0) but belongs at (2,2), its Manhattan distance is |2-0| + |2-0| = 4.

Search Strategies

  • Uninformed search: Considers only information about past work.
  • Informed search: Estimates information about the future.
  • BFS: FIFO Queue, optimal (if cost=1), high memory (O(b^d)).
  • DFS: LIFO Stack, not optimal, low memory (O(bm)).
  • IDDFS: DFS with depth limit. Optimal, low memory.
  • UCS: g(n) (cost), Priority Queue, optimal if h=0.
  • Greedy: h(n) (heuristic), fast but not optimal.
  • A*: f(n) = g(n) + h(n), optimal if h(n) is admissible/consistent.

Heuristic Properties

Admissibility (tree): %IMAGE_2% (h*n - true cost to nearest goal).

Consistency (graph): %IMAGE_3% %IMAGE_4%


Uncertainty and Probabilistic Inference

Prior: P(a)

Posterior: %IMAGE_5% (If unknown P(x), calculate all possible values, e.g., %IMAGE_6%)

Bayes' Rule: %IMAGE_7%

  • Absolute independence: P(a|b) = P(a), P(a∩b) = P(a) * P(b).
  • Conditional independence: A and B are independent given C.
  • Bayes net diagram: Directed Acyclic Graph (DAG). Nodes = variables, Edges = dependencies.

Exact Inference (Alarm Domain)

Summing out: %IMAGE_8%

x = event occurred, e = evidence known, y = hidden variables. To calculate %IMAGE_9%, find %IMAGE_10% %IMAGE_11%.


Adversarial Search

Alpha-Beta Pruning: If α ≥ β, then prune.


Constraint Satisfaction Problems (CSP)

Variables, Domains, Constraints.

  • Map coloring: X=Regions, D={Red, Green, Blue}, C=neighbors must differ in color.
  • Sudoku: X=Cells, D={1..9}, C=All different in row, column, and block.
  • N-Queens: X=Queens (columns), D={Rows 1..N}, C=No two queens share row/diagonal.
  • Cryptarithmetic (TWO+TWO=FOUR): X={T,W,O,F,U,R, C1, C2, C3}, D={0..9}, C=All different + Math constraints.

%IMAGE_12% %IMAGE_13% %IMAGE_14%

Backtracking and Filtering

Backtracking DFS: Select unassigned variable → Try value → Check constraints → Recurse. If conflict/empty domain, backtrack (undo assignment).

Forward Checking: When variable X is assigned, delete inconsistent values from neighbors' domains immediately.

Arc Consistency (AC-3): Checks X → Y. Consistent if for every value x in X's domain, there exists a valid y in Y's domain.

CSP Heuristics

  • MRV (Minimum Remaining Values): Pick variable with fewest legal values left (Fail-fast).
  • LCV (Least Constraining Value): Pick value that rules out the fewest choices for neighbors (Fail-last).

Markov Decision Processes (MDP)

Components

  • States: e.g., cool, hot, overheated.
  • Actions: e.g., fast, slow.
  • Discount factor: Gamma (γ).
  • Transition Function T(s,a,s'): Probability of reaching s' from s via action a. P(s' | s, a).
  • Reward Function: R(s,a,s').

Utilities over time (when γ = 1, no discount): %IMAGE_15%

Policy (π): Policy of each state. Optimal Policy (π*): Best policy of state based on reward.

Value and Policy Iteration

Bellman Equations: %IMAGE_18% %IMAGE_19% %IMAGE_20% %IMAGE_21%

Related entries: