Compiler Design: SDTS, LR Parsing, and Code Optimization
Syntax-Directed Translation Schemes (SDTS)
A possible Syntax-Directed Translation Scheme (SDTS) uses the attribute val to store the value of each non-terminal.
- E → E1 + T { E.val = E1.val + T.val }
- E → T { E.val = T.val }
- T → T1 * F { T.val = T1.val * F.val }
- T → F { T.val = F.val }
- F → num { F.val = num.value }
Bottom-Up Evaluation of 3 + 2 * 4
Evaluation using SDTS (bottom-up):
- F → num(4): F.val = 4
- F → num(2): F.val = 2
- F → num(3): F.val = 3
- T → F (for num(2)): T.val = F.val = 2
- T → T * F: T.val = T.val (from num(2)) * F.val (from num(4)) = 2 * 4 = 8
- T → F (for num(3)): T.val = F.val = 3
- E → T (for num(3)): E.val = T.val = 3
- E → E + T: E.val = E.val (from num(3)) + T.val (from 2 * 4) = 3 + 8 = 11
Therefore, the result of the computation for 3 + 2 * 4 using the given SDTS is 11.
Three-Address Code and Basic Blocks
The following is the Three-Address Code for the loop:
- prod = 0
- i = 1
- L1: t1 = a[i] // Access element a[i]
- t2 = b[i] // Access element b[i]
- t3 = t1 * t2 // Multiply the elements
- prod = prod + t3 // Add to prod
- i = i + 1 // Increment i
- if i <= 20 goto L1 // Loop condition
- L2: // End of the loop
Identifying Leaders and Basic Blocks
Finding Leaders:
- (1) prod = 0 is a leader (first statement).
- (3) L1: t1 = a[i] is a leader (target of the goto in statement 8).
- (9) L2: is a leader (follows the goto in statement 8).
Based on the leaders, the basic blocks are:
- Block B1: Statements (1) and (2)
- Block B2: Statements (3) through (8)
- Block B3: Statement (9)
Flow Graph Representation
The flow graph structure is as follows:
- B1 points to B2.
- B2 loops back to B2 (if i <= 20 is true).
- B2 points to B3 (if i <= 20 is false).
LR(1) Parsing and Grammar Augmentation
To analyze the grammar, we first augment it:
- S' → S
- S → CC
- C → cC
- C → d
FIRST and FOLLOW Sets
- FIRST(C) = {c, d}
- FIRST(S) = {c, d}
- FIRST(S') = {c, d}
- FOLLOW(S') = {$}
- FOLLOW(S) = {$}
- FOLLOW(C) = {c, d, $}
LR(1) Items and State Transitions
- State I1: GOTO(I0, S) = {[S' → S·, $]}
- State I2: GOTO(I0, C) = {[S → C·C, $], [C → ·cC, $], [C → ·d, $]}
- State I3: GOTO(I0, c) = {[C → c·C, c/d], [C → ·cC, c/d], [C → ·d, c/d]}
- State I4: GOTO(I0, d) = {[C → d·, c/d]}
- State I5: GOTO(I2, C) = {[S → CC·, $]}
- State I6: GOTO(I2, c) = {[C → c·C, $], [C → ·cC, $], [C → ·d, $]}
- State I7: GOTO(I2, d) = {[C → d·, $]}
- State I8: GOTO(I3, C) = {[C → cC·, c/d]}
- State I11: GOTO(I6, C) = {[C → cC·, $]}
Note: State I9 is equivalent to I3, State I10 is equivalent to I4, State I12 is equivalent to I6, and State I13 is equivalent to I7.
LR(1) Parsing Table
Operator Precedence and Function Tables
Consider the grammar: S → (L) | a and L → S | S, L.
Operator Precedence Relation (OPR)
Operator Function Table (OFT)
A possible OFT for the precedence relations:
| Terminal | f | g |
|---|---|---|
| ( | 2 | 1 |
| ) | 4 | 3 |
| , | 4 | 3 |
| a | 4 | 3 |
Advantages and Disadvantages of OFT
Advantages:
- Space Efficiency: Requires less storage than a full OPR table.
- Simplicity: Numerical comparisons are faster than table lookups.
Disadvantages:
- Not Always Applicable: Cannot be constructed for all operator precedence grammars.
- Potential Loss of Information: Mapping relations to functions may obscure specific precedence details.
Conflicts in LR Parsers
There are two primary types of conflicts in LR parsers:
- Shift/Reduce Conflict: The parser cannot decide whether to shift the next symbol or reduce the current stack top.
- Reduce/Reduce Conflict: The parser has multiple productions available for reduction.
SLR(1) Conflict Analysis
In State I1 for a specific grammar, we find:
- Item R → L·: On input '=', reduce by R → L (since '=' is in FOLLOW(R)).
- Item S → L·=R: On input '=', shift.
This results in a Shift/Reduce conflict on the '=' input in State I1.
Symbol Table Management
The symbol table stores essential information about program identifiers (variables, functions, etc.). It is used for:
- Storage: Details like name, type, scope, and memory location.
- Lookup: Quick retrieval of identifier data.
- Semantic Checks: Type checking and scope validation.
- Code Generation: Providing details for machine code production.
Common Data Structures:
- Hash Table: Offers fast average lookup (O(1)).
- Balanced Binary Search Tree: Provides guaranteed O(log n) lookup and ordered storage.
Intermediate Code and DAG Representation
Three-Address Code and Leaders
Leaders: i = 1, L1, L2, L3, L4, L5, L6, L7, L8, L9, L10
Directed Acyclic Graph (DAG)
The Role of a Compiler in Language Processing
A compiler is the core translator in a language processing system. It typically operates after the preprocessor and before the assembler. It converts high-level source code into low-level assembly or machine code.
Example Walkthrough
For the expression: Count = i + j * k; (where Count, i, j are Float and k is Int):
- Lexical Analysis: Identifies tokens like Count, =, i, +, j, *, k, and ;.
- Syntax Analysis: Creates an Abstract Syntax Tree (AST) respecting operator precedence (multiplication before addition).
- Semantic Analysis: Checks types and performs implicit conversion of k to float for the multiplication.
- Intermediate Code Generation:
- temp1 = float(k)
- temp2 = j * temp1
- Count = i + temp2
English with a size of 232.62 KB