Compiler Design: SDTS, LR Parsing, and Code Optimization

Posted by Anonymous and classified in Computers

Written on January 27, 2026 in English with a size of 232.62 KB

Syntax-Directed Translation Schemes (SDTS)

A possible Syntax-Directed Translation Scheme (SDTS) uses the attribute val to store the value of each non-terminal.

E → E₁ + T { E.val = E₁.val + T.val }
E → T { E.val = T.val }
T → T₁ * F { T.val = T₁.val * F.val }
T → F { T.val = F.val }
F → num { F.val = num.value }

Bottom-Up Evaluation of 3 + 2 * 4

Evaluation using SDTS (bottom-up):

F → num(4): F.val = 4
F → num(2): F.val = 2
F → num(3): F.val = 3
T → F (for num(2)): T.val = F.val = 2
T → T * F: T.val = T.val (from num(2)) * F.val (from num(4)) = 2 * 4 = 8
T → F (for num(3)): T.val = F.val = 3
E → T (for num(3)): E.val = T.val = 3
E → E + T: E.val = E.val (from num(3)) + T.val (from 2 * 4) = 3 + 8 = 11

Therefore, the result of the computation for 3 + 2 * 4 using the given SDTS is 11.

Three-Address Code and Basic Blocks

The following is the Three-Address Code for the loop:

prod = 0
i = 1
L1: t1 = a[i] // Access element a[i]
t2 = b[i] // Access element b[i]
t3 = t1 * t2 // Multiply the elements
prod = prod + t3 // Add to prod
i = i + 1 // Increment i
if i <= 20 goto L1 // Loop condition
L2: // End of the loop

Identifying Leaders and Basic Blocks

Finding Leaders:

(1) prod = 0 is a leader (first statement).
(3) L1: t1 = a[i] is a leader (target of the goto in statement 8).
(9) L2: is a leader (follows the goto in statement 8).

Based on the leaders, the basic blocks are:

Block B1: Statements (1) and (2)
Block B2: Statements (3) through (8)
Block B3: Statement (9)

Flow Graph Representation

The flow graph structure is as follows:

B1 points to B2.
B2 loops back to B2 (if i <= 20 is true).
B2 points to B3 (if i <= 20 is false).

LRtVGdxZQQQgj5GqdCGF2BEUR1szoE0vkYQggh5Fv8MMJICCGEuAIKIyGEEGKDwkgIIYTYoDASQgghNiiMhBBCiA0KIyGEEGKDwkgIIYTYoDASQgghNiiMhBBCiA0KIyGEEGKDwkgIIYTYoDASQgghNiiMhBBCiA0KIyGEEGKDwkgIIYTY+P8BSaC7mRyU+ZUAAAAASUVORK5CYII=

LR(1) Parsing and Grammar Augmentation

To analyze the grammar, we first augment it:

S' → S
S → CC
C → cC
C → d

FIRST and FOLLOW Sets

FIRST(C) = {c, d}
FIRST(S) = {c, d}
FIRST(S') = {c, d}
FOLLOW(S') = {$}
FOLLOW(S) = {$}
FOLLOW(C) = {c, d, $}

LR(1) Items and State Transitions

State I1: GOTO(I0, S) = {[S' → S·, $]}
State I2: GOTO(I0, C) = {[S → C·C, $], [C → ·cC, $], [C → ·d, $]}
State I3: GOTO(I0, c) = {[C → c·C, c/d], [C → ·cC, c/d], [C → ·d, c/d]}
State I4: GOTO(I0, d) = {[C → d·, c/d]}
State I5: GOTO(I2, C) = {[S → CC·, $]}
State I6: GOTO(I2, c) = {[C → c·C, $], [C → ·cC, $], [C → ·d, $]}
State I7: GOTO(I2, d) = {[C → d·, $]}
State I8: GOTO(I3, C) = {[C → cC·, c/d]}
State I11: GOTO(I6, C) = {[C → cC·, $]}

Note: State I9 is equivalent to I3, State I10 is equivalent to I4, State I12 is equivalent to I6, and State I13 is equivalent to I7.

LR(1) Parsing Table

AVP5khSnGp32AAAAAElFTkSuQmCC

Operator Precedence and Function Tables

Consider the grammar: S → (L) | a and L → S | S, L.

Operator Precedence Relation (OPR)

8DtWxeopuQydkAAAAASUVORK5CYII=

AS+cw3NpWUKDAAAAAElFTkSuQmCC

h9qsALqtYeB9AAAAABJRU5ErkJggg==

Operator Function Table (OFT)

A possible OFT for the precedence relations:

Terminal	f	g
(	2	1
)	4	3
,	4	3
a	4	3

Advantages and Disadvantages of OFT

Advantages:

Space Efficiency: Requires less storage than a full OPR table.
Simplicity: Numerical comparisons are faster than table lookups.

Disadvantages:

Not Always Applicable: Cannot be constructed for all operator precedence grammars.
Potential Loss of Information: Mapping relations to functions may obscure specific precedence details.

Conflicts in LR Parsers

There are two primary types of conflicts in LR parsers:

Shift/Reduce Conflict: The parser cannot decide whether to shift the next symbol or reduce the current stack top.
Reduce/Reduce Conflict: The parser has multiple productions available for reduction.

SLR(1) Conflict Analysis

In State I1 for a specific grammar, we find:

Item R → L·: On input '=', reduce by R → L (since '=' is in FOLLOW(R)).
Item S → L·=R: On input '=', shift.

This results in a Shift/Reduce conflict on the '=' input in State I1.

Symbol Table Management

The symbol table stores essential information about program identifiers (variables, functions, etc.). It is used for:

Storage: Details like name, type, scope, and memory location.
Lookup: Quick retrieval of identifier data.
Semantic Checks: Type checking and scope validation.
Code Generation: Providing details for machine code production.

Common Data Structures:

Hash Table: Offers fast average lookup (O(1)).
Balanced Binary Search Tree: Provides guaranteed O(log n) lookup and ordered storage.

Intermediate Code and DAG Representation

Three-Address Code and Leaders

z+uhObC0HIwNAAAAABJRU5ErkJggg==

Leaders: i = 1, L1, L2, L3, L4, L5, L6, L7, L8, L9, L10

Directed Acyclic Graph (DAG)

wOO1pPCTX5pxwAAAABJRU5ErkJggg==

The Role of a Compiler in Language Processing

A compiler is the core translator in a language processing system. It typically operates after the preprocessor and before the assembler. It converts high-level source code into low-level assembly or machine code.

Example Walkthrough

For the expression: Count = i + j * k; (where Count, i, j are Float and k is Int):

Lexical Analysis: Identifies tokens like Count, =, i, +, j, *, k, and ;.
Syntax Analysis: Creates an Abstract Syntax Tree (AST) respecting operator precedence (multiplication before addition).
Semantic Analysis: Checks types and performs implicit conversion of k to float for the multiplication.
Intermediate Code Generation:
- temp1 = float(k)
- temp2 = j * temp1
- Count = i + temp2

Related entries:

Tags: