Program to Calculate First and Follow Sets of Given Grammar

Last Updated : 29 Nov, 2024

The First and Follow sets are important in syntax analysis, mainly in parsing. All sets help in making predictive parsers and are integral to identifying how a given grammar can be parsed effectively.

The First Set for a non-terminal symbol represents all possible terminals that can appear at the beginning of any string derived from that non-terminal.

The follow set contains terminals that can appear immediately after a non-terminal in the derivation of the grammar.

All meaning of First and Follow sets is it allows to define which production rules are implemented in different parsing moment.

What is the First Set?

The First set of non-terminal in grammar tells us all the possible terminals that can appear at the beginning of any string derived from that non terminal. it gives us a way to know what symbol might come first if we start with a particular non-terminal. For example, if we have a rule like A -> aB | ε, then the First set of A includes the terminal a, as well as ε, which represents the possibility of deriving an empty string. Calculating the First set helps us predict which rules to apply during parsing by knowing which symbols can appear first in a derivation.

What is a Follow Set?

The Follow set of a non-terminal contains all terminals that can immediately follow it in any derivation of the grammar. This means that if a non-terminal appears in the middle of a sentence, the Follow set tells us what symbol might come right after it. For example, in a rule like S -> AB, the Follow set of A might include anything that can follow B, depending on other rules in the grammar. Follow sets are especially useful for parsers when making decisions about which rule to apply next and ensuring that we’re interpreting the structure of a sentence correctly, especially for LL(1) parsers where we only look one symbol ahead.

Example:

Input :
E -> TR
R -> +T R| #
T -> F Y
Y -> *F Y | #
F -> (E) | i


Output :
First(E)= { (, i, }
First(R)= { +, #, }
First(T)= { (, i, }
First(Y)= { *, #, }
First(F)= { (, i, }

-----------------------------------------------

Follow(E) = { $, ), }
Follow(R) = { $, ), }
Follow(T) = { +, $, ), }
Follow(Y) = { +, $, ), }
Follow(F) = { *, +, $, ), }

The functions follow and follow first are both involved in the calculation of the Follow Set of a given Non-Terminal. The follow set of the start symbol will always contain "$". Now the calculation of Follow falls under three broad cases:

If a Non-Terminal on the R.H.S. of any production is followed immediately by a Terminal then it can immediately be included in the Follow set of that Non-Terminal.
If a Non-Terminal on the R.H.S. of any production is followed immediately by a Non-Terminal, then the First Set of that new Non-Terminal gets included on the follow set of our original Non-Terminal. In case encountered an epsilon i.e. " # " then, move on to the next symbol in the production.
Note: "#" is never included in the Follow set of any Non-Terminal.
If reached the end of a production while calculating follow, then the Follow set of that non-terminal will include the Follow set of the Non-Terminal on the L.H.S. of that production. This can easily be implemented by recursion.

Assumptions

Epsilon is represented by '#'.
Productions are of the form A=B, where 'A' is a single Non-Terminal and 'B' can be any combination of Terminals and Non- Terminals.
L.H.S. of the first production rule is the start symbol.
Grammar is not left recursive.
Each production of a non-terminal is entered on a different line.
Only Upper Case letters are Non-Terminals and everything else is a terminal.
Do not use '!' or '$' as they are reserved for special purposes.

Below is the implementation:

Output:

First(X) = { q, n, o, p, #,m}
First(T) = { q, #, }
First(S) = { p, #, }
First(R) = { o, p, q, #, }
-----------------------------------------------
Follow(X) = { $, }
Follow(T) = { n, m, }
Follow(S) = { $, q, m, }
Follow(R) = { m, }

Conclusion

Understanding First and Follow sets is crucial for building efficient parsers in compiler design. These sets allow us to find which production rules are during parsing to identify the potential starting and following symbols for each non terminal. By calculating First and Follow sets, we ensure accurate syntax analysis, helping to create parsers that are both predictive and unambiguous, which is essential for building reliable compilers.

Comment

Article Tags:

C Programs

Compiler Design

Explore

Compiler Design Basics

Lexical Analysis

Syntax Analysis & Parsers

Syntax Directed Translation & Intermediate Code Generation

Code Optimization & Runtime Environments

Practice Questions

Courses

URL: https://www.geeksforgeeks.org/compiler-design/program-calculate-first-follow-sets-given-grammar/