![]() |
VOOZH | about |
To work with formal languages and string patterns, it is essential to understand regular expressions, regular grammar, and regular languages. These concepts form the foundation of automata theory, compiler design, and text processing.
Regular expressions are symbolic notations used to define search patterns in strings. They describe regular languages and are commonly used in tasks such as validation, searching, and parsing.
A regular expression represents a regular language if it follows these rules:
| Description | Regular Expression | Regular Languages |
|---|---|---|
| Set of vowels | `(a | e |
| 'a' followed by 0 or more 'b' | ab* | {a, ab, abb, abbb, abbbb, ...} |
| Any number of vowels followed by any number of consonants | [aeiou]*[bcdfghjklmnpqrstvwxyz]* | {ε, a, aou, aiou, b, abcd, ...} (ε represents empty string) |
A regular grammar is a formal grammar that generates regular languages. It consists of:
a, b).S, A).A -> aB or A -> a.
S -> aS | bS | ε
A -> Ba or A -> a.read more about - Regular grammar
Regular languages are the class of languages that can be represented using finite automata, regular expressions, or regular grammar. These languages have predictable patterns and are computationally efficient to recognize.
1. Closure Properties
Regular languages are closed under operations like union, concatenation, and Kleene star.
read more about - Closure properties of Regular languages
2. Finite Automata:
Every regular language can be recognized by a finite automaton (DFA or NFA).
3. Decision Problems:
Problems like membership testing, emptiness, and equivalence can be solved for regular languages.
Note: Two regular expressions are equivalent if languages generated by them are same. For example, (a+b*)* and (a+b)* generate same language. Every string which is generated by (a+b*)* is also generated by (a+b)* and vice versa.
| Aspect | Regular Expressions | Regular Grammar | Regular Languages |
|---|---|---|---|
| Definition | Pattern representation of strings | Rule-based generation of strings | Language class described by regex and grammar |
| Representation | Symbols and operators | Terminals, non-terminals, production rules | Finite automata, regex, or grammar |
| Use Case | Text processing, validation | Syntax generation for compilers | Language recognition |
Which one of the following languages over the alphabet {0,1} is described by the regular expression?
*(0+1)0(0+1)0(0+1)
(A) The set of all strings containing the substring 00.
(B) The set of all strings containing at most two 0s.
(C) The set of all strings containing at least two 0s.
(D) The set of all strings that begin and end with either 0 or 1.
Solution:
00. However, the string 10101 is part of the language but does not contain 00. So, Option A is incorrect.0s. But the string 00000 is part of the language, which violates this condition. So, Option B is incorrect.0s. The regular expression ensures at least two 0s are present. Hence, Option C is correct.0 or 1. However, the language can generate strings starting with 0 and ending with 1 or vice versa. So, Option D is incorrect.Correct Answer:(C)
Which of the following languages is generated by the given grammar?
S -> aS | bS | ε
(A) {aⁿbᵐ | n,m ≥ 0}
(B) {w ∈ {a,b}* | w has an equal number of as and bs}
(C) {aⁿ | n ≥ 0} ∪ {bⁿ | n ≥ 0} ∪ {aⁿbⁿ | n ≥ 0}
(D) {a,b}*
Solution:
nas followed by mbs. However, the grammar can produce strings like ba (S -> bS -> ba), which violates the pattern. So, Option A is incorrect.as and bs. But the string b (S -> bS -> b) does not satisfy this condition. So, Option B is incorrect.as, only bs, or nas followed by nbs. However, the string ba does not fit this pattern. So, Option C is incorrect.as and bs in any order. The grammar can generate all such strings, including ba. Hence, Option D is correct.Correct Answer:(D)
The regular expression 0*(10*)* denotes the same set as:
(A) (10)1
(B) 0 + (0 + 10)
(C) (0 + 1)10(0 + 1)
(D) None of these
Solution:
Two regular expressions are equivalent if the languages they generate are the same.
0*(10*)*, making them equivalent. So, Option A is correct.0*(10*)* can generate the null string. So, Option B is incorrect.10 is a substring, but 0*(10*)* may or may not have 10 as a substring. So, Option C is incorrect.Correct Answer:(A)
The regular expression for the language with input alphabets a and b, where two as do not come together, is:
(A) (b + ab)* + (b + ab)a
(B) a(b + ba) + (b + ba)*
(C) Both (A) and (B)
(D) None of the above
Solution:
The language can be expressed as:
L = {ε, a, b, bb, ab, aba, ba, bab, baba, abab, ...}
ab as the building block for strings where two as are not adjacent. (b + ab)* covers strings ending with b, while (b + ab)*a covers strings ending with a.ba as the building block and covers strings starting with a or b.Correct Answer:(C)