![]() |
VOOZH | about |
Regular expressions, often called regex or regexp, are a powerful tool used to search, match, and manipulate text. They are essentially patterns made up of characters and symbols that allow you to define a search pattern for text.
In this article, we will see the basic properties of regular expressions and their work character and how they help in real-world applications.
Regular Expression is a way of representing regular languages. The algebraic description for regular languages is done using regular expressions. They can define it in the same language that various forms of finite automata can describe. Regular expressions offer something that finite automata do not, i.e. it is a declarative way to express the strings that we want to accept. They act as input for many systems. They are used for string matching in many systems(Java, python, etc.)
Example: Lexical-analyzer generators, such as Lex or Flex.
The widely used operators in regular expressions are Kleene closure(∗), concatenation(.), and Union(+).
Example - ∑ = {a, b} and r is a regular expression of language made using these symbols
| Regular language | Regular set |
|---|---|
| ∅ | { } |
| ∈ | {∈} |
| a* | {∈, a, aa, aaa .....} |
| a+ b | {a, b} |
| a.b | {ab} |
| a* + ba | {∈, a, aa, aaa,...... , ba} |
The union of two regular languages, L1 and L2, which are represented using L1 ∪ L2, is also regular and which represents the set of strings that are either in L1 or L2 or both.
Example:
L1 = (1+0).(1+0) = {00 , 10, 11, 01} and
L2 = {∈ , 100}
then L1 ∪ L2 = {∈, 00, 10, 11, 01, 100}.
The concatenation of two regular languages, L1 and L2, which are represented using L1.L2 is also regular and which represents the set of strings that are formed by taking any string in L1 concatenating it with any string in L2.
Example:
L1 = { 0,1 } and L2 = { 00, 11} then L1.L2 = {000, 011, 100, 111}.
If L1 is a regular language, then the Kleene closure i.e. L1* of L1 is also regular and represents the set of those strings which are formed by taking a number of strings from L1 and the same string can be repeated any number of times and concatenating those strings.
Example:
L1 = { 0,1} = {∈, 0, 1, 00, 01, 10, 11 .......} , then L* is all strings possible with symbols 0 and 1 including a null string.
Kleene closure is an unary operator and Union(+) and concatenation operator(.) are binary operators.
If r1 and r2 are regular expressions(RE), then
If r1, r2, r3 are RE, then
i.) r1+ (r2+r3) = (r1+r2) +r3
ii.) r1.(r2.r3) = (r1.r2).r3
Associativity property does not hold for Kleene closure(*) because it is unary operator.
In the case of union operators,
r + ∅ = ∅ + r = r,
Therefore, ∅ is the identity element for a union operator.
In the case of concatenation operator:
r.x = r , for x= ∈, r.∈ = r
Therefore, ∈ is the identity element for concatenation operator(.).
If r1, r2 are RE, then
If r1, r2, r3 are regular expressions, then
There are many identities for the regular expression. Let p, q and r are regular expressions.
In conclusion, regular expressions are a versatile and powerful tool for working with text. They allow you to search, match, and manipulate patterns efficiently, making them invaluable in tasks like data validation, text searching, and automated editing. Mastering regular expressions can greatly enhance your efficiency and problem-solving capabilities. The flexibility and power they offer make them an essential skill in many fields.