TutorChase logo
Login
AQA A-Level Computer Science

13.2.5 Regular language

Regular languages are a fundamental class of formal languages that can be described using regular expressions or recognised by finite state machines. They form the basis for language processing and are widely used in computer science.

What is a regular language?

A regular language is a formal language that can be defined in either of two equivalent ways:

  • It can be accepted by a finite state machine (FSM).

  • It can be described using a regular expression.

This means that if a language can be represented by a regular expression, then there exists a finite state machine that will accept exactly the same strings that the regular expression defines—and vice versa.

Regular languages are used in areas such as text processing, programming language design, lexical analysis, network protocols, and input validation. They are also foundational for understanding more complex language classes.

Regular languages are defined over an alphabet, usually represented by the Greek letter Σ (sigma), which is simply a set of symbols such as {0,1} or {a,b}.

Two definitions of regular languages

Recognised by a finite state machine

Take your grades to the next level!

UPGRADING TO PREMIUM UNLOCKS
AI Tutor
AI-powered study assistant
instant feedback and guidance
Predicted Papers
Examiner-style predicted papers
based on recent exam trends
Practice Questions
All exam practice questions
by topic for each subject
Study Notes
All detailed revision notes
written by expert teachers
Cheat Sheets
Quick revision summaries
perfect for last-minute review
Past Papers
Complete collection
of practice and past exam papers
Email
Password
Confirm Password
Already have an account?

Practice Questions

FAQ

Yes, a language can still be regular even if it contains infinitely many strings. Regularity is not determined by the size of the language but by the structure and recognisability of its strings. If the language has a pattern that can be expressed using a regular expression or accepted by a finite state machine (FSM), then it is regular. FSMs are capable of handling infinite languages as long as the language can be described with a finite number of rules and transitions. For example, the language consisting of all strings over {0,1} that end with a ‘1’ is infinite, yet it is regular because an FSM can accept it using a simple structure with a few states. The key is that regular languages do not need to “remember” any unbounded information about input strings—so infinite repetition of a valid pattern is perfectly acceptable in regular languages.

Regular languages handle optional patterns or characters using constructs in regular expressions and equivalent FSM structures. In regular expressions, the ? operator allows a character or subpattern to be optional, meaning it can appear zero or one time. For example, the expression a?b matches both ab and b. FSMs handle this by introducing transitions that either accept or bypass certain inputs. An FSM would use branches in its transition diagram where one path reads the optional input and the other skips it, allowing both scenarios to be accepted. These optional patterns do not require memory, so they fit within the limitations of FSMs. This is commonly used in languages where certain characters may or may not appear depending on context. Because the presence or absence of the optional component can be captured in finite transitions, this kind of variability is well within the power of regular languages and easily implemented in both FSMs and regular expressions.

Regular languages are unable to represent nested or recursive patterns because they lack the ability to store or compare arbitrary levels of structure or depth. Nesting typically requires a machine to remember what has come before and match it with what follows—such as matched pairs of brackets or repeated mirrored patterns like palindromes. Finite state machines, which define regular languages, do not have any memory mechanism beyond their finite set of states. This means they cannot count or track how many levels deep a nested structure is, nor can they reverse or match earlier parts of a string. In contrast, pushdown automata, which recognise context-free languages, use a stack to keep track of these deeper relationships. A classic example is matching opening and closing parentheses: the machine must remember every opening parenthesis and match it with the correct closing one. Since FSMs can’t do this, regular languages are inherently unsuitable for any task requiring recursion or nesting.

Regular languages can handle patterns where string lengths are divisible by a fixed number, but they cannot recognise strings of prime length. When it comes to divisibility, FSMs can be designed to keep track of the number of input symbols modulo a fixed integer. For example, an FSM can recognise binary strings whose lengths are divisible by 3 by cycling through three states (mod 0, mod 1, mod 2) and accepting only in the state corresponding to mod 0. However, recognising strings of prime length is a different matter—it requires determining whether the length of a string corresponds to a prime number, which is a property that cannot be captured by any fixed number of states. There is no finite pattern or cycle that can identify all primes and only primes. Since finite state machines cannot count or perform arbitrary arithmetic checks on input lengths beyond fixed periodicity, this task is outside the scope of regular languages.

The size of the alphabet used in a language does not determine whether the language is regular. Regularity depends on the structure of the strings in the language and whether that structure can be described by a regular expression or accepted by an FSM. Whether the alphabet is binary, such as {0,1}, or contains many symbols, such as {a, b, c, d, e}, the key question is whether the pattern of acceptable strings requires counting, matching, recursion, or nesting. An FSM can be constructed for any finite alphabet, but the number of transitions needed may increase with alphabet size because each state must define transitions for all symbols in the alphabet. This can make the FSM more complex, but not fundamentally more powerful or less capable. Therefore, a regular language can be defined over any finite alphabet, small or large, as long as the rules governing string formation stay within the limitations of FSMs and regular expressions.

Hire a tutor

Please fill out the form and we'll find a tutor for you.

1/2
Your details
Alternatively contact us via
WhatsApp, Phone Call, or Email