TutorChase logo
Login
AQA A-Level Computer Science

13.2.3 Regular expressions – syntax and usage

Regular expressions offer a concise and powerful way to define patterns within strings, often used in string matching, validation, and parsing tasks.

What is a regular expression?

A regular expression, or regex, is a sequence of characters that defines a search pattern. This pattern represents a set of strings, rather than listing each string explicitly. Regular expressions are used to describe and match strings that conform to specific formats.

They are particularly useful in situations involving:

  • Input validation (e.g. checking if an email is valid)

  • Text searching and filtering (e.g. finding all words ending in "ing")

  • Syntax checking (e.g. determining if a binary string ends in '01')

  • Lexical analysis in compilers and interpreters

At their core, regular expressions belong to a class of languages known as regular languages. These are the simplest types of formal languages and are recognised by finite state machines (FSMs).

For instance, the regular expression a*b represents:

Take your grades to the next level!

UPGRADING TO PREMIUM UNLOCKS
AI Tutor
AI-powered study assistant
instant feedback and guidance
Predicted Papers
Examiner-style predicted papers
based on recent exam trends
Practice Questions
All exam practice questions
by topic for each subject
Study Notes
All detailed revision notes
written by expert teachers
Cheat Sheets
Quick revision summaries
perfect for last-minute review
Past Papers
Complete collection
of practice and past exam papers
Email
Password
Confirm Password
Already have an account?

Practice Questions

FAQ

Yes, regular expressions can be designed to check if a string contains a specific sequence at any position. To achieve this, the pattern must allow for arbitrary characters both before and after the target sequence. For instance, to check if the sequence 101 occurs anywhere in a binary string, the regular expression (0|1)*101(0|1)* would be used. Here, (0|1)* allows for any combination of binary digits, including the empty string, on either side of the 101. This pattern does not require the sequence to be at the start or end of the string. It simply ensures that 101 is present somewhere within the string, surrounded by any other valid binary input. This is a flexible and commonly used approach in pattern matching tasks where the sequence of interest could be embedded in a variety of contexts. It's essential to understand this technique for building general search patterns.

Character classes in regular expressions are used to match any one character from a defined set. They are enclosed in square brackets [] and allow you to specify multiple options for a single character position. For example, [aeiou] matches any lowercase vowel, while [0-9] matches any single digit. If a character in the string matches any character inside the brackets, the class is satisfied. Character classes are useful when a specific set of values is valid for a position. For instance, [01] matches a single binary digit, and [A-Za-z] matches any letter regardless of case. A caret ^ at the beginning of a character class negates it, so [^a] matches any character except a. Character classes are particularly powerful when dealing with inputs that require restricted character options and are often clearer and more compact than writing out full alternations like (a|e|i|o|u).

Grouping with parentheses () and character classes with square brackets [] serve very different purposes in regular expressions, even though both involve grouping symbols. Parentheses are used to group entire sequences or sub-patterns and allow the application of repetition operators (*, +, ?) to the entire group. For example, (ab)* matches zero or more repetitions of the string ab. This allows you to apply pattern logic to substrings rather than individual characters. Square brackets, on the other hand, define a character class, which matches any one character from a specific set. For instance, [abc] matches either a, b, or c, but not ab or bc. You cannot use repetition on individual items within a character class as you would with grouped patterns. Misunderstanding this difference can lead to incorrect pattern behaviour, especially when handling complex strings that depend on both specific sequences and sets of valid characters.

To make a multi-character sequence optional in a regular expression, you must use parentheses to group the sequence, followed by a ? operator. The ? only applies to the immediately preceding character or group, so without grouping, it only makes the last character optional. For example, to make abc optional, you must write (abc)?. This entire group can now appear zero or one time in the input string. If you just write abc?, only the character c is optional, and the pattern matches either ab or abc. Grouping ensures that the repetition or optionality applies to the full pattern rather than just its last component. This is especially useful when dealing with patterns where certain prefixes, suffixes, or segments of data may or may not be present, but must be treated as a unit for the pattern to make logical sense.

Regular expressions can be constructed to match exact counts of characters or sub-patterns by combining precise sequences with optional or repeated elements. To match a binary string that contains exactly two 1s, you must ensure the pattern allows for two 1s and any number of 0s, but no more than two 1s in total. One method is to use alternation and fixed placement: (0*10*10*) matches strings with a 1, followed by any number of 0s, then another 1, and more 0s. This allows for two 1s positioned anywhere, surrounded by 0s. However, it does not match strings with more than two 1s. More complex patterns can use anchors and more precise control over character positions. While regex doesn't count beyond what the expression explicitly defines, patterns can be structured cleverly to simulate fixed occurrences by enforcing strict order and spacing. Regex works well with small, known fixed counts but struggles with general counting.

Hire a tutor

Please fill out the form and we'll find a tutor for you.

1/2
Your details
Alternatively contact us via
WhatsApp, Phone Call, or Email