Fixed-point binary representation allows computers to handle fractional numbers efficiently by fixing the position of the binary point, offering a balance between precision and simplicity.
What is fixed-point binary?
Fixed-point binary is a method of representing fractional numbers using binary digits where the position of the binary point is fixed and predetermined. This fixed location distinguishes it from other formats like floating-point, which allow the binary point to move based on an exponent.
Fixed-point binary is especially useful in systems that:
Operate under tight memory or hardware constraints (e.g., embedded systems).
Require fast and efficient numerical calculations.
Lack hardware support for floating-point arithmetic.
Require predictable behaviour in arithmetic operations.
By choosing a fixed location for the binary point, programmers can simplify binary arithmetic and ensure consistent handling of numbers.
Comparison with floating-point representation
Floating-point representation, as used in most general-purpose computers, stores numbers with an exponent that shifts the binary point dynamically. This enables a much wider range of values but requires more complex hardware and computational resources.
Fixed-point, by contrast:
Practice Questions
FAQ
The binary point is fixed in fixed-point representation to reduce complexity in both hardware and software. By knowing exactly where the binary point is located, systems can perform arithmetic operations without needing to account for an exponent or shifting values during calculation. This greatly simplifies the design of arithmetic circuits, especially in embedded systems or digital signal processing where resources are limited. The fixed position means every bit’s contribution to the number is predetermined, allowing for predictable behaviour, which is crucial in real-time systems. Unlike floating-point, where part of the bit pattern is reserved for encoding the exponent and sign, fixed-point dedicates all bits to the number itself (split between integer and fractional parts). This makes operations faster and conserves processing power. The trade-off is a loss of dynamic range, but for many applications, especially those with known input limits, fixed-point is efficient and reliable without the need for additional metadata like an exponent.
Signed fixed-point numbers use the same principle as signed integers: the most significant bit (MSB) acts as the sign bit when using two’s complement representation. In an 8-bit signed fixed-point format, if the MSB is 1, the number is negative; if it is 0, the number is positive. The binary point remains in a fixed position, such as after the fourth bit in a 4.4 format. For example, the binary 10010000 in two’s complement with 4 integer and 4 fractional bits would represent a negative value. To determine its decimal value, you invert the bits, add one, then interpret the magnitude as usual and apply a negative sign. This approach allows subtraction and negative numbers to be handled using the same binary arithmetic rules. The main drawback is that using one bit for the sign reduces the range of positive values, but the system benefits from simple, uniform logic for both addition and subtraction.
When a result exceeds the range of representable values or cannot be precisely expressed using the available bits, two issues can occur: overflow and quantisation error. Overflow happens when the result is too large or too small to fit within the allocated integer or fractional bits. For instance, adding two large numbers may cause the integer part to exceed its 4-bit limit, resulting in wrap-around or an incorrect value depending on the system’s overflow handling. Quantisation error occurs when a decimal value has no exact binary representation, like 0.1 in decimal, which becomes a recurring binary fraction. In fixed-point, this fraction must be truncated or rounded to fit the number of fractional bits. This introduces small inaccuracies in the result, which may accumulate over multiple calculations. Some systems apply rounding strategies (e.g., round to nearest, floor, ceiling) to minimise the impact, but fixed-point will always have precision limits due to its finite bit width.
To maintain correct placement of the binary point during multiplication or division, programmers or hardware must explicitly adjust the result by shifting bits. For multiplication, multiplying two fixed-point numbers effectively doubles the number of fractional bits, so the result must be right-shifted by the number of fractional bits to restore the original format. For example, if two 8-bit numbers with 4 fractional bits are multiplied, the raw result will have 8 fractional bits; a right shift of 4 bits is needed to bring it back to 4. Similarly, for division, the dividend is first left-shifted by the number of fractional bits to preserve precision before dividing by the divisor. These adjustments prevent distortion of the numerical value. Failure to correctly shift the result will lead to an incorrect interpretation of the number. Some fixed-point libraries automate this, but low-level systems or custom implementations require careful bit manipulation to maintain numerical integrity.
Managing rounding errors in fixed-point systems involves selecting suitable rounding techniques and carefully designing the fixed-point format to balance range and precision. Common rounding strategies include:
Truncation: Simply cutting off excess bits. Fast but least accurate.
Round to nearest: Adds 1 to the last retained bit if the first discarded bit is 1. More accurate and commonly used.
Round up (ceiling): Rounds towards positive infinity, often used when underestimation is unacceptable.
Round down (floor): Rounds towards negative infinity, useful for certain control systems.
Another important strategy is guard bits: temporary extra bits used during calculations to store intermediate precision, which are later rounded properly. Additionally, scaling factors can be applied to ensure the value is expressed in a more easily representable form before conversion. For example, storing temperature as tenths of a degree allows 23.5°C to be stored as 235. Finally, choosing the correct number of fractional bits for the application helps reduce cumulative errors. Regular testing and validation of algorithms can help detect unacceptable rounding behaviour.
