Attributes are the named columns in a table, each representing a property of an entity. They help define the structure and meaning of the data stored.
What is an attribute?
In relational database design, an attribute is defined as a named column within a table. It represents a specific property or characteristic of an entity, where the entity is the subject of the table.
For instance, if the table represents a Student entity, then common attributes might include:
StudentID
FirstName
LastName
DateOfBirth
EmailAddress
Each attribute describes a distinct piece of information about the student. In this way, attributes form the blueprint of the data the table will contain. Each table in a relational database is composed of several attributes, and every attribute defines a domain of possible values.
Difference between entities and attributes
It is essential to distinguish clearly between entities and attributes:
An entity refers to a real-world object or concept (e.g. Student, Book, Order) that is represented by a table.
An attribute refers to a property or characteristic of that entity (e.g. Name, Title, DateOfBirth).
If we compare it to a sentence, the entity is the noun (e.g. Student), while the attributes are the adjectives that describe it (e.g. the student’s name, age, or ID).
Attributes and records
Practice Questions
FAQ
Yes, two different entities can have attributes with the same name, and in many cases, it is both valid and beneficial. For instance, both a Customer and a Supplier table might include an attribute called EmailAddress. Using consistent attribute names for similar properties across tables promotes clarity, simplifies query writing, and helps maintain uniformity in database design. However, it becomes important to fully qualify attribute names when writing queries involving multiple tables, especially during joins. For example, specifying Customer.EmailAddress and Supplier.EmailAddress avoids ambiguity. While shared names are acceptable, designers must be cautious not to use the same attribute name for unrelated or differently purposed fields, as that can lead to confusion. Consistency should always serve clarity and function. When names are reused, they should represent the same kind of data across all tables. If the meaning diverges, unique and descriptive names should be chosen to preserve semantic integrity.
Surrogate attributes are artificially created identifiers used exclusively within the database to uniquely identify records. Unlike regular attributes, which describe real-world properties of entities (e.g. FirstName, EmailAddress), surrogate attributes serve no meaning outside the database context. A common example is an auto-incrementing CustomerID or OrderID that holds no business significance but ensures uniqueness for each record. Surrogate attributes are particularly useful when no single natural attribute (or combination) can reliably serve as a primary key or when natural keys are subject to change. In contrast, regular attributes store data that is meaningful and necessary to users or applications. Surrogates simplify joins, improve indexing, and support performance optimisation, but should not replace meaningful data fields. It’s important not to rely on surrogate attributes for any business logic or validation. Instead, they are intended strictly for internal database operation. Their consistent use is a common practice in scalable and well-normalised database design.
Atomicity refers to the requirement that each attribute should hold a single, indivisible value in any given record. An atomic attribute cannot be logically split into smaller meaningful components. For example, storing a full name as FullName = "Anna Thompson" is less atomic than storing FirstName = "Anna" and LastName = "Thompson" in separate attributes. Atomicity is important because it enables more effective data manipulation, searching, sorting, and filtering. It also supports normalisation, particularly First Normal Form (1NF), which requires all attributes to be atomic. If an attribute contains multiple values (e.g. PhoneNumbers = "0123, 0456"), it becomes difficult to search or manipulate each value independently. This violates the principles of good relational design and can introduce redundancy and inconsistency. Designing attributes to be atomic ensures data remains well-structured, accessible, and scalable, especially as database complexity grows. It is a foundational principle of relational database systems and essential for integrity and performance.
Technically, the order in which attributes are defined in a relational database table has no impact on how data is stored or processed. The relational model treats data as unordered; it identifies data based on attribute names rather than their positions. However, from a practical and user experience perspective, attribute ordering can play a significant role. Placing the most commonly used attributes near the beginning of a table or interface can enhance readability and usability. It also makes forms, reports, and queries easier to manage. Logical grouping of related attributes, such as placing address fields together (Street, City, Postcode), supports clarity and maintainability. Additionally, consistent ordering across similar tables helps users and developers navigate schemas more easily. While databases do not rely on attribute order functionally, a well-considered layout improves design quality and user understanding. Therefore, thoughtful attribute ordering is encouraged, especially in environments where humans interact with the schema frequently.
Default values are predefined values automatically assigned to an attribute when a new record is created, and no explicit value is provided. For example, in an Orders table, the OrderStatus attribute might have a default value of “Pending”. This feature ensures consistency and reduces the chance of NULL values where they are undesirable. Defaults are particularly useful for attributes where a common initial value makes sense across most records, such as CreatedDate being set to the current date, or a SubscriptionActive flag set to True. Using defaults simplifies data entry, improves data completeness, and supports business logic. However, care should be taken to ensure that default values are genuinely appropriate for most use cases. Assigning defaults where variability is high can result in misleading or incorrect data. It’s also crucial to distinguish between a default value and a genuinely missing (NULL) value. Thoughtful use of defaults enhances data quality and streamlines input workflows.
