Lesson 3: Database design

Database design

Learning objectives

Data modeling and database design

Database mindset

Naming convention

Consistent, Descriptive Names Reduce Mistakes

Plural Form for Table Names (e.g., products)

Singular Form for Column Names (e.g., name_last, email)

Alphabetical Order for Link Tables (e.g., posts_users)

Long, descriptive names are better than short, cryptic ones. Use aliases.

Use consistent naming for an id column and its references as table_id.

Table design

Normalization

Denormalization

Data relationships

One-to-One or Zero-to-One
Stored generally in the same table
May be split for performance

One-to-Many or Zero-to-Many
Many in separate table referenced by one primary key (i.e., a foreign key)

Many-to-Many
Several tables with one operating as a link
Linked tables often have a composite primary key

Third normal form (3NF)

Remove horizontal redundancies

First normal form

No single column with more than a single item

No two columns with the same information

Each row must be unique

First normal form

Use a primary key: Natural or surrogate like AUTO_INCREMENT

Remove vertical redundancy

Second normal form

Same value should not repeat across rows

Data type can play a role (e.g., ENUM)

Remove vertical redundancy

Third normal form

Columns not dependent on primary key are removed

Character set and collation

Character set may be global or for schema, table or column

Multi-byte character sets increase disk storage and working memory requirements

Collations affect string comparison (character order)

Collations can be changed for a query:

SELECT * FROM table1 ORDER BY col1
COLLATE latin1_german2_ci;

Indexing concept

Poor Indexing #1 Reason For Poor Performance

Like most RDBMS, MariaDB resolves queries faster with indexes

Without Indexes, MariaDB does slow Full Table Scans

Indexes can be for a Column or Multiple Columns (i.e., Composites)

In InnoDB, the primary key is added to the end of all indexes on disk

Avoid Indexing Excessively or Arbitrarily

Regularly Remove Unused or Redundant Indexes

Avoid foreign keys due to performance overhead

Index best practices

Every table should have a Primary Key

Foreign Key should have an index as it helps in the query joins

Create an index on the columns that are frequently used in the WHERE clause of the queries

Consider ORDER BY clause for possible composite Indexes

INDEX on an EXPRESSION, also known as FUNCTION based indexes

EXPLAIN / ANALYZE and PROFILING queries are the most important tools available when looking for possible Index candidates

File storage

Store Files in BLOB Useful

Store Files in BLOB has Problems

Can Minimize by putting Files in Separate Table (1:1 relationship)

Store Files in File System and File Name and Path in Table

Database design summary

Adjust schema as needs change

Consider data growth (a large dataset makes migrations slower and more difficult)

Don’t optimize prematurely (makes future redesigns more difficult)

Consider early on the limitations of the system

Minimize table size on disk and in memory

Archive table data if possible and appropriate

Remove duplicate or unused indexes

Use appropriate data types — smaller is better

Consider sharding large tables across multiple servers

Use auto generated columns when possible

Lesson summary

Lab exercises


For academic and non-commercial usage, licensed under CC BY-NC-SA 4.0 by MariaDB plc.