Lesson 2: Schema objects
Schema objects
Learning objectives
- Gain an understanding of schema objects available within MariaDB
- Learn how to use and create databases and tables, including temporal data tables, and flexible and invisible columns
- Discuss data types and built-in functions, and their use cases
- Learn how to create and use views, triggers, events, and sequences
Databases, tables and default schemas
Case sensitivity
Depending on Operating System, File System and lower_case_table_names
Set before starting a project (strongly recommended)!
Usually case sensitive by default on Linux, not so on Windows and MacOS
Adopt a convention, such as always creating and referring to databases and tables using lowercase names
Databases
Database aka Schema
- Highest Level Object
- Corresponds to directory within data directory
- All other objects reside within a database except user accounts, roles and plugins
CREATE DATABASE world;
Tables
Stores rows of structured data organised by typed columns
Each corresponds to a metadata file (.frm
) and data file(s), dependent on storage engine
(i.e. InnoDB tablespaces)
Qualified by a database name
(i.e. database.table
)
Columns
Set of typed data values
Qualified by a table name
(i.e. table.column)Stored together in each row (MariaDB Server) within pages, within tablespaces or data files
Virtual generated columns are not stored at allIndex
Column attributes
Columns have strict type definitions
Can specify DEFAULT value for column
Only Primary Key can be automatically incremented
CREATE TABLE people
(id INT AUTO_INCREMENT KEY,
name VARCHAR(20) DEFAULT 'unknown');
Columns can be NULL, unless defined NOT NULL
- NULL means “No Value”, “Not Applicable”, or “Unknown”
- Use NULL when value is not an empty string
- NOT NULL
- Reduces storage in some Engines
- Can also reduce execution time because there are more CPU cycles used to first check for NULL
Indexes
- Exact copy of selected columns followed by primary key (e.g., name,address,ID)
- For long columns the index might not be an “exact copy”, but instead it might contain a prefix of the column
- Fast lookup of data within a table, without having to scan all columns for each row
- For InnoDB, primary key is silently appended to end of secondary indexes, unless specified elsewhere within index
- Attribute of a table
Constraints
- Types: Primary Key, Foreign Key, Unique and Check
- Support and implementation differs per storage engine
- Allows data within column(s) to be limited or constrained to a set of values
- Usually defined with corresponding index
- Attribute of a table
- Has potential for contention at scale
Views
- Virtual table defined by a SQL
SELECT
query - Evaluated at each access
- No materialized views
- Treated as a table for many purposes, shares namespace with tables
- Updateable in certain cases (
WITH CHECK OPTION
) - Qualified by database (
database.view
)
Stored routines
- Types: Functions, Triggers, Events, and Stored Procedures
- Has input and output parameters
- Reusable SQL Code
- SQL/PSM (default)
- PL/SQL (
sql_mode=ORACLE
)
- Can allow cursor loops
- Runtime script, not compiled or binary
- Lack of good debugging, can be hard to profile
- Can be bad for Statement based binary logging
- Stored routine uses the privileges of the user that defined it (user can be changed with the DEFINER clause)
- If SQL SECURITY INVOKER is set then the privileges of the user executing the stored routine are used
- Qualified by database (
database.my_stored_procedure
)
Default schemas
information_schema
- The encyclopedia of your database
- Contains information on databases, tables, columns, procedures, indexes, statistics, partitions and views
- Plugins can install additional
information_schema
tables
mysql
Stores information on:
- Server configuration
- Grants and timezones
- User accounts
- Roles
- Stored procedure definitions
- Stored function definitions
- Event definitions
- Plugins installed by
INSTALL SONAME
/INSTALL PLUGIN
performance_schema
- Metrics and performance data
- Will be covered in Tuning & optimization Module
Creating your first table
CREATE TABLE city (
ID INT(11) NOT NULL AUTO_INCREMENT,
Name CHAR(35) NOT NULL DEFAULT '',
CountryCode CHAR(3) NOT NULL DEFAULT '',
District CHAR(20) NOT NULL DEFAULT '',
Population INT(11) NOT NULL DEFAULT '0',
PRIMARY KEY (ID),
KEY CountryCode (CountryCode),
CONSTRAINT FOREIGN KEY (CountryCode)
REFERENCES country (Code)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
- Name and type of object being created
- Column Definitions
- Define Primary Key, Secondary Indexes and Constraints
- Define Engine, Partitions, Character Set and other attributes
Altering tables
ALTER TABLE
is used to change a table’s schema
ADD COLUMN to Add a Column
DROP COLUMN to Drop a Column — Deletes Data
CHANGE COLUMN and MODIFY COLUMN to Alter a Column
ALTER TABLE table1
ADD COLUMN col5 CHAR(8),
DROP COLUMN col3,
CHANGE COLUMN col4 col6 DATE,
MODIFY COLUMN col8 VARCHAR(10);
Basic Syntax Example for ALTER TABLE
Statement
Temporal tables
Type | Tracks | Sample Use Cases |
---|---|---|
System-Versioned | Change history | Audit, forensics, IoT temperature tracking |
Application-Time Period | Time-limited values | Sales offers, subscriptions |
Bitemporal | Time-limited values with history | Schedules, decision support models |
Temporal tables
System-Versioned Example
CREATE TABLE accounts (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255),
amount INT
) WITH SYSTEM VERSIONING;
Application-Time Period Example
CREATE TABLE coupons (
id INT UNSIGNED,
date_start DATE,
date_end DATE,
PERIOD FOR
valid_period(date_start, date_end)
);
Bitemporal Example
CREATE TABLE coupons_new (
id INT UNSIGNED,
name VARCHAR(255),
date_start DATE,
date_end DATE,
PERIOD FOR
valid_period(date_start, date_end)
) WITH SYSTEM VERSIONING;
Data types and built-in functions
Data types
Types: Binary, Numeric, String, Temporal, and User Defined
Use the most suitable data type to store all possible, required values
Will truncate silently and rounds (sql_mode
)
MariaDB [(none)]> help INT;
Name: 'INT'
Description: INT[(M)] [UNSIGNED] [ZEROFILL]
A normal-size integer. The signed range is -2147483648 to 2147483647. The unsigned range is 0 to 4294967295.
URL: https://mariadb.com/kb/en/mariadb/data-types-numeric-data-types/
Numeric data types
FLOAT
andDOUBLE
are approximate types- Uses 4 and 8 bytes IEEE storage format
DECIMAL (m, d)
maximum total number of digits, number of digits after decimal point- An Exact Value type, up to 65 digits precision, 4 bytes storage for each multiple of nine digits
NUMERIC
is a synonym forDECIMAL
REAL
is a synonym forDOUBLE
- Unless in
REAL_AS_FLOAT
SQL mode
- Unless in
Auto increment
- Use
LAST_INSERT_ID()
to get value generated for client connection SERIAL
is a synonym forBIGINT UNSIGNED NOT NULL AUTO_INCREMENT UNIQUE
- In Aria the counter can be set back manually or if counter value wraps
- InnoDB
- Prepares
AUTO_INCREMENT
counters when the MariaDB Server starts - Single mutex on a table, behavior changed with
innodb_autoinc_lock_mode
- Prepares
0 | default, Traditional
- Holds table-level lock for all
INSERTs
until end of statement
1 Consecutive
- Holds table-level lock for all bulk
INSERTs
(such asLOAD DATA
orINSERT ... SELECT
) until end of statement - For simple
INSERTs
, no table-level lock held
2 Interleaved
- No table-level locks are held ever
- Fastest and most scalable
- Not safe for statement-based replication
- Generated IDs are not always consecutive
Numeric data types
- Use
UNSIGNED
when appropriate INT(n)
specifies display precision, not storage precision- Size and precision is storage engine dependent
- Define handling of Out-of-Range Values with
sql_mode
- Default Mode: Values are Truncated Silently
- Strict Mode: Errors are Generated
BIGINT
can enumerate more than all the ants on Earth and shouldn’t be your default choiceTINYINT(1)
is used forBOOLEAN
values and is aliased by theBOOLEAN
type
String data types
All String Data Types Have A Character Set
CHAR(n)
- Number of characters, not bytes, wide- Always stores n characters
- Automatically pads with spaces for shorter strings
VARCHAR(n)
- Variable length up to maximum n characters- Changes to
CHAR
in Implicit Temporary Tables and mysqld internal buffers - 256 characters and longer treated as
TEXT
- For InnoDB, this maximum will depend on the row format
- Changes to
TEXT
- Large text object- Not supported by the
MEMORY
Storage Engine - MariaDB uses
ARIA
for implicit on-disk temporary tables
- Not supported by the
CHAR
VARCHAR
TINYTEXT
TEXT
MEDIUMTEXT
LONGTEXT
String data types
Character set may be global or for schema, table, or column
CHAR
VARCHAR
TINYTEXT
TEXT
MEDIUMTEXT
LONGTEXT
Multi-byte character sets increase disk storage and working memory requirements
UTF-8
requires 3 or 4 bytes per character
Collations affect string comparison (character order)
Collations can be changed for a query
Binary data types
BINARY
,VARBINARY
andBLOBs
can contain data with bytes from the whole range from 0 - 255- Uses a special character set and collation called “binary”
- Blobs are often used to store files in a database
- Files on disk are often faster
- But no referential integrity is guaranteed
- Blobs are included in transactions, replication, and backups
- Blobs inflate
mysqld
memory usage - Modern InnoDB has some improvements in storage and lookup of blobs
Temporal data types
DATE
— from 1000-01-01 to 9999-12-31- YYYY-MM-DD
TIME
[()]- from -838:59:59 to 838:59:59
DATETIME
[()]- Same ranges as
DATE
andTIME
above - YYYY-MM-DD HH:mm:ss
- Same ranges as
TIMESTAMP
— Unix timestamp, in seconds from 1970-01-01- Many Apps Store
UNIX_TIMESTAMP()
values in unsigned integer field
- Many Apps Store
YEAR
— Accepts YYYY
SELECT CURTIME(4);
| CURTIME(4) |
|------------|
| 05:33:09.1061 |
Special data types
ENUM
is an enumerated list of string values- Uses a 2-byte integer index
CREATE TABLE country ( Continent ENUM('Asia','Europe','N America', 'Africa','Oceania','Antarctica','S America') );
SET
is a specified list of string values- Can hold multiple specified values
CREATE TABLE countrylanguage ( CountryCode CHAR(3), Language SET('English','French','Mandarin') ); INSERT INTO countrylanguage VALUES ('CHN','Mandarin'), ('CAN','English,French');
Built-in functions
Types: String, Date and Time, Aggregate, Numeric, Control Flow
Secondary functions such as Bit Functions and Operators, Encryption, Hashing and Compression, and Information Functions
Special Functions such as Dynamic Columns, Geographic, JSON, Spider and Window Functions
Manipulating date & time
Functions for date and time manipulation
ADDDATE()
ADDTIME()
CONVERT_TZ()
CURDATE()
CURTIME()
DATE()
DATE_ADD()
DATE_FORMAT()
DATE_SUB()
DATEDIFF()
DAYNAME()
DAYOFMONTH()
DAYOFWEEK()
DAYOFYEAR()
EXTRACT()
FROM_DAYS()
FROM_UNIXTIME()
GET_FORMAT()
HOUR()
LAST_DAY()
MAKEDATE()
MAKETIME()
MICROSECOND()
MINUTE()
MONTH()
MONTHNAME()
NOW()
PERIOD_ADD()
PERIOD_DIFF()
QUARTER()
SEC_TO_TIME()
SECOND()
STR_TO_DATE()
SUBDATE()
SUBTIME()
SYSDATE()
TIME()
TIME_FORMAT()
TIME_TO_SEC()
TIMEDIFF()
TIMESTAMP()
TIMESTAMPADD()
TIMESTAMPDIFF()
TO_DAYS()
UNIX_TIMESTAMP()
UTC_DATE()
UTC_TIME()
UTC_TIMESTAMP()
WEEK()
WEEKDAY()
WEEKOFYEAR()
YEAR()
YEARWEEK()
Documentation on Date and Time Functions: https://mariadb.com/kb/en/mariadb/date-and-time-functions/
Examples of date & time functions
Used in Queries and Data Manipulation Statements
SELECT NOW() + INTERVAL 1 DAY
- INTERVAL 1 HOUR
AS 'Day & Hour Earlier';
+---------------------+
| Day & Hour Earlier |
+---------------------+
| 2020-06-02 08:32:44 |
+---------------------+
Used in WHERE Clauses
UPDATE table1
SET col3 = 'today', col4 = NOW()
WHERE col5 = CURDATE();
Used in Bulk Load
load data local infile '/tmp/test.csv' into
table test fields terminated by ','
ignore 1 lines (id,@dt1)
set dts=str_to_date(@dt1,'%d/%m/%Y');
Manipulating strings
Functions for string manipulation
ASCII()
BIN()
BINARY
BIT_LENGTH()
CAST()
CHAR()
CHARACTER_LENGTH()
CHAR_LENGTH()
CHR()
CONCAT()
CONCAT_WS()
CONVERT()
ELT()
EXPORT_SET()
EXTRACTVALUE()
FIELD()
FIND_IN_SET()
FORMAT()
FROM_BASE64()
HEX()
INSERT()
INSTR()
LCASE()
LEFT()
LENGTH()
LIKE
LOAD_FILE()
LOCATE()
LOWER()
LPAD()
LTRIM()
MAKE_SET()
MATCH AGAINST()
MID()
NOT LIKE
NOT REGEXP
OCTET_LENGTH()
ORD()
POSITION()
QUOTE()
REPEAT()
REPLACE()
REVERSE()
RIGHT()
RPAD()
RTRIM()
SOUNDEX()
SOUNDS LIKE
SPACE()
STRCMP()
SUBSTR()
SUBSTRING()
SUBSTRING_INDEX()
TO_BASE64()
TRIM()
UCASE()
UNHEX()
UPDATEXML()
UPPER()
WEIGHT_STRING()
Documentation on String Functions: https://mariadb.com/kb/en/library/string-functions/
An example of a string function
Used in queries and data manipulation statements
SELECT domain, domain_count FROM
( SELECT SUBSTRING(email_address, LOCATE('@', email_address) +1 ) AS domain,
COUNT(*) AS domain_count
FROM clients_email
GROUP BY domain ) AS derived1
WHERE domain_count > 200
LIMIT 100;
Aggregate functions
Used for Summary Operations
Aggregate Functions Reduce a Set of Values to a Single Value
- AVG()
- COUNT()
- GROUP_CONCAT()
- MAX()
- MIN()
- STD()
- STDDEV
- SUM()
- VARIANCE()
DISTINCT
Removes Duplicate Values before Aggregation (e.g., with COUNT()
and GROUP_CONCAT()
)
SELECT COUNT(*)
FROM City;
+----------+
| COUNT(*) |
+----------+
| 4079 |
+----------+
SELECT COUNT(DISTINCT CountryCode)
FROM City;
+-----------------------------+
| COUNT(DISTINCT CountryCode) |
+-----------------------------+
| 232 |
+-----------------------------+
Aggregate multiplication
Using logarithm logic:
A chart shows the rate of return for a publicly traded company over 10 years
With an initial investment of $1,000 in 2009, calculate the value of the investment by 2019
- Possible solution:
SELECT AVERAGE(ROR) FROM investment;
- Problem: This would result in an average return of approximately 9.9% over the 10 year span, but it does not take into account compound interest.
Aggregate multiplication
Calculating compound interest: Find each year’s ROR:
SELECT YEAR, 1+ROR AS multiplier, LOG(1+ROR)
FROM investment;
+------+-----------+---------------------+
| YEAR | multiplier | LOG(1+ROR/100) |
+------+-----------+---------------------+
| 2010 | 1.44 | 0.36464311193222|
| 2011 | 0.75 | -0.28768207245178|
| 2012 | 1.08 | 0.07961013948044|
..
| 2017 | 1.11 | 0.10436001478726|
| 2018 | 0.92 | -0.083381606995421|
| 2019 | 0.79 | -0.23572232522169 |
+------+-----------+---------------------+
10 rows in set (#.## sec)
Aggregate multiplication
Calculating compound interest:
Add all the logarithmic results together
SELECT SUM(LOG(1+ROR)) FROM investment; +--------------------+ | SUM(LOG(1+ROR)) | +--------------------+ | 0.68010598814117 | +--------------------+
Determine the return percentage with compound interest
SELECT EXP(SUM(LOG(1+ROR))) FROM investment; +------------------------+ | EXP(SUM(LOG(1+ROR))) | +------------------------+ | 1.9740869509493 | +------------------------+
Aggregate multiplication
Calculating compound interest:
Calculate the result
Over ten years, even including the bad years, the initial investment of $1,000 nearly doubled
Creative SQL can be used to solve perceived limitations in the language.
SELECT EXP(SUM(LOG(1+ROR)))*1000 FROM investment;
+-------------------------+
| EXP(SUM(LOG(1+ROR)))*1000 |
+-------------------------+
| 1974.0869509493 |
+-------------------------+
Running total query
In the course of almost every analytical presentation, someone will request that a running total be presented!
The following SQL example uses a self-join on the investment table to create the running total
SELECT a.YEAR, a.ROR,
EXP(SUM(LOG(1+b.ROR)))*1000 AS balance
FROM investment a
JOIN investment b ON (a.YEAR >= b.YEAR)
GROUP BY a.YEAR, a.ROR;
Running total query
Notice how this statement works
SELECT a.YEAR, a.ROR,
EXP((LOG(1+b.ROR)))*1000 AS balance
FROM investment a
JOIN investment b ON (a.YEAR >= b.YEAR)
ORDER BY a.YEAR, a.ROR;
+------+----------+------------------+
| YEAR | ROR | balance |
+------+----------+------------------+
| 2000 | 0.0 | 1439.99999761581 |
| 2011 | -0.25 | 1439.99999761581 |
| 2011 | -0.25 | 1439.99999761581 | 750 |
| 2012 | 0.08 | 1439.99999761581 |
...
| 2019 | -0.21 | 1439.99999761581 |
| 2019 | -0.21 | 1439.9999821186 | 750 |
| 2019 | -0.21 | 1079.9999821186 |
+------+----------+------------------+
Control flow functions
IF(expr1,expr2,expr3)
- If
expr1
is TRUE (expr1 <> 0
andexpr1 <> NULL
) thenIF()
returnsexpr2
; otherwise it returnsexpr3
IFNULL(expr1,exprNULL)
returns expr1
, or exprNULL
if expr1
is NULL
Do not mix up with NULLIF(expr1,expr2)
which returns expr1
if expr1<>expr2
otherwise NULL
Some numeric functions
CEIL(X)
- Returns the smallest integer value not less than
X
- Synonym for
CEILING()
- Returns the smallest integer value not less than
FLOOR(X)
- Returns the largest integer value not greater than
X
- Returns the largest integer value not greater than
ROUND(X, D)
- Rounds or round up to
D
decimal places
- Rounds or round up to
ABS(X)
- Returns the non-negative number of
X
- Returns the non-negative number of
Views, triggers and events
Views
- Adapt and standardize table schema
- Simplify, split, or factor complex reporting queries
- Restrict visible table data to specific users and applications
An example of a view
- A SQL Statement Represented as a Table
- Views are Not Materialised
- The
SELECT
Query always Re-Executes
CREATE VIEW emp_names AS
SELECT emp_id, name_first, name_last
FROM employees;
ALTER VIEW emp_names AS
SELECT emp_id, name_first, name_middle, name_last,
FROM employees;
SHOW FULL TABLES WHERE Table_type = 'VIEW';
SHOW CREATE VIEW emp_names \G
DROP VIEW emp_names;
Lesson summary
- Gain an understanding of schema objects available within MariaDB
- Learn how to use and create databases and tables, including temporal data tables, and flexible and invisible columns
- Discuss data types and built-in functions, and their use cases
- Learn how to create and use views, triggers, events, and sequences
Lab exercises
- 2-1 Creating a Database
- 2-2 Creating a Table
- 2-3 Creating Virtual Columns
- 2-4 Setting the Database Default Character Set
- 2-5 Creating and Comparing Tables for Multiple Storage Engines
For academic and non-commercial usage, licensed under CC BY-NC-SA 4.0 by MariaDB plc.