MySQL Introduction

41 minute read

Published:

Java Web– MySQL Introduction

MySQL Introduction

Getting to know MySQL

Java EE: Enterprise Java Development, Web

Background: (connection point: connect database JDBC, link front end (control, control view jump, and pass data to front end))

database (store data)

Database is the core of all software systems

database

Database: store data

Concept: data warehouse, software, installed on top of operating system

Function: store data, manage data

Classification

Relational database: storage of data through the relationship between tables and between rows and columns (structured SQL)

  • MySQL
  • Oracle
  • SQL Server
  • DB2 -SQLlite

Non-relational database: key-value, object storage, determined by the object’s own attributes (NO-SQL)

  • Redis
  • MongDB

DBMS (Database Management System)

  • Database management software, scientifically and efficiently manage data, maintain and obtain data
  • MySQL – database management system

Introduction to MySQL

  • Relational database management system
  • Developed by the Swedish MySQL AB company, it is a product of Oracle
  • Open source database software, small size, fast speed and low total cost of ownership
  • Suitable for small and medium websites, large websites

Version:

  • 5.7 Stable
  • 8.0

Installation suggestion:

  1. Try not to use exe
  2. Use compressed packages as much as possible to install

Connect to the database

mysql -u username -p password

  • change Password

    update mysql.user set authentication_string=password('123456') where user='root' and Host='localhost';
    
  • Refresh permissions

    flush privileges;
    
  • View all databases

    show databses;
    
  • switch database

    use database name;
    
  • View all tables in the database

    show tables;
    
  • View all information in the table

    describe tablename;
    
  • create database

    create database database name;
    
  • exit the connection

    exit; -- single line comment
    /**
    multi-line comment
    */
    

Operate the database

SQL language: CRUD

  • DDL definition
  • DML operations
  • DQL query
  • DCL control

mysql keywords are not case sensitive

Operating the database

  • create database

    create database [if not exists] database name;
    
  • delete database

    drop database [if exists] database name;
    
  • use database

    use database name; -- if the field name is a special character, surround it with ``
    
  • View database

    show databses;
    

database column type

  • Numerical value
    • tinyint: very small data, 1 byte
    • smallint: Smaller data, 2 bytes
    • mediumint: medium size data, 3 bytes
    • int: standard integer, 4 bytes
    • bigint: larger data, 8 bytes
    • float: single-precision floating point number, 4 bytes
    • double: double precision floating point number, 8 bytes
    • decimal: floating point number in string form, financial calculation
  • string
    • char: fixed-size character type, 0 to 255
    • varchar: variable string, 0~65535
    • tinytext: tiny text, 2^8-1
    • text: text string, 2^16-1
  • time date
    • date: YYYY-MM-DD, date format
    • time: HH:mm:ss, time format
    • datetime:YYYY-MM-DD HH:mm:ss, date+time
    • timestamp: timestamp, milliseconds since 1970-01-01
    • year: year representation
  • null
    • no value, unknown
    • Note: Do not use null for operation, the result is null

Database field properties

  • Unsigned: Unsigned integer, cannot be set to negative
  • zerofill: zero fill, insufficient digits are filled with zeros
  • auto_increment: The default is +1 on the basis of the previous record. It is usually used to set it as the only primary key. It must be an integer type. You can also customize the starting value and step size of the primary key.
  • not null: no assignment will result in an error
  • default: set the default value

Note: The following 5 fields must exist in a table, indicating the meaning of the existence of a record

  1. id: primary key
  2. version: for optimistic locking
  3. is_deleted: pseudo-deleted
  4. gmt_create: creation time
  5. gmt_update: modification time

create table

create table if not exists `student` (
`id` int(4) not null auto_increment comment 'Student ID',
    `name` varchar(30) not null default 'anonymous' comment 'name',
    `pwd` varchar(2) not null default '123456' comment 'password',
    `gender` varchar(2) not null default 'female' comment 'gender',
    `birthday` datetime default null comment 'birthday',
    `address` varchar(100) default null comment 'Home address',
    `email` varchar(50) default null comment 'email',
    primary key (`id`)
) engine=innodb default charset=utf8;

Format:

create table [if not exists] `table name`(
`fieldname` column type [property] [index] [comment],
    ......
    `fieldname` column type [property] [index] [comment]
)[table type] [table charset] [comment];

Common commands

show create database database name;-- view the statement to create the database
show create table table name;-- view the statement to create the table
desc table name; -- view the structure of the table
  • Type of data sheet

    1. Innodb uses default

    2. Early use of MyIsam

       MyIsamInnodb
      Transaction SupportNot SupportedSupported
      Data LocksTable LocksRow Locks
      Foreign Key ConstraintsNot SupportedSupported
      Full Text IndexSupportedNot Supported
      Tablespace sizeSmallerAbout twice as large as MyIsam

    General use operation:

    • MyIsam: Space-saving and fast
    • Innodb: high security, transaction processing, multi-table multi-user operation
  • The difference between the MySQL engine on the physical file

    • Innodb has only one *.frm file in the database table, and the ibdata1 file in the parent directory
    • File corresponding to MyIsam
      • Definition file of *.frm table structure
      • *.myd data files
      • *.myi index file
  • Character set encoding

    If not set, it will be the default character set encoding of mysql (Chinese is not supported)

Modify table

alter table table name rename as new table name;--modify table name
alter table table name add field name field attribute;--add field
alter table table name modify field name field attribute;--modify field constraints
alter table table name change field name new field name;--field renaming
alter table table name drop field name;--delete table fields

delete table

drop table [if exists] table name;

All creation and deletion should be judged as much as possible to avoid errors

be careful:

  • Field name, wrapped with ``
  • Comments – /**/
  • case insensitive
  • All symbols in English

##Data management

foreign keys

create table if not exists `student` (
`id` int(4) not null auto_increment comment 'Student ID',
    `name` varchar(30) not null default 'anonymous' comment 'name',
    `pwd` varchar(20) not null default '123456' comment 'password',
    `gender` varchar(2) not null default 'female' comment 'gender',
    `birthday` datetime default null comment 'birthday',
    `gradeid` int(10) not null comment 'Student grade',
    `address` varchar(100) default null comment 'Home address',
    `email` varchar(50) default null comment 'email',
    primary key (`id`),
    key `FK_gradeid` (`gradeid`),
    constraint `FK_gradeid` foreign key (`gradeid`) references `grade` (`gradeid`)
) engine=innodb default charset=utf8;

When deleting a table with a foreign key relationship, you must first delete the table that references others (slave table), and then delete the referenced table (master table)

alter table `student`
add constraint `FK_gradeid` foreign key(`gradeid`) references `grade` (`gradeid`);

Physical foreign keys, database-level foreign keys, deprecated

Best Practice: Implementing Foreign Keys Programmatically

DML language

data manipulation language

  • insert add

    insert into table name([field name 1, field name 2,...]) values('value 1','value 2',...),('value 1','value 2',... ),...;
    

    if not specified, the fields of the write table will match one by one

  • update modification

    update table name set field name 1=value 1, [set field name 2=value 2, ...] [where condition];
    

    Not specifying conditions will change all data, value can be a value or a variable

    operator:

    OperatorMeaning
    =is equal to
    <> or !=not equal to
    >greater than
    <is less than
    >=Greater than or equal to
    <=Less than or equal to
    BETWEEN … AND …within a range, closed interval
    andor
    orand
  • delete

    1. delete
    delete table name from [where condition];
    

    Not specifying conditions will delete all data

    1. truncate
    truncate tablename;
    

    delete and truncate:

    • Can delete data, but will not delete table structure
    • truncate resets the auto-increment column, the counter will be cleared
    • truncate does not affect transactions
    • delete problem:
      • Innodb auto-increment column will start from 1 (memory, lost when power is off)
      • MyIsam continues from the previous self-increment (file)

    DQL language

    Data query language, the core language in the database.

    • Specify query fields

      select field from table;
      

      Aliases: give the result a name as, you can alias fields or tables

      The function concat(a,b) string concatenation

      select studentno as student number, studentname as student name from student;
      select concat("Name:",studentname) as new name from student as student table;
      

      Deduplication distinct, remove duplicate data, only display one

      select distinct studentno from result;
      

      database column (expression)

      select version();-- query system version
      select 1+2;-- evaluate expression
      select @@auto_increment_increment;-- query auto increment step size
      select studentno, studentresult+1 as after scoring from result;
      
    • where conditional clause

      Retrieve eligible values in data

      • Logical Operators

        operatorsyntaxdescription
        and &&a and b a&&blogical AND
        or ||a or b a||blogical or
        not !not a !aLogical NOT

        Note: try to use English letters

        select studentno, studentresult from result
        where studentresult>=95 and studentresult<=100;
              
        select studentno, studentresult from result
        where studentresult between 95 and 100;
              
        select studentno, studentresult from result
        where not studentno=1000;
        
      • Fuzzy query

        comparison operator

        operatorsyntaxdescription
        is nulla is nullis empty
        is not nulla is not nullis not null
        between…and…a between b and ca between b and c
        likea like ba matches b
        ina in (a1,a2,…)in one of the values

        % represents 0 to any number of characters _ represents any 1 character

        select studentno, studentname from student
        where studentname like '%jia%';
              
        select studentno, studentname from student
        where studentno in (1001,1002,1003);
              
        select studentno, studentname from student
        where address='' or address is null;
        
      • Linked table query

        select s.studentno, studentname, subjectno, studentresult from student as s inner join result as r
        where s.studentno=r.studentno;
              
        select s.studentno, studentname, subjectno, studentresult from student as s left join result as r
        on s.studentno=r.studentno;
              
        select s.studentno, studentname, subjectno, studentresult from student as s right join result as r
        on s.studentno=r.studentno;
              
        select s.studentno, studentname, subjectno, studentresult from student as s left join result as r
        on s.studentno=r.studentno
        where studentresult is null;
        
        OperationDescription
        inner joinreturns rows if there is at least one match in the table
        left joinReturns all values in the left table even if there is no match in the right table
        right joinreturns all values in the right table even if there is no match in the left table

        Note: join on join query, where equivalent query

        select s.studentno,studentname,subjectname,studentresult
        from student as s
        right join result as r
        on r.studentno=s.studentno
        inner join subject as sub
        on r.subjectno=sub.subjectno;
        

        Self-join: own table and own table join

        Split a table into two identical tables

        select a.categoryName as 'parent column', b.categoryName as 'child column'
        from category as a, category as b
        where a.categoryid=b.pid;
        
      • Pagination and sorting

        1. Sort: Ascending and Descending

          select subjectno, studentresult from result
          order by studentresult asc;
          

          asc ascending, desc descending

        2. Paging

          Syntax: limit start value page size

          select subjectno, studentresult from result
          order by studentresult asc
          limit 0,2;
          
      • Subqueries

        Essence: Nest a subquery statement in the where statement

        select studentno, subjectno, studentresult
        from result
        where subjectno = (
        select subjectno from `subject`
        where subjectname='Advanced Mathematics-3'
        );
        

        Inside and out

        ```sql select studentno, studentname from student where studentno in ( select studentno from result where studentresult>=80 and subjectno = ( select subjectno from subject where subjectname=’Advanced Mathematics’ ) );

  • Grouping and filtering

    group by group

    having filtered

    select subjectname, avg(studentresult), max(studentresult), min(studentresult)
    from result as r
    inner join `subject` as s
    on r.subjectno=s.subjectno
    group by r.subjectno
    having avg(studentresult)>=80;
    
  • select complete syntax:

    select [all | distinct]
    {* | table.* | [table.field1 [as alias1]] [, table.field2 [as alias2]] [, ...]}
    from table_name [as table_alias]
    [left | right | inner join table_name2]
    [where...]
    [group by...]
    [having...]
    [order by...]
    [limit {[offset,]row_count | row_countOFFSET offset}];
    

MySQL functions

Common functions

  • computation

    select abs(-8); -- absolute value
    select ceiling(5.2); -- round up
    select floor(5.2); -- round down
    select rand(); -- returns a random number between 0-1
    select sign(5); -- determine the sign of a number 0 returns 0, negative numbers return -1, positive numbers return 1
    
  • string

    select char_length('sdcdfcdf'); -- return the length of the string
    select concat('a','bc'); -- concatenate strings
    select insert('abcdefg',1,2,'tt'); -- insert string
    select lower('ABC'); -- lowercase
    select upper('abc'); -- convert to uppercase
    select instr('abcdefg','c'); -- returns the index of the first occurrence
    select replace('abcddddd','abc','ttt'); -- replace string
    select substr('abcddddd',4,2); -- return the specified string (source string, intercepted position, intercepted length)
    select reverse('abc'); -- reverse the string
    
  • time and date functions

    select current_date(); -- get the current date
    select now(); -- get the current time
    select localtime(); -- get local time
    select sysdate(); -- get system time
    select year(now()); -- year
    select month(now()); -- month
    select day(now()); -- day
    select hour(now()); -- hour
    select minute(now()); -- minute
    select second(now()); -- seconds
    
  • System functions

    select system_user(); -- system user
    select user(); -- user
    select version(); -- version
    

Aggregate functions

function namedescription
count()count
sum()Summation
avg()Average
max()maximum value
min()minimum value
select count(studentname) from student; -- count specifies the column
select count(*) from student; -- count *
select count(1) from student; -- count 1
  • the difference:
    • count specified fields will ignore null values
    • count(*), count(1) will not ignore null values
    • count(*), count(1) are essentially counting the number of rows

Database level md5

insert into testmd5
values ​​(4,'xiaoming',md5('123456'));

Transactions

Definition

Either all succeed or all fail.

Execute a set of SQL in one batch

in principle

https://blog.csdn.net/dengjili/article/details/82468576

ACID principle

  • Atomicity: Atomicity means that a transaction is an indivisible unit of work, and the operations in the transaction either all occur or none of them occur
  • Consistency: The integrity of the data before and after the transaction must be consistent
  • Isolation: The isolation of a transaction is that when multiple users access the database concurrently, the transaction opened by the database for each user cannot be interfered by the operation data of other transactions, and multiple concurrent transactions must be isolated from each other
  • Persistence: Persistence means that once a transaction is committed, its changes to the data in the database are permanent, and even if the database fails, it should not have any effect on it.

Isolate the resulting problem

  • Dirty read: refers to a transaction that reads uncommitted data from another transaction
  • Non-repeatable read: Read a row of data in a table within a transaction, and the results of multiple reads are different (this is not necessarily an error, but it is wrong in some cases)
  • Phantom read (phantom read): refers to reading data inserted by other transactions within a transaction, resulting in inconsistent reading before and after (usually row effect, one more row)

execute transaction

MySQL enables automatic transaction commit by default

off: set autocommit=0

On: set autocommit = 1

start transaction -- transaction start

insert xxx
insert yyy

commit; -- commit
rollback; -- rollback

-- end of transaction

Set a savepoint for a transaction: savepoint savepoint_name

Rollback to savepoint: rollback to savepoint savepoint name

Remove savepoint: release savepoint savepoint name

Transaction isolation level

SettingsDescription
SerializableAvoid dirty reads, non-repeatable reads, and dummy reads. (serialization)
Repeatable readAvoid dirty reads and non-repeatable reads. (repeatable read)
Read committedAvoids dirty read situations (read committed).
Read uncommittedMinimum level, none of the above is guaranteed. (read uncommitted)
  • Read Uncommitted: Allows dirty reads, that is, it is possible to read data modified by uncommitted transactions in other sessions

  • Read Committed: Only the committed data can be read. Most databases such as Oracle default to this level (non-repeatable read)

  • Repeated Read: Repeated read. Queries within the same transaction are consistent at the beginning of the transaction, the default level of InnoDB. In the SQL standard, this isolation level eliminates non-repeatable reads, but there are still phantom reads

  • Serializable: fully serialized read, each read needs to obtain a table-level shared lock, and reads and writes will block each other

According to the SQL:1992 transaction isolation level, InnoDB is REPEATABLE READ by default. MySQL/InnoDB provides all four transaction isolation levels described by the SQL standard. You can use the –transaction-isolation option on the command line, or in an options file, to set the default isolation level for all connections. For example, you can set this option in the [mysqld] section of the my.inf file like this:

transaction-isolation = {READ-UNCOMMITTED | READ-COMMITTED | REPEATABLE-READ | SERIALIZABLE}

Users can use the SET TRANSACTION statement to change the isolation level for a single session or for all incoming connections. Its syntax is as follows:

SET [SESSION | GLOBAL] TRANSACTION ISOLATION LEVEL {READ UNCOMMITTED | READ COMMITTED | REPEATABLE READ | SERIALIZABLE}

Note: The default behavior (without session and global) is to set the isolation level for the next (unstarted) transaction. If you use the GLOBAL keyword, the statement globally sets the default transaction level for all new connections (except non-existing connections) created from that point on. You need SUPER permission to do this. Use the SESSION keyword to set the default transaction level for future transactions performed on the current connection. Any client is free to change the session isolation level (even in the middle of a transaction), or set the isolation level for the next transaction.

You can query the global and session transaction isolation levels with the following statement:

SELECT @@global.tx_isolation;
SELECT @@session.tx_isolation;
SELECT @@tx_isolation;

index

http://blog.codinglabs.org/articles/theory-of-mysql-index.html

Index (Index) is a data structure that helps MySQL obtain data efficiently

Index Classification

  • Primary key index (primary key)
    • Unique identification, the primary key cannot be repeated, and only one column can be used as the primary key
  • Unique index (unique key)
    • Avoid duplicate columns, multiple columns can be identified as unique indexes
  • Regular index (key/index)
    • default
  • Full text index (fulltext)
    • Adopted under a specific database engine, MyIsam
    • Quickly locate data
show index from table name; -- show all index information
alter table table name add fulltext index name (column name); -- add index
explain sql statement; -- analyze the status of sql execution

Indexing Principles

  • The more indexes the better

  • Don’t make changes to the process

    data indexing

  • Do not add indexes to tables with small amounts of data

  • Indexes are generally added to fields commonly used in queries,

Indexed data structure

Hash type index

Btree: Innodb’s default data structure

http://blog.codinglabs.org/articles/theory-of-mysql-index.html

  • B-Tree and B+Tree

At present, most database systems and file systems use B-Tree or its variant B+Tree as the index structure.

  • B-Tree

In order to describe B-Tree, first define a data record as a two-tuple [key, data], key is the key value of the record, for different data records, the keys are different from each other; data is the data of the data record except the key . Then a B-Tree is a data structure that satisfies the following conditions:

d is a positive integer greater than 1, called the degree of B-Tree.

h is a positive integer called the height of the B-Tree.

Each non-leaf node consists of n-1 keys and n pointers, where d<=n<=2d.

Each leaf node contains at least one key and two pointers, and at most 2d-1 keys and 2d pointers, and the pointers of the leaf nodes are all null.

All leaf nodes have the same depth, equal to the tree height h.

The key and the pointer are spaced apart from each other, and the two ends of the node are pointers.

The keys in a node are arranged non-decreasingly from left to right.

All nodes form a tree structure.

Each pointer is either null or points to another node.

If a pointer is at the far left of the node node and is not null, all the keys it points to the node are less than v(key1)v(key1), where v(key1)v(key1) is the value of the first key of node.

If a pointer is at the far right of the node node and is not null, all the keys it points to the node are greater than v(keym)v(keym), where v(keym)v(keym) is the value of the last key of node.

If a pointer on the left and right adjacent keys of node node is keyikeyi and keyi+1keyi+1 and not null, then all keys pointed to the node are less than v(keyi+1)v(keyi+1) and greater than v( keyi)v(keyi).

Due to the characteristics of B-Tree, the algorithm for retrieving data by key in B-Tree is very intuitive: first, perform a binary search from the root node, if found, return the data of the corresponding node, otherwise, recursively search for the node pointed to by the pointer of the corresponding interval , until a node is found or a null pointer is found, the former succeeds and the latter fails. The pseudocode of the search algorithm on B-Tree is as follows:

BTree_Search(node, key)
{
if(node ​​== null) return null;
foreach(node.key)
{
if(node.key[i] == key) return node.data[i];
if(node.key[i] > key) return BTree_Search(point[i]->node);
}
return BTree_Search(point[i+1]->node);
}
data = BTree_Search(root, my_key);

There are a series of interesting properties about B-Tree. For example, a B-Tree with degree d, if its index is N keys, the upper limit of its tree height h is logd((N+1)/2)logd((N+ 1)/2), to retrieve a key, the asymptotic complexity of finding the number of nodes is O(logdN)O(logdN). It can be seen from this point that B-Tree is a very efficient index data structure.

In addition, since inserting and deleting new data records will destroy the properties of B-Trees, when inserting and deleting, it is necessary to perform operations such as splitting, merging, and transferring the tree to maintain the properties of B-Trees.

  • B+Tree

There are many variants of B-Tree, the most common of which is B+Tree. For example, MySQL generally uses B+Tree to implement its index structure.

Compared with B-Tree, B+Tree has the following differences:

The upper bound of the pointer per node is 2d instead of 2d+1.

Inner nodes do not store data, only keys; leaf nodes do not store pointers.

Since not all nodes have the same domain, leaf nodes and inner nodes in a B+Tree are generally of different sizes. This is different from B-Tree. Although the number of keys and pointers stored in different nodes in B-Tree may be inconsistent, the domain and upper limit of each node are consistent, so in implementation, B-Tree often applies the same amount to each node. size of space.

Generally speaking, B+Tree is more suitable for implementing an external storage index structure than B-Tree. The specific reasons are related to the principle of external storage and the principle of computer access, which will be discussed below.

  • B+Tree with sequential access pointers

The B+Tree structures generally used in database systems or file systems are optimized on the basis of classic B+Trees, adding sequential access pointers.

A B+Tree with sequential access pointers is formed by adding a pointer to an adjacent leaf node in each leaf node of the B+Tree. The purpose of this optimization is to improve the performance of interval access. For example, in Figure 4, if you want to query all data records with keys from 18 to 49, when 18 is found, you only need to traverse the nodes and pointers in order for one-time access. To all data nodes, the efficiency of interval query is greatly mentioned.

This section gives a brief introduction to B-Tree and B+Tree. The next section combines the memory access principle to introduce why B+Tree is currently the preferred data structure for database systems to implement indexes.

  • Why use B-Tree (B+Tree)

Data structures such as red-black trees can also be used to implement indexes, but file systems and database systems generally use B-/+Tree as the index structure.

In general, the index itself is also very large, and it is impossible to store it all in memory, so the index is often stored on disk in the form of index files. In this case, disk I/O consumption will be generated during the index search process. Compared with memory access, the consumption of I/O access is several orders of magnitude higher. Therefore, the most important indicator to evaluate the quality of a data structure as an index is The asymptotic complexity of the number of disk I/O operations during a seek. In other words, the structure of the index should minimize the number of disk I/O accesses during the lookup process. The following first introduces the principles of memory and disk access, and then combines these principles to analyze the efficiency of B-/+Tree as an index.

  • Main memory access principle

At present, the main memory used by computers is basically random read-write memory (RAM). The structure and access principle of modern RAM are more complicated. Here, the specific differences are abandoned, and a very simple access model is abstracted to illustrate the working principle of RAM.

From an abstract point of view, main memory is a matrix of a series of memory cells, each of which stores a fixed size of data. Each storage unit has a unique address. The addressing rules of modern main memory are more complicated. Here, it is simplified to a two-dimensional address: a row address and a column address can uniquely locate a storage unit.

The main memory access process is as follows:

When the system needs to read the main memory, it puts the address signal on the address bus and uploads it to the main memory. After the main memory reads the address signal, it parses the signal and locates the specified storage unit, and then puts the data of this storage unit on the data bus. , for other components to read.

The process of writing to the main memory is similar. The system places the unit address and data to be written on the address bus and data bus respectively, and the main memory reads the contents of the two buses and performs the corresponding write operation.

It can be seen here that the main memory access time is only linearly related to the number of accesses. Because there is no mechanical operation, the “distance” of the data accessed twice will not have any effect on the time. For example, take A0 first and then take A1 takes the same time as taking A0 and then taking D3.

  • The principle of disk access

As mentioned above, indexes are generally stored on disk in the form of files, and index retrieval requires disk I/O operations. Unlike main memory, disk I/O has mechanical movement costs, so the time consumption of disk I/O is huge.

A disk consists of circular platters of the same size that are coaxial, and the disks can rotate (the disks must rotate in sync). There is a magnetic head bracket on one side of the magnetic disk, and the magnetic head bracket fixes a group of magnetic heads, and each magnetic head is responsible for accessing the contents of a magnetic disk. The magnetic head cannot be rotated, but it can move in the radial direction of the disk (actually it is oblique tangential movement). At present, there is a multi-head independent technology, which is not limited).

The platter is divided into a series of concentric rings, the center of which is the center of the platter, each concentric ring is called a track, and all tracks with the same radius form a cylinder. The track is divided into small segments along the radius line, each segment is called a sector, and each sector is the smallest storage unit of the disk. For simplicity, we assume below that the disk has only one platter and one head.

When data needs to be read from the disk, the system will transmit the logical address of the data to the disk, and the control circuit of the disk translates the logical address into a physical address according to the addressing logic, that is, to determine which track and sector the data to be read is in. In order to read the data in this sector, the head needs to be placed above the sector. In order to achieve this, the head needs to move to align with the corresponding track. This process is called seek, and the time it takes is called seek time. Then the disk rotates to The target sector rotates under the head, and the time spent in this process is called rotation time.

  • Locality principle and disk read-ahead

Due to the characteristics of the storage medium, the access of the disk itself is much slower than that of the main memory. In addition to the cost of mechanical movement, the access speed of the disk is often one-hundredth of the main memory. Therefore, in order to improve efficiency, it is necessary to reduce the number of disks as much as possible. I/O. In order to achieve this purpose, the disk is often not read strictly on demand, but will read ahead every time. Even if only one byte is required, the disk will start from this position and sequentially read data of a certain length backward into memory. The rationale for this is the well-known locality principle in computer science:

When a piece of data is used, its nearby data is usually used immediately.

The data required during program operation is usually concentrated.

Since disk sequential reads are very efficient (no seek time, only very little spin time), read-ahead can improve I/O efficiency for programs with locality.

The read-ahead length is generally an integer multiple of the page. A page is a logical block of computer management memory. Hardware and operating systems often divide main memory and disk storage into consecutive blocks of equal size. Each block of storage is called a page (in many operating systems, the size of a page is usually 4k), main memory and disk exchange data in units of pages. When the data to be read by the program is not in the main memory, a page fault exception will be triggered. At this time, the system will send a disk read signal to the disk, and the disk will find the starting position of the data and read one or several pages continuously. Load into memory, then return abnormally, and the program continues to run.

  • Performance analysis of B-/+Tree indexes

At this point, we can finally analyze the performance of the B-/+Tree index.

As mentioned above, the number of disk I/Os is generally used to evaluate the pros and cons of the index structure. First, from the B-Tree analysis, according to the definition of B-Tree, it can be known that a retrieval needs to visit h nodes at most. The designers of the database system cleverly used the principle of disk read-ahead to set the size of a node equal to a page, so that each node can be fully loaded with only one I/O. In order to achieve this goal, the following techniques need to be used in the actual implementation of B-Tree:

Each time a new node is created, it directly applies for a page of space, which ensures that a node is also physically stored in a page. In addition, the computer storage allocation is page-aligned, so that only one I/O is required for a node.

A retrieval in B-Tree requires at most h-1 I/O (root node resident memory), and the asymptotic complexity is O(h)=O(logdN)O(h)=O(logdN). In general practical applications, the out-degree d is a very large number, usually more than 100, so h is very small (usually no more than 3).

To sum up, it is very efficient to use B-Tree as an index structure.

In the structure of red-black tree, h is obviously much deeper. Since logically close nodes (father and son) may be physically far away, locality cannot be utilized, so the I/O asymptotic complexity of red-black tree is also O(h), which is significantly less efficient than B-Tree.

As can be seen from the above analysis, the larger the d, the better the performance of the index, and the upper limit of the out-degree depends on the size of the key and data in the node:

dmax=floor(pagesize/(keysize+datasize+pointsize))dmax=floor(pagesize/(keysize+datasize+pointsize))

floor means round down. Since the data field is removed from the nodes in the B+Tree, it can have a larger out-degree and better performance.

Privilege management and backup

User Management

User table: mysql.user

create user

create user username identified by 'password';

change Password

set password = password('new password'); -- modify the current user password
set password for username = password('new password'); -- modify the specified user password

Rename

rename user username to new username;

User authorization

grant all privileges on *.* to username;
grant permission on database_name.table_name to user_name;

all privileges cannot authorize others

with grant option can authorize others

Query permissions

show grant for username; -- view the permissions of the specified user
show grant for username@host; -- view administrator permissions

Revoke permission

revoke all previleges on *.* from username;
revoke permission on database name. table name from user name;

delete users

drop user username;

MySQL data backup

  • Guarantee important data is not lost
  • Data transfer

How to backup database:

  • Directly copy physical files

  • mysqldump command line use

    mysqldump -h host -u username -p password database table name > physical disk location/file name

Standard database design

Database Design

  • Analyzing Requirements: Analyzing the needs of the business and the database that needs to be processed
  • Outline Design: Design Relationship Diagram E-R Diagram

Three paradigms

  • data duplication
  • Update exception
  • Insert exception
  • remove exception

Three paradigms:

https://www.cnblogs.com/wsg25/p/9615100.html

There are currently six normal forms for relational databases: First Normal Form (1NF), Second Normal Form (2NF), Third Normal Form (3NF), Bath-Codd Normal Form (BCNF), Fourth Normal Form (4NF) and Fifth Normal Form ( 5NF, also known as Perfect Normal Form).

And usually we use the most is the first normal form (1NF), the second normal form (2NF), the third normal form (3NF), that is, “three normal forms”.

  • First Normal Form (1NF): Atomicity, guarantees that each column is not subdivided
  • Second normal form (2NF): satisfy the first normal form, each table only describes one thing, it is necessary to ensure that each column in the database table is related to the primary key, not only a part of the primary key (mainly for the joint primary key and language), eliminating the partial dependence of non-primary attributes on the primary key
  • Third normal form (3NF): To satisfy the second normal form, it is necessary to ensure that each column of data in the data table is directly related to the primary key, but not indirectly related, eliminating transitive dependencies

Normative and performance issues

The associated query table must not exceed three tables

  • Considering the needs and goals of commercialization (cost, user experience), the performance of the database is more important
  • When considering performance issues, proper consideration needs to be given to normative
  • Deliberately add some redundant fields to some tables (from multi-table query to single-table query)
  • Deliberately increase some calculated columns (large data volume is reduced to small data volume queries)

Integrity constraints

https://zhuanlan.zhihu.com/p/82319482

Data integrity constraints are data inspection rules enforced on tables and fields. In order to prevent irregular data from entering the database, when users insert, modify, or delete data, the DBMS automatically executes data according to certain constraints. Monitoring is mainly about the constraints of null values and duplicate values, so that data that does not meet the specifications cannot enter the database, so as to ensure the integrity and accuracy of data storage.

  • Integrity classification

    • Entity Integrity

      Used to identify each record in the table, a record represents an entity (entity)

      1. Primary key constraints

        • There can only be one primary key in each table.

        • The primary key value must be non-null and non-repeating.

        • You can set a single-field primary key or a multi-field joint primary key.

        • The primary key constraint is violated only when the data of multiple fields in the union primary key are exactly the same.

          -- Add primary key constraint when creating table
          create table <table name>(
          <field name 1> <field type 1>,
          <field name 2> <field type 2>,
          ......
          <field name n> <field type n>,
          [constraint primary key constraint name] primary key(field name 1[, field name 2,...field name n]));
          -- Add primary key constraint after table creation
          alter table <table name> add [constraint primary key constraint name] primary key (field name);
          -- remove the primary key constraint
          alter table <table name> drop primary key;
          
      2. Unique constraint (unique)

        • The data of the specified column cannot be repeated.

        • Can be null, but only one null value can appear.

          -- add a unique constraint when creating a table
          create table <table name> (
          <field name 1> <field type 1>,
          <field name 2> <field type 2>,
          ...
          <field name n> <field type n>,
          [constraint unique constraint name] unique (field name 1[, field name 2...field name n]));
          -- Add a unique constraint after creating the table
          alter table <table name> add [constraint unique constraint name] unique (field name);
          -- remove the unique constraint
          alter table <table name> drop index <unique constraint name>;
          
      3. Auto-increment column (auto_increment)

        • The data of the specified field grows automatically.

        • Used with primary keys, and only for integer types.

        • The default value starts from 1, and the value of this field will increase by 1 for each additional record.

        • Even if data is deleted, it continues to increase from the deleted sequence number.

          -- Add autogrow constraint when creating table
          create table <table name> (
          <field name 1> <field type 1> primary key auto_increment,
          <field name 2> <field type 2>,
          ...
          <field name n> <field type n>);
          -- Add autogrow constraint after table creation
          alter table <table name> modify <field name> <field type> auto_increment;
          -- set auto-growth initial value
          alter table <table name> auto_increment=initial value;
          -- remove autogrow constraint
          alter table <table name> modify <field name> <field type>;
          
    • Domain Integrity

      Domain integrity is a cell-specific constraint that is not compared to other row references.

      1. Not null constraint (not null)

        • The value of the field cannot be empty

          -- Add a not-null constraint when creating a table
          creat table <table name> (
          <field name 1> <field type 1> not null,
          <field name 2> <field type 2>,
          ...
          <field name n> <field type n>);
          -- Add a non-null constraint after creating the table
          alter table <table name> modify <field name> <field type> not null;
          -- remove non-null constraints
          alter table <table name> modify <field name> <field type> [null];
          
      2. Default constraints (default)

        • If this field is not assigned a value when a new record is inserted, the system will automatically assign this field the value set by the default constraint.

        • If the inserted data is “null”, the default value will not be used, and the default value will be used only when no data is inserted.

          -- add default constraints when creating table
          creat table <table name> (
          <field name 1> <field type 1> default default value,
          <field name 2> <field type 2>,
          ...
          <field name n> <field type n>);
          -- Add default constraints after table creation
          alter table <table name> modify <field name> <field type> default default value;
          -- remove default constraints
          alter table <table name> modify <field name> <field type>;
          
    • Referential integrity

      Referential integrity refers to the correspondence between multiple tables. When performing operations such as data insertion, update, and deletion in one table, the DBMS will compare it with another table to avoid irregular operations and ensure the integrity of data storage.

      • foreign key constraints

        The value of a field in one table depends on the value of a field in another table

        The table where the primary key is located is the primary table, and the table where the foreign key is located is the secondary table

        Each foreign key value must correspond to a primary key value in another table

        -- Add foreign key constraint when creating table
        create table <table name> (
        <field name 1> <field type 1>,
        <field name 2> <field type 2>,
        ...
        <field name n> <field type n>,
        [constraint foreign key constraint name] foreign key (field name) references <main table> (primary key field));
        -- Add foreign key constraints after table creation
        alter table <table name> add [constraint foreign key constraint name] foreign key (field name) references <main table> (primary key field);
        -- remove foreign key constraints
        alter table <table name> drop foreign key <foreign key constraint name>;
        

        Precautions:

        1. First delete the slave table and then delete the master table
        2. Delete the foreign key constraint first, then delete the table
    • User defined integrity

      It is defined according to the actual integrity requirements of the user. When performing operations such as data insertion and update, the DBMS will check whether the data meets the limited conditions in the check constraints, and avoid operations that do not meet the conditions to ensure the accuracy of data storage.

    • Check constraints (check)

      Specify the qualifications that need to be checked

         -- Add a check constraint when creating the table
         creat table <table name> (
         <field name 1> <field type 1>,
         <field name 2> <field type 2>,
         ...
         <field name n> <field type n>,
         check(<qualifier>));
         -- Add a check constraint after creating the table
         alter table <table name> add constraint <check constraint name> check(<qualification condition>);
         -- remove check constraint
         alter table <table name> drop constraint <check constraint name>;