Redis Introduction

56 minute read

Published:

Java Web– Redis Introduction

Redis Introduction

Nosql overview

In the era of big data, general databases cannot be analyzed and processed

  1. The era of stand-alone MySQL

    App->DAL->MySQL

    The amount of access will not be too large, the amount of single data source is completely sufficient, and the server does not have too much pressure

    • If the amount of data is too large, one machine cannot fit it
    • If the index of the data is too large, the machine memory will not fit
    • Traffic (mixed read and write), one server can’t bear it
  2. Cache (Memcached) + MySQL + vertical split

    Multiple databases read and write separation, use cache to ensure efficiency (solve the problem of reading)

    Optimize data structure and index -> file cache (IO) -> Memcached

  3. divide database and table + horizontal split

    divide database and table to solve the pressure of writing

    MyIsam->Innodb

  4. Technology explosion

    Positioning, hot list

    The relational database of MySQL is not enough, and the data volume is large and changes rapidly.

    To store some relatively large content, the database table is large and the efficiency is low.

    Requires a NoSQL database, which handles this situation well

  • NoSQL: Not Only SQL

    Refers to non-relational databases in general

    Traditional relational databases are difficult to deal with in the era of web2.0

    Relational database: table - row, column

    Non-relational database: controlled by key-value pairs, no fixed format is required

  • NoSQL features

    1. Easy to expand (no relationship between data, easy to expand)
    2. Large data volume and high performance
    3. The data types are diverse (no need to design the database in advance, just take it and use it)
    4. The difference between traditional RDBMS and NoSQL
      • RDBMS
        1. Structured organization
        2. Structured Query SQL
        3. Data and relationships are stored in separate tables
        4. Data manipulation, data definition language
        5. Strict Consistency
        6. Basic Transaction Operations
      • NoSQL
        1. More than just data
        2. No fixed query language
        3. Key-value store, column store, document store, graph database
        4. Eventual consistency
        5. CAP theorem and BASE theory
        6. High performance, high availability, high scalability
  • Big Data Era

    1. 3V
      • Volume
      • Variety
      • Velocity
    2. 3 high
      • High concurrency
      • Highly scalable
      • high performance

Four major categories of NoSQL

KV key-value pair

  • Redis
  • memcache

Document database

bson format: binary json

  • MongoDB: A database based on distributed storage, mainly used to store a large number of documents
  • ConthDB

column store database

  • HBase
  • Distributed file system

Graph relational database

  • It’s not about graphics, it’s about relationships
  • Neo4j
  • InfoGrid
TypesExamplesTypical UsagesData ModelAdvantagesDisadvantages
Key-ValueTokyo Cabinet/Tyrant, Redis, Voldemort, Oracle BDBContent Caching, used to deal with high access load to large amount of data and some logging systems(key, value) is a pair, usually implemented by hash tablefast to look updata is unstructured, often regarded as strings or binary data
Column StoreCassandra, HBase, Riakdistributed file systemsstored as clustered column; stored one column data togetherlook up fast; easy to extendfunction is limited
Document DatabaseCouchDB, MongoDBweb applications (similar to key-value, and value is structured; database can know the contents of value)(key, value) pairs, value is structured datadata structure limit is not strict; table structure can be changed, and don’t need to define in advancequerying performance is bad; lacking in consistent querying grammar
Graph Relational DatabaseNeo4J, InfoGrid, Infinite Graphsocial networking, recommendation systems etc., focusing on building relational graphsgraph structuremake use of graph-related algorithms, like dfsSometimes it requires to traverse the whole graph to get the answer, and it does not suitable for distributed methods

Redis

Redis (Remote Dictionary Server), the remote dictionary service, is one of the most popular NoSQL technologies at the moment.

Function

  • Memory storage, persistence (RDB/AOF)
  • High efficiency, can be used for cache
  • Publish-Subscribe System
  • Map information analysis
  • timer, counter

Features

  1. Diverse data types
  2. Persistence
  3. Cluster
  4. Transactions

use

  1. Start the service through the specified configuration file redis-server config/redis.conf path: /usr/local/bin/
  2. Use the Redis client to connect to the service redis-cli -p 6379

Basics

Redis has 16 databases by default, and the 0th database is used by default, which can be switched using select number

  • dbsize: View the current database size
  • flushdb: flush the current database
  • flushall: flush all databases

Redis is single-threaded and based on memory operations. The CPU is not the performance bottleneck. The bottleneck is to store all data in memory according to the machine’s memory and network bandwidth.

Redis data types

Five basic data types

Redis is an open source (BSD licensed), in-memory data structure storage system that can be used as a database, cache, and messaging middleware. It supports multiple types of data structures such as strings, hashes, lists, sets, sorted sets and range queries, bitmaps, hyperloglogs and geography Spatial (geospatial) index radius query. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of disk persistence, and through Redis Sentinel and automatic partitioning (Cluster) provides high availability.

  • Redis-key

      keys * #View all keys
      set name abc #set key
      exists name #Determine whether the key exists
      move name 1#Move the key to the specified database
      expire name 10#Set the expiration time (unit is seconds)
      ttl name #View the remaining time of the current key
      get name #Get the current key
      type name #View the type of the value corresponding to the current key
    
  • String

      append key1 "hello" #Append a string to the key, if it does not exist, it is equivalent to set key
      strlen key1 #Get the length of the string
      incr views #self-increment 1
      decr views #self-decrease 1
      incrby views 10 #Set the step size to specify the increment
      decrby views 5 #Set the step size to specify the decrement
      getrange key1 0 3 #Intercept string range [0,3]
      getrange key1 0 -1 #Intercept string range [0, end]
      setrange key1 0 xx #replace the string starting at the specified position
      setex key2 30 "hello"#Set the expiration time
      setnx mykey abc #Does not exist and then set
      mset key1 v1 key2 v2 key3 v3 #Batch set key
      mget key1 key2 key3 #Get keys in batches
      msetnx key1 vv1 key4 vv4 #If there is no set value, atomic operation
      #Advanced usage
      mset user:1:name zhangsan user:1:age 2
      mget user:1:name user:1:age
      getset key1 aaa #Get first and then set, if there is no return nil and set the value, if there is, return the original value and set the new value
    

    The value of type String can be a string or a number

    1. Counter
    2. Statistics
  • List

    List, can be made into stack, queue, blocking queue

      lpush list one #Insert one or more values to the head of the list (left)
      rpush list four #Insert one or more values to the end of the list (right)
      lrange list 0 -1 #Get the value in the list
      lrange list 0 1 #Get the value of the specified range
      lpop list #Remove the first element at the head of the list (left)
      rpop list #Remove the first element at the end of the list (right)
      lindex list 0 #Get a value in the list by subscripting
      llen list #Get the length of the list
      lrem list 1 one #Remove the specified number of values in the list
      ltrim mylist 1 2 #Intercept the specified length by subscript, the list will be changed
      rpoplpush mylist mylist1 #Remove the last element in the list and move to a new list
      lset list 0 item #Replace the value of the specified subscript in the list with another value (if it does not exist, an error will be reported)
      linsert mylist before hello1 aaa #Insert a specific value before or after an element in the list
      linsert mylist after hello1 bbb #Insert a specific value before or after an element in the list
    

    In fact, it is a linked list, before Node, after Node, left, right can be inserted

    If the key does not exist, create a new linked list

    If the key has new content

    If all values are removed, an empty linked list also means that it does not exist

    Inserting or updating values on both sides is efficient, and intermediate operations are inefficient

  • Set set is an unordered non-repeating collection

      add elements to sadd myset abc #set
      smembers myset #View all values in the specified set
      sismember myset eee #Determine whether the specified value exists in the set
      scard myset #Get the number of elements in the set collection
      srem myset eee #Remove the specified element in the set collection
      srandmember myset #Randomly select an element
      srandmember myset 2 #Randomly select multiple elements
      spop myset #remove elements randomly
      smove myset myset2 eee #Move the specified element to another set
      sdiff myset myset2 #difference set
      sinter myset myset2 #intersection
      sunion myset myset2 #union
    
  • Hash

    map collection, key-Map collection {key:value}

      hset myhash field1 abc #Set the hash specific key-value
      hget myhash field1 #Get the value of the key specified by the hash
      hmset myhash field1 hello field2 world #Set hash multiple key-values
      hmget myhash field1 field2 #Get the value of multiple keys specified by hash
      hgetall myhash #Get all hash data
      hdel myhash field1 #Delete the field specified by hash, and the corresponding value will disappear
      hlen myhash #Get the number of fields in the hash
      hexists myhash field1 #Determine whether the specified field in the hash exists
      hkeys myhash #Get all the keys of the hash
      hvals myhash #Get all the values of the hash
      hincrby myhash field3 1 #hash specifies the value increment for the given step size
      hsetnx myhash field4 hello #If it does not exist, it can be set, otherwise it cannot be set
    

    Hash can store changed data user, name, age, especially frequently changing information

  • Zset

    Sorted set, adding a value to the set zset k1 score v1

      zadd myset 1 one #Add a value
      zadd myset 2 two 3 three #Add multiple values
      zrange myset 0 -1 #Get all values in zset
      zrangebyscore myset -inf +inf #Sort by score from small to large
      zrangebyscore myset -inf +inf withscores #Sort by score from small to large, return with score
      zrem myset bbb #Remove elements in zset
      zcard myset #Get the number in zset
      zcount myset 1 3 #Get the number of members in the specified interval
    

    zset can be used for leaderboard implementation

Three special data types

  • geospatial

    Geographical location, information that can calculate geographic location, distance between two places, people in a radius of a few miles

      geoadd china:city 121.472644 31.231706 shanghai #Add geographic location (two levels cannot be added) Parameters: key longitude latitude name
      geopos china:city beijing #Get the longitude and latitude of the specified city
      geodist china:city beijing chongqing km #Get the distance between the specified cities
      georadius china:city 110 30 1000 km withdist count 3 #Get the city with the specified latitude and longitude as the center distance is smaller than the specified distance
      georadiusbymember china:city beijing 1200 km #Find other elements within the specified distance of the specified element
      geohash china:city beijing #Return the latitude and longitude to the 11-digit hash value, the closer the string is, the closer the distance is
    

    The underlying principle is zset, you can use the zset command to operate geo

      127.0.0.1:6379> zrange china:city 0 -1
      1) "chongqing"
      2) "shanghai"
      3) "beijing"
    
  • Hyperloglog

    Used for cardinality statistics, the memory occupied is fixed, but there is a certain error rate

      pfadd mykey a b c d e f g h i j #create element
      pfcount mykey #Count the number of element cardinality
      pfmerge mykey3 mykey mykey2 #Merge two sets of elements
    
  • BitMaps

    Bit storage: active, inactive, only two states can use BitMaps, all operate binary bits to record

      setbit sign 0 1#Set the i-th element
      getbit sign 1#Get the i-th element
      bitcount sign #Count the number of 1
    

Redis transactions

Redis

Redis single command guarantees atomicity, but Redis transaction does not guarantee atomicity

Redis transaction essence: a collection of commands, all commands of a transaction will be serialized and executed in order

One-time, sequential, exclusive

Redis transactions have no concept of isolation level. All commands are not directly executed in the transaction. They are only executed when the command is executed.

Stages of a Redis transaction:

  • open transaction (multi)
  • command enqueue(…)
  • Execute transaction (exec)
127.0.0.1:6379> multi #Open transaction
OK
#Command transaction
127.0.0.1:6379> set k1 v1
QUEUED
127.0.0.1:6379> set k2 v2
QUEUED
127.0.0.1:6379> get k2
QUEUED
127.0.0.1:6379> set k3 v3
QUEUED
127.0.0.1:6379> exec #Execute transaction
1) OK
2) OK
3) "v2"
4) OK

127.0.0.1:6379>multi
OK
127.0.0.1:6379> set k1 v1
QUEUED
127.0.0.1:6379> set k2 v2
QUEUED
127.0.0.1:6379> set k4 v4
QUEUED
127.0.0.1:6379> discard #Abandon the transaction, none of the commands in the transaction will be executed
OK
127.0.0.1:6379> get k4
(nil)
  • Problem with command: all commands in transaction will not be executed
  • There is a syntax error in the queue: other commands in the transaction can be executed normally, and the wrong command will throw an exception
127.0.0.1:6379>multi
OK
127.0.0.1:6379> set k1 v1
QUEUED
127.0.0.1:6379> set k2 v2
QUEUED
127.0.0.1:6379> set k3 v3
QUEUED
127.0.0.1:6379> getset k3
(error) ERR wrong number of arguments for 'getset' command
127.0.0.1:6379> set k4 v4
QUEUED
127.0.0.1:6379> set k5 v5
QUEUED
127.0.0.1:6379> exec #Command error, all commands will not be executed
(error) EXECABORT Transaction discarded because of previous errors.
127.0.0.1:6379> get k4
(nil)
127.0.0.1:6379> get k2
(nil)

127.0.0.1:6379>multi
OK
127.0.0.1:6379> set k1 vvv
QUEUED
127.0.0.1:6379> incr k1
QUEUED
127.0.0.1:6379> set k2 v2
QUEUED
127.0.0.1:6379> set k3 v3
QUEUED
127.0.0.1:6379> get k3
QUEUED
127.0.0.1:6379> exec #The command error is still executed normally
1) OK
2) (error) ERR value is not an integer or out of range
3) OK
4) OK
5) "v3"
127.0.0.1:6379> get k2
"v2"
127.0.0.1:6379> get k3
"v3"

Locks in Redis

  • Pessimistic lock: very pessimistic, when something goes wrong, no matter what you do, it will be locked
  • Optimistic lock: very optimistic, think that there will be no problem, will not lock, update time to judge whether someone has modified this data

Monitoring of Redis:

  • normal execution

      127.0.0.1:6379> set money 100
      OK
      127.0.0.1:6379> set out 0
      OK
      127.0.0.1:6379> watch money
      OK
      127.0.0.1:6379>multi
      OK
      127.0.0.1:6379>decrby money 20
      QUEUED
      127.0.0.1:6379 > incrby out 20
      QUEUED
      127.0.0.1:6379>exec
      1) (integer) 80
      2) (integer) 20
    
  • abnormal execution

      127.0.0.1:6379> watch money #watch
      OK
      127.0.0.1:6379>multi
      OK
      127.0.0.1:6379>decrby money 10
      QUEUED
      127.0.0.1:6379> incrby out 10
      QUEUED
      127.0.0.1:6379> exec #Optimistic lock operation, the value is modified and the execution fails
      (nil)
      127.0.0.1:6379> unwatch #unlock first if execution fails
      OK
      127.0.0.1:6379> watch money #Get the latest optimistic lock
      OK
      127.0.0.1:6379>multi
      OK
      127.0.0.1:6379>decrby money 1
      QUEUED
      127.0.0.1:6379> incrby out 1
      QUEUED
      127.0.0.1:6379> exec # Compare whether the monitored value has changed, if there is no change, it can be executed successfully
      1) (integer) 199
      2) (integer) 21
    

config file

unit

# Note on units: when memory size is needed, it is possible to specify
# it in the usual form of 1k 5GB 4M and so forth:
#
# 1k => 1000 bytes
# 1kb => 1024 bytes
# 1m => 1000000 bytes
# 1mb => 1024*1024 bytes
# 1g => 1000000000 bytes
# 1gb => 1024*1024*1024 bytes
#
# units are case insensitive so 1GB 1Gb 1gB are all the same.

Include other configuration files

# Include one or more other config files here. This is useful if you
# have a standard template that goes to all Redis servers but also need
# to customize a few per-server settings. Include files can include
# other files, so use this wisely.
#
# Notice option "include" won't be rewritten by command "CONFIG REWRITE"
# from admin or Redis Sentinel. Since Redis always uses the last processed
# line as value of a configuration directive, you'd better put includes
# at the beginning of this file to avoid overwriting config change at runtime.
#
# If instead you are interested in using includes to override configuration
# options, it is better to use include as the last line.
#
# include /path/to/local.conf
# include /path/to/other.conf

network

  • ip address
# By default, if no "bind" configuration directive is specified, Redis listens
# for connections from all the network interfaces available on the server.
# It is possible to listen to just one or multiple selected interfaces using
# the "bind" configuration directive, followed by one or more IP addresses.
#
# Examples:
#
# bind 192.168.1.100 10.0.0.1
# bind 127.0.0.1 ::1
#
# ~~~ WARNING ~~~ If the computer running Redis is directly exposed to the
# internet, binding to all the interfaces is dangerous and will expose the
# instance to everybody on the internet. So by default we uncomment the
# following bind directive, that will force Redis to listen only into
# the IPv4 loopback interface address (this means Redis will be able to
# accept connections only from clients running into the same computer it
# is running).
#
# IF YOU ARE SURE YOU WANT YOUR INSTANCE TO LISTEN TO ALL THE INTERFACES
# JUST COMMENT THE FOLLOWING LINE.
# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~
  • Protected mode
# Protected mode is a layer of security protection, in order to avoid that
# Redis instances left open on the internet are accessed and exploited.
#
# When protected mode is on and if:
#
# 1) The server is not binding explicitly to a set of addresses using the
# "bind" directive.
# 2) No password is configured.
#
# The server only accepts connections from clients connecting from the
# IPv4 and IPv6 loopback addresses 127.0.0.1 and ::1, and from Unix domain
# sockets.
#
# By default protected mode is enabled. You should disable it only if
# you are sure you want clients from other hosts to connect to Redis
# even if no authentication is configured, nor a specific set of interfaces
# are explicitly listed using the "bind" directive.
protected-mode yes
  • port
# Accept connections on the specified port, default is 6379 (IANA #815344).
# If port 0 is specified Redis will not listen on a TCP socket.
port 6379

General configuration

  • run as a daemon
# By default Redis does not run as a daemon. Use 'yes' if you need it.
# Note that Redis will write a pid file in /var/run/redis.pid when daemonized.
daemonize yes
  • pid process file
# If a pid file is specified, Redis writes it where specified at startup
# and removes it at exit.
#
# When the server runs non daemonized, no pid file is created if none is
# specified in the configuration. When the server is daemonized, the pid file
# is used even if not specified, defaulting to "/var/run/redis.pid".
#
# Creating a pid file is best effort: if Redis is not able to create it
# nothing bad happens, the server will start and run normally.
pidfile /var/run/redis_6379.pid
  • log
# Specify the server verbosity level.
# This can be one of:
# debug (a lot of information, useful for development/testing)
# verbose (many rarely useful info, but not a mess like the debug level)
# notice (moderately verbose, what you want in production probably)
# warning (only very important / critical messages are logged)
loglevel notice
# Specify the log file name. Also the empty string can be used to force
# Redis to log on the standard output. Note that if you use standard
# output for logging but daemonize, logs will be sent to /dev/null
logfile "/var/log/redis/redis-server.log"
  • the number of databases
# Set the number of databases. The default database is DB 0, you can select
# a different one on a per-connection basis using SELECT <dbid> where
# dbid is a number between 0 and 'databases'-1
databases 16
  • whether to display the logo
# By default Redis shows an ASCII art logo only when started to log to the
# standard output and if the standard output is a TTY. Basically this means
# that normally a logo is displayed only in interactive sessions.
#
# However it is possible to force the pre-4.0 behavior and always show a
# ASCII art logo in startup logs by setting the following option to yes.

always-show-logo yes

Snapshot

  • Persistence rules
# Save the DB on disk:
#
# save <seconds> <changes>
#
# Will save the DB if both the given number of seconds and the given
# number of write operations against the DB occurred.
#
# In the example below the behaviour will be to save:
# after 900 sec (15 min) if at least 1 key changed
# after 300 sec (5 min) if at least 10 keys changed
# after 60 sec if at least 10000 keys changed
#
# Note: you can disable saving completely by commenting out all "save" lines.
#
# It is also possible to remove all the previously configured save
# points by adding a save directive with a single empty string argument
# like in the following example:
#
# save ""

save 900 1
save 300 10
save 60 10000
  • whether to stop the persistence error
# By default Redis will stop accepting writes if RDB snapshots are enabled
# (at least one save point) and the latest background save failed.
# This will make the user aware (in a hard way) that data is not persisting
# on disk properly, otherwise chances are that no one will notice and some
# disaster will happen.
#
# If the background saving process will start working again Redis will
# automatically allow writes again.
#
# However if you have setup your proper monitoring of the Redis server
# and persistence, you may want to disable this feature so that Redis will
# continue to work as usual even if there are problems with disk,
# permissions, and so forth.
stop-writes-on-bgsave-error yes
  • whether to compress rdb files
# Compress string objects using LZF when dump .rdb databases?
# For default that's set to 'yes' as it's almost always a win.
# If you want to save some CPU in the saving child set it to 'no' but
# the dataset will likely be bigger if you have compressible values or keys.
rdbcompression yes
  • Whether to verify the rdb file
# Since version 5 of RDB a CRC64 checksum is placed at the end of the file.
# This makes the format more resistant to corruption but there is a performance
# hit to pay (around 10%) when saving and loading RDB files, so you can disable it
# for maximum performances.
#
# RDB files created with checksum disabled have a checksum of zero that will
# tell the loading code to skip the check.
rdbchecksum yes
  • rdb file save directory
# The working directory.
#
# The DB will be written inside this directory, with the filename specified
# above using the 'dbfilename' configuration directive.
#
# The Append Only File will also be created inside this directory.
#
# Note that you must specify a directory here, not a file name.
dir /var/lib/redis

Safety

  • set password
# IMPORTANT NOTE: starting with Redis 6 "requirepass" is just a compatiblity
# layer on top of the new ACL system. The option effect will be just setting
# the password for the default user. Clients will still authenticate using
# AUTH <password> as usually, or more explicitly with AUTH default <password>
# if they follow the new protocol: both will work.
#
# requirepass foobared
  • The client can also set the password directly:
config get requirepass #Get password
config set requirepass "123456" #Set password
auth 123456 #Authentication

Client

  • Limit the maximum number of client connections
# Set the max number of connected clients at the same time. By default
# this limit is set to 10000 clients, however if the Redis server is not
# able to configure the process file limit to allow for the specified limit
# the max number of allowed clients is set to the current file limit
# minus 32 (as Redis reserves a few file descriptors for internal uses).
#
# Once the limit is reached Redis will close all the new connections sending
# an error 'max number of clients reached'.
#
# IMPORTANT: When Redis Cluster is used, the max number of connections is also
# shared with the cluster bus: every node in the cluster will use two
# connections, one incoming and another outgoing. It is important to size the
# limit accordingly in case of very large clusters.
#
# maxclients 10000

memory settings

  • configure max memory
# In short... if you have replicas attached it is suggested that you set a lower
# limit for maxmemory so that there is some free RAM on the system for replica
# output buffers (but this is not needed if the policy is 'noeviction').
#
# maxmemory <bytes>
  • Processing strategy when the memory reaches the upper limit
# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory
# is reached. You can select one from the following behaviors:
#
# volatile-lru -> Evict using approximated LRU, only keys with an expire set.
# allkeys-lru -> Evict any key using approximated LRU.
# volatile-lfu -> Evict using approximated LFU, only keys with an expire set.
# allkeys-lfu -> Evict any key using approximated LFU.
# volatile-random -> Remove a random key having an expire set.
# allkeys-random -> Remove a random key, any key.
# volatile-ttl -> Remove the key with the nearest expire time (minor TTL)
# noeviction -> Don't evict anything, just return an error on wri
te operations.
#
# LRU means Least Recently Used
# LFU means Least Frequently Used
#
# Both LRU, LFU and volatile-ttl are implemented using approximated
# randomized algorithms.
#
# Note: with any of the above policies, Redis will return an error on write
# operations, when there are no suitable keys for eviction.
#
# At the date of writing these commands are: set setnx setex append
# incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd
# sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby
# zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby
# getset mset msetnx exec sort
#
# The default is:
#
#maxmemory-policy noeviction

There are six ways of maxmemory-policy:

  • volatile-lru: only perform LRU on keys with expiration time set (default value)

  • allkeys-lru : delete the keys of the lru algorithm

  • volatile-random: Randomly delete expiring keys

  • allkeys-random: delete randomly

  • volatile-ttl : remove expiring soon

  • noeviction : never expire, return error

aof configuration

  • Aof mode is not enabled by default, rdb mode is used
# By default Redis asynchronously dumps the dataset on disk. This mode is
# good enough in many applications, but an issue with the Redis process or
# a power outage may result into a few minutes of writes lost (depending on
# the configured save points).
#
# The Append Only File is an alternative persistence mode that provides
# much better durability. For instance using the default data fsync policy
# (see later in the config file) Redis can lose just one second of writes in a
# dramatic event like a server power outage, or a single write if something
# wrong with the Redis process itself happens, but the operating system is
# still running correctly.
#
# AOF and RDB persistence can be enabled at the same time without problems.
# If the AOF is enabled on startup Redis will load the AOF, that is the file
# with the better durability guarantees.
#
# Please check http://redis.io/topics/persistence for more information.
appendonly no
  • Persist aof file
# The name of the append only file (default: "appendonly.aof")
appendfilename "appendonly.aof"
  • Sync frequency
# The fsync() call tells the Operating System to actually write data on disk
# instead of waiting for more data in the output buffer. Some OS will really flush
# data on disk, some other OS will just try to do it ASAP.
#
# Redis supports three different modes:
#
# no: don't fsync, just let the OS flush the data when it wants. Faster.
# always: fsync after every write to the append only log. Slow, Safest.
# everysec: fsync only one time every second. Compromise.
#
# The default is "everysec", as that's usually the right compromise between
# speed and data safety. It's up to you to understand if you can relax this to
# "no" that will let the operating system flush the output buffer when
# it wants, for better performances (but if you can live with the idea of
# some data loss consider the default persistence mode that's snapshotting),
# or on the contrary, use "always" that's very slow but a bit safer than
#everysec.
#
# More details please check the following article:
# http://antirez.com/post/redis-persistence-demystified.html
#
# If unsure, use "everysec".

# appendfsync always (sync every modification, consuming performance)
appendfsync everysec #(execute sync every second, may lose this 1s data)
# appendfsync no (do not execute sync, the operating system synchronizes itself, the fastest)

Redis persistence

Redis is an in-memory database. If the in-memory data cannot be saved to disk, then once the server process exits, the database state in the server will also disappear.

RDB (Redis DataBase)

Write a snapshot of data in memory to disk within a specified time interval, that is, a Snapshot snapshot, which reads the snapshot file directly into memory when it is restored

Redis will create (fork) a child process separately for persistence, first write the data to a temporary file, and after the persistence process is over, replace the temporary file with the last persistent file. During the whole process, the main process does not perform any IO operations, ensuring high performance. If large-scale data recovery is required and the integrity of data recovery is not very sensitive, the ROB method is more efficient than AOF. The disadvantage of ROB is that the data after the last persistence may be lost.

The default is RDB. Generally, you do not need to modify this configuration. The saved file is dump.rdb by default.

In a production environment, this file is sometimes backed up

  • Trigger RDB mechanism (automatically generate dump.rdb)

    1. The save rule is satisfied
    2. Execute the flushall command
    3. Exit Redis
  • Restore RDB files

    1. Put the rdb file in the redis startup directory

    2. Redis will be loaded automatically after startup

    3. View the catalog:

       127.0.0.1:6379>config get dir
       1) "dir"
       2) "/var/lib/redis"
      
  • RDB pros and cons

    1. Suitable for large-scale data recovery
    2. The requirements for data integrity are not high, and RDB can be used
    3. It takes a certain time interval to operate. If redis is down, the last modified data will be gone.
    4. Fork process will occupy a certain amount of memory space

AOF (Append Only File)

Record all the commands, and execute the entire file once during recovery.

Record each write operation in the form of a log, record all the instructions executed by Redis (read operations are not recorded), only append files but not files, redis will read the file at the beginning of startup to rebuild the data, In other words, if redis is restarted, the write instruction will be executed from front to back according to the content of the log file to complete the data recovery.

AOF mode is not enabled by default. If you need to enable it, you must manually enable it (appendonly yes)

If there is an error in the aof file, the redis startup will fail, you need to repair Redis (redis-check-aof --fix), and the file can be restored after restarting.

  • AOF rewrite rules
    1. The default of aof is the infinite append of the file, which will get bigger and bigger
    2. If the AOF file is larger than 64MB, it will fork a new process to rewrite the file
  • AOF advantages and disadvantages
    1. Every modification is synchronized, the file integrity will be better
    2. Sync data every second, may lose one second of data
    3. Never sync is the most efficient
    4. For data files, aof is much larger than rdb, and the repair speed is slower than rdb
    5. AOF is also slower than RDB

Extension

Author: TurboSnail Link: https://juejin.im/post/6844903939339452430 Source: Nuggets Copyright belongs to the author. For commercial reprints, please contact the author for authorization, and for non-commercial reprints, please indicate the source.

  • If the data is not so sensitive and can be regenerated from elsewhere, then persistence can be turned off
  • If the data is more important, you don’t want to get it from other places, and you can endure data loss for several minutes, such as cache, etc., then you can only use RDB
  • If it is used as an in-memory database, it is recommended to use Redis persistence. It is recommended to enable both RDB and AOF, or periodically execute bgsave for snapshot backup. The RDB method is more suitable for data backup, and AOF can ensure that data is not lost.

Redis publish subscribe

Redis’ publish and subscribe is a message communication mode: the publisher (pub) sends messages, and the receiver (sub receives messages)

Redis clients can subscribe to any number of channels

Three roles:

  • message publisher
  • channel
  • message subscribers

https://www.runoob.com/redis/redis-pub-sub.html

Common commands

The following table lists common commands for redis publish and subscribe:

Serial numberCommand and description
1PSUBSCRIBE pattern [pattern ...] Subscribe to one or more channels matching the given pattern.
2PUBSUB subcommand [argument [argument ...]] View subscription and publishing system status.
3PUBLISH channel message Send a message to the specified channel.
4PUNSUBSCRIBE [pattern [pattern ...]] Unsubscribes all channels for the given pattern.
5SUBSCRIBE channel [channel ...] Subscribe to the given channel or channels.
6UNSUBSCRIBE [channel [channel ...]] means to unsubscribe from the given channel.

Example:

  • recipient
127.0.0.1:6379> subscribe abc
Reading messages... (press Ctrl-C to quit)
1) "subscribe"
2) "abc"
3) (integer) 1
1) "message"
2) "abc"
3) "hello"
1) "message"
2) "abc"
3) "hello222"
  • sender
127.0.0.1:6379>ping
PONG
127.0.0.1:6379> publish abc "hello"
(integer) 1
127.0.0.1:6379> publish abc "hello222"
(integer) 1
127.0.0.1:6379>

Principle

Redis is implemented in C, and the publish.c file is the underlying implementation of the publish and subscribe mechanism

Redis implements publish and subscribe functions through PUBLISH, SUBSCRIBE, PSUBSCRIBE and other commands

  • After subscribing to a channel through the SUBSCRIBE command, a dictionary is maintained in the server. The keys of the dictionary are each channel, and the value of the dictionary is a linked list, which saves all clients that subscribe to this channel. The key to the SUBSCRIBE command is to add the client to the specified channel’s subscription list.
  • Send a message to the subscriber through the PUBLISH command. The server uses the specified channel as the key, searches the channel dictionary it maintains to record the linked list of all clients subscribed to this channel, traverses the linked list, and sends the message to the subscriber.

In Redis, a key value can be set for message publishing and message subscription. When a key value is published, all clients that subscribe to it will receive the corresponding information.

scenes to be used:

  • Real-time communication system
  • Subscribe and follow system

Redis master-slave replication

concept

Master-slave replication is to copy the data of one Redis server to other Redis servers. The former is called the master node (master/leader), and the latter is called the slave node (slave/follower); data replication is one-way, Replication can only be done from the master node to the slave node. Master is mainly for writing, and Slave is mainly for reading.

By default, each Redis server is a master node, and a master node can have multiple slave nodes (or none), but a slave node can only have one master node.

  • role
    1. Data redundancy: master-slave replication realizes hot backup of data, which is another data redundancy method besides persistence
    2. Failure recovery: when the master node has a problem, the slave node can provide services to achieve rapid failure recovery, which is actually a service redundancy
    3. Load balancing: On the basis of master-slave replication, with read-write separation, the master node can provide write services and the slave nodes can provide read services (that is, the application connects to the master node when writing Redis data, and the application connects to the slave node when reading Redis data) , share the server load; especially in the case of writing less and reading more, sharing the read load through multiple slave nodes can greatly improve the concurrency of the Redis server
    4. High availability cornerstone: In addition to the above functions, master-slave replication is also the basis for sentinels and clusters to be implemented, so master-slave replication is the basis for high availability

Master-slave replication, read-write separation, can reduce server pressure, generally at least one master and three slaves

  1. A single Redis server will have a single point of failure, and one server needs to handle all the request load, which is under great pressure
  2. The memory capacity of a single Redis server is limited. Generally, the maximum memory used by a single Redis server cannot exceed 20G.

Environment configuration

Only configure the slave library, not the master library

View information about the current library

127.0.0.1:6379>info replication
# Replication
role:master
connected_slaves:0
master_replid: 8a93cd94244ed76ed8bcd89639dc033ce2c29877
master_replid2: 0000000000000000000000000000000000000000
master_repl_offset: 0
second_repl_offset: -1
repl_backlog_active:0
repl_backlog_size: 1048576
repl_backlog_first_byte_offset: 0
repl_backlog_histlen:0
  1. Modify the configuration file information:
  • port
  • pid name
  • log file name
  • dump.rdb file name
  1. Start the Redis server

    View process information:

$ ps -ef |grep redis
root 6573 1 0 16:38 ? 00:00:00 redis-server 127.0.0.1:6379
root 6582 1 0 16:38 ? 00:00:00 redis-server 127.0.0.1:6380
root 6591 1 0 16:38 ? 00:00:00 redis-server 127.0.0.1:6381
  1. Configure the slave slaveof ip port

     127.0.0.1:6380> slaveof 127.0.0.1 6379
     OK
     127.0.0.1:6380>info replication
     # Replication
     role:slave
     master_host: 127.0.0.1
     master_port: 6379
     master_link_status:up
     master_last_io_seconds_ago:1
     master_sync_in_progress: 0
     slave_repl_offset: 0
     slave_priority: 100
     slave_read_only:1
     connected_slaves:0
     master_replid: a68f635adb45f4f29c792df61f0d48007b3b2fa0
     master_replid2: 0000000000000000000000000000000000000000
     master_repl_offset: 0
     second_repl_offset: -1
     repl_backlog_active:1
     repl_backlog_size: 1048576
     repl_backlog_first_byte_offset:1
     repl_backlog_histlen:0
    

    Host configuration view:

     127.0.0.1:6379>info replication
     # Replication
     role:master
     connected_slaves: 2
     slave0:ip=127.0.0.1,port=6380,state=online,offset=112,lag=1
     slave1:ip=127.0.0.1,port=6381,state=online,offset=112,lag=1
     master_replid: a68f635adb45f4f29c792df61f0d48007b3b2fa0
     master_replid2: 0000000000000000000000000000000000000000
     master_repl_offset: 112
     second_repl_offset: -1
     repl_backlog_active:1
     repl_backlog_size: 1048576
     repl_backlog_first_byte_offset:1
     repl_backlog_histlen: 112
    

    Note: The master-slave relationship can be configured in the configuration file

     replicaof <masterip> <masterport>
     masterauth <master-password>
    

detail

  1. The master can write, but the slave cannot write and can only read; all information and data in the master will be automatically saved by the slave

     127.0.0.1:6379> set k1 v1
     OK
     127.0.0.1:6381>keys*
     1) "k1"
     127.0.0.1:6381> set k2 v2
     (error) READONLY You can't write against a read only replica.
    
  2. The host is disconnected, the slave is still connected to the host, if the host comes back, the slave can still read the information written by the host

  3. If the command is used to configure the master-slave relationship, the slave will automatically become the master after restarting, and as long as it becomes a slave, the value will be obtained from the master immediately

  4. The principle of replication

    • After the slave starts successfully connected to the master, a sync command will be sent
    • The host receives the command, starts the background save process, and collects all received commands for modifying the data set. After the background process is executed, the host will transfer the entire data file to the slave and complete a complete synchronization
    • Full copy: After the slave machine receives the database file data, it saves it and loads it into the memory
    • Incremental replication: the master continues to transmit all the new collected modification commands to the slave in turn to complete the synchronization
    • As long as the master is reconnected, a full synchronization (full copy) will be performed automatically, and the data must be visible in the slave

5.Layer-by-layer link

The previous master node connects to the next slave node

`M--S--S`
  1. If the host is disconnected, you can use slaveof no one to make yourself the host, and other nodes can manually connect to this node

Redis Sentinel Mode

How to automatically pick a master server

concept

The method of master-slave switching technology is: when the master server is down, it is necessary to manually switch a slave server to the master server, which requires manual intervention, which is time-consuming and labor-intensive, and also causes the service to be unavailable for a period of time. The Sentinel mode is generally prioritized to solve this problem.

Sentinel mode can monitor whether the host in the background is faulty, and if there is a fault, the library will be automatically converted to the main library according to the number of votes.

Sentinel mode is a special mode. First, Redis provides commands for sentinels, which are independent processes. As a process, it will run independently. The principle is that Sentinel monitors multiple running Redis instances by sending commands and waiting for the Redis server to respond.

Sentinel role:

  1. By sending commands, let the Redis server return to monitor its running status, including the master server and the slave server
  2. When the sentinel detects that the master server is down, it will automatically switch from the server to the master server, and then notify other slave servers through the publish-subscribe mode to let them switch hosts

However, a sentinel process may have problems monitoring the Redis server, so multiple sentinels can be used for monitoring, and each sentinel will also be monitored. This is the multiple sentinel mode.

Assuming that the primary server is down, Sentinel 1 detects it first, and does not re-elect immediately. It is just that Sentinel 1 subjectively believes that the primary server is unavailable. This phenomenon is called subjective offline. When the following sentinels also detect that the master server is unavailable and the number reaches a certain value, a vote will be held between the sentinels, and the result of the vote will be initiated by a sentinel to perform a failover operation. After the switch is successful, each sentinel will switch the host from the server they monitor through the publish-subscribe model. This process is called objective offline.

Configuration steps

  1. Configure the sentinel configuration file sentinel.conf

    sentinel monitor monitor name ip port 1

  2. Start Sentinel

    redis-sentinel configuration file path

  3. If the master is down, a server will be selected from the slave

  4. After the host comes back, it can only be merged into the new host as a slave

# Example sentinel.conf

# Sentinel instance running port default 26379
port 26379

# Sentinel's working directory
dir /tmp

# The ip port of the redis master node monitored by sentinel
# master-name The master node name that can be named by itself can only be composed of letters A-z, numbers 0-9, and these three characters ".-_".
# quorum Configure how many sentinel sentinels uniformly think that the master master node is out of contact, then objectively think that the master node is out of contact
# sentinel monitor <master-name> <ip> <redis-port> <quorum>
sentinel monitor mymaster 127.0.0.1 6379 2

# When the requirepass foobared authorization password is enabled in the Redis instance, all clients connecting to the Redis instance must provide the password
# Set the sentinel password to connect the master and slave. Note that the same verification password must be set for the master and slave
# sentinel auth-pass <master-name> <password>
sentinel auth-pass mymaster MySUPER--secret-0123passw0rd

# After specifying how many milliseconds, the master node does not respond to the sentinel sentinel. At this time, the sentinel subjectively thinks that the master node is offline. The default is 30 seconds
# sentinel down-after-milliseconds <master-name> <milliseconds>
sentinel down-after-milliseconds mymaster 30000

# This configuration item specifies how many slaves can synchronize the new master at the same time when failover master/slave switchover occurs.
# The smaller the number, the longer it will take to complete the failover,
#But if the number is larger, it means that more slaves are unavailable due to replication.
#By setting this value to 1, you can ensure that only one slave is in a state that cannot process command requests at a time.

# sentinel parallel-syncs <master-name> <numslaves>
sentinel parallel-syncs mymaster 1
# Failover timeout failover-timeout can be used in the following ways:
#1. The interval between two failovers of the same sentinel to the same master.
#2. Time is calculated when a slave synchronizes data from a wrong master. Until the slave is corrected to synchronize data to the correct master.
#3. The time it takes to cancel an ongoing failover.
#4. When failover, configure the maximum time required for all slaves to point to the new master. However, even after this timeout, the slaves will still be correctly configured to point to the master, but not according to the rules configured by parallel-syncs
# default three minutes
# sentinel failover-timeout <master-name> <milliseconds>

sentinel failover-timeout mymaster 180000

#SCRIPTS EXECUTION

#Configure the script that needs to be executed when an event occurs. You can use the script to notify the administrator. For example, when the system is not running normally, send an email to notify the relevant personnel.
#There are the following rules for the running result of the script:
#If the script returns to 1 after execution, the script will be executed again later, and the number of repetitions currently defaults to 10
#If the script returns 2 after execution, or a return value higher than 2, the script will not be executed repeatedly.
#If the script is terminated due to receiving a system interrupt signal during execution, the behavior is the same as when the return value is 1.
#The maximum execution time of a script is 60s. If this time is exceeded, the script will be terminated by a SIGKILL signal and then re-executed.

#Notification script: When sentinel has any warning level event (such as subjective failure and objective failure of redis instance, etc.), this script will be called. At this time, this script should be notified by email, SMS, etc. Information for the system administrator about the abnormal operation of the system. When calling the script, two parameters will be passed to the script, one is the type of the event and the other is the description of the event. If the script path is configured in the sentinel.conf configuration file, it must be ensured that the script exists in this path and is executable, otherwise sentinel cannot start successfully.
#Notification script
# shell programming
# sentinel notification-script <master-name> <script-path>
sentinel notification-script mymaster /var/redis/notify.sh

# Client reconfigure master node parameter script
# When a master changes due to failover, this script will be called to notify relevant clients that the master address has changed.
# The following parameters will be passed to the script when it is called:
# <master-name> <role> <state> <from-ip> <from-port> <to-ip> <to-port>
# Currently <state> is always "failover",
# <role> is one of "leader" or "observer".
# Parameters from-ip, from-port, to-ip, to-port are used to communicate with the old master and the new master (ie the old slave)
# This script should be generic and can be called multiple times, not targeted.
# sentinel client-reconfig-script <master-name> <script-path>
sentinel client-reconfig-script mymaster /var/redis/reconfig.sh # Usually configured by operation and maintenance!

Advantages and disadvantages

  1. Sentinel cluster is based on master-slave replication mode
  2. The master-slave can be switched, the fault can be transferred, and the availability of the system will be better
  3. Sentinel mode is an upgrade of master-slave mode, which is more robust
  4. Redis online expansion is more troublesome, once the cluster capacity reaches the upper limit, online expansion is very troublesome
  5. It is cumbersome to implement the configuration of the sentinel mode, and there are many options in it

Redis cache details

https://blog.csdn.net/ThinkWon/article/details/103522351#_800

Cache penetration (unable to find)

The concept of cache penetration is very simple. When a user wants to query a piece of data and finds that the redis in-memory database is not available, that is, the cache is not hit, then It is a query to the persistence layer database. It is not found, so this query fails. When there are a lot of users, the cache is not hit (seconds kill scene), so they all requested the persistence layer database. This will put a lot of pressure on the persistence layer database, which is equivalent to appearing at this time. Cache penetration.

  • solution:

    1. Bloom filter

      Bloom filter is a data structure, which stores all possible query parameters in hash form, and checks first at the control layer. discarded, thus avoiding query pressure on the underlying storage system;

    2. Cache empty objects

      When the storage layer misses, even the returned empty object will be cached, and an expiration time will be set at the same time, and later access to the data will be obtained from the cache, protecting the back-end data source.

      question:

      • If null values can be cached, it means that the cache needs more space to store more keys, because there may be many keys with null values;

      • Even if the expiration time is set for the null value, there will still be inconsistencies between the data of the cache layer and the storage layer for a period of time, which will affect the business that needs to maintain consistency

Cache Avalanche

Cache avalanche means that the cache set expires and fails in a certain period of time. Redis is down!

One of the reasons for the avalanche, for example, at the time of writing this article, it will soon be 12:00, and there will soon be a wave of panic buying. This wave of goods is put into the cache in a concentrated time, assuming that the cache is for an hour. . Then at one o’clock in the morning, the cache of this batch of goods will expire. The access queries for this batch of commodities all fall on the database, and for the database, there will be periodic pressure peaks. Therefore, all requests will reach the storage layer, and the call volume of the storage layer will increase sharply, resulting in the situation that the storage layer will also hang.

In fact, centralized expiration is not very fatal. The more fatal cache avalanche is that a node of the cache server is down or disconnected from the network. Because of the naturally formed cache avalanche, the cache must be created centrally in a certain period of time. At this time, the database can also withstand the pressure. It is nothing more than a periodic pressure on the database. The downtime of the cache service node will cause unpredictable pressure on the database server, and it is very likely that the database will be overwhelmed in an instant.

  • solution

    1. Redis high availability

      The meaning of this idea is that since redis may fail, I will add a few more redis, so that after one fails, the others can continue. Work, in fact, is to build a cluster. (Live in a different place!)

    2. Current limit degradation

      The idea of this solution is to control the number of threads that read the database and write the cache by adding locks or queues after the cache is invalidated. such as right A key only allows one thread to query data and write to the cache, while other threads wait.

    3. Data warm-up

      The meaning of data preheating is that before the formal deployment, I first visit the possible data in advance, so that some data that may be accessed in large quantities can be accessed in advance. data will be loaded into the cache. Manually trigger the loading of different keys in the cache before large concurrent access is about to occur, and set different expiration times to allow The timing of cache invalidation should be as uniform as possible.

Cache breakdown (too large, cache expired)

Cache breakdown means that there is no data in the cache but there is data in the database (usually the cache time expires). At this time, due to the large number of concurrent users, the data is not read in the cache at the same time, and the data is retrieved from the database at the same time. , causing the database pressure to increase instantaneously, resulting in excessive pressure. Different from the cache avalanche, the cache breakdown refers to checking the same piece of data. The cache avalanche means that different data have expired, and many data cannot be found, so the database is checked.

  • solution

    1. Set hotspot data to never expire.

    2. Add mutex, mutex

Cache warmup

Cache preheating means that after the system goes online, the relevant cache data is directly loaded into the cache system. This can avoid the problem of querying the database first and then caching the data when the user requests! The user directly queries the pre-warmed cached data!

  • solution

    1. Write a cache refresh page directly, and do it manually when going online;
    2. The amount of data is not large, and it can be automatically loaded when the project is started;
    3. Refresh the cache regularly;

Cache downgrade

When traffic surges, services experience problems (such as slow or unresponsive response times), or non-core services affect the performance of core processes, there is still a need to ensure that services are available, even at a loss. The system can automatically downgrade according to some key data, or it can be configured with switches to achieve manual downgrade.

The ultimate goal of cache downgrade is to keep core services available, even at a loss. And some services cannot be downgraded (eg add to cart, checkout).

Before downgrading the system, you should sort out the system to see if the system can be left behind to protect the leader; to sort out which ones must be protected and which ones can be downgraded; for example, you can refer to the log level setting plan:

  1. General: For example, if some services occasionally time out due to network jitter or the service is going online, they can be automatically downgraded;
  2. Warning: The success rate of some services fluctuates within a period of time (such as between 95 and 100%), which can be automatically downgraded or manually downgraded, and an alarm will be sent;
  3. Errors: For example, the availability rate is lower than 90%, or the database connection pool is overwhelmed, or the access volume suddenly increases to the maximum threshold that the system can withstand. At this time, it can be automatically downgraded or manually downgraded according to the situation;
  4. Serious errors: For example, due to special reasons, the data is wrong, and emergency manual downgrade is required at this time.

The purpose of the service downgrade is to prevent the Redis service from failing, resulting in an avalanche problem with the database. Therefore, for unimportant cached data, a service downgrade strategy can be adopted. For example, a common practice is that if there is a problem with Redis, instead of querying the database, it directly returns the default value to the user.