Source code for this project can be found here

Introduction Link to heading

I was scrolling through the GitHub Discover page on a lowly Thursday evening when I came across this gem of a repository, build-your-own-x. This was a godsend - I’ve gotten tired of writing simple webservers and to-do apps to learn things, and I was itching to try something out on this page.

Given my current situation of migrating our applications to MySQL 8 given the upcoming Extended Support EOL of 5.7, I was immediately drawn to the databases section when the C++: Build Your Own Redis from Scratch link caught my eye. If you’ve spent any time working in development with a caching system, you likely already know how Redis is one of the most loved databases out there. It’s fast, simple, easy to use, and I wanted to give it a shot, but I didn’t want to write it in C++.

So I decided to write it in Go instead.

The RESP Protocol Link to heading

Ultimately, I wanted an actual Redis server with which I could use the redis-cli. So, I started looking into the Redis protocol to get familiar with how a client communicates with a Redis server.

And just like using Redis, its protocol to communicate over the wire is pretty simple. Here’s a quick rundown of the parts of the protocol that we care about:

  • Redis uses TCP to communicate
  • Redis uses a request-response model
  • Redis uses a simple string protocol to receive data, which consists of a few parts:
    • A * to denote the number of arguments
    • A \r\n to denote the end of the argument count
    • A $ to denote the number of bytes in the argument
    • A \r\n to denote the end of the argument length
    • The argument data
    • A \r\n to denote the end of the argument data
    • Repeat for each argument
  • Redis uses a simple string protocol to send data, which consists of a few parts:
    • A + to denote a simple string
    • A - to denote an error
    • A : to denote an integer
    • A $ to denote the number of bytes in the argument
    • A \r\n to denote the end of the argument length
    • The argument data
    • A \r\n to denote the end of the argument data

So, for example, if we were to receive the command SET foo bar from the CLI, it would come in as the following string:

*3\r\n$3\r\nSET\r\n$3\r\nfoo\r\n$3\r\nbar\r\n

Let’s say we wanted to send a response back to the client. We would, therefore, send the following string:

+OK\r\n

Then, if the client issued the `GET foo’ command, we would receive the following string:

*2\r\n$3\r\nGET\r\n$3\r\nfoo\r\n

To which we would send the following string:

+bar\r\n

And that’s it! That’s all we need to know about the protocol to implement the core key-value store functionality of Redis - it’s really that simple. Knowing how the protocol worked, I could start writing my server.

Writing the server and parser Link to heading

I started by writing a simple main.go file, which would be the entrypoint for my server. I wrote a simple main function that would start a TCP server on port 6379 and listen for incoming connections. Once a connection was established, I would spawn a new goroutine to handle the connection and then continue listening for new connections.

Note: Redis uses an event loop to handle connections. To keep this implementation simple and easy to understand, we’ll instead be using goroutines to handle each connection.

package main

import (
    "fmt"
    "io"

    "net"
    "os"
)

const addr = "0.0.0.0:6379"

func main() {
    l, err := net.Listen("tcp", addr)
    if err != nil {
        fmt.Printf("Failed to bind to %s\n", addr)
        os.Exit(1)
    }

    fmt.Printf("Starting Redis server at %s\n", addr)
    // Listen for inputs and respond
    for {
        conn, err := l.Accept()
        if err != nil {
            panic(err)
        }
        go func(conn net.Conn) {
            defer conn.Close()
            buf := make([]byte, 2014) // store out stuff somewhere

            for {
                len, err := conn.Read(buf)
                if err != nil {
                    if err != io.EOF {
                        fmt.Printf("Error reading: %#v\n", err)
                    }
                    break
                }

                // Parse the command out
                command := buf[:len]
                response := Parse(command)

                // Write the response back to the connection
                _, responseErr := conn.Write([]byte(response + "\r\n"))
                if responseErr != nil {
                    fmt.Printf("Error reading: %#v\n", err)
                    break
                }
            }
        }(conn)
    }
}

Now that I had a simple TCP server, I needed to write the Parse function, which would take the command string and return the response string (we’re only going to be dealing with this one data type for simplicities sake). I started by writing a Parse function which would parse out the syntax based on the rules from the spec above into a command string and an array of args for that command.

In order to actually see that it worked, I decided to also go ahead and attempt to implement the PING command. First, let’s write the Parse function:

parser.go

package main

import (
    "fmt"
    "strings"
)

type Parser struct {
    // The number of expected arguments we should see while parsing
    NumberOfExpectedArguments int
    // The character length of the next argument to parse
    LengthOfNextArgument int
    // The amount of arguments parsed
    NumberOfArgumentsParsed int
}

// Parse converts an incoming byte string into usable command with it's args
func Parse(command []byte) string {
    var cmd string
    args := []string{}

    parser := &Parser{
        NumberOfExpectedArguments: 0,
        LengthOfNextArgument:      0,
        NumberOfArgumentsParsed:   0,
    }

    position := 0
    for position < len(command) {
        switch command[position] {
        case '*':
            result, err := getIntArg(position+1, command)
            if err != nil {
                return err.Error()
            }
            // Enforce the number of expected arguments, and preallocate array without the first
            // argument, which will be the first command
            parser.NumberOfExpectedArguments = result.Result
            args = make([]string, parser.NumberOfExpectedArguments-1)

            position += result.PositionsParsed
        case '$':
            result, err := getIntArg(position+1, command)
            if err != nil {
                return err.Error()
            }
            // Enforce the length of the next argument
            parser.LengthOfNextArgument = result.Result

            position += result.PositionsParsed
        case '\r':
            position += 1
        case '\n':
            position += 1
        default:
            // Make sure we haven't reached here improperly through invalid argument syntax
            if parser.NumberOfExpectedArguments == 0 || parser.LengthOfNextArgument == 0 || parser.NumberOfArgumentsParsed >= parser.NumberOfExpectedArguments {
                return stringMsg("Invalid syntax")
            }

            // We've gotten past the checks - parse it!
            parsedItem := string(command[position : parser.LengthOfNextArgument+position])

            if parser.NumberOfArgumentsParsed > 0 {
                // it's an arg, add to args array
                args[parser.NumberOfArgumentsParsed-1] = parsedItem
            } else {
                // The first 'arg' we parse is the primary command
                cmd = parsedItem
            }
            // Move our position forward, as we have parsed the argument
            position += parser.LengthOfNextArgument

            // Prepare our parser for the next parsing sequence
            parser.LengthOfNextArgument = 0
            parser.NumberOfArgumentsParsed += 1
        }
    }

    return ParseCommand(cmd, args)
}

// ParseCommand routes the parsed request to the correct command processor
func ParseCommand(command string, args []string) string {
    cmd := strings.ToUpper(command)
    fmt.Printf("Received '%s' command\n", cmd)

    switch cmd {
    case "PING":
        return PerformPong(args)
    default:
        return errorMsg(fmt.Sprintf("unknown command '%s'", cmd))
    }
}

And then the PerformPong command to handle the request:

commands.go

package main

// PerformPong response pack with "PONG", or optionally a passed in argument
func PerformPong(args []string) string {
    if len(args) > 0 {
        return stringMsg(args[0])
    }
    return stringMsg("PONG")
}

I also defined a utility file to handle the common conversions to the response types (and moved the string->integer parsing from the buffer to the common utility to clean up the parser a bit)

utils.go

package main

import (
    "errors"
    "fmt"
    "strconv"
    "strings"
)

// errorMsg returns a formatted string error message
func errorMsg(msg string) string {
    return fmt.Sprintf("-ERR %s", msg)
}

// stringMsg returns a formatted string message
func stringMsg(msg string) string {
    return "+" + msg
}

// nilBulkStringMsg returns a nil bulk string Msg
func nilBulkStringMsg() string {
    return "$-1"
}

type GetIntArgResult struct {
    // The result of the parsing operation
    Result int
    // How many indexes were parsed from the provided byte array
    PositionsParsed int
}

// getIntArg parses an integer argument from a string representation and returns
// the result and skipped amount of bytes in the array
func getIntArg(startPosition int, arr []byte) (*GetIntArgResult, error) {
    var err error
    result := &GetIntArgResult{
        Result:          0,
        PositionsParsed: 0,
    }

    // Get digits until termination characters
    notDone := true
    position := startPosition

    // Building out a string with concatenation creates a new string each time. String builder
    // is more efficient for incrementally building a string
    var stringVal strings.Builder
    for notDone {
        if arr[position] == '\r' && arr[position+1] == '\n' {
            // We hit the termination, consider it done
            notDone = false
        } else {
            stringVal.WriteByte(arr[position])
        }

        result.PositionsParsed += 1
        position += 1
    }

    if stringVal.Len() == 0 {
        return nil, errors.New("no value was detected")
    }

    resultInt, err := strconv.Atoi(stringVal.String())
    if err != nil {
        return nil, errors.New("failed to parse int")
    }

    result.Result = resultInt
    return result, nil
}

Now let’s start our server by running go run . and attempt to connect up with the redis cli and test it out:

Great! We now have a server that, at minimum, works with the redis-cli and can respond to PINGs.

You may notice the other commands coming in - redis-cli issues various commands on connection to get information about the Redis server it connects to. However, responding to this command with anything, including our current ‘invalid command’ handler, allows us to still connect up and issue commands, so we will ignore this portion (for now).

Writing the SET and the store Link to heading

Redis works by storing items in memory (and optionally persisting to disk, which we won’t implement here….at least not yet). To issue SET commands, we need to set up the handler and an in-memory store.

We’ll start by creating a simple map of strings to a StoreItem struct holding the item’s value, expiry time, and mutex (we’ll get into why the mutex is needed in a bit):

commands.go

package main

import (
    "fmt"
    "strconv"
    "strings"
    "sync"
    "time"
)

type StoreItem struct {
    Value  string
    Expiry time.Time
    Mutex  sync.Mutex
}

var store = make(map[string]*StoreItem)

// ...
// ...

And in the same file, we’ll define the handler:

// ...

// PerformSet stores a value with an expiry in the database
func PerformSet(args []string) string {
    if len(args) < 2 {
        return errorMsg("invalid syntax provided to 'SET'")
    }
    key := &args[0]
    val := &args[1]
    var exp time.Time

    if len(args) > 2 {
        // We have options to parse. Let's parse them!
        position := 2
        for position < len(args) {
            switch strings.ToLower(args[position]) {
            case "px":
                // Set expiry in milliseconds
                if len(args) < position+1 {
                    return errorMsg("no time provided to 'PX'")
                }
                expMillis, err := strconv.Atoi(string(args[position+1]))
                if err != nil {
                    return errorMsg("invalid format provided to 'PX'")
                }
                exp = time.Now().Add(time.Millisecond * time.Duration(expMillis))

                position += 2
            case "ex":
                // Set expiry in seconds
                if len(args) < position+1 {
                    return errorMsg("no time provided to 'EX'")
                }
                expSeconds, err := strconv.Atoi(string(args[position+1]))
                if err != nil {
                    return errorMsg("invalid format provided to 'EX'")
                }
                exp = time.Now().Add(time.Second * time.Duration(expSeconds))

                position += 2
            default:
                return errorMsg(fmt.Sprintf("invalid argument '%s'", args[position]))
            }
        }
    }

    // Default to an hour
    if exp == (time.Time{}) {
        exp = time.Now().Add(time.Hour)
    }

    store[*key] = &StoreItem{
        Value:  *val,
        Expiry: exp,
    }
    return stringMsg("OK")
}

In our implementation, we parse the args coming in, ensure we have both a key and a value and optionally parse the PX and EX flags. If an expiry is not provided, we set the default expiry on the item to 1 hour.

Lastly, we’ll add the handler to the parser:

parser.go

// ParseCommand routes the parsed request to the correct command processor
func ParseCommand(command string, args []string) string {
    cmd := strings.ToUpper(command)
    fmt.Printf("Received '%s' command\n", cmd)
    switch cmd {
    // .. other commands
    case "SET":
        return PerformSet(args)
    // ...
    }
}

Let’s test it out:

Great! But now two open questions:

  • How do we get the item out we just stored?
  • That expiry is just a value on a struct. How does Redis actually expire the item?

We’ll answer both of these questions in the GET implementation.

Writing the GET Link to heading

According to the Redis documentation, Redis expires keys in two ways:

  • Passive expiration: The key is only expired when it is accessed, and it is found to be expired. This is the default behavior.
  • Active expiration: The key is expired periodically, in background, by Redis. This feature is called volatile-ttl and was introduced in Redis 2.1.3.

We’ll handle expire by performing passive expiration, as it’s the simplest to implement. Let’s start by adding a Get handler to our commands.go file:

// PerformGet retrieves a value from the database, if it exists and is not expired. If it is expired, it will be deleted
func PerformGet(args []string) string {
    if len(args) == 0 {
        return errorMsg("no value provided to 'GET'")
    }

    item := store[args[0]]

    // Check if it's null
    if item == nil {
        return nilBulkStringMsg()
    }

    // Item exists - enforce mutual exclusion on the expiry operation and retrieval
    item.Mutex.Lock()
    defer item.Mutex.Unlock()

    // Check the expiry
    now := time.Now()
    if item.Expiry.Before(now) {
        store[args[0]] = nil
        return nilBulkStringMsg()
    }

    return stringMsg(store[args[0]].Value)
}

You’ll notice three operations happening here:

  • We check if the item exists in the store. If it doesn’t, we return a nil bulk string message
  • We lock the mutex on the item, and defer unlocking to the end of the operation. This is to ensure that if the item is being expired, we don’t try to retrieve or expire it at the same time from another thread
  • We check the expiry on the item. If it’s expired, we delete it from the store and return a nil bulk string message

Lastly, we’ll add the handler to the parser:

// ParseCommand routes the parsed request to the correct command processor
func ParseCommand(command string, args []string) string {
    cmd := strings.ToUpper(command)
    fmt.Printf("Received '%s' command\n", cmd)
    switch cmd {
    // .. other commands
    case "GET":
        return PerformGet(args)
    // ...
    }
}

Let’s test it out:

Great! We now have a working Redis server that can store and retrieve items, and will expire items when they are accessed and found to be expired.

But what if we want to get rid of something?

Writing the DEL Link to heading

The last command we’ll implement is the DEL command, which will delete an item from the store. We’ll start by adding the handler to the commands.go file:

// PerformDel deletes a value from the database, if it exists
func PerformDel(args []string) string {
    if len(args) == 0 {
        return errorMsg("no value provided to 'DEL'")
    }

    item := store[args[0]]

    // Check if it's null
    if item == nil {
        return nilBulkStringMsg()
    }

    // Item exists - enforce mutual exclusion on the expiry operation and retrieval
    item.Mutex.Lock()
    defer item.Mutex.Unlock()

    // Delete the item
    delete(store, args[0])

    return stringMsg("OK")
}

And then add the handler to the parser:

// ParseCommand routes the parsed request to the correct command processor
func ParseCommand(command string, args []string) string {
    cmd := strings.ToUpper(command)
    fmt.Printf("Received '%s' command\n", cmd)
    switch cmd {
    // .. other commands
    case "DEL":
        return PerformDel(args)
    // ...
    }
}

Let’s test it out:

Great! We now have a working Redis server that can store, retrieve, expire, and delete items.

Conclusion Link to heading

I honestly had a lot of fun writing this. It’s pretty impressive how simple Redis can be under the hood. On top of that simplicity, Redis powers some of the most complex and demanding systems in the world, such as Twitter’s timeline and Uber’s real-time push platform.

You can find the source code for this project here. I may come back to this in the future to implement the disk persistence and active expiration, but for now, this was a fun project to distract me from the looming MySQL 5.7 EOL.

I guess it’s time to get back to dealing with the new MySQL keyword incompatibilities…😮‍💨💨