Before looking through concrete examples, lets look into a comparison of concurrency and parallelism. Awhile back I read an interesting definition comparing the two:
Concurrency is the composition of independently executing processes, while parallelism is the simultaneous execution of computation. Parallelism is about executing many things at once, it’s focus is execution. While concurrency is about dealing with many things at once, it’s focus is structure.
This explanation does a good job of comparing and contrasting the two, which I’ve often seen misunderstood. Understanding the differences between the two is useful in this age of software development where multi-threaded patterns become increasingly more prevalent for cloud-native architecture.
Communicating Sequential Processes
Go provides an easy to understand paradigm of concurrency known as Communicating Sequential Processes (CSP.) In a short but simple definition this is a message passing paradigm of concurrency through the use of channels, which can be thought of as a queue of messages.
You may also be familiar with some other popular paradigms of concurrency seen in other languages:
- Actor Model — Erlang, Scala
- Threads — Java, C#, C++
I won’t go into a full review of these as that can easily be an entire discussion in itself. Although, I would like to note no one paradigm is better at solving concurrency, they each have their tradeoffs and use-cases. There’s even some really neat community written libraries where different paradigms have been implemented across different languages, like the actor model in Go!
Like most things in Go, it’s implementation of concurrency shines in its simplicity and efficiency. This is partly due to its first-class functions for both goroutines and channels which provide an easy to use interface for concurrently passing messages around your application runtime. In addition, the very little overhead of goroutines within Go’s scheduler can allow for you to potentially spawn millions of simultaneous tasks. This is dramatically less overhead than thread implementations seen in other languages. Ardan Labs has a much more detailed article better exampling how it works.
For this article, I will stick with some examples of best practices, techniques and applications for using Go’s concurrency API. If you’re new to Go and you haven’t already, I recommend going through A Tour of Go first.
Now for the fun part, let’s jump into some examples and patterns of applying these concepts to real problems.
For the sake of this article, I will mention two different types of processes and their scopes: request-level and server-level processes. When writing an application, especially server applications, you’ll be working with these two different scenarios very often.
This could be thought of as a temporarily running process based on the scope of a request.
One of the most common examples for a request-level process would be a request(s) being received by an HTTP server. You might be writing a service which is serving a RESTful API through HTTP. As you receive requests, you’ll want to asynchronously process the request, possibly execute some business logic and/or access some data storage medium. The scope of each request should be executed simultaneous of one another to prevent blocking new requests. Another common example might be reading messages off of message queues like Kafka or RabbitMQ simultaneously and processing the message through a stream.
Let’s look at some examples where we use goroutines and channels in this scenario. Below is a very simple use-case with a goroutine where we spawn an asynchronous process and wait.
As seen, we are spawning a goroutine of an anonymous function. We pass in a send-only channel to the goroutine to return back a value to the main goroutine after waiting for one second. The variable declaration
res will block until a message is written to it from the channel.
Alright, that’s the simplest use-case, let’s expand on that and process multiple things asynchronously.
The Go standard library provides a library
sync for handling synchronization of goroutines called
WaitGroup. As you can see in the example, we create a WaitGroup and
waitGroup.Add(2), as we know up front the amount of goroutines we spawn. This also means we can set the channels buffer size to match the total expected responses. This is okay but this is a static example and doesn’t provide much flexibility. Let’s change this example and make it dynamic.
Now we have a more interesting use-case. Let’s step through it.
- Create a channel with a small buffer size.
- Creating an array of words (let’s assume we don’t know the size and pretend it’s a stream of words.)
- For each word in the stream, we increment the wait group and spawn a new goroutine.
- Then spawn a separate goroutine to read the messages off the channel until the channel is closed.
- Wait and block until all goroutines are Done.
- Close the
wordChchannel. This is important, we are sending a signal to the separate channel saying that we have finished processing new words.
- Close the
donechannel to unblock the main goroutine.
Note: It’s usually important to have some mechanism to stop or short-circuit a goroutine! I provide a better example below with
What is the expected output?
the quick brown fox jumped over the lazy dog
No, not likely, this would assume the process is synchronous. We have to remember that these messages are processed asynchronously using goroutines, therefore we lose the guarantee of order for the output. Although, there’s still another problem.
The actual output will be “dog” N times for the total amount of words. What 🤯?! The reason for this is a common Go gotcha! Ahh, what a frustrating intended behavior that all Go developers run across at some point. There’s a good explanation on the official Go GitHub Wiki, but simply put, the variable in the for loop’s closure is a pointer. As the main goroutines loop finishes before the spawned goroutines Print functions occur, the pointer of
word references the last element from the array and each Print will contain the same value in our case. The solution is simple, copy the variable. Since my first time running across this, it looks like the IDE GoLand now provides a warning to this problem, nice!
One last example that I would like to cover which I believe is important for any request-level process is handling deadlines. Whenever performing any kind of I/O bounded operation, whether this is a network request or even a long running in-memory calculation, you should account for it to either take too long or worst-case not even finish. Similar to how we signaled to stop our goroutine from the previous example, we can do the same with a deadline. One way of handling this problem is with the standard library
context. Context can be used in a few different ways, as a timer and/or deadline signaler or key value map (not recommended.)
In this example, we are using the deadline signal. We fork the background parent context and set a deadline 1 second in the future. When the deadline occurs, it will close the Done channel which can be used to signal to our goroutines that we should handle any clean up and return. Alternatively, you can also use any channel to signal to close the goroutine if there’s no specific time.
Great, so now we’ve covered some common examples for a request-level use of goroutines and channels. Let’s move on.
I like to think of server-level processes as long running processes that live throughout your runtime. In the same HTTP server example, you can think of the HTTP server’s instance as the long running process. There might be cases where you’ll need to write custom processes that live throughout your runtime, such as a custom cache handler, router, message queue consumer/producer, scheduler or job runner, etc.
Again, let’s look at a simple use-case first with a token cache and refresher:
Perhaps our application depends on using a token with HTTP requests, this might be a JWT to authenticate against some remote resource. We want to store our token in-memory and automatically refresh it when it expires. To do this, one option is to create a process that will run from the start of our application till its shutdown while periodically refreshing the token. Typically, this process will be created and spawned in the initialization of our application.
In the example, we create a
TokenCache struct that provides a
GetToken methods. Then create a new goroutine of the
Start process that accepts a done channel for shutting down — like the previous examples and will continue processing until closed. It will periodically refresh the token and lock the
TokenCache with mutual exclusion (mutex) in case another goroutine accesses the
GetToken in the middle of the refresh.
What can go wrong in this scenario? What happens if the underlying refresh method causes a panic? Our goroutine will panic the entire server and crash the application! This is obviously not good behavior for these types of processes. Unlike the request-level examples that usually have middleware to recover from an HTTP request’s (or other request flow) panic, we have to implement our own logic. In most cases it’s ideal to emit a metric or log the problem, then recover and continue trying again. If the problem continues to occur, you can set a threshold for your tolerance of these fatal panics. Another solution for when there’s no way to recover is to either return an error or signal back to the main application goroutine that we need to shutdown gracefully and return a status code indicating the failure. This is important because there might be other messages in different processes that are in-memory that need to be flushed through the system before closing.
Additionally, having the caller invoke
GetToken and potentially wait on a lock, isn’t a very message-driven pattern for concurrency. A better solution to this would be to allow callers to receive a channel for when the token is available. This can be thought of as a Future or Promise.
We can still use the mutex lock internally as the state for our
TokenCache but I think we can still do better. Some nice to have enhancements would be waiting on a response for a potential new token rather than blocking with the lock and the support of
context as a parameter to our
GetToken method. This would be beneficial to our caller and lets them better handle their own logic.
By creating a
request message and processing the request directly in our refresh process, we can pass our
ctx to the refresh implementation directly. Usually it’s an anti-pattern to store context in a struct but for the use-case of a message it’s perfectly acceptable. There’s a small chance that the caller will have a delay if they happen to request right during the expiration, if not, the refresh is caught in the
<-time.After(…), making sequential requests fulfill their Future without being blocked by the refresh process. A couple things I didn’t include in this example is with the new changes, when done is closed, we should close the request channel too and have a check in our
GetToken method indicating the shutdown. We could expand on this by changing the return type to a
<-chan Response message that can include an error.
I didn’t explain all of the syntax in the examples, so here’s a quick overview of some of the features around the Go goroutine and channel API:
- You can change the size of a channel with
make(chan T, size).
- For a channel you can use the
lenfunction, but you can also check the capacity using
- You can change the signature of a channel for parameters and return types to be send-only
chan<- T, or receive-only
- You can close a channel
close(ch)and check whether a channel is closed
res, isClosed := <- ch.
- You can
rangethrough a channel which will loop until the channel closes.
- You cannot get a return value from a goroutine, because
gois not an expression, e.g.
val := go something(). Instead use a channel like in the examples to pass a message to the parent goroutine (this is particularly useful for error handling.) Remember we are using CSP, a message passing concurrency paradigm.
Hopefully this article will be useful to some of you! When I first started developing Go and getting into more high-performance server-sided Go development, I found it very difficult to find many resources on these kinds of scenarios. These are just some best practices that I’ve learned from my own experiences and from compiling together what I’ve been able to learn from others.
If you’re interested in taking your goroutine and channel experience to the next level, a lot of these techniques can be written more elegantly as a stream or pipeline. The official Go blog has some more advanced examples if you’re interested.
Thanks for reading. Feel free to leave a comment if you have any questions or see anything wrong!