Functions as State

Recently I’ve been working on a Slack bot project in Go using the wonderful nlopes/slack client. While nlopes makes it easy to post and consume messages to and from Slack, either via the Web API or Real Time API, I found myself struggling when trying to maintain conversation state between users and the bot.

Any good Slack bot is able to listen to and respond to messages given some cue or trigger. This could be as simple as the bot listening for the phrase ‘hello’ and responding ‘howdy!’. Where things get interesting is when you want your bot to be able to carry on a conversation with a user, or more accurately, a bunch of users at the same time. Doing this in a sane and performant manner is what this post is all about.

The way the nlopes/slack RTM client works is that once you authenticate and call the ManageConnection method in a separate goroutine, you are then able to ‘listen’ to incoming events from Slack via a channel. This channel is appropriately named IncomingEvents and works like this:

As you can see, events come in one at a time and can be switched on by type, allowing you to process each of the Slack event types differently. This is great if events are distinct and do not require any external state to be meaningful, but when you are trying to emulate human conversation, this pattern is not very helpful.

For example, my bot asks the user a series of questions in order, waiting on response before asking the next one. Depending on how the user previously answered, the line of questioning may change. This is already kind of complex, but you also need to factor in the fact that this same bot will (hopefully) be having ‘conversations’ with multiple users simultaneously, each in a different stage in the conversation.

Initially, I tried to represent this with a simple map of userIDs to state iota values such as:

But, This quickly became a nightmare to reason about for a couple of reasons:

  1. There was no way to easily see the history of states or how a user got to a certain state, we ‘forgot’ the history on each iteration of the *slack.MessageEvent received.
  2. Inside the MessageEvent block, the code was a mess of if/else or switch statements, each doing something different depending on the value of state.

Thinking in States

I soon realized that what I was really trying to model was a simple state machine. However, as a Go newbie, I didn’t really know the best way to represent this abstraction in code.

I brought this up to my friend over beers where he pointed me to an awesome talk by Rob Pike on Lexical Scanning in Go, which is all about the implementation of the Go template lexer. If you haven’t seen it, please do yourself a favor and watch it. The simplicity and elegance of how Rob implements a state machine using function types in Go really opened my eyes.

Basically the idea is that you define a function type stateFn that itself returns a stateFn. This is a recursive type and is a little mind-bending at first if you are not used to looking at such types. Here’s the implementation in lex.go.

I adapted this pattern to fit my usecase by defining a type chatFn that takes in a pointer to Bot as an argument and returns a chatFn. This way each state can be easily represented as a separate function, allowing you to focus only on what that state should do and nothing else.

Here is a contrived example:

Heres the flow on a call to chat:

  1. state would be initialized to the value of the hello function
  2. Since state does not equal nil, call the value of state and store the result again in state
  3. Repeat until state equals nil

This would result in the following:

  1. hello would execute, sending ‘Hello!’ to the user and return wait
  2. wait would execute, sleeping for 5 seconds and return goodbye
  3. goodbye would execute, sending ‘Goodbye!’ to the user and return nil, breaking us out of the for loop

Notice however that we don’t have the ability to get the actual MessageEvent from Slack. Also, this still doesn’t solve the issue of being able to ‘pick up’ a conversation in progress when a message from a user who is already ‘chatting’ comes in.

We’ll solve those issues now.

Representing the Conversation

Now that we have a simple way to represent states in the conversation with functions, we still need some way of representing the conversation itself. We can use a struct to accomplish this:

A few things to notice:

  • conversation has a userID that we get from Slack
  • an incoming channel has been declared, this is the channel that we will send all incoming messages from Slack that are generated by this user
  • state is defined as a chatFn, this is the current state of the conversation we are in with this user

I also create a map for the bot to keep track of all conversations in progress keyed by the user’s ID.

Note: This is just an example, in the real implementation I protect all access to this map with a sync.RWMutex to guard against concurrent access.

Now let’s implement the handle *slack.MessageEvent block from earlier, and also update the chat function:

There were a couple things in the Run method:

  • Inside the Run method, as each *slack.MessageEvent is received, we check to see if we are already in a conversation with this user
  • If we are, we simply send the incoming message to that conversation that is occurring in a separate goroutine
  • If not, we create a new conversation and add it to the map, and then kick of chat in a goroutine, forwarding on the message

We also had to update the chat method so that:

  • chat now accepts a *conversation as an argument to initiate and update the conversation state
  • The chatFn signature has been modified to take in a *Bot as well as a *conversation in order to be able to access the incoming channel and other metadata
  • We close the incoming channel once the conversation is over
  • We also delete the conversation from the map to represent the conversation being complete

Note: Again, the closing of the incoming channel and the deletion of the conversation should be protected by a mutex in real-life to guard against race conditions

That’s pretty much it. We now have a better way to represent conversational states and can also carry on multiple conversations ‘simultaneously’ with multiple users!

I’m still a Go newb, so I may have muffed some things up with the implementation, however I still think that this is a good pattern that every new Go programmer can add to their arsenal.

As always, please let me know what you thought of this post in the comments or on Twitter.

Mark Phelps

Senior Software Engineer @codeship. Aspiring entrepreneur.