All code for this article can be found here.
You can see the bot in action here.
What is Wordament, and why cheat?
Wordament is a word-searching game for Android, iPhone and Windows Phone (hah). The game displays a four by four grid of letters. The player’s job is to swipe through the letters to find words. Points are awarded for every word found, based on complexity. Every player plays the same board simultaneously and the game ranks players after each 90-second round.
I was always pretty bad at this game, making it to the top 35% on a good day. While playing I noticed that the players at the top of the scoreboard were either extremely good, or not legit. Many have scores and word counts seemingly too high to be achieved by a human being in 120 seconds. So I set out to beat these top players by writing a bot.
Why bother? I don’t care about getting the best score on Wordament, it just seemed like an interesting technical challenge.
Because I wanted to use this challenge to play with fun technologies my approach was a little convoluted. I wrote a web service that runs on the Microsoft Azure Websites platform (free tier). This service takes a Wordament board as input and gives all possible solutions to the board as output.
I then wrote an Android UI automation test. This test takes a screenshot of the board and finds letters on it. It sends the letters up to my Azure service to get the possible solutions. It then uses UI automation to swipe these solutions into the game board.
I call these components the ‘solver’, the ‘reader’ and the ‘player’.
Here we need a component that can find every solution given a 16-letter Wordament board. A ‘solution’ in this context is a word plus the tile positions on the board that it will need to swipe to input that word. For example if I give it the board
It might output something like
I would leave it up to the device-specific code to transform these board coordinates into screen coordinates.
Creating a dictionary
First we need a list of possible words. For this I Google’d ‘word list’ and ‘dictionary list’ and found a bunch of options to download. Towards the end of this project I found that every individual list I found had two deficiencies. None of them had the complete set of words for any given board, and they all had lots of words that you wouldn’t expect to find on a Wordament board.
I solved this by creating a quick program. It takes a directory of word lists and returns all the words which appear in at least three lists. This way every word in my final set is common enough to be found in many dictionaries (high likelihood of being a possible Wordament word). Also any common word missing from one word list is likely to be found in the others.
The data structure
Here’s how I wanted to search a Wordament board for a given word. For each letter on the board, find out if there are any words that start with it and each of the neighbouring letters. If there are, for each neighbouring letter, if any words in the current list finish with this letter then add the solution to the result. Repeat for every connected letter.
Here’s a pseudocode algorithm
It happens that there’s a data structure purpose built to perform these kinds of prefix searches. It’s called the prefix tree, or trie (pronounced ‘tray’). I’ve written before about the prefix tree and how I implemented mine in F#. You can find the code here or just grab the binaries from Nuget. I won’t go into details about the prefix tree here. Suffice it to say it exposes exactly the interface I need to perform this prefix search. We’ll see this interface in action in this article.
Searching for solutions
To solve the board with the given 16 letters (see the reader section below for how we get them) we just need to pass them through the search algorithm. We also need to pass our in trie which has been pre-loaded with the word list, described above. Here’s the actual F# code.
This is worth explaining.
We define a couple of types. Solution, which is just a word and a list of integers giving the positions of the letters. Letter, which is a node in the board which points to its neighbours (so we can represent the board as a graph). And Board, which takes a string representing the 16 letters and hooks them all together based on their locations on the grid (this is the generateNeighbours function).
Next we define a depth first search. This is a normal DFS algorithm, with the added feature that it takes a “short circuit” function. If this function returns true at any point during the search then the search will halt. This is so that when we search we can tell it to stop if the current word is not a valid prefix for any word in our trie.
The result of this search function is the list of all solutions, including the path required for each one.
The web service
I wrote a web service to host my solver code. It’s a simple ASP.NET MVC service with three controllers.
The first is named ReaderController. This is capable of taking a bitmap image (a screenshot of a Wordament board) from the body of a request and returns the letters on the board as text (more on the Reader later). I mostly used this for testing and debugging - it’s not used in the final application.
The next controller is called SolutionController. It takes 16 letters (the characters on a Wordament board). It sends them to my solver and returns the solutions for the board. Like ReaderController this was mostly for testing and debugging.
And the third controller is the poorly-named SolverController. This guy combines the behaviours of the previous two controllers. It reads a Wordament screenshot from the request and sends the letters as text to my solver. It then returns the solutions to the caller. This one was used by my Android program to find the solutions from a Wordament screenshot.
Getting the letters from a board took a bit of effort. Presumably to prevent cheating, Microsoft doesn’t put any kind of identifying metadata onto the objects that make up a board. My only option was to analyse a bitmap of the phone screen with the board on it.
I needed a simple OCR algorithm that would allow me to identify letters from bitmaps. After playing with some pretty crappy web services I decided to roll my own. My implementation is simple and inflexible, but it works well on Wordament boards. All the code we’ll discuss here is available in Reader.fs in my Bitbucket repository. This is written in F# but would be easy to port to any language.
First I needed to cut up a Wordament board so that I had individual bitmaps of each of the 16 letters. So I needed to search the bitmap for the orange letter tiles. I start by defining the start and end rows that I wanted to perform the search over. Here I just hard coded ratios of the screen size. I always used ratios and not absolute pixel values because I was testing this app on many devices with different resolutions.
The idea here is that I wanted to divide up the board so that the only orange pixels inside the boundaries are letter tiles. This way I can use that distinctive orange colour to easily divide the letter tiles up. This divided up a Wordament board like this:
Next is the getSquares function. Here I iterate through each row of pixels, looking for the first orange pixel on the row. When I find an orange pixel, I iterate across it and then down it until I run out of orange pixels. Now these boundaries make up a bounding rectangle for this tile. I then continue my search for the next tile by starting from the upper right of this rectangle and repeat. When I run out of room on this pixel row, I jump down a full tile’s height and start searching again from this row. Here’s the full code.
There are probably less stupid ways of finding bounding boxes but this works well ¯\(ツ)/¯ This divides the board into 16 square bitmaps, like so:
To match each bitmap to a letter, I divided the bitmaps into quarters and assigned each quadrant a value based on the ratio of white to orange pixels. For example, here’s the letter ‘G’, with its so-called ‘letter vector’:
How did I get these letter vector values? By painstakingly analysing a ton of screenshots of Wordament boards. In fact, if you look at my code you’ll see I’m actually missing Q and X. these letters appear so infrequently in Wordament that I wasn’t able to snap a screenshot to analyse them.
You may also notice I do not assign a value to the upper left quadrant. After implementing and testing this whole system, I found some letters were not matching when they should have been. I discovered that the digit in the upper left of the letter tile, the score, changes value on certain boards. This happens in special rounds, like where letters in corners are more valuable. This was enough to throw off my simple algorithm. I could have made my algorithm more permissive, but I found that the three other quadrants were enough to uniquely identify a letter. So I just threw that upper-right quadrant away.
So now I have a bitmap from the letter tile and I have a pre-defined vector to letter map. I can break my bitmap tile into quadrants. I can transform my quadrants into a vector of ratios of white to orange. And I can find the vector in my map that most closely matches the vector from my bitmaps. Here’s the code for doing this:
Now we just combine these two functions to break down the board into 16 bitmap letter tiles, and pass these tiles through the toLetter function. We wind up with 16 letters and return them. We can pass these letters into the solver defined above and out pops the solutions, ready to be input into the game.
Not surprisingly the player was the most finicky and least fun part of this project, but it’s fairly quick to describe.
I created a simple do-nothing Android project with a single test class. The test class takes a screenshot of the game board and sends it up to my web service, which sends it to the reader and solver. When the test receives the solutions it inputs them into the application.
Screenshotting the board
Screenshotting a board in an Android test is straight-forward.
Once we’ve taken the screenshot it needs to be prepared and sent to the web service. The important part here is how we compress the screenshot. Remember how we wrote quite a brittle reader? We need to make sure that the orange to white ratios in the image match our expectations. This means using a compression method that preserves colors and edges. JPEG is definitely out, but PNG is a good fit.
I also decided to scale the image down by a factor of two on each side. This preserves the ratios of white to orange but reduces the size of the image by a factor of four, which significantly improves performance (especially if you’re using this on a high-resolution device like a Pixel phone). It could probably scale perfectly fine to much smaller sizes but this was enough for me.
I used pretty standard JSON handling to manage the request and response.
Now that we have the solutions we need to figure out how to input them. The first step is figuring out where is the game board on the screen and where are the letter tiles. Here are a couple of utility methods for achieving this. This uses standard Android UI automation test tools for querying and interacting with the Android object model.
The result of these methods is the coordinates of the UIObject2’s that represent the letters. So now we know which letters to swipe and where they are, it’s a matter of performing the swiping with automated testing tools:
The Android test framework supplies very simple API for us to swipe through the UI so there’s little work for us to do here.
The result, and some problems
So here’s a quick video of the result. The bot will get us to the top of the leaderboard on any game it’s capable of solving well.
I didn’t go all the way with this project. There are certain boards that this bot won’t solve. There’s an example of such a board to the right.
You can see that this board has a letter combination. Ammending the bot to handle letter combos like this would be straight-forward, except for generating the letter vectors to identify every possible combination. Either a LOT of painstaking work or a smarter OCR system wouwld be required.
Another limitation is the special games that Wordament presents - some of my favourite games. These include games like “E in the corners” where every corner will have an ‘E’ and these tiles are worth much more points. Now, in most cases this bot will do well in such games as a result of being exhaustive, but as discussed in the word list section our dictionaries are not in fact completely exhaustive. If we are missing a significant number of ‘E’ results in our word list then we may miss out on too many points to win the game.
The good news is that in either of these cases the bot will still perform. Letters that are not recognised are returned as “*”, a letter that never occurs. So although that tile will never be used, the bot will continue to work around it.
These are not problems I intend to solve, as this bot was just a muck-around project for me. If you update the bot yourslef I’d love to see how, and am keen to hear any other feedback you have to offer. Thanks for reading!