This is the backend logic for a web-app designed to play the word game Codenames, between a human and AI opponent.
The AI is referred to as a 'search model' and is trained to find a word in a large database (all 63,000 words from merriam webster) that best satisfies an algorithm created to play the game of Codenames. A trained model is included in the repository and the code used to train it can be found here ''. The model is relatively small (<200mb in size) as it is trained on the embeddings generated by a pre-trained encoder-only transformer, the embeddings are cached in an SQLite database.
Each turn is recorded in a MongoDB database
A transformer based sentence encoder is trained to generate an embedding/latent space which captures the semantic relationships in the English language. The search model is then trained to search the latent space based on a given board state for a word/sentence that best satisfies a predefined algorithm. The embedding space itself is left unchanged. The search model takes the board state as input and outputs a single embedding, then by using approximate nearest neighbor (ANN) search the 20 most similar embeddings are found. These embeddings are then scored using the same algorithm used in training the model and the highest performing word is returned. On average the AI can find a hint word resulting in the selection of 6.5/9 of the target words on the first turn. In comparison the average person can usually only guess about 2-3/9 words.