Giter Club home page Giter Club logo

quordlebot's Introduction

QuordleBot

Like WordleBot, but for Quordle. How can I get it in six?

Usage

Install / generate lookup tables:

python3 -m venv venv
pip install -r requirements.txt

./generate_lookup_table.py  # takes ~4 minutes
./generate_array.py
ls -lh words/map.pickle words/array.pickle

Find the best next play:

./quordlebot.py NYLON,SHELF,VIRAL,BUDGE ROAST
ROAST ['.y...', '...y.', 'y.y..', '.....'] -> [76, 94, 73, 266] = 138722192 +17.66 bits
Best next plays based on expected information gain:
CLINE -> +20.00 bits
INCLE -> +19.36 bits
CLIPE -> +19.29 bits
CLIME -> +19.26 bits

Find the best plays absent any feedback:

./priors.py
# (takes many hours to run!)

Notes

The word list is inlined into the Quordle JS:

{
  "blacklist": "GYPSY GIPSY MAMMY AGORA SLAVE HUSSY JUNTA JUNTO WELCH MORON",
  "wordBank": "ABACK ABASE ABATE...",
  "allowed": "AAHED AALII AARGH..."
}
  • blacklist: 10 words
  • wordBank: 2,315 words
  • allowed: 10,657 words

I assume "wordBank" is the words that can be answers and "allowed" is the words you're allowed to guess. Unclear to me exactly how "blacklist" works, these words appear in "wordBank" but not "allowed".

Assuming there are never dupes, that means there are

2314 * 2314 * 2313 * 2312 = 28634537426976 solutions
log2(28634537426976) = 44.7 bits of entropy

So maybe a good start is to look at which guesses maximize information gain for any given state.

Best and worst first words according to pure information gain:

[(5.885960110378853, 'SOARE'), (5.882779324291968, 'ROATE'), (5.865709709951878, 'RAILE'), (5.865457142861989, 'REAST'), (5.834581525865644, 'SALET'), (5.817161175806047, 'ORATE'), (5.792668103557574, 'CARTE'), (5.786709827456228, 'RAINE'), (5.775166510376183, 'ARIEL'), (5.774382034176954, 'CARET')]
[(2.2851367360614, 'GYPPY'), (2.267954317843193, 'KUDZU'), (2.2436395107418843, 'FUFFY'), (2.2341699553948438, 'JAFFA'), (2.2141711564276, 'PZAZZ'), (2.206893724562269, 'YUKKY'), (2.192000567958077, 'XYLYL'), (2.0525999694165993, 'IMMIX'), (2.0383091074888515, 'JUJUS'), (1.8918361327315534, 'QAJAQ')]

I remember ROATE!

I see no reason to use a different first word for Quordle. What about as a second word? Presumably it depends on the patterns you get out.

With optimal information gain play, my quordlebot solves today's in seven, which is the same number of guesses that I used.

Here's its sequence:

./quordlebot.py SOARE,y..y.,..y..,..g..,g.yy. CLIPT,....y,.....,.g...,...yy STRAP,yyy..,...y.,...y.,ggggg DUNGY,.y...,...y.,.....,.....
SOARE [16, 138, 40, 8]
CLIPT [2, 20, 6, 1]
STRAP [2, 17, 6, 1]
DUNGY [1, 2, 1, 1]
Quad 0 must be TRUSS
Quad 1 is one of ['GAMMA', 'MAGMA']
Quad 2 must be LLAMA

If your goal were to get a six, I wonder if there's a better strategy than maximizing information gain?

The lists are actually disjoint! So I need to add the wordbank words to the allowed guess list.

$ two-stats words/wordbank.txt words/allowed.txt
A: words/wordbank.txt
B: words/allowed.txt
|A|: 2315
|B|: 10657
|A ∪ B|: 12972
|A ∩ B|: 0
|A - B|: 2315
|B - A|: 10657

So 12,972 guessable words total.

Next steps:

  • Rerun initial play analysis with larger list of guessable words
  • Look for the best static pair of initial plays
    • Remove first word from list of seconds after examing
    • Parallelize the search
    • Is sorting the list and then doing counts faster?
    • Does factoring out the result[guess1] lookup matter?
  • See if using pickle is faster/more compact than JSON (it is: 7.468s->5.060s for quordlebot.py)
  • Print how many bits of information were gained by your actual guesses
  • Explore the game tree to find strategies to minimize expected remaining guesses
    • Expected number of remaining guesses doesn't seem right?
    • Sort un-forced guesses by information gain
    • Prune by "best so far"; but when can you bail out if you're looking for expected number of plays?
    • Only keep track of optimal number in the cases where that's what we care about
  • Ignore subsequent guesses after you've gotten a word
  • Switch quordlebot.py to use the array format
  • Add a mode that takes the four words and your guesses, rather than .yg
  • Factor out a Quordle class
  • ROAST / CLINE seems to always result in far more bits of information gain than priors.py suggests should be expected (~35 bits vs. 9.6 bits). What's going on? (It's four boards vs. one!)
  • How frequently does ROAST / CLINE / HUMID or ROAST / CLINE / DUMPY give you a guaranteed seven?
  • How often is CLINE the best second guess after ROAST?
  • Report the expected vs. actual information gain of each of your guesses
  • What are the odds of getting a six if you go for it?
  • Can Python distinguish guessable/wordbank indices via nominal types? (Sorta https://docs.python.org/3/library/typing.html#newtype)
  • Why do I have to import ResultDict to unpickle array.pickle?
  • Measure optimal performance
    • What's quordlebot's expected number of guesses / distribution?
    • How frequently does "go for 5" work? What does it cost you?
  • Add a --today (or default) option that mimics Quorlde's Mersenne Twister;
  • When's the next time that TRAIN / CLOSE / FILET / SONAR will be an answer?
  • Add a --no-spoilers option
  • Improve the startup time for quordle.py
  • Change progress printout to use a single line / progress bar
  • Print current word / performance on progress bar
  • Change all displayed numbers to be total plays (not additional plays)
  • Show expected number of remaining plays in --no-spoilers mode
  • Show other possibilities after you get a correct guess, even in --no-spoilers mode.

Updated best and worst first guesses:

[(5.885960110378853, 'SOARE'), (5.882779324291968, 'ROATE'), (5.877909690821479, 'RAISE'), (5.865709709951878, 'RAILE'), (5.865457142861989, 'REAST'), (5.855775376955964, 'SLATE'), (5.834581525865644, 'SALET'), (5.832589698902821, 'CRATE'), (5.831396980440786, 'IRATE'), (5.829323821643724, 'TRACE')] [(2.2851367360614, 'GYPPY'), (2.267954317843193, 'KUDZU'), (2.2436395107418843, 'FUFFY'), (2.2341699553948438, 'JAFFA'), (2.2141711564276, 'PZAZZ'), (2.206893724562269, 'YUKKY'), (2.192000567958077, 'XYLYL'), (2.0525999694165993, 'IMMIX'), (2.0383091074888515, 'JUJUS'), (1.8918361327315534, 'QAJAQ')]

Best first two guesses:

SOARE CLINT -> +9.63 bits
ROAST CLINE -> +9.61 bits *
SOCLE RIANT -> +9.61 bits
RIANT CLOSE -> +9.61 bits
LOAST CRINE -> +9.60 bits
SAINT CEORL -> +9.60 bits
TRAIL SONCE -> +9.59 bits
ROIST LANCE -> +9.58 bits
SONCE LIART -> +9.58 bits
TRICE SALON -> +9.58 bits
SARIN CLOTE -> +9.58 bits
LIANE CROST -> +9.58 bits
TORAN SLICE -> +9.58 bits
TOILE CARNS -> +9.57 bits
NOISE CLART -> +9.57 bits
TRAIN SOCLE -> +9.57 bits
TRINE COALS -> +9.57 bits
TOILS CRANE -> +9.57 bits
TOILE CRANS -> +9.57 bits
CRANE TOILS -> +9.57 bits *
DOILT CARSE -> +9.56 bits

ROAST/CLINE feels somewhat not made up; CRANE/TOILS are both clearly real words.

After playing ROAST/CLINE, there's a fully-determined word ~70% of the time.

If you only look at pairs of words from the wordbank, here are the top ones:

TRICE SALON -> +9.58 bits
STOLE CAIRN -> +9.56 bits
TRAIN CLOSE -> +9.56 bits
LINER COAST -> +9.55 bits
SLANT CRIED -> +9.52 bits
STAIR CLONE -> +9.52 bits
TRAIL SCONE -> +9.52 bits
SPILT CRANE -> +9.52 bits
SLANT PRICE -> +9.51 bits

"TRICE" feels made-up, but TRAIN / CLOSE is great.

Here's a very slow run (~7h) with mixed solution / non-solution plays:

./quordlebot.py CHAFF,LOWLY,SWORE,PAPER TRAIN CLOSE  14292.83s user 38.97s system 58% cpu 6:45:06.09 total
Best play by expected number of steps to complete:
1. 7.535 PUDGY (+5.535 plays, +6.73 bits)
2. 7.537 CHAMP (is solution, +5.537 plays, +5.72 bits)
3. 7.558 DOGMA (+5.558 plays, +5.73 bits)
4. 7.595 MOLDY (is solution, +5.595 plays, +5.76 bits)
5. 7.619 PADDY (+5.619 plays, +6.59 bits)
6. 7.619 GOLEM (+5.619 plays, +5.90 bits)
7. 7.624 DOPEY (+5.624 plays, +6.45 bits)
8. 7.639 GODLY (is solution, +5.639 plays, +5.06 bits)
9. 7.645 PYGMY (+5.645 plays, +6.64 bits)
10. 7.651 MADLY (+5.651 plays, +6.11 bits)
11. 7.654 DUMPY (+5.654 plays, +6.60 bits)
12. 7.663 MOGUL (is solution, +5.663 plays, +4.91 bits)

Another slow one:

./quordlebot.py HAREM,GAILY,TRITE,KNEAD TRAIN CLOSE
Best play by expected number of steps to complete:
1. 7.537 EMPTY (+5.537 plays, +7.26 bits)
2. 7.600 PYGMY (+5.600 plays, +7.35 bits)
3. 7.631 DEPTH (+5.631 plays, +6.53 bits)
4. 7.641 DUMPY (+5.641 plays, +7.33 bits)
5. 7.706 PARTY (+5.706 plays, +6.28 bits)
6. 7.713 BLIMP (+5.713 plays, +6.49 bits)
7. 7.725 APHID (+5.725 plays, +6.77 bits)
8. 7.740 EMBED (+5.740 plays, +6.54 bits)
9. 7.744 AMPLY (+5.744 plays, +7.18 bits)
10. 7.749 GIPSY (+5.749 plays, +6.39 bits)

If you want to go for five, what's the best first word to play? For some of our standard openers, here's how many words are fully-determined by the opening play, or a 50/50:

  • ROAST: 35 / 26
  • CLINE: 30 / 28
  • TRAIN: 32 / 32
  • CLOSE: 21 / 26

So play ROAST before CLINE and TRAIN before CLOSE.

What if you want to go for five? Some variations:

  • FILET / SONAR: 37 / 46, 9.276 information gain
  • FLIRT / CANOE: 32 / 52, 9.286 IG
  • SATIN / CRUEL: 35 / 38, 9.429 IG
  • TRIED / SALON: 35 / 36, 9.444 IG
  • ROAST / CLINE: 35 / 26, 9.61 IG
  • TRAIN / CLOSE: 32 / 32, 9.558 IG

(Note: this only considered wordbank words for the second guess for IG.)

quordlebot's People

Contributors

danvk avatar

Watchers

 avatar  avatar

Forkers

gm456

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.