Perplexity vs Entropy: Using an LLM Metric to Accurately Measure Entropy

A password’s strength is determined by the amount of chances it would take on average to guess it. This is usually measured in terms of entropy. The more entropy, the more guesses it would take to crack a given password, the more secure the password.

But it’s not so simple! With modern machine learning practices, people are using AI to accurately guess passwords. Since people tend to use repeatable patterns when making passwords, deep learning algorithms are well suited to model common passwords and generate highly accurate guesses. This means that calculating the true strength of a password is more difficult than the basic log2(search_space ^ characters) formula we currently use.

This brings us to the idea of perplexity. Perplexity, as well as being the name of an AI company, is a metric used by Large Language Models that measures how likely it would have predicted each word in a given test sequence. Researchers use this metric to determine how coherent and confident an LLM is in order to help refine it’s preferred use case.

We can use this same metric to more accurately guess how strong a passphrase is.

Converting Perplexity to Entropy

While we were developing Fuzzypass as a stronger and more memorable alternative to a master password, we realized that this perplexity metric can and should be used to measure the strength of a passphrase.

With Fuzzypass, we are using words. People use words to create logical sentences. And since we want people to remember their Fuzzypass, we don’t want to force people to choose random words like Diceware does.

That means that when someone uses a logical sentence, the search space for a given word is much less than with a truly random passphrase. The true strength of a passphrase like Fuzzypass is based not on the size of the dictionary that the words came from, but on how likely it would be for an LLM to guess it.

Our math at this point is fairly simple. We calculate the perplexity of a given passphrase using this handy guide, divide it in 2 to add a significant security buffer, and then use that number as a replacement for the dictionary size in the traditional entropy formula.

For example, let’s take the phrase “mary had a little lamb whose fleece was white as snow”.

This is obviously a weak passphrase. But a traditional entropy calculation would say that it’s 11 words, with a dictionary size of something like 8000 words, meaning for formula to calculate entropy is log2(8000^11) = 142.6 bits of entropy. This is ridiculous and wrong and all online entropy calculators would lie to you about the security of such a passphrase.

But with Fuzzypass, we take into account perplexity. When using our online calculator the entropy of such a weak passphrase falls to a puny 28.8 when measuring with Llama 2 7b.

As a startup we don’t have the resources to self-host a larger model, if someone wants to run the entropy calculation using a big boy model please be my guest! Just let me know what the results are.

Converting Perplexity to Entropy

Leave a Reply Cancel reply