How character repetition affects passwords entropy?

user2824371 05/06/2018. 3 answers, 91 views
passwords

From this wiki page, I learned that the strength of a password is affected by two main factors, the length (L) and the possible symbols (N), and it's calculated using the equation:

H = L * log2(N)

Now, what about the character repetition? Wikipedia says that repetition should be avoided but how does it affect the equation or the strength of the password?

I thought of some solutions like:

  1. Counting repeated characters as 1 character. This way it affects the length (e.g. Password = MIC3333 Length = 4 ). But that's not accurate because the number of the actual possible passwords that a hacker calculates is higher that the calculated entropy.

  2. If L=10 with 5 repeated characters, the repetition percentage is 5/10 or 50%. So, the strength of the password will be reduce by 50%. I thinks that's not accurate too.

I would like to know if there's an equation that includes character repetitions.

Thank you so much,

3 Answers


Therac 05/07/2018.

Password entropy is based on the number of possible combinations. For a pattern to reduce entropy, the pattern needs to be known to the attacker.

If the pattern isn't fixed and known, the reduction in complexity will be specific to the attacker's algorithm. It will be different for different attackers. E.g., with a dictionary attack, effective complexity for dictionary words is dictionary size, for words out of the dictionary full brute force complexity.

I don't know whether there is or can be a mathematically optimal algorithm for letter repetitions. The simplest solution I can think of is to treat symbol repetitions, up to a N repetitions, as extra graphemes. In that case, the password's strength will be the number of graphemes it contains.

Against such an algorithm, the added strength of each consecutive symbol from the 2nd to Nth symbols is 0, and equivalent to another grapheme up to 2N. But such an algorithm itself will be slower against random passwords due to a larger effective character set. For instance, checking for 2-long repetitions for every symbol will drop its speed by a factor of 2^(length-1). But against a password that is all double symbols, its speed will improve to only a square root of above complexity.

If it's imperative that brute force efficiency is not compromised, a free tweak is always starting to pick the last grapheme as being equal to the previous one. In that case the added strength from the first (for variable length) or all repeat symbols at the end (for fixed length) is 0. Doing so in the middle isn't a free tweak anymore. Neither is trying for extra repeats in a variable length password, although it's so cheap as to be nearly free.

In short, against an algorithm optimized to pick passwords with repeated characters, a password's strength can be estimated by dropping all consecutive digits past the first to get effective length L'. Theoretical strength would be approximately (charset*N)^L', where N is the maximum number of repetitions the attacker is testing for.

Against an algorithm optimized for brute-force efficiency, only the consecutive digits at the end should be dropped. Theoretical strength with a naive algorithm would be charset^(L'-1)*(charset+N). Any practical algorithm will be testing for a lot of suffixes already, though (passwords often end in "1" or "1!" to bypass complexity rules).

It's all a matter of what algorithm the attacker uses. Dictionaries, including leaked password lists, will generally be tried first.


Bob Brown 05/06/2018.

A friend whose doctorate is in statistics likes to say, "Often the sole significance of a statistical improbability is that the improbable has happened."

The formula you give holds only if each character is selected randomly, which implies selection independent of the other characters. So, it is improbable, but not impossible that a completely random password might be 3333333333.

With that said, attackers use heuristics. One such might be to look for repeats and, finding such, test that the next character is another repeat before trying more difficult combinations. So, I'd be tempted to reject password suggestions with three or more repeated characters.


Serge Ballesta 05/07/2018.

H = L * log(N) is the mathematical entropy of the set of the possible passwords of L characters randomly chosen among a set of size N. It is just the log of that number: H = log(NL).

But as soon as you add additional rules such as:

  • disallow repetition of same symbol
  • require presence of characters from disjoin subsets

you reduce the number of possible patterns and reduce the strength (entropy). If an attacker knows that a password cannot contain repetitions of a character, he can optimize his algorithm with that. But in fact H measures the strength of the password against brute force attacks, where attacker assumes than any combination has same probability and consistently browse the whole possible set.

Such restrictions are anyway commons, because most users do not use true random for choosing a password, and some damn simple patterns (00000000 or 12345678 for eight num characters) have a much higher probability of being chosen than others(*). So those restrictions try to avoid those simple passwords that could be attacked by dictionary or more exactly heuristics attacks.

That's where we enter the psychologic game. If I assume that the attacker will use heuristics to first test specific patterns, I should disallow them. Even if I know that this lowers the possible passwords number, hence the theorical strenght of the password.

TL/DR: The more constraints you add to the password, the more resistant it will be to short time attacks using heuristics, the more vulnerable it will be to brute force attacks


As it is hard to give precise probabilities for human choosed passwords, it is hard to precisely determine the actual entropy. Above formule is only valid when every combination has exactly same probability. But as soon as probabilities vary in major way, the theorical entropy formula will give a much lower value. Extract from wikipedia:

the entropy Η of a discrete random variable X with possible values {x1, ..., xn} and probability mass function P(X) is:

H ( X ) = E [ I ( X ) ] = E [ − ln ⁡ ( P ( X ) ) ]

where E represent the expectation of a variable


HighResolutionMusic.com - Download Hi-Res Songs

1 BLACKPINK

Kiss And Make Up flac

BLACKPINK. 2018. Writer: Soke;Kny Factory;Billboard;Chelcee Grimes;Teddy Park;Marc Vincent;Dua Lipa.
2 Martin Garrix

Waiting For Tomorrow flac

Martin Garrix. 2018. Writer: Pierce Fulton;Mike Shinoda;Martijn Garritsen;Brad Delson.
3 John Legend

Written In The Stars flac

John Legend. 2018. Writer: Kiana Brown;Santoy;Kevin White;Mike Woods;MZMC;The Heavy Group;Rice N' Peas.
4 Martin Garrix

Access flac

Martin Garrix. 2018. Writer: Martin Garrix.
5 Martin Garrix

Yottabyte flac

Martin Garrix. 2018. Writer: Martin Garrix.
6 Alan Walker

Diamond Heart flac

Alan Walker. 2018. Writer: Alan Walker;Sophia Somajo;Mood Melodies;James Njie;Thomas Troelsen;Kristoffer Haugan;Edvard Normann;Anders Froen;Gunnar Greve;Yann Bargain;Victor Verpillat;Fredrik Borch Olsen.
7 Cardi B

Taki Taki flac

Cardi B. 2018. Writer: Bava;Juan Vasquez;Vicente Saavedra;Jordan Thorpe;DJ Snake;Ozuna;Cardi B;Selena Gomez.
8 Bradley Cooper

Shallow flac

Bradley Cooper. 2018. Writer: Andrew Wyatt;Anthony Rossomando;Mark Ronson;Lady Gaga.
9 Post Malone

Sunflower flac

Post Malone. 2018. Writer: Louis Bell;Billy Walsh;Carter Lang;Swae Lee;Post Malone.
10 Dyro

Latency flac

Dyro. 2018. Writer: Martin Garrix;Dyro.
11 Lady Gaga

I'll Never Love Again flac

Lady Gaga. 2018. Writer: Benjamin Rice;Lady Gaga.
12 Rita Ora

Let You Love Me flac

Rita Ora. 2018. Writer: Rita Ora.
13 Zara Larsson

Ruin My Life flac

Zara Larsson. 2018. Writer: Delacey;Michael Pollack;Stefan Johnson;Jordan Johnson;Sermstyle;Jackson Foote.
14 Halsey

Without Me flac

Halsey. 2018. Writer: Halsey;Delacey;Louis Bell;Amy Allen;Justin Timberlake;Timbaland;Scott Storch.
15 ZAYN

Fingers flac

ZAYN. 2018. Writer: Zayn Malik;Alex Oriet;David Phelan.
16 Dewain Whitmore

Burn Out flac

Dewain Whitmore. 2018. Writer: Dewain Whitmore;Ilsey Juber;Emilio Behr;Martijn Garritsen.
17 Mako

Rise flac

Mako. 2018. Writer: Riot Music Team;Mako;Justin Tranter.
18 Bradley Cooper

Always Remember Us This Way flac

Bradley Cooper. 2018. Writer: Lady Gaga;Dave Cobb.
19 Blinders

Breach (Walk Alone) flac

Blinders. 2018. Writer: Dewain Whitmore;Ilsey Juber;Blinders;Martin Garrix.
20 Julia Michaels

There's No Way flac

Julia Michaels. 2018. Writer: Ian Kirkpatrick;Justin Tranter;Julia Michaels;Lauv.

Related questions

Hot questions

Language

Popular Tags