From this wiki page, I learned that the strength of a password is affected by two main factors, the length (L) and the possible symbols (N), and it's calculated using the equation:

H = L * log2(N)

Now, what about the character repetition? Wikipedia says that repetition should be avoided but how does it affect the equation or the strength of the password?

I thought of some solutions like:

Counting repeated characters as 1 character. This way it affects the length (e.g. Password =

*MIC3333*Length =*4*). But that's not accurate because the number of the actual possible passwords that a hacker calculates is higher that the calculated entropy.If L=10 with 5 repeated characters, the repetition percentage is 5/10 or 50%. So, the strength of the password will be reduce by 50%. I thinks that's not accurate too.

I would like to know if there's an equation that includes character repetitions.

Thank you so much,

Therac 05/07/2018.

Password entropy is based on the number of possible combinations. For a pattern to reduce entropy, the pattern needs to be known to the attacker.

If the pattern isn't fixed and known, the reduction in complexity will be *specific to the attacker's algorithm*. It will be different for different attackers. E.g., with a dictionary attack, effective complexity for dictionary words is dictionary size, for words out of the dictionary full brute force complexity.

I don't know whether there is or can be a mathematically optimal algorithm for letter repetitions. The simplest solution I can think of is to treat symbol repetitions, up to a N repetitions, as extra graphemes. In that case, the password's strength will be the number of graphemes it contains.

Against such an algorithm, the added strength of each consecutive symbol from the 2nd to Nth symbols is 0, and equivalent to another grapheme up to 2N. But such an algorithm itself will be slower against random passwords due to a larger effective character set. For instance, checking for 2-long repetitions for every symbol will drop its speed by a factor of 2^(length-1). But against a password that is all double symbols, its speed will improve to only a square root of above complexity.

If it's imperative that brute force efficiency is not compromised, a free tweak is always starting to pick the last grapheme as being equal to the previous one. In that case the added strength from the first (for variable length) or all repeat symbols at the end (for fixed length) is 0. Doing so in the middle isn't a free tweak anymore. Neither is trying for extra repeats in a variable length password, although it's so cheap as to be nearly free.

In short, *against an algorithm* optimized to pick passwords with repeated characters, a password's strength can be estimated by dropping all consecutive digits past the first to get effective length L'. Theoretical strength would be approximately (charset*N)^L', where N is the maximum number of repetitions the attacker is testing for.

Against an algorithm optimized for brute-force efficiency, only the consecutive digits at the end should be dropped. Theoretical strength with a naive algorithm would be charset^(L'-1)*(charset+N). Any practical algorithm will be testing for a lot of suffixes already, though (passwords often end in "1" or "1!" to bypass complexity rules).

It's all a matter of what algorithm the attacker uses. Dictionaries, including leaked password lists, will generally be tried first.

Bob Brown 05/06/2018.

A friend whose doctorate is in statistics likes to say, "Often the sole significance of a statistical improbability is that the improbable has happened."

The formula you give holds *only* if each character is selected randomly, which implies selection independent of the other characters. So, it is improbable, but not impossible that a completely random password might be 3333333333.

With that said, attackers use heuristics. One such might be to look for repeats and, finding such, test that the next character is another repeat before trying more difficult combinations. So, I'd be tempted to reject password suggestions with three or more repeated characters.

Serge Ballesta 05/07/2018.

H = L * log(N) is the mathematical entropy of the set of the possible passwords of L characters **randomly** chosen among a set of size N. It is just the log of that number: H = log(N^{L}).

But as soon as you add additional rules such as:

- disallow repetition of same symbol
- require presence of characters from disjoin subsets

you reduce the number of possible patterns and reduce the strength (entropy). If an attacker knows that a password cannot contain repetitions of a character, he can optimize his algorithm with that. But in fact H measures the strength of the password against brute force attacks, where attacker assumes than any combination has same probability and consistently browse the whole possible set.

Such restrictions are anyway commons, because most users do not use true random for choosing a password, and some damn simple patterns (`00000000`

or `12345678`

for eight num characters) have a much higher probability of being chosen than others(*). So those restrictions try to avoid those simple passwords that could be attacked by *dictionary* or more exactly heuristics
attacks.

That's where we enter the psychologic game. If I assume that the attacker will use heuristics to first test specific patterns, I should disallow them. Even if I know that this lowers the possible passwords number, hence the theorical strenght of the password.

TL/DR: The more constraints you add to the password, the more resistant it will be to short time attacks using heuristics, the more vulnerable it will be to brute force attacks

As it is hard to give precise probabilities for human choosed passwords, it is hard to precisely determine the actual entropy. Above formule is only valid when every combination has exactly same probability. But as soon as probabilities vary in major way, the theorical entropy formula will give a much lower value. Extract from wikipedia:

the entropy Η of a discrete random variable X with possible values {x1, ..., xn} and probability mass function P(X) is:

`H ( X ) = E [ I ( X ) ] = E [ − ln ( P ( X ) ) ]`

where E represent the expectation of a variable

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

- How does varying character type increase strength of password?
- How can we accurately measure a password entropy range?
- Confused about (password) entropy
- Why are lengthy passwords stronger than complex ones?
- Security of carrying passwords on person
- Is there an equation to calculate the number of instances of a series of characters exist in a given key space?
- Should a space in a password count as a symbol when checking for complexity?
- Does the strength of password matter in bcrypt?
- Is randomly generating passwords from an assortment of dictionary words cryptographically secure?
- Does entropy apply to passwords generation when some passwords are more likely?

- Player getting frustrated because he's dropping to 0 HP frequently - what to do?
- What does it mean to "burn a zero-day"?
- Should I punish my teenage sister, whom I have full custody of, for lying to me in order to secretly see her boyfriend?
- It's now safe to turn off your computer
- What is the name of a player's "subparty"?
- Find the true one
- Can secret GET requests be brute forced?
- Sugar-bomb tree mechanics
- Got invited to apply for a job for which I don't qualify. How should I take this?
- Words with Puzzlers
- Can I replace a 135mm/5mm QR hub with a 10mm-diameter IGH?
- If I don't want to patent something, what can I do to ensure the patent office doesn't unintentionally grant the patent to someone else?
- Number between square brackets on FPGA schematic
- I am hated by the world
- Why not include as a requirement that all functions must be continuous to be differentiable?
- Why do Guitar chords work the way they do compared to a Piano?
- Why the names Type 1, 2 error?
- chain action in Ubuntu terminal
- Do they have to be integers?
- What should I do when my neural network doesn't learn?
- Opposite of "sexy clothes"
- Why did Dobby help Harry Potter?
- Why does c = ++(a+b) give compilation error?
- Is it possible for a cryptographic algorithm to limit the number of times a package/ciphertext can be decrypted?