The Anatomy
of a Password

Predicting Crack Time
with Linear Regression

Gurman Basran  ·  DASC 4850  ·  April 2026

Question + Dataset

"Which features of a password actually predict how long it takes to crack?"

  • ~12k passwords total, target = log10(crack time)
  • 10k real from breach dumps: RockYou 2009, LinkedIn 2012, Adobe 2013, Dropbox 2016, Pwdb 2021 / COMB
  • 2k generated strong passwords (password manager output)
  • Threat model: 10 billion GPU guesses / sec against the leaked hash dump
GPU rack performing brute force password cracking

Insight 1: The Gap Between Weak and Strong

  • Spike on the left: real people, cracked in <1 second
  • Strip on the right: generated strong, ~1025+ seconds
  • Almost nothing in between
Either you reuse password123 or you let a manager generate something random. There is no middle ground in real human behavior.
Bimodal Crack Time Distribution

Insight 2: Dictionary Words Are A Death Sentence

"Do passwords containing common english words crack significantly faster?"

Two sample t-test: p < 0.001 (effect is real, not random)

Passwords without a dictionary word are
~14,000× harder to crack on average

So Monkey123! is essentially as crackable as monkey. Capitalizing, adding digits and symbols are standard modifications crackers test automatically.

Dictionary Word vs Not — distributions barely overlap

The Model and How It Did

  • Ridge regression (handles correlated features without flipping signs)
  • 60 / 20 / 20 train / val / test split, cross validated
MetricTest Set
R-squared0.988
RMSE (log10 sec)1.032
MAE (log10 sec)0.531
Model explains ~99% of the variance in crack time across 32 orders of magnitude. Predictions typically within ~1 order of magnitude.
Predicted vs Actual

Takeaway, Limitation, Next Step

What the model says: length wins

correcthorsebatterystaple (20 lowercase) ≈ 2620 = 2×1028
K$3p9!aB (8 mixed all 4 types) ≈ 958 = 7×1015
Lowercase passphrase ~1012× stronger despite a smaller alphabet.

Practical advice

  • Password manager + 16+ char random for everything
  • 5 word diceware passphrase for the master password

Limitation · Next step

Numbers assume MD5 speed; bcrypt/Argon2 shifts absolutes but not feature ranking. Next: deploy as a live strength checker in my Vaultwarden setup.

Long elegant chain dominating a small tangled knot — length beats complexity
DASC 4850 · The Anatomy of a Password · G. Basran 1 / 6
← / → or space to navigate · F for fullscreen