Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because remembering the password "horsebattery123" is way easier than "GFj27ef8%k$39"

One way I like to remember long yet high-entropy passwords is to memorise a long, somewhat nonsensical phrase and use characters from it. The reverse is also possible. E.g. that one could become "Gordon Freeman joins 27 electric fences 8% kills $39"



I am pretty suspicious of my meat-based random phrase generator. A lot of the analysis on the entropy of correcthorsebatterystaple type passwords assume the words are uniformly drawn from a vocabulary of whatever thousands. But if you have a large enough dataset of passwords I bet you can find the true (non-uniform) distribution of words people actually draw from and the entropy will be a bit lower. And if you further restrict it to be almost grammatically correct phrases it will be further lower[1].

Am I just paranoid?

[1] RNNs can learn the distribution of your grammar easily http://karpathy.github.io/2015/05/21/rnn-effectiveness/. The worry is that human generators will condition their random words too much. e.g., "correct" ooohh brain just did a adjective let's throw it a noun next "horse".


This is where diceware comes in. You are supposed to pick whatever word you roll on the first try, no exceptions, for this very reason. Rolling until you get a word you "like" reduces randomnesss significantly. They even say you should use only use real, meatspace dice to generate passwords to be truly random, not pseudorandom.

http://world.std.com/~reinhold/dicewarefaq.html

>There are some obscure words in both lists. If you passphrase includes a word you don't know, look it up in a good dictionary. Learning the word's meaning will aid you memory and your vocabulary.

Of course there's exceptions when you should start from scratch.

http://world.std.com/~reinhold/diceware.html

>Because some words on the diceware list are two characters or less, you can get a very short passphrase. If your passphrase, including the spaces between the words, is less than 17 characters long, we recommend that you start over and create a new passphrase. You should also start over if your passphrase is a recognizable English sentence or phrase. (These situations are extremely rare.)

(If it were me I'd just keep adding more and more words if my password was 17 chars or less)


> Rolling until you get a word you "like" reduces randomnesss significantly.

Rerolling 32 times loses 5 bits of security... not a big deal.


A 5 word diceware password has ~50 bits of security.

5 bits is a significant part of 50 bits. Alternatively, it matters whether an attack takes a month or 2.67 years.


When I use diceware I use random.org for the dice rolls. They claim true randomness generated from atmospheric noise.


Is that because Random.org is really random, or because your threat model doesn't include state level actors taking over random.org?


If state level actors really are interested in what you have to say, then either you're a high enough level spook that you aren't using a web service for your entropy source, or your crypto is not going to be the weak point in your defense.


If every hacker uses random.org, it's more cost-effective to compromise the service than to take a wrench to every single hacker's kneecaps. It wouldn't be a targeted attack, but it might be very useful if a sufficient subset of 'interesting' people use random.org.


Because I don't often have physical dice handy. So it's that or pseudorandom numbers I generate on my local machine.


Getting your entropy for crypto purposes from a 3rd party is not a great idea.


'3rd party' is rather vague in this context. At least to me it is. Perhaps its more clarifying to say the usage of any networking required is out of the question. E.g. you should be able to use a diceware application on a laptop which is completely off the grid.


Pseudorandom bits are statistically indistinguishable from random bits if seeded with true randomness. And if your OS is properly configured it should be seeded with true randomness.

If you prefer to use a dice, that's fine, but you're going to bet less than two bits of entropy per roll. Enjoy rolling your dice more than 70 times to get 128 bits of entropy. ;)


You don't roll them one by one.


Do you roll 10 of them in one hand and manually input the results back? For what purpose?


I'd love to see a study done on this.

I have a feeling that a "phrase" password has significantly more entropy than a "character" password of double or even triple the length (comparing a word equal to a character length wise).

Even taking into account that a real sentence would need to follow a lot of rules, there are still a LOT of adjectives , a LOT of nouns, etc... I'm sure your "meat based" generator is more open to targeted attacks if someone knows your interests or something, but I have a feeling that it's still such a large pool that it's safe.

And if you start to include somewhat nonsensical phrases like "correct horse battery staple" that even opens things up more.

Include other things like spacing, capitalization, misspellings, made up words, or even prepending or appending a "traditional" password gets you even more still.


If you pick completely random words from the dictionary, you get about 17 bits per word. That's worth between two and three random characters.

If you take random words and arrange most of them into a sentence then you drop slightly but you're still okay. If you add any common words like "the" or "was", don't include them in your word count.

While you could boost it with spacing, capitalization, misspellings, etc. you gain very few bits for each modification you have to remember. You're better off tossing a random character or two onto your phrase, or simply making it a word longer.


I don't know how to answer the other parts of your comment, but your "double or even triple length" estimate might be off:

From https://en.wikipedia.org/wiki/Entropy_(information_theory)

> English text has between 0.6 and 1.3 bits of entropy for each character of message.

For comparison if you used a random string of alphanumeric characters it will have lg(26 + 26 + 10) = 5.7 bits per character.

So if your password is drawn from an english corpus, if the low end of the estimate is correct, it's only about as strong as a random password 9 times shorter (or 4 on the high end).

But of course we don't want a grammatical english password. Question is how much entropy does our meat-based random generator actually lose due to language bias compared to random word selection from an english dictionary (which I don't disagree with the analysis of as long as it's machine generated).


I wonder how much extra entropy you can add by introducing an extra language or two?

For instance, having a password like 'unterwasserboot-sparkle-mocidade-yogurt'.

It seems like multilingual folks would be at a distinct advantage here ... at least until you forget which of the words in your password was in which language, and you end up with 'submarine-faisca-jugend-yogurt' instead :)


Beyond grammar, the addition of intentional errors or symbols or Finnegans Wake can help increase the entropy drastically too.


BIP39 is a spec for generating secure mnemonics (passphrases) with 128-bit entropy. I've written a version in rust, have a look. github.com/leshow/rust_mnemonic

BIP39 generates 12 word or 24 word mnemonics usually.

This was just a small project I did, and hasn't been checked for correctness. However, it should give you an idea of how you can generate word sequences.


I do the same, but I skip the "and use characters from it". By using the full multi-word passphrase I get more entrophy than by using just some of the letters, at the same memorization effort. The only exception is sites with low maximum password length.


Does it not occur to you that if you are going to go that route then why not use the password GordonFreemanjoins27electricfences8%kills$39

which would be even more secure and just as easy if not moreso to remember?


I'm all for mnemonic devices and I can understand how one could remember a password like "GFj27ef8%k$39". But can people remember 30 of those, assuming no password reuse?


You can use passphrases with a substitution or shift cipher and get a high entropy password that would also be resilient to combination dictionary attacks.


I use a German word for my WiFi password. 33 characters, and everybody I know enters it correctly on the first try :D


Just use diceware.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: