To generate human-readable and pronounceable passwords, I usually rely on apg or pwgen, but I needed a solution that can be used from a Python program, which should be portable so that I can’t rely on the presence of apg or pwgen on the computer.

First, we need to download Ned Batchelder’s hyphenate.py that implements Frank Liang’s hyphenation algorithm (the one used in TeX). It allows to break words into syllables and to build a big set of elements that we’ll be able to combine later.

In the script below, we load a dictionary (typically /usr/share/dict/words), select only the words that are all lower case alphabetic (to exclude proper nouns, words ending with 's, etc.), break them into syllables, keep only the short ones (some syllables are really long, up to full words, perhaps because the algorithm fails on some words), then we build a Python module with two set objects words and syllables.

# gendict.py
import sys, pprint
from hyphenate import hyphenate_word
words = set()
syllables = set()
for dictfile in sys.argv[1:] :
    for line in open(dictfile) :
        word = line.strip()
        if word.isalpha() and word.islower() :
            words.add(word)
            syllables.update(s for s in hyphenate_word(word)
                             if len(s) < 5)
print("words = %s\n\n" % pprint.pformat(words))
print("syllables = %s\n" % pprint.pformat(syllables))

This script can be invoked as, for instance:

$ python gendict.py /usr/share/dict/words > wordlist.py

Then, we are ready to create our password generator:

# pwgen.py
import sys, random
import wordlist

nonalpha = "0123456789-+/*=%,;.:!?$&#|@"

def pwgen (length) :
    while True :
        pw = []
        for s in random.sample(wordlist.syllables, length) :
            pw.append(s)
            if random.randint(0, 1) :
                pw[-1] = pw[-1].capitalize()
            if random.randint(0, 1) :
                pw[-1] = pw[-1].swapcase()
        pw.extend(random.sample(nonalpha, random.randint(1, length)))
        random.shuffle(pw)
        attempt = "".join(pw)
        word = attempt.translate(None, nonalpha).lower()
        if word not in wordlist.words :
            return attempt

if __name__ == "__main__" :
    for length in sys.argv[1:] :
        print(pwgen(int(length)))

The while loop is used to check that we don’t generate a word that actually exists in our list.

$ python pwgen.py 1 2 3 4 5
LUM!
6nateR4
GRAB6SHIMALMS4
?lAXpORSviva&vels
Ba/saraZels9,mAKEBREW2

Hopefully you will not have the same output. ;-)

Using module random is actually a weakness because it uses a pseudo-random generator that is not considered secure enough to be used for cryptography. However, on systems where os.urandom is available, it is used to seed the pseudo-random generator, which should result in decent passwords. A stronger implementation would need to explicitly reimplement random.sample and random.randint using os.urandom.