A friend of mine recently implemented a brute-force password cracker in Python. I, having never looked into this kind of thing before, became curious about the implementation details. He showed me his code and it was surprisingly simple - just one recursive function and a small loop that called the function. It wasn’t actually a proper password cracker - you supplied a word and it would generate all possible words (by combining letters from a list) until the supplied word was matched. As it stands, it is useless but it wouldn’t be hard to make it produce SHA1 or MD5 hash of each word it generated - this would allow it to break hashed passwords.

The main problem with the code was that it was extremely slow. I wanted to see how much faster it would run if Psyco (a JIT complier for Python) was used. I wrote my own implementation of the original code and took a few timings:

#!/usr/bin/env python
 
import sys
 
chars = 'abcdefghijklmnopqrstuvwxyz0123456789'
 
def crack(w, pos, p, pwd):
    for c in chars:
        cr = None
        if pos < w:
            cr = crack(w, pos+1, p+c, pwd)
        if cr:
            return cr
        if p+c == pwd:
            return p+c
 
if __name__ == '__main__':
    try:
        pwd = sys.argv[1]
    except:
        print 'Usage: %s <password>' % sys.argv[0]
        sys.exit(1)
    p = None
    w = 1
    max_len = 10
 
    while not p and w < max_len:
        p = crack(w, 0, '', pwd)
        w += 1
    print 'Found', p

Here are the results of the timings for 4, 5 and 6 character passwords (all timings taken on a 2.4GHz Macbook Pro w/ 2GB RAM):

$ time ./python_brutepass zzzz
Found zzzz

real	0m0.567s
user	0m0.542s
sys	0m0.013s
$ time ./python_brutepass zzzzz
Found zzzzz

real	0m18.779s
user	0m18.632s
sys	0m0.046s
$ time ./python_brutepass zzzzzz
Found zzzzzz

real	11m9.131s
user	11m4.323s
sys	0m1.190s

11 minutes to match a 6 character word (which is only lowercase characters and numbers)?! Once you’d added in a hashing step, I imagine this would take a great deal longer still.

I added in a couple of lines at the top of the code:

import psyco
psyco.full()

and took the timings again:

$ time ./python_brutepass zzzz
Found zzzz

real	0m0.134s
user	0m0.112s
sys	0m0.014s
$ time ./python_brutepass zzzzz
Found zzzzz

real	0m3.486s
user	0m3.435s
sys	0m0.022s
$ time ./python_brutepass zzzzzz
Found zzzzzz

real	2m8.056s
user	2m6.994s
sys	0m0.254s

With Psyco enabled it ran ~5x quicker - not bad for an extra two lines of code.

Anyway, these examples show that such computationally expensive operations aren’t really possible in Python. To find a 5 character password, the maximum number of operations needed (assuming no hashing is done) is ~62193781. I worked this out as follows. If n is the length of the password (5), and l is the number of letters being used (36 in this case for lowercase letters + numbers from 0 to 9), the total number of possible permutations with repitition for each length of word is ln, but as the length of the word would not be known, the total number of permuatations with repitition is the sum of lk for k=0 to n. This is actually a geometric progression and thus can be calculated like this: