pdfcrack

PDFs can be encrypted, so that you need a password to be able to open them (or, sometimes, to be able to print them, or do other things with them).

pdfcrack is a password-cracker for PDFs, enabling you to try both brute-force and dictionary attacks on encrypted PDF files for which you don't have the password. It's also available as a Debian package.

I recently received an encrypted PDF (together with the corresponding password), so I thought I'd try out pdfcrack to see how well it worked.

The tool allows you to specify the minimum password length to try, and the maximum you want it to work its way up to. By default it starts with 4-character passwords, and it tries combinations of lower case letters, upper case letters, and digits. I have no idea whether there are restrictions on the symbols which can actually be used in PDF passwords.

The encrypted PDF I was sent had a 16-character password (comprised, fortunately, of lower case letters, upper case letters, and digits).

I ran pdfcrack against this, setting both the minimum and maximum password lengths to try to 4 characters. Obviously it was going to fail, but I wanted to see how long it took.

It used one core of my 8-core 4GHz AMD CPU, and took 4 minutes to decide that it couldn't work out what the password was. As it runs, it gives an update every 20 seconds on what "word" it is trying, and how many words per second it is scanning. In my case it reported an average of 60,000 words per second.

I then tested it with 5-character passwords, and it took 4½ hours to fail to find the password.

I always like cross-checking things, so:

26 upper case letters + 26 lower case letters + 10 digits = 62 symbols. Therefore there are 624 = 14,776,336 possible 4-character "words" to try, and dividing this by 60,000 per second gives 246 seconds, which is close enough to 4 minutes for my liking.

Similarly 625 = 916,132,832 5-charcter "words". Divide that by 60,000 per second and the answer is 15269 seconds, or 4.24 hours. Again, close enough for me.

I then established that I could run 8 copies of pdfcrack on my machine simultaneously, each trying a different password length, and each using a different CPU core, with each one still achieving around 50,000 attempts per second.

Thus on my (fairly modest and not at all expensive) PC, I could test 8 x 50,000 = 400,000 passwords per second.

This seemed pretty good, until I then worked out how many 16-character passwords there are, and did the arithmetic:

6216 = 4.7 x 1028

4.7 x 1028 divided by 400,000 per second = 1.2 x 1023 seconds

You can divide this by 3600 to get hours, 24 to get days, 365.25 to get years, but then you run out of convenient units of measurement. I'm not sure if there is a definition of "ages", but the answer turns out to be 3.8 x 1015 years.

I believe the universe is currently estimated to be 14 billion years old, and assuming that the people who work these things out actually mean 14 thousand million years, even though they said "billion", that would be 1.4 x 1010 years.

This falls short of the time to crack a 16-character PDF password on my computer by a factor of 269,758. At least we've stopped having to use scientific notation by now, but it does mean that I either need 270 thousand computers of the sort I currently have, and then wait for half the age of the universe before I have a 50% chance of getting the password (for which I don't think I could afford the electricity bill), or I might as well just give up now and accept that 16-character randomly-chosen passwords are pretty damn good.

A dictionary attack would be far more efficient, since my copy of the OED contais only 1058 16-letter words, such as:

  • acquaintanceship
  • ambidextrousness
  • arteriosclerosis
  • circumnavigation
  • contemptuousness
  • counterclockwise
  • deoxyribonucleic
  • disqualification
  • diverticulectomy
  • enthusiastically
  • hieroglyphically
  • hydrodynamometer
  • imperturbability
  • inconclusiveness
  • inextinguishable
  • intraventricular
  • lithophotography
  • metamathematical
  • mispronunciation
  • neuropathologist
  • onomatopoietical
  • periphrastically
  • phototypesetting
  • predetermination
  • quadragintesimal
  • restitutionalist
  • sanguinification
  • transmutationist
  • unmentionability

I tried running pdfcrack with the full list of 1058 16-letter words, but it was so fast that the timings were meaningless. 0.15 seconds was about average.

Even on the full OED list of 261,958 words (of all lengths) it took only 4.3 seconds to try them all.


Go up
Return to main index.