Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Cryptic Crossword: Amateur Crypto and Reverse Engineering (muppetlabs.com)
131 points by breadbox on Feb 14, 2014 | hide | past | favorite | 16 comments


The punchline to this story is that a few years ago the Times stopped scrambling their .puz files, making all this reverse-engineering work largely irrelevant.

(At the time I was using a modified version of the "xword" program in Debian's repo, which didn't detect whether the file was scrambled. In other words, it treated every letter as wrong because it didn't match the enciphered grid. I ended up hacking in some code to detect these files and disable the check/reveal features when playing them.)


Very true! I mentioned that fact the first time I gave this presentation, but it wound up being an anticlimactic ending, so I chose to omit it from the written essay. (And at this point, the focus is more about the process of reverse-engineering anyway.)

EDIT: To be precise, there were still a few other crossword publishers using the scrambling feature. None as important as the New York Times, though, of course.


Problem solving is its own reward!


This is a great and entertaining read about reverse engineering.

It's such a good read that this is almost beside the point... but, as it happens, I worked on and reverse-engineered this same "encryption" scheme (I hesitate to use the word) for an iOS app that never shipped. I just dumped the code (which seems to have been written in late 2008) up on github... it's old, and messy, but hey, maybe it's fun for someone:

https://github.com/davepeck/puzfile


Very cool rundown of the approach to trying to decode this. FWIW there is also a significant archive of information about the format here, including information about the scrambling: https://code.google.com/p/puz/wiki/FileFormat


Indeed, staggering how patient and determined some people can be. I love this line when asked by his friend if he would be able to reverse-engineer this scrambling algorithm:

"My response was: maybe. Hard to say, but I'm willing to try. Privately, though, my reaction was THIS IS MY DREAM PROJECT AND THERE IS NO WAY I'M NOT SPENDING ALL AVAILABLE FREE TIME ON THIS."


yeah, i found that extremely helpful when writing the acrosslite module for a crossword format converter. i decided not to bother with scrambled grids for the moment, though, since they are a very acrosslite-specific feature; once the project progresses a bit further i'll go back and add them in.


Extremely methodical and determined approach. Especially analysis of the errors that partially successful approaches encountered. Well done.


>In a way this is just a restatement of Occam's Razor, but I like it because it clarifies why Occam's Razor is a good idea. It's not because simpler solutions are actually more likely to be true; they usually aren't. It's because it's almost always easier to improve a simple solution by adding complexity, than it is to improve a complicated solution by digging out a simple solution buried within it.

While the second part of that is an interesting observation, the first part is simply false. It basically comes down to prior probabilities and conjunctions. Every bit of new information implied by a hypothesis is another "and". It is a simple fact that P(X) ≥ P(X and Y), so the more conjunctions your hypothesis implies (the more complex it is) the lower its prior probability.


Wouldn't it have been easier to disassemble the program that works with these files, and analyze the code?


Maybe, but the author mentioned that he wanted to reverse engineer it as a black box, out of legal concerns. It makes a more interesting challenge this way too.


I particularly liked the approach to automation of the application, even through WINE. Setting the time with LD_PRELOAD is a really neat trick.


Ah, I skimmed the article and didn't see that part - makes for a much more interesting writeup anyway.


Man, reading little-endian binary formats makes my head hurt. I get why it's done that way, but what a nightmare for comprehending what you're reading.


After enough time spent reading hexdumps you get used to it, and then all of a sudden big-endian feels really backwards.


Well, assuming you're reading little-endian hexdumps. I cut my teeth on OS X back in the PPC days, so to me it is just the opposite.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: