Making and Breaking Codes

Lies at the Speed of Light

Taming the Automobile

Making and Breaking Codes

There are two kinds of encryption in the world—one that will prevent your little nosy sister from peeking at your diaries—the other would prevent even the most powerful governments on earth. The previous sentence is paraphrased from a comment by Bruce Schneier in his book on Applied Cryptography. What makes these two types of encryption separate from each other is quite counter-intuitive. The former kind consists of techniques that depend on secrecy. The second kind depends on openness, full disclosure and peer review. The first kind should have become extinct by now, yet it keeps popping up in many current products. The second kind is of course, useful.

Encryption is a method for taking a written document (plaintext) and transforming it to a unreadable document (ciphertext) such that the original document is somehow recoverable from the ciphertext. For example, we can take this column and replace every occurrence of the letter “e” with the letter “x”. Now the column would be hard to read, but in a few seconds a reader would realize what was done, and be able to undo the replacements. Of course, we could invent a better scheme. Suppose we replace “a” with “p”, “b” with “c” and so on. The complete list of transpositions can be put into a table and used to encrypt (forward substitution) as well as decrypt (backward substitution). The table, is then known as they “key”. Anyone who has the key can decrypt a document if it is encrypted with the same key.

What if, Alice encrypts her diary using the above technique and then puts the key in a very safe place. Bob finds the encrypted diary, and wants to read it, but does not have the key. Can he manage to do it? The answer is Yes.

Bob can use one of two techniques—brute force or cryptanalysis. The brute force technique attempts to decrypt the message using all possible keys. In the above example, there are a total of 403,291,461,126,606,000,000,000,000 keys (26 factorial). The brute force technique tries all possible keys and looks at the resulting decryption and checks whether it looks like English. If a computer could test 100,000,000 keys a second (that would be faster than any computer invented yet) then it would take this computer 127,882,883,411 years to find Alice’s real writings.

The above makes it apparent that the simple transposition cipher is a very “strong” cipher that is very hard to “break”. By breaking a cipher, we mean, getting the plaintext from the ciphertext without having a key. However, the above example is severely flawed, the transposition cipher is, in reality, very weak.

There is no need to try a brute force on that transposition cipher—it takes a few minutes, or maybe an hour of computer time to decipher the diary without a key. The technique is to use letter statistics and letter positions (“e” is the most common letter, vowels are common, vowels are located inside words, and so on). Many possible keys can be built and then tried, and the original document can be recovered quite fast. This method (as well as many others) forms a class of code breaking techniques called cryptanalysis.

What is a good cipher? There is only one known cipher that is provably unbreakable; it is called the “one-time pad”. In this cipher, each letter is transposed to another letter a random distance away. The transposition distance for each letter is the key (the sequence of distances have to be as long as the message, and cannot contain repetitions). Since, the key is just as long as the message, this method is not at all useful.

Anyone can build a cipher. But how do we know the strength of the cipher? Normally the strength of a cipher is often expressed as the key length—or the number of binary digits needed to store the key. If a cipher has a strength of N bits, it means there are a total of 2^N different keys. Today, ciphers over 64 bits are considered somewhat strong. The IDEA cipher, which is heavily used on the Internet, has 128 bit keys. Note that the transposition cipher, we discussed earlier, has an 88-bit key that makes it apparently very strong, yet in reality it is a very weak cipher. Hence key length does not quite tell the whole story, the actual working of the encryption algorithm determines the strength of the cipher.

The quest for good ciphers has been the holy grail of cryptographers. Until about 1970 cryptographic work was done in almost complete secrecy, mainly by secret organizations formed by governments of powerful counties. In the US, the organization that has the most experience with cryptography is the National Security Agency or NSA. In fact, during the early days, the NSA was so secret that NSA denied any existence of itself. People, who knew NSA existed, used to call it “No Such Agency”.

Today NSA is entrusted with preserving the national security of the United States, using electronic surveillance. They eavesdrop on all communications (voice, data, satellite, cable) going in and out of the country. They figure out how to find “interesting” things in the huge amount of electronic traffic that speed along the data and voice superhighways connecting the United States to the rest of the world. It is a daunting task—we know the NSA does it, but no one knows how they do it. The NSA is also the world’s largest employer of high-caliber mathematicians. They are the largest powerhouse of code-makers and code-breakers.

In 1976 the National Bureau of Standards (NBS) published an encryption algorithm called the Digital Encryption Standard (DES). DES has a strength of 56 bits. The complete description of DES and how it works is freely available. DES has roots in an algorithm by an IBM researcher, and was subsequently scrutinized, modified and blessed by the NSA. DES became the most heavily used and studied cipher in the world. Given that the NSA had a hand in the creation of DES, many experts feel DES is flawed, that is there is a backdoor technique built into the algorithm, such that it is possible to decrypt DES encoded messages without knowing the key. Of course, this backdoor is only known to the NSA.

Till today, this backdoor has not been discovered (even though the entire algorithm is open to scrutiny). More so, no one has been able to find a method of breaking DES other than using brute force. The brute force attacks on DES have yielded some interesting results. An organization called EFF (Electronic Frontier Foundation) built a custom computer—a machine built to do brute force attacks on DES and then showed that this machine, called the EFF Cracker, can break DES in about 60 hours (July 1998). After that, another experiment used the EFF Cracker and 100,000 computers to get a DES breaking time of 22 hours.

Hence, we know today, that DES is weak. (A strong cipher should not be breakable in anything less than a few thousand years). A variant of DES called Triple-DES where the plaintext is encrypted three times with three different keys is almost universally accepted as a very strong encryption method.

Again, what is a strength of a cipher? DES is 56 bits and lives up to it, it can only be broken by brute force. The transposition cipher seems to have 88 bits, but it is weak. The IDEA cipher (128 bits) is very likely very strong, as no one yet has found any weaknesses. The weaknesses are often found by people who spend their lives inspecting ciphers to find a way to reverse engineer them and find the plaintext from the cipher without needing too many key test attempts.

Over the years, one irrefutable fact has emerged—good ciphers are ciphers are ones whose workings are made public. If I invent a cipher and publish all of its details, many of the cryptographers will take a very close look at it and try to see if there are ways to analyze, reverse engineer and break the cipher (without the need for brute force). If for many years no one is able to break my cipher, then it is likely to be a good cipher. If, as in DES, it is not broken in 25 years, then we can really depend on the cipher (DES is broken by brute force, there is ample evidence that DES has 56 bit strength).. In reality, most ciphers are broken very easily. This makes life quite difficult for those few misguided people who still think that a secure cipher is a secret cipher.

A lot of commercial products use secret (also called proprietary) ciphers. The mindset of these weak-minded people is that if the internal workings of the cipher is kept secret it would be hard to break. Over and over again, the cryptanalysts have proven them wrong and yet they keep committing the same costly folly. Two such idiocies in recent times are the large-scale adoptions of the GSM cipher and the DVD cipher.

The GSM cellular phone has a little chip in it that does encryption. The algorithms used by the GSM cipher were kept secret before it was deployed. The GSM industry did not publicly tell anyone how it worked. In 1998, Briceno, Goldberg and Wagner of University of California at Berkeley discovered (reverse engineered) the algorithm used in GSM authentication and found a cracking technique that takes 10 hours on a PC (easy). Subsequent to that, in 1999, Biryukov and Shamir of the Weizmann Institute (Israel) showed how to crack the GSM voice encoder in 2 seconds. The secrecy at the design phase is now blamed for this screw up.

But the folly keeps on going. The movie industry wanted to put encryption on DVD disks (the currently popular movie format) to prevent piracy. Some brilliant jokers invented a scheme in secrecy and convinced the DVD consortium (more techno illiterates) of its wonderful security features. In late 1996 this brew was accepted as a worldwide standard for DVD copy protection, and soon, in 1998 the DVD players were rolling off of the assembly lines. In 1999 DVD, the code was cracked by two unrelated groups of underground hackers, one call themselves MoRE (Masters of Reverse Engineering) and the other calls themselves DoD (Drink or Die). The software to unlock DVD’s (called DeCSS) is now available on hundreds of web sites all over the world. The movie industry is fighting back with huge lawsuits to stop the distribution of the DeCSS software. The genie is out of the bottle, and it is much too hard to put it back.

Of course, the GSM people or the DVD people could have just used any of the hundreds of well-known, publicly available ciphers—instead of building their own. They chose not to, for unknown possibly lunatic reasons, and have proved to the world, once again, that code making is a very complex art. Only openness and disclosure of ciphers make them strong and provide high levels of security.

Partha Dasgupta is on the faculty of the Computer Science and Engineering Department at Arizona State University in Tempe. His specializations are in the areas of Operating Systems, Cryptography and Networking.

Partha Dasgupta