Asymmetric Cryptography: A Different Animal

Message authentication using HMACs works just fine, but how do we distribute symmetric cipher keys among the users? We can pass them around on floppies or fancy USB pen-drives with encrypted partitions on them, but what if many users live all over the world? What if the physical key distribution method takes time and the keys must be frequently changed? This is the case with the traditional WEP, which should be rotated every few minutes.

Key-encrypting keys (KEKs) were offered as symmetric cipher keys used only to encrypt other symmetric cipher keys before they are distributed. Therefore, only the distribution of KEK is required. Still, how do we distribute the KEK in a secure manner? Won't it become a single point of failure for the whole organization? A model of physical KEK distribution would become very vulnerable to social engineering attacks and we know that social engineering tends to wreak more havoc than all known cracking tools combined (see Mitnick's The Art of Deception (John Wiley & Sons, 2002, ISBN: 0471237124) as a reference). Besides, from a management viewpoint, won't such a system give too much power and responsibility to a small group of people, perhaps even a single person on a technical team?

The answer lies in using asymmetric ciphers, something totally different from everything we have reviewed in this chapter so far. As we have seen, one-way hashes are nothing more than fancy symmetric ciphers that take a constant of necessary length as plaintext, enciphered data as a large "key," and run a huge amount of complex rounds to make the decryption unfeasible. Symmetric ciphers are nothing more than sophisticated, modern-day, digital Enigma-style rotor machines. Replace the rotors and cogwheels with CPU registers and available instructions, make them operate in accordance with well-established laws and principles (Shannon, Feistel, etc.), and you will get the idea.

Asymmetric ciphers, on the contrary, are based on solving specific mathematical tasks in the world of large numbers. In layman's terms, imagine an equation impossible to solve without a certain variable. That variable is kept secret and is called a private key. The rest of the variables can be given to anyone else to initiate the task; this is called a public key. The algorithm of the equation itself does not have to be secret, and encrypting or decrypting data depends on the success of solving the equation. To get closer to the heart of the problem, imagine a cryptographic hash function that is relatively easy to compute but practically impossible to invert, unless a certain value is known. That value (or, more likely, values) is called a trapdoor. The mathematical relationship between the trapdoor (the basis for the private key) and variables given to the public (the basis for the public key) is very costly to solve, making the deduction of private key from the public one close to impossible if you take into account the computational power of today's machines. This is referred to as a hard problem.

As far as the practical implementation of such a mathematical concept goes, mankind came up with three secure hard problems to use: factoring large numbers into prime factors, calculating discrete logarithms in a finite field, and, as a variation of this, calculating elliptic curve discrete logarithms. All these problems have one thing in common: Although conceptually they might not be too difficult to solve, in practice and with current computing power, solving one of these problems might take more time than it takes our universe to expand to the point of collapse and the next Big Bang.

Whitfield Diffie and Martin Hellman proposed the idea of asymmetric cryptography in 1976. Their method was based on calculating discrete logarithms in a finite field. Although it might sound sophisticated to a non-mathematician, in reality the Diffie–Hellman (DH) system is very simple and elegant.

The Examples of Asymmetric Ciphers: ElGamal, RSA, and Elliptic Curves

Let's take a look at the modular arithmetic first. Modular arithmetic differs from standard math by using numbers in a range limited from zero to some number n, which is the modulus. When an operation produces a number greater or equal to the modulus, that number is divided by the modulus and the reminder is taken as a result. When an operation produces a negative number, the modulus value is added to it until we get a result in the zero-to-modulus range. For example, 5 + 5 mod 8 = 2 and 3 - 5 mod 7 = 5. In modular arithmetic, exponentiation works as a one-way function. Whereas it is easy to calculate y=g^x mod n, it is much harder to find x knowing other numbers in the equation, in particular when the numbers are sufficiently large. This is the finite field (0 to n) discrete logarithm problem in a nutshell, because x is the logarithm of y base g mod n and the numbers used are finite and whole. Mathematically, we can take two discrete logarithm equations, let's say ya = g^xa mod p and yb = g^xbmod p, where p is a prime number (which means it can only be divided by 1 and itself). In these equations, xa and xb values are the private keys and ya and yb values are public keys generated from the private ones. Let's swap the public keys, keeping the private keys secret, and use these public keys instead of g to generate key K:

The essential part here is ya^xb mod p = yb^xamod p, which means that by exchanging the public keys, both sides can generate message key K, which the sides share but do not exchange! Obtaining the key K not knowing xa or xb is not an easy task, at least resource-wise. Let's illustrate it with small numbers. Take p = 11, g = 5, and private keys xa = 2 and xb = 3:

Now to the hard problem: Without using a calculator, try to find both private keys knowing p = 11, g = 5, and the public keys are 3 and 4. Even better, use larger values for p, g and both public keys ya and yb. When you are back from this task, remember that the private key numbers used in the real-world implementations of the DH system (and the closely related ElGamal system) are at least 1,024 bits long! Actually, the minimal recommended size of a private key for the U.S. government Digital Signature Algorithm (DSA) standard, which uses ElGamal, is 2,048 bits. You get the idea.

Another very common asymmetric cryptosystem is RSA from Rivest, Shamir, and Adleman. RSA was the first asymmetric encryption method applied in practice. It is based on a hard problem of factoring large numbers in a given group of numbers from 0 to the modulus n. Take two large prime numbers p and q. The modulus would be n = p x q. Then compute the number of integers that are less than n and cannot be divided by n: f(n) = (p - 1)(q - 1) (f is known as the Euler phi function). Select a random number b under the condition that b cannot be divided by f(n) (this is called "being relatively prime to f(n)"; f(n) would be relatively prime to n). Finally, calculate a = b – 1 mod f(n). Keep a, p, and q secret. Give n and b as a public key.

If you know numbers 15 and 7, can you easily deduce numbers 2, 3, and 5? How about trying it with 2,048-bit numbers?

Finally, the elliptic curves-based asymmetric cryptosystems use determining the coordinates of points on elliptic curves as a hard task, presenting a relation between two different points on a curve as the private key, and coordinates of one of these points as the public key. Essentially, elliptic curve systems are a variation of the discrete logarithm problem, but you use a two-dimensional universe of the curve instead of the straight linear algebra we saw in the discrete logarithm method. Let's take an elliptic curve restricted by a prime number modulus p, as shown in Figure 12-2.

To find out if a certain point is positioned on the curve, check if its coordinates (x,y) fit into the equation that describes the curve: y² = x³ + ax³ + b (mod p).

It is possible to define addition and subtraction of two or more points on the elliptic curve: If both P and Q are points on the curve, then P+Q and P-Q are also somewhere on the curve and their coordinates can be determined. Now, fix a prime modulus p and a curve E(F_q). Take the point P and the point Q, which is a multiple of P: Q = kP. Then the discrete logarithm problem is to find the number k (private key), knowing the coordinates of the point Q (public key). The complexity of this task in practical terms is such that a key only 224 bits long is considered to be as secure as the RSA 2,048-bit key. This saves both memory space (important for restricted-resource devices) and key generation time (important when the keys are frequently changed, e.g., on a per-session basis).

Practical Use of Asymmetric Cryptography: Key Distribution, Authentication, and Digital Signatures

The basic idea of using asymmetric cryptography is distributing public keys while keeping the private keys private and using a person's public key to encrypt data sent to this particular individual. This is defined as secure message format. The distribution of public keys can be done in a hierarchical manner (using X.509 certificates) or as a "brotherhood of the ring," establishing the ring of users who share each others' public keys. The last model is used by free privacy-protection software such as PGP and GnuPG. Public key infrastructure (PKI) can be deployed, so that anyone interested can download public keys from the centralized server instead of asking the receiving sides to send them. Such servers can be public (e.g., blackhole.pca.dfn.de and horowitz.surfnet.nl) or privately deployed by your company or organization.

Although the secure message format addresses data confidentiality, it does not provide authentication. This creates a well-documented vulnerability to man-in-the-middle attacks, when an attacker placed between both sides replaces public keys exchanged with his or her own public key. Thus, the attacker can decrypt the data coming from both ends with his or her own private key and forward it to some guy named Bill. At the same time, the attacker can encrypt the decrypted data with public keys of the victims and forward it to its intended destinations. Thus, the attack is completely transparent and the victims would not even suspect that their data has been snooped on. To avoid having Bill read your supposedly secret e-mails, some form of authentication is necessary. That can be done by reversing the process and encrypting the data with your private key. In such a case, anyone with your public key can decrypt and read the data, knowing that the data comes from you and no one else if it was decrypted successfully. This is defined as open message format. Open message format provides nonrepudiation service: An entity is bound to the pair of keys and cannot deny itself as a source of the data sent. The only claim the sending side can make is that the data was modified on the way to the destination. However, we know the method to prove (or disprove) such a claim: one-way hashes. Thus, we can take a one-way hash of the data and encrypt it with the public key before sending it. This is how digital signatures work, providing both nonrepudiation and data integrity services.

Digital signatures carry as much legal weight as conventional signatures, if not more, although the law in your country might be different on this issue; conventional signatures are much easier to forge. To forge a digital signature, the fraudsters must have root-level access to the server that stores the organization's private keys. Thus, such servers must use a stable, secure OS and undergo regular security audits. In some operational systems, commands exist that make the file immutable and undeletable (e.g., chattr +i in Linux). Applying such commands to the private key and then deleting the command binary from the system can confuse some attackers who manage to gain access to the system. It is a good idea to place the private-keys-storing host on a different subnet and implement fascist router access lists, restricting access to the server on a strict "need-to-know" basis. In higher security settings, private keys can be stored on a PDA or laptop kept offline in a durable safe and turned on only when enciphering and signing are necessary. Of course, a removable hard drive or Zip drive or read-only CD can be used for private keys instead of the whole machine; the choice of protection method is yours. Do not forget that the human factor is the weakest link, and only trusted personnel should have access to your private keys. The rest of the employees should not even know how and where the keys are stored.

There are two common digital signature algorithms in use: Digital Signature Algorithm (DSA) and the RSA Signature Scheme. The RSA Signature Scheme is founded on the RSA asymmetric cryptosystem and uses MD5 or SHA-1 for one-way hash generation. It was a de facto standard in digital signature generation and verification before the U.S. government introduced DSA. DSA is based on the ElGamal asymmetric cryptosystem and employs SHA-1. A more secure variety of DSA is the Elliptic Curve DSA (ECDSA). Although (provided the key size is 2,048 bits or higher) both RSA and DSA offer a sufficient level of security, the speed of operations involving both algorithms is different. RSA works much slower when operations involve the private key; the opposite is true for the DSA. Thus, DSA is far more efficient when it comes to signature generation and signing (server side), and RSA is more appropriate for signature verification (client side).

As you probably already realized, although digital signatures provide nonrepudiation and data integrity, no data confidentiality is supplied. A solution for this problem is secure and signed format:

Generate a message digest of the data.
Encrypt both data and hash with the private key.
Encrypt the result with the receiver's public key.

A secure key backup solution can be both a difficult task and a hard decision to make. We leave it to you, because the key backup saves you from the unfortunate consequences of key loss, but introduces an additional target for private key-hungry intruders.

The question is this: If there is a secure and signed asymmetric cryptography format, why do we still have to use symmetric ciphers?

There are two answers: performance and key size. If the throughput of symmetric ciphers is estimated in megabytes per second, throughput of asymmetric ones is counted in kilobytes per second. The speed of RSA encryption (1,024-bit key) is about 1,500 times slower than the speed of enciphering with any of the five AES finalists. Such performance can introduce unacceptable delays in host and network operation, in particular when wireless networking is involved. Also, even the smallest acceptable 1,024-bit asymmetric cipher keys can be a problem for limited-resource devices like smart cards or mobile phones. Thus, a compromise between asymmetric cryptography secure key exchange and nonrepudiation properties and the performance of symmetric ciphers has to be found. Such a compromise exists in the form of hybrid encryption or digital envelopes:

This model is used in operation of public key cryptographic systems employed by tools like PGP and GnuPG. These tools can use RSA or DSA for asymmetric key generation. A wireless-relevant implementation of GnuPG is its use by the NoCat wireless authentication portal to sign the messages exchanged, thus avoiding the forgery so easily performed on WLANs. When key exchange is implemented in various networking operations, the key agreement is frequently done using the original DH scheme operation based on the discrete logarithms in the finite space calculation problem. The DH standard is outlined in NIST FIPS PUB 186-1 and FIPS 186-2. Common DH key sizes are 768, 1,024, and 2,048 bits. Authenticated DH uses digital signatures to foil man-in-the-middle attacks and has proven to be quite reliable, but slow. ACLs based on the Authenticated DH signatures can be implemented when running IPSec. To address some of the DH cryptosystem drawbacks, the Elliptic Curve DH key exchange scheme was proposed. It has obvious performance and keyspace size advantages over the original DH implementation. Unfortunately, the Elliptic Curve DH key exchange scheme is not currently widely implemented by hardware and software vendors.

On this point we conclude our discussion of asymmetric cryptography and applied cryptography background in general and move to the security protocols and software tools that implement the principles and algorithms we have discussed.