< Day Day Up > |

## Modern-Day Cipher Structure and Operation ModesThe cipher strength depends on the key length, key secrecy (including appropriate key management and distribution), and the design of the algorithm itself. Claude Shannon, who is considered to be a father of modern cryptography by many researchers, proposed two essential elements of cryptographic systems. He designated these elements as diffusion and confusion. Diffusion refers to eliminating the relation of ciphertext statistical composition to that of the plaintext. The confusion element states that the relationship of the statistical composition between the ciphertext and the value of a key must be as complex as possible. The Lucifer cipher, published by Horst Feistel from the IBM cryptography team in 1973 (56 years after Shannon), was the first cryptosystem to employ Shannon's principles. The way these principles were implemented involved combining multiple permutations and substitutions in a sequence. Each performed substitution or permutation of data was designated as round, with the whole sequence known as iteration. Instead of using the key as it is, a key schedule algorithm was employed to generate the subkeys from the original key, further XORing different subkeys with blocks of data as it goes through the iteration process. Thus, a sufficient level of confusion was achieved. ## A Classical Example: Dissecting DESIt was on the basis of Lucifer that the Data Encryption Standard (DES) for the next 20 years was developed. Because of the importance of DES and the elegance of its design, we focus more on its mathematical description, clearly demonstrating Shannon's elements at work. Once you understand how DES works, you can easily grasp the inner workings of post-DES ciphers by comparing their machinery to that of DES. In our experience this tremendous simplification is very helpful. In addition to all previously described simple algorithms, DES is a symmetric block cipher. Symmetric refers to the fact that all users of such cryptosystems must share secret keys (usually one key per user) to encrypt and decrypt data. The secrecy of these keys should never be compromised; that is why shadowing of One would expect that if DES uses a 64-bit plaintext block size, the key size is also 64 bits. However, not everything is that simple. DES key bits in positions 8, 16, 24, 32, 40, 48, 56, and 64 are used for error control (which is basically assurance of bytes' odd parity; each byte has an odd number of 1s). Even more, when the data goes through DES rounds, only 48 bits of the remaining 56 bits of key space are selected for use. DES iteration starts and ends with the initial and final permutation functions that exist for data input–output convenience and does not affect overall DES security. If a 64-bit L => the data block is split on the "right" and "left" sides, 32 bits each._{0} = R_{0} = 32 bitThe initial permutation is followed by 16 rounds of substitution and transposition. If we define the cipher function as K as _{s}K, the 16 rounds can be described as follows:_{i}x This means that as the data goes through these rounds, the right side of data passes the cipher function and the left side is XORed with the function output on the right to give new function input on the right. The process then repeats itself until What happens inside of the First of all, we mentioned that only 48 bits of key space is eventually used. In practical terms it means IP(x) is 32 bits. Thus, the first step in a round is an expansion permutation E(R, which expands the right side of the data from 32 to 48 bits (apparently, it is the reason why the right side of data passes the cipher function). How does one magically turn 32 bits into 48?_{i})Split 32 bits of data into 4-bit input blocks. Then shift the data in input blocks so that the last bit of each 4-bit block becomes the first bit of the next output block and the first bit of each input block becomes the last bit of the last output block. Thus, using this shift, each input block donates 2 additional bits to each output block. Providing that the original 4 bits in every input block is passed into a corresponding output block, we get 8 x 6-bit output blocks out of 8 x 4-bit input blocks. Problem solved! Even more, the expansion permutation exhibits a very effective cryptographic property called the avalanche effect. Because the expansion permutation presence efficiently generates more ciphertext output from less plaintext input, small differences in plaintext produce vastly different blocks of ciphertext. Once we get our 48 blocks, we can XOR them with the subkeys: However, the round does not end with this XORing. Remember that we have to get 64 bits of encrypted data from 64 bits of plaintext at the end, so we need to get back to two 32-bit blocks of data, essentially reversing the expansion permutation function after it did its job. This is accomplished by splitting 48-bit blocks of data into 8 x 6-bit blocks and feeding them into so-called S-boxes, which produce 4-bit output from each 6-bit data block. In the S-box, the first and last bits of the 6 bits supplied form a binary number to select a row. The inner 4 bits are used to select a column. Thus, an S-box is simply a table with 4 rows and 16 columns. Four inner bits form a number positioned in the table and selected via row determined by 2 "external" bits and column determined by 4 "internal" bits. Thus, 2 outer bits are cut away (but participate in the 4 inner bits selection) and the remaining 4-bit numbers are reshuffled in a predetermined matter. The function of the S-box is the most nonlinear event in the whole process of DES iteration and is responsible for the lion's share of DES security. Total memory requirement for all eight S-boxes used is 256 bytes. The 4-bit output from each S-box is concatenated back into 32 bits of data before putting these 32-bit blocks through another permutation. This time the permutation is defined as "straight"; no bits are "reused" to expand the data and no bits are ignored. Basically, 32 bits of data is fed into a P-box with 2 rows and 32 columns, and numbers between rows exchange places. Did you already forget that everything we just discussed applies only to the right side of the initial 64-bit block? After the P-box spits out 32 bits of data, they are XORed with the left 32-bit half of the initial input. Then both halves switch places and the next round can begin. Following Round 16, the left and right halves are not exchanged. Instead they are concatenated and fed into the final permutation, which exchanges halves and concatenates them together in a way opposite to the initial permutation IP. The last thing to consider inside of the function box is First, the 8 parity bits are subtracted and the remaining 56 bits are split in half. This split is similar to C_{i} = LS_{i} (C_{i-1}) ; D_{i} = LS_{i} (D_{i-1}).Between Rounds 3 and 8 and 10 and 15, 2-bit left shifts are done, otherwise it is a 1-bit shift. After the shift, D are concatenated, and we are left with 56 bits of key data that must be XORed with 48 bits of data produced by the expansion permutation on the right side. Thus, we need a compression permutation function PC-2 for the keying material to match the corresponding (permutated) "plaintext": _{i}K._{i} = PC-2 (C_{i}D_{i})PC-2 does not use any specific algorithm to shift the positions of bits or drop some bits to get 48-bit output. Instead it uses a predefined table with numbers of bit positions. Bits are assigned their new positions in accordance with their position in the table. For example, position one from the beginning of the PC-2 table is 14, so the 14th bit is assigned the first position in PC-2 output. The table for PC-2 is widely published in the literature (see p. 274 in Schneier's Applied Cryptography). As a result of PC-2 we get a 48-bit The images of cipher structures from John Savard's home page cryptography section (http://home.ecn.ab.ca/~jsavard/crypto/entry.htm) are so wonderful that we could not resist borrowing them for this chapter. Sometimes, a picture can be worth a thousand lines of code! Figure 11-1 summarizes all we went through, with the left scheme representing the whole iteration of DES and the right scheme representing a single round. ## Figure 11.1. DES structure and operation.## Kerckhoff's Rule and Cipher SecrecyYou might have wondered, if DES is a U.S. government standard, why were its inner workings given away to the general public? The answer is this: Keeping the cipher closed from scrutiny does no good for the cipher, its developers, and its users. As early as 1883, Jean Guillaumen Hubert Victor Fransois Alexandre Auguste Kerckhoff von Nieuwenhof (yes, some people have rather long names) wrote that the key used and the cipher's function must be two separate entities and cryptosystems should rely on secrecy of keys, but not algorithms. Some 111 years later, a proprietary secret algorithm, RC4, was published on the Internet by an unknown hacker who posted its source code to the Cypherpunks mailing list. Opening the RC4 structure quickly led to the development of several attacks on the cipher (however, these attacks aren't related to the weakness of WEP, which uses RC4). Whereas the developers of RC4, RSA Data Security, Inc., are well-reputed cryptographers who created a variety of strong product ciphers, many small companies that claim to develop highly efficient and secure secret encryption algorithms often do not offer anything more than a variety of ROT13 with a single "round" of XORing. It appears that open sourcing ciphers, just like open sourcing software, has advantages when it comes to security, public scrutiny being one of them. Another advantage of DES openness is that now you have learned about S-boxes, subkeys, expansion, and compression, and straight permutations, using a classical and still practical (considering the use of 3DES and still running legacy cryptographics software and hardware) example. Now it is much easier to explain how post-DES ciphers work, not to mention saving a lot of space and our time. ## The 802.11i Primer: A Cipher to Help Another CipherAs a very relevant example, the per-packet key mixing function in TKIP is a small Feistel cipher on its own, developed by Doug Whiting and Ronald Rivest. The purpose of this function is producing a per-packet key from a temporal key and the IV or TKIP sequence counter (TSC). This per-packet key then serves as a secure WEP seed, eliminating the risk of an FMS-style attack. Such a WEP seed can be computed before it is used, which positively affects the network performance. Let's walk through the per-packet key mixing function in detail, as defined by the 802.11i standard drafts available at the time of writing. As mentioned in the previous chapter, there are two phases of per-packet key mixing function operation. However, both Phase 1 and Phase 2 of per-packet key generation rely on the S-box that substitutes one 16-bit value with another 16-bit value. The substitution function is nonlinear and is implemented as a table lookup. The table lookup can be implemented as a single large table with 65,536 entries and a 16-bit index (128 KB of table) or two different tables with 256 entries and an 8-bit index (1024 bytes for both tables). When the two smaller tables are chosen, the high-order byte is used to obtain a 16-bit value from one table and the low-order byte is used to obtain a 16-bit value using the other table. The S-box output in this case would be the XOR of the two 16-bit values selected. The inputs taken by the first phase are 80 bits of the 128-bit temporal session key (TK), the transmitter MAC address (TA), and the 32 bits of the IV = TSC. Its output (TTAK) is also 80 bits in length and constitutes an array of five 16-bit TTAK, _{1}TTAK, _{2}TTAK and _{3,}TTAK values. The description of the Phase 1 algorithm treats these values as 8-bit arrays: _{4}TA and _{0}..TA_{5}TK._{6}..TK_{12}XOR, ADD, and bitwise AND operations are used in the Phase 1 computation. A loop counter The Phase 1 algorithm consists of two steps. The first step does an initialization of TTAK from both IV and MAC address, but without the temporary key. The second step employs the S-box we outlined earlier to mix the keying material into the 80-bit TTAK and sets the PHASE1_LOOP_COUNT value to 8: Input: transmit address TA0...TA5, temporal key TK0..TK12, and TSC0..TSC2 Output: intermediate key TTAK0..TTAK4 PHASE1-KEY-MIXING(TA0...TA5, TK0..TK12, TSC0..TSC2) PHASE1_STEP1: TTAK0 <= TSC0 TTAK1 <= TSC1 TTAK2 <= Mk16(TA1,TA0) TTAK3 <= Mk16(TA3,TA2) TTAK4 <= Mk16(TA5,TA4) PHASE1_STEP2: for i = 0 to PHASE1_LOOP_COUNT -1 j = 2(i & 1) TTAK0 <= TTAK0 + S[TTAK4 ^= Mk16(TK1+j,TK0+j)] TTAK1 <= TTAK1 + S[TTAK0 ^= Mk16(TK5+j,TK4+j)] TTAK2 <= TTAK2 + S[TTAK1 ^= Mk16(TK9+j,TK8+j)] TTAK3 <= TTAK3 + S[TTAK2 ^= Mk16(TK13+j,TK12+j)] TTAK4 <= TTAK4 + S[TTAK3 ^= Mk16(TK1+j,TK0+j)]+i end The inputs to the second phase of the temporal key mixing function include the output of the first phase (TTAK), the TK, and the lower 16 bits of the TSC. The created WEP seed possesses an internal structure that conforms to the original WEP specification. The first 24 bits of the seed are transmitted in plaintext in the same way as with the old WEP IVs. As mentioned in the previous chapter, these 24 bits are used to convey the lower 16 bits of the TSC from transmitter to receiver. The remaining 32 bits are conveyed in the Extended IV (EIV) field, in Big-Endian order. In Phase 2, both TK and TTAK values are represented as in Phase 1. The WEP seed produced is an array of 8-bit values ranging from XOR, ADD, AND, OR, and the right bit shift operations (>>) are employed in the process of Phase 2 computation that relies on four functions: Lo8 references the least significant 8 bits of the 16-bit input value. Hi8 references the most significant 8 bits of the 16-bit value. RotR1 rotates its 16-bit argument 1 bit to the right. Mk16 was already described when outlining Phase 1.
Phase 2 consists of three steps: - STEP1 copies the TTAK and brings in the 16 bits of TSC.
- STEP2 is the S-box.
- STEP3 brings in the remaining TK bits and defines the 24-bit WEP IV values transmitted.
Input: intermediate key TTAK0...TTAK4, TK, and TKIP sequence counter TSC. Output: WEPSeed0 ...WEPSeed15 PHASE2-KEY-MIXING(TTAK0...TTAK4, TK, TSC) PHASE2_STEP1: PPK0 <= TTAK0 PPK1 <= TTAK1 PPK2 <= TTAK2 PPK3 <= TTAK3 PPK4 <= TTAK4 PPK5 <= TTAK4 + TSC PHASE2_STEP2: PPK0 <= PPK0 + S[PPK5 ^= Mk16(TK1,TK0)] PPK1 <= PPK1 + S[PPK0 ^= Mk16(TK3,TK2)] PPK2 <= PPK2 + S[PPK1 ^= Mk16(TK5,TK4)] PPK3 <= PPK3 + S[PPK2 ^= Mk16(TK7,TK6)] PPK4 <= PPK4 + S[PPK3 ^= Mk16(TK9,TK8)] PPK5 <= PPK5 + S[PPK4 ^= Mk16(TK11,TK10)] PPK0 <= PPK0 + RotR1(PPK5 ^= Mk16(TK13,TK12)) PPK1 <= PPK1 + RotR1(PPK0 ^= Mk16(TK15,TK14)) PPK2 <= PPK2 + RotR1(PPK1) PPK3 <= PPK3 + RotR1(PPK2) PPK4 <= PPK4 + RotR1(PPK3) PPK5 <= PPK5 + RotR1(PPK4) PHASE2_STEP3: WEPSeed0 <= Hi8(TSC) WEPSeed1 <= (Hi8(TSC) || 0x20) && 0x7F WEPSeed2 <= Lo8(TSC) WEPSeed3 <= Lo8((PPK5 ^= Mk16(TK1,TK0)) >> 1) for i = 0 to 5 WEPSeed Step 3 of Phase 2 determines the values of all three WEP IV octets. Its structure was designed to eliminate the use of known weak keys. The receiving device can easily reconstruct the least significant 16 bits of the TSC used by the sender by concatenating the first and third IV octets and ignoring the second one. The remaining 32 bits of the TSC are obtained from the EIV. Thus, you have been presented with an interesting case when a specific cipher is designed and implemented to correct a flaw in another cipher's implementation. ## There Is More to a Cipher Than the Cipher: Understanding Cipher Operation ModesUnderstanding how DES (or any other symmetric block cipher including the TKIP function we just described and the AES used by the final 802.11i standard) works is not sufficient per se. An extremely important detail is the unique mode of the block cipher operation. With DES you can encrypt 64 bits of plaintext into an equivalent amount of ciphertext, but what if you want to encrypt only 50 bits? 128 bits? 200 bits? Or 31337 bits of data? An obvious solution is to split the long string of data into 64-bit blocks and pad blocks with less than 64 bits of data with some regular pattern of 0s and 1s. This is the most basic and simplest mode of block cipher operation, the Electronic Codebook Mode (ECB). In a nutshell, if the plaintext string Another problem we have encountered with the ECB is a short block length. We have discovered that on a machine running the U.S. version of Debian (which still uses ECB-mode DES for password security due to the export regulations) maximum password length cannot exceed 8 characters, and any symbol in a password longer than that does not make any difference at all (e.g., you can log in with "password" if the real password is "password%^*&))@!#0x69"). Thus, the possibility of a successful dictionary attack or password brute-forcing is increased. If you recall that the block size is 64 bits, and one ASCII character takes 1 byte, the reason for such an event is obvious. Yet, the majority of system administrators we asked about this discovery looked rather puzzled. Sometimes a theory might be more practical than it seems. To summarize, ECB is reasonable for encryption of short data strings, like PIN numbers or database entry values (parallel encryption and decryption of large databases does have its advantages). It is not a suitable mode to rely on for encrypting strong passwords or large data volumes. The way to avoid the replay problem is to chain 64-bit blocks of data together so that they become interdependent. Thus, a next mode of block cipher operation we are looking at is a Cipher Block Chaining mode (CBC). It is based on XORing the plaintext with a previously obtained block of encrypted data before the plaintext is encrypted. Because when the encryption happens the first time there is no ciphertext to XOR with, a new parameter known as the initialization vector (IV; sometimes also called an injection vector, initializing value, or initial chaining value) is introduced. IV is nothing more than a block of random data in the size of a block the cipher uses (64 bits with DES and many other symmetric block ciphers out there). It could be a timestamp, x The use of IVs ensures that the ciphertexts resulting from the plaintexts that are similar in the first few bytes (e.g., LLC SNAP or IP headers) are different. There are two main disadvantages of CBC: Because chaining is applied, parallel encryption is not possible (although parallel decryption is still an option). An error in one block will propagate to other (chained) blocks, which is likely to result in data retransmission.
Still, we encrypt data in fixed-size blocks, chained or not. What if we want to encrypt it in smaller blocks, or bit-by-bit, starting the encryption before the whole block is received? For some applications (e.g., remote shells), data should be encrypted immediately, character by character (8-bit blocks). There are two solutions for this problem, namely Cipher-Feedback (CFB) and Output Feedback (OFB) modes. In CFB we start feeding the blocks of generated ciphertext into the encrypted IV value rather than into plaintext: z Unlike CBC, we can start sending enciphered data as the plaintext and m/n, where m is a block cipher size and n is a number of bits encrypted at a time. For example, 64-bit block cipher encrypting ASCII characters in CFB mode will work 64/8 = 8 times slower compared to the same cipher operating in ECB or CBC with 64-bit blocks. The CBC statement on parallel processing applies to the CFB as well. However, when it comes to error propagation, a ciphertext error would affect only the corresponding plaintext and the next full block.Finally, Output Feedback (OFB) mode removes chaining, and as such, removes error propagation. On the other hand, the data blocks are not interdependent anymore, so some external form of synchronization (e.g., similar to the CSMA/CA algorithm on 802.11 LANs) is needed. To remove chaining, OFB generates a constant stream of z does not depend on either plaintext or ciphertext:_{i}z Because there is only one stream of m/n rule applies to OFB as well as CFB.How about the counter mode (CCM) used by the 802.11i standard for Advanced Encryption Standard (AES) operation? It is quite similar to the OFB mode just reviewed. In the OFB mode, each z procedure. CCM is simpler: The IV values taken from the incrementing counter are encrypted and XORed with blocks of plaintext to produce the ciphertext, or:_{i} = e_{k} (z_{i-1})z The |

< Day Day Up > |