ASHBURN, Va. – Cryptography forms the foundation of many aspects of trusted computing. This article considers recent algorithms and cryptographic attacks, as well as some future directions for cryptography in deployed embedded systems.
This article provides a foundational understanding of the context of cryptographic algorithms, and delves into the details of secure hashing, common secure hashing algorithms, and some trusted computing applications of secure hashing.
Today there are three general categories of cryptographic algorithms that are common in trusted computing: secure hashing, symmetric cryptography, and asymmetric cryptography.
Hashing takes a relatively large piece of data and generates a relatively small unique value from that data. Given a large amount of data, if even one bit of the input data is changed, the resulting hash will change by a large amount and in a deterministic but unpredictable way. Cryptographic algorithms that use symmetric cryptography employ the same key to encrypt and decrypt the data.
Symmetric algorithms take data and encrypt them to be mathematically infeasible to extract the original information from the cipher text without the key. One key encrypts and decrypts to ensure the secrecy of that key, and so ensures the secrecy of the protected data.
In contrast, the third type of algorithm, asymmetric cryptography, uses a pair of keys. One is private and always should remain secret, while the other can be shared freely.
Asymmetric cryptography has uses like signing data, since it can prove who generated the data. Asymmetric cryptography also is for key agreement protocols to enable secure agreement on a shared secret key over an open network.
Hashing passes a large set of data through a mathematical one-way algorithm, and then generates a more compact, deterministic, and unique value. Hashing algorithms are fast, require minimal computational resources, and need to be “one-way” to ensure that it is computationally infeasible to find another input that matches the output value of the hash function. Some commonly used hash functions include MD5, SHA-1, SHA-2, and SHA-3.
Secure hash algorithms need the ability to resist collision attacks, pre-image attacks, second pre-image attacks, and length extension attacks. A collision attack attempts to find any two inputs A and B that, when hashed with the hash algorithm H(), generate the same output.
Find any two A and B such that H(A) = H(B)
During a pre image attack, the attacker knows output H(A) but does not know the input A. The attacker next attempts to find any input A, B, C… that will generate the same hash output H(A).
Given H(A) find any value B which may include A such that H(A) = H(B)
A second pre-image attack, while similar to a pre-image attack, stipulates that the attacker is trying to find some input other than the original input A that will generate the same hash as H(A).
Given H(A) find a B such that A != B and H(A) = H(B)
Related: Lowering the costs of encrypted data storage in trusted computing
A length extension attack is when an attacker, given the output hash H(A), and the length of the original message, tries to design a new input A || B without needing to know A, that can generate a known output.
Given H(A) design H(A || B) with known B such that H(A || B) is a given output
MD5 hash algorithm
The MD5 hash algorithm first was published in 1992, and uses a fixed digest size of 128 bits. While designed initially for security, several vulnerabilities have been found in MD5 for collision and preimage attacks. While still often used as a checksum for files to check for unintended corruption, MD5 should not be used for security in trusted computing systems.
SHA-1 hash algorithm
SHA-1 was released in 1995, with a fixed digest size of 160 bits. As its secure hash algorithm name implies, it was designed for security. Unfortunately several papers have shown that SHA-1 is vulnerable to collision attacks. Since 2010, SHA-1 has not been recommended for use in the security of U.S. trusted computing systems, although exceptions are made for interfacing with legacy systems.
SHA-2 hash algorithm
SHA-2 refers to a family of hash algorithms that were released in 2001. They are SHA-224, SHA-256, SHA-384, and SHA-512: the value denotes the digest size in bits. All members of the SHA-2 family use the same fundamental mathematical algorithm, with only slight variations to change the digest size.
Related: A guide to international authorities for global trusted computing standards certification
SHA-224 and SHA-384 essentially are truncated versions of the SHA-256 and SHA-512 algorithms. While there have been several papers published that reduce the computational complexity required to break SHA-2, it is still infeasible for larger digest sizes. The CNSS Policy 15, published 20 Oct. 2016, recommends SHA-384 as the minimum for protection of classified information on U.S. National Security Systems.
SHA-3 hash algorithm
The SHA-3 family of algorithms was released by NIST in 2015. These algorithms fundamentally are different from the MD5, SHA-1, and SHA-2 algorithms, which all share a similar structure. While SHA-3 is not considered “better” than SHA-2, these hash algorithms provide additional tools to ensure there are robust solutions readily available to fill the same role if fundamental vulnerabilities are found in SHA-2.
There are no current plans to require a move to SHA-3 from SHA-2 algorithms for U.S. National Security Systems. The SHA-3 algorithms don’t have a fixed digest size unlike the other common hash functions. That’s because they employ a different mathematical structure. SHA-3 relies on sponge construction that provides for the concept of absorbing data and then “squeezing” out data after applying permutations.
NIST Publication 202, though, does specify a set of defined digest sizes and algorithms that are specified as SHA3-224, SHA3-256, SHA3-384, and SHA3-512. These would allow SHA-3 to be used as a drop-in replacement for SHA-2. SHA-3 also offers two additional algorithms, called extensible output functions (XOF), that can generate any given output length at a specific security strength. These XOFs are called SHAKE128 and SHAKE256; SHAKE is a combination of SHA and Keccak, the fundamental mathematical algorithm that SHA-3 depends on.
Secure hashing algorithms generate digests from data, an approach that many higher-level protocols use in trusted computing systems. For example, while not required for generating digital signatures over arbitrary data, secure hash algorithms can increase the processing speed of signing and verification operations for asymmetric cryptographic operations.
Secure hashing can help process large quantities of data quickly, and generate a relatively small hash. Slower asymmetric cryptographic operations then can sign or verify the smaller digest. This approach can help reduce overall processing time over a wide range of data sizes.
Secure hashing algorithms also is for hashing message authentication code (HMAC). HMAC algorithms, defined in FIPS publication 198-1, concatenate a secret key (K) with the message (M) and then hash the resulting combined message H(K || M). The specific implementation includes additional padding and key length details that are important to security, but the overall concept is as above. The recipient of the message M uses its own copy of the shared secret key K, concatenates it with the received message, and then performs the same hash operation to verify that the message is authentic.
Secure hash algorithms also can help in key derivation functions (KDFs), which are for higher-order key management techniques like key stretching or key rolling. Defined KDFs that depend on hash algorithms acting as a pseudo-random function are specified in NIST special publication 800-108.
Related: Establishing a trusted supply chain for embedded computing design
Defined KDFs include a KDF operating in counter mode and feedback mode. In counter mode, keys are computed using a base key and a defined counter, which allows for missed blocks not impacting an individual key. In feedback mode, keys are computed using some set of previous keys that require some number of previous blocks to correctly calculate the next key.
One increasingly popular use of secure hashing is for developing hash chains, or more generally Merkle Trees. Hash chains and Merkle Trees are based on the repeated application of a hash function to data. Hash chains apply a hash recursively. This approach can support applications like one-time-use passwords.
A Merkle Tree applies a hash data blocks, which then generate the leaves of the tree. It computes each parent node of the leaf nodes as hashes of the set of the children nodes. This data structure enables application like verification of data integrity across distributed systems (git or mercurial content management system) or the verification of data integrity in file systems like ZFS.
Blockchain cryptocurrency, such as bitcoin, also relies on hash chains for recording the ledger of bitcoin transactions, with each transaction containing a hash linking it to a previous transaction.
Related: Decomposing system security to prevent cyber attacks in trusted computing architectures
Since secure hashing does not depend on mathematical operations that are solvable on quantum computers, secure hashing structures such as hash chains and Merkle Trees help to mitigate concerns over increasing capabilities in quantum computers to break other cryptographic algorithms.
David Sheets is senior principal security architect at Curtiss-Wright Defense Solutions. Contact him by email at [email protected].
Ready to make a purchase? Search the Military & Aerospace Electronics Buyer's Guide for companies, new products, press releases, and videos