PGP keys, software security, and much more threatened by new SHA1 exploit
Three years ago, Ars declared the SHA1 cryptographic hash algorithm officially dead after researchers performed the world’s first known instance of a fatal exploit known as a “collision” on it. On Tuesday, the dead SHA1 horse got clobbered again as a different team of researchers unveiled a new attack that’s significantly more powerful.
The new collision gives attackers more options and flexibility than were available with the previous technique. It makes it practical to create PGP encryption keys that, when digitally signed using SHA1 algorithm, impersonate a chosen target. More generally, it produces the same hash for two or more attacker-chosen inputs by appending data to each of them. The attack unveiled on Tuesday also costs as little as $45,000 to carry out. The attack disclosed in 2017, by contrast, didn’t allow forgeries on specific predetermined document prefixes and was evaluated to cost from $110,000 to $560,000 on Amazon’s Web Services platform, depending on how quickly adversaries wanted to carry it out.
The new attack is significant. While SHA1 has been slowly phased out over the past five years, it remains far from being fully deprecated. It’s still the default hash function for certifying PGP keys in the legacy 1.4 version branch of GnuPG, the open-source successor to PGP application for encrypting email and files. Those SHA1-generated signatures were accepted by the modern GnuPG branch until recently, and were only rejected after the researchers behind the new collision privately reported their results.
Git, the world’s most widely used system for managing software development among multiple people, still relies on SHA1 to ensure data integrity. And many non-Web applications that rely on HTTPS encryption still accept SHA1 certificates. SHA1 is also still allowed for in-protocol signatures in the Transport Layer Security and Secure Shell protocols.
In a paper presented at this week’s Real World Crypto Symposium in New York City, the researchers warned that even if SHA1 usage is low or used only for backward compatibility, it will leave users open to the threat of attacks that downgrade encrypted connections to the broken hash function. The researchers said their results underscore the importance of fully phasing out SHA1 across the board as soon as possible.
“This work shows once and for all that SHA1 should not be used in any security protocol where some kind of collision resistance is to be expected from the hash function,” the researchers wrote. “Continued usage of SHA1 for certificates or for authentication of handshake messages in TLS or SSH is dangerous, and there is a concrete risk of abuse by a well-motivated adversary. SHA1 has been broken since 2004, but it is still used in many security systems; we strongly advise users to remove SHA1 support to avoid downgrade attacks.”
A hashing primer
To recap, a hash is a cryptographic fingerprint of a message, file, or other type of digital input that, like traditional fingerprints, looks unique. Also known as message digests, hashes play a vital role in ensuring that software updates, cryptographic keys, emails, and other types of messages are the authentic product of a specific person or entity, as opposed to a counterfeit input created by an adversary. These digital fingerprints come in the form of a fixed sequence of numbers and letters that are generated when the message is inputted into a hash algorithm or function.
The entire security of a hashing scheme rests on the infeasibility of finding two or more different inputs that produce the same fingerprints. A function with a bit length of n should require a brute force attacker to test 2n/2 inputs before finding a collision (a mathematical concept known as the birthday paradox significantly reduces the number of guesses required, accounting for the n/2 in the equation). Hash functions with sufficient bit lengths and collision resistance require attackers are secure because they require an attacker to devote an infeasible amount of time and computing resources to generate a collision. Hash functions are considered broken when collisions can be found using fewer than 2n/2 tries.
The 128-bit MD5 hash function was one of the earlier widely used entrants to fall to collision attacks. Although researchers warned as early as 1996 that flaws in MD5 made it prone to collisions, it remained a key part of software and Web authentication for more than two decades afterwards.
Then, in 2008, researchers used MD5 collisions to create an HTTPS certificate for any website of their choosing. The demonstration eventually convinced browser-trusted certificate authorities to drop MD5, but the function continued to be widely used for other purposes. The full deprecation of MD5 for authentication purposes didn’t come until 2012, when the Flame espionage malware, which the US and Israel are reported to have used to spy on sensitive Iranian networks, wielded a collision attack to hijack Microsoft’s Windows Update mechanism so Flame could spread from computer to computer inside an infected network.
SHA1 is proving to follow a path that’s uncannily similar to that of MD5. Already a key part of the official standard for validating software updates, cryptographic keys, and other sensitive data, SHA1 became even more vital after the demise of MD5. But it, too, had collision vulnerabilities that have been known since 2004. The difficulty of transitioning to newer algorithms with better collision resistance allowed SHA1 to remain in wide-scale use even after 2015, when researchers predicted it could succumb to collision attacks by year’s end.
SHA1 is dead. Long live SHA1
Some 16 months later, researchers demonstrated the world’s first known collision attack against SHA1. It came in the form of two PDF files that, despite displaying different content, had the same SHA1 hash. The researchers behind it said it could allow a landlord to draft two rental agreements with colliding hashes. The landlord could get a tenant to digitally sign one document offering a low rental price and later claim the tenant signed the agreement for the lease agreeing to a much higher price.
The attack—which cost as little as $110,000 to carry out on Amazon’s cloud computing platform—was what cryptographers call a classical collision attack. Also known as an identical prefix collision, it results when two inputs have the same predetermined prefix—or beginning—and differing data that follows. Even though the two inputs are distinctly different, they can hash to the same value if additional data is appended to the files. Stated another way, for a hash function H, two distinct messages M1 and M2 will lead to the same hash output: H(M1) = H(M2).
Identical prefix collisions are powerful and a fatal blow against the security of a hash function, but their utility to attackers is also limited. A far more powerful form of collision is known as a chosen prefix attack, which is what allowed the MD5 attacks against the HTTPS certificate system in 2008 and against Microsoft’s update mechanism in 2012. While harder to carry out than identical prefix collisions, the chosen prefix cousins are generally much more useful.
That’s because chosen prefix attacks allow attackers to take two or more different prefixes—as opposed to the same prefix in traditional collision attacks—and append data to each so they hash to the same value. Given two message prefixes P1 and P2, an attacker can compute two messages M1 and M2 such that H(P1 || M1) = H(P2 || M2), where || denotes “concatenation,” or the act of linking the two. A more detailed explanation of chosen prefix collisions is available in this 2015 post from Nick Sullivan, head of research and cryptography at content delivery network Cloudflare.
PGP/GnuPG impersonation
The attack demonstrated Tuesday is the first known chosen prefix collision on SHA1. To demonstrate its potency, researchers Gaëtan Leurent and Thomas Peyrin of Inria France and the Nanyang Technological University in Singapore respectively, used the collision to perform a PGP/GnuPG impersonation attack. In their Real World Crypto paper the researchers explain:
The chosen prefixes correspond to headers of two PGP identity certificates with keys of different sizes, an RSA-8192 key and an RSA-6144 key. By exploiting properties of the the OpenPGP and JPEG format, we can create two public keys: key A with the victim name, and key B with the attacker name and picture, such that the identity certificate containing the attacker key and picture has the same SHA-1 hash as the identity certificate containing the victim key and name. Therefore, the attacker can request a signature of his key and picture from a third party (from the Web of Trust or from a CA) and transfer the signature to key A. The signature will still be valid because of the collision, while the attacker controls key A with the name of the victim, and signed by the third party. Therefore, he can impersonate the victim and sign any document in her name.
In a post further demonstrating the attack, the researchers provided both messageA and messageB. Despite containing differing user ID prefixes, they both map to the same SHA1 hash value of 8ac60ba76f1999a1ab70223f225aefdc78d4ddc0
.
The researchers’ results significantly improve the efficiency of SHA1 attacks, with a speedup factor of about 10. More precisely, the new attacks reduce the cost of an identical prefix collision attack from 264.7 to 261.2, and the cost of a chosen-prefix collision attack from 267.1 to 263.4 when performed on a GTX 970 graphics processor.
The researchers carried out the attack over a two-month period on a cluster of 900 Nvidia GTX 1060 GPUs they rented online. They said the rented cluster is a much more economical platform than Amazon Web Services and competing cloud services. The attack cost $74,000 when carried out a few months ago, but with an optimized implementation and computation costs that have continued to fall, the researchers say the same attack now costs $45,000. By 2025, the researchers estimate the attack will cost $10,000. The result: the same chosen prefix attacks that have been possible against MD5 since 2009 are now practical against SHA1 as well and will only become more affordable over time.
SHA1: May it (finally) rest in peace
The researchers privately reported their results to developers of software that is most affected. They included developers for:
- GnuPG. The developers responded by implementing a countermeasure in November that invalidates SHA1-based identity signatures that were created after January 2019.
- CAcert, a certificate authority that issues PGP keys. The researchers noticed a large number of CAcert-issued keys with recent SHA1 signatures on public keyservers. That may indicate that the CA still uses SHA1 to sign user keys. CAcert has acknowledged the issue, and it is planning to move away from SHA1.
- OpenSSL, a cryptographic library that continues to accept SHA1 certificates in many security-sensitive contexts. Developers responded by saying they’re considering disabling SHA1 in those contexts.
Given the number of applications and protocols that continue to rely on SHA1 for collision-resistant hashes, however, the researchers were unable to contact all affected developers. To prevent the attacks from being actively used in the wild, the researchers are withholding many of the collision details for the time being.
Matt Green, a John Hopkins University professor specializing in cryptography, said the results were impressive and underscored the oft-repeated observation that SHA1 can no longer be considered secure.
“For a secure hash function, a [speedup] factor of 10 shouldn’t make much of a difference, but when you’re down to something that’s pretty close to broken, those kinds of efficiencies really make a difference, especially when there’s lots of mining hardware out there,” he said in an interview. “We knew that one shoe had dropped and this is the next shoe dropping.”