SHA-1: From Hero to Zero
For two decades, SHA-1 was the backbone of internet security—protecting SSL certificates, signing software, and verifying Git commits. Then in 2017, Google shattered it. Literally.
This post traces SHA-1's journey from government-approved standard to cryptographic casualty, culminating in the SHAttered attack that proved its collision resistance was broken in practice.
What is SHA-1?
SHA-1 (Secure Hash Algorithm 1) is a cryptographic hash function that produces a 160-bit (20-byte) hash value, typically rendered as a 40-character hexadecimal string:
SHA1("Hello, World!") = 0a0a9f2a6772942557ab5355d76af442f8f65e01
Like all cryptographic hash functions, SHA-1 was designed to provide:
- Preimage resistance: Given a hash, you can't find a message that produces it
- Second preimage resistance: Given a message, you can't find another message with the same hash
- Collision resistance: You can't find ANY two messages with the same hash
SHA-1's 160-bit output provides a theoretical collision resistance of 2^80 operations (due to the birthday paradox). In 2005, that seemed like plenty. It wasn't.
The Birth of SHA-1
SHA-1 was designed by the National Security Agency (NSA) and published by NIST in 1995 as FIPS PUB 180-1.
It was a revision of SHA-0 (published in 1993), which the NSA quietly withdrew due to an undisclosed "weakness." SHA-1 added a single left-rotation operation that significantly strengthened the algorithm—though the NSA never explained why this change was necessary.
The SHA Family Tree
SHA-0 (1993) — Withdrawn, weakness found
↓
SHA-1 (1995) — Subject of this post
↓
SHA-2 (2001) — SHA-224, SHA-256, SHA-384, SHA-512
↓
SHA-3 (2015) — Completely different design (Keccak)
How SHA-1 Works (Simplified)
SHA-1 processes messages in 512-bit blocks:
- Padding: Message is padded to a multiple of 512 bits
- Initialization: Five 32-bit state variables (h0-h4) are set to fixed constants
- Processing: Each block goes through 80 rounds of operations in 4 groups of 20
- Output: Final state variables are concatenated into a 160-bit hash
Each round combines:
- Bitwise operations (AND, OR, XOR, NOT)
- Modular addition
- Left rotation by 5 bits
- A round-specific constant
The 80 rounds are divided into four stages, each using a different function:
- Rounds 0-19:
(B AND C) OR (NOT B AND D) - Rounds 20-39:
B XOR C XOR D - Rounds 40-59:
(B AND C) OR (B AND D) OR (C AND D) - Rounds 60-79:
B XOR C XOR D
The Glory Days
SHA-1 became the dominant hash function of the internet age:
- SSL/TLS certificates: The entire web PKI relied on SHA-1 signatures
- Code signing: Microsoft, Apple, and others used SHA-1 to verify software authenticity
- Git: Every commit, tree, and blob is identified by its SHA-1 hash
- PGP/GPG: Email encryption and signing
- IPsec: VPN tunnels used HMAC-SHA1
- SSH: Host key fingerprints
- Digital signatures: DSA originally required SHA-1
By 2010, SHA-1 was processing billions of operations daily across the global internet infrastructure.
The Fall: A Timeline of Attacks
2005: The First Cracks
Xiaoyun Wang—the same cryptographer who broke MD5—struck again. Her team published a theoretical attack reducing SHA-1 collision complexity from 2^80 to 2^69.
While 2^69 operations was still impractical in 2005, the implications were severe:
- The attack would only get better as researchers refined it
- Moore's Law would make 2^69 achievable within years
- SHA-1's security margin had evaporated
NIST immediately recommended transitioning to SHA-2, but the internet moved slowly.
2011: NIST Deprecation
NIST officially deprecated SHA-1 for digital signatures in SP 800-131A:
"SHA-1 shall not be used for digital signature generation after December 31, 2013."
Despite this, SHA-1 remained widely deployed. Certificate authorities continued issuing SHA-1 certificates, and major platforms kept accepting them.
2015: The Freestart Collision
Marc Stevens and others demonstrated a "freestart collision" in the SHA-1 compression function. This wasn't a full SHA-1 collision (it required control over the initial state), but it proved the theoretical attacks were practical.
The researchers estimated a full collision would cost $75,000-$120,000 using cloud computing.
2017: SHAttered
On February 23, 2017, researchers from Google and CWI Amsterdam announced SHAttered: the first practical SHA-1 collision.
They created two different PDF files with identical (the same) SHA-1 hashes:
SHA1(pdf1) = 38762cf7f55934b34d179ae6a4c80cadccbb7f0a
SHA1(pdf2) = 38762cf7f55934b34d179ae6a4c80cadccbb7f0a
The PDFs displayed completely different content—one with a blue background, one with a red background—yet were cryptographically "identical (the same)" according to SHA-1.
The numbers behind SHAttered:
- 9,223,372,036,854,775,808 SHA-1 computations (9 quintillion)
- 6,500 CPU-years of computation
- 110 GPU-years of computation
- Equivalent cost: ~$110,000 in cloud computing
While expensive, this was within reach of nation-states, large corporations, and well-funded criminal organizations. And the cost would only decrease.
How SHAttered Worked
The attack exploited SHA-1's Merkle-Damgård construction:
- Both PDFs share an identical (the same) prefix (the PDF header and some structure)
- The collision blocks are inserted—different bytes that produce the same intermediate hash state
- Both PDFs share an identical (the same) suffix (the remaining content)
Because SHA-1 processes blocks sequentially, if two messages reach the same internal state, they'll produce the same final hash regardless of what comes after.
The researchers crafted the collision blocks to fall within the PDF's image data region. This let them embed different images while maintaining the same hash.
PDF Structure:
┌─────────────┐
│ PDF Header │ ← Identical in both files
├─────────────┤
│ Collision │ ← Different bytes, same hash state
│ Block │
├─────────────┤
│ Image Data │ ← Different images (blue vs red)
├─────────────┤
│ PDF Footer │ ← Identical in both files
└─────────────┘
2020: SHA-1 is a Shambles
Researchers went further with a chosen-prefix collision—the more dangerous variant that allows attackers to start with arbitrary, meaningful prefixes.
The "SHA-1 is a Shambles" paper demonstrated:
- Chosen-prefix collisions in 2^63.4 operations
- Cost: ~$45,000 (down from SHAttered's $110,000)
- Practical PGP/GPG key impersonation attacks
This attack could create two different PGP keys with the same fingerprint, allowing an attacker to impersonate any user.
Why SHAttered Matters
"So they found two PDFs with the same hash. Who cares?"
Here's why it matters:
1. Certificate Forgery
Before 2017, many certificate authorities still issued SHA-1 certificates. An attacker could:
- Create a legitimate certificate request
- Craft a CA certificate with the same SHA-1 hash
- Get the legitimate request signed
- Use that signature on the CA certificate
- Issue arbitrary "trusted" certificates
2. Software Supply Chain
If a software repository uses SHA-1 for integrity:
- Submit legitimate software for review
- Create malware with the same SHA-1
- After approval, distribute the malware
- Users verify the hash—it matches!
3. Version Control Attacks
Git identifies everything by SHA-1:
- What if two different commits have the same hash?
- What if malicious code and legitimate code hash identically (produce the same hash)?
- An attacker could potentially poison repositories
Linus Torvalds initially downplayed the Git risk (Git uses SHA-1 for content addressing, not security), but the project has been migrating to SHA-256.
4. Legal Document Fraud
Digitally signed contracts using SHA-1 could be swapped:
- Create two contracts with the same hash
- Get the favorable one signed
- Substitute the malicious version
- The signature still validates
Current Status: Dead for Security
SHA-1 is now universally deprecated for cryptographic purposes:
| Organization | Action |
|---|---|
| NIST | Deprecated 2011, disallowed for signatures 2013 |
| CA/Browser Forum | Banned SHA-1 certificates from January 2016 |
| Google Chrome | Warnings from 2016, blocked from 2017 |
| Mozilla Firefox | Rejected SHA-1 certificates from 2017 |
| Microsoft | Blocked SHA-1 certificates in Edge/IE from 2017 |
| Apple | Rejected SHA-1 certificates from 2017 |
| Git | Migrating to SHA-256 (ongoing) |
What About Preimage Resistance?
Like MD5, SHA-1's collision resistance is broken, but preimage resistance remains intact. The best known preimage attack is still 2^160 operations—computationally impossible.
This means:
- Broken: Finding two messages with the same hash
- Not broken: Reversing a hash to find the original message
However, broken collision resistance is enough to retire SHA-1 from all security applications.
What Should You Use Instead?
| Purpose | Recommended |
|---|---|
| General hashing | SHA-256, SHA-3, BLAKE3 |
| Digital signatures | SHA-256 or SHA-3 |
| Code signing | SHA-256 |
| Certificates | SHA-256 (mandatory since 2016) |
| Password hashing | Argon2, bcrypt, scrypt (NOT any SHA) |
| Git | SHA-256 (migration in progress) |
| HMAC | HMAC-SHA256 |
Where SHA-1 Still Lurks
Despite deprecation, SHA-1 persists in legacy systems:
Still Acceptable (Non-Security)
- HMAC-SHA1: When used with a secret key, the collision weakness doesn't apply directly. Still, migration to HMAC-SHA256 is recommended.
- Non-cryptographic checksums: Detecting accidental corruption (not malicious tampering)
- Legacy identifiers: Old Git commits, historical records
Not Acceptable
- Digital signatures
- Certificate signing
- New cryptographic protocols
- Integrity verification against malicious actors
- Password storage (never was acceptable)
Lessons from SHA-1
1. Government Approval ≠ Security
SHA-1 was designed by the NSA and approved by NIST. It still fell. Cryptographic standards require continuous evaluation.
2. Deprecation Takes Forever
NIST deprecated SHA-1 in 2011. SHAttered happened in 2017. Six years of warnings, and SHA-1 was still everywhere when it finally broke.
3. Theoretical Attacks Become Practical
Wang's 2005 attack was "theoretical." Twelve years later, it was $110,000. Three years after that, $45,000. Theoretical attacks are early warnings, not false alarms.
4. Collision Resistance is Fragile
Both MD5 and SHA-1 fell to collision attacks while their preimage resistance held. When designing systems, assume collision resistance will break first.
5. Cryptographic Agility is Essential
Systems that hardcoded SHA-1 (like Git) faced painful migrations. Design systems to swap algorithms without major rewrites.
Conclusion
SHA-1 served the internet well for over 15 years. Its fall wasn't sudden—it was a slow-motion collapse that the cryptographic community saw coming for a decade.
The lesson isn't that SHA-1 was poorly designed. In 1995, it was solid. The lesson is that cryptography exists in an adversarial environment where attacks only improve. SHA-1's 160-bit output seemed generous in 1995. By 2017, it was barely enough to buy time.
Today, SHA-256 and SHA-3 stand where SHA-1 once stood. They're stronger, but they're not eternal. When their time comes—and it will—we'll need to be ready to migrate. Again.
References
- NIST FIPS 180-4. Secure Hash Standard
- Wang, X., et al. (2005). Finding Collisions in the Full SHA-1
- Stevens, M., et al. (2017). SHAttered: The First Practical SHA-1 Collision
- Leurent, G., Peyrin, T. (2020). SHA-1 is a Shambles
- NIST SP 800-131A Rev 2. Transitioning the Use of Cryptographic Algorithms
- CA/Browser Forum. Ballot 118 - SHA-1 Sunset
- Google Security Blog. Announcing the first SHA1 collision