Cryptography

SHA-1: From Hero to Zero

For two decades, SHA-1 was the backbone of internet security—protecting SSL certificates, signing software, and verifying Git commits. Then in 2017, Google shattered it. Literally.

This post traces SHA-1's journey from government-approved standard to cryptographic casualty, culminating in the SHAttered attack that proved its collision resistance was broken in practice.

What is SHA-1?

SHA-1 (Secure Hash Algorithm 1) is a cryptographic hash function that produces a 160-bit (20-byte) hash value, typically rendered as a 40-character hexadecimal string:

SHA1("Hello, World!") = 4ca9653095931ef15cb6b02d72f621e1bcbb856b

Like all cryptographic hash functions, SHA-1 was designed to provide:

Preimage resistance: Given a hash, you can't find a message that produces it
Second preimage resistance: Given a message, you can't find another message with the same hash
Collision resistance: You can't find ANY two messages with the same hash

SHA-1's 160-bit output provides a theoretical collision resistance of 2^80 operations (due to the birthday paradox). In 2005, that seemed like plenty. It wasn't.

The Birth of SHA-1

SHA-1 was designed by the National Security Agency (NSA) and published by NIST in 1995 as FIPS PUB 180-1.

It was a revision of SHA-0 (published in 1993), which the NSA quietly withdrew due to an undisclosed "weakness." SHA-1 added a single left-rotation operation that significantly strengthened the algorithm—though the NSA never explained why this change was necessary.

The SHA Family Tree

SHA-0 (1993) — Withdrawn, weakness found
    ↓
SHA-1 (1995) — Subject of this post
    ↓
SHA-2 (2001) — SHA-224, SHA-256, SHA-384, SHA-512
    ↓
SHA-3 (2015) — Completely different design (Keccak)

How SHA-1 Works (Simplified)

SHA-1 processes messages in 512-bit blocks:

Padding: Message is padded to a multiple of 512 bits
Initialization: Five 32-bit state variables (h0-h4) are set to fixed constants
Processing: Each block goes through 80 rounds of operations in 4 groups of 20
Output: Final state variables are concatenated into a 160-bit hash

Each round combines:

Bitwise operations (AND, OR, XOR, NOT)
Modular addition
Left rotation by 5 bits
A round-specific constant

The 80 rounds are divided into four stages, each using a different function:

Rounds 0-19: (B AND C) OR (NOT B AND D)
Rounds 20-39: B XOR C XOR D
Rounds 40-59: (B AND C) OR (B AND D) OR (C AND D)
Rounds 60-79: B XOR C XOR D

The Glory Days

SHA-1 became the dominant hash function of the internet age:

SSL/TLS certificates: The entire web PKI relied on SHA-1 signatures
Code signing: Microsoft, Apple, and others used SHA-1 to verify software authenticity
Git: Every commit, tree, and blob is identified by its SHA-1 hash
PGP/GPG: Email encryption and signing
IPsec: VPN tunnels used HMAC-SHA1
SSH: Host key fingerprints
Digital signatures: DSA originally required SHA-1

By 2010, SHA-1 was processing billions of operations daily across the global internet infrastructure.

The Fall: A Timeline of Attacks

2005: The First Cracks

Xiaoyun Wang—the same cryptographer who broke MD5—struck again. Her team published a theoretical attack reducing SHA-1 collision complexity from 2^80 to 2^69.

While 2^69 operations was still impractical in 2005, the implications were severe:

The attack would only get better as researchers refined it
Moore's Law would make 2^69 achievable within years
SHA-1's security margin had evaporated

NIST immediately recommended transitioning to SHA-2, but the internet moved slowly.

2011: NIST Deprecation

NIST officially deprecated SHA-1 for digital signatures in SP 800-131A:

"SHA-1 shall not be used for digital signature generation after December 31, 2013."

Despite this, SHA-1 remained widely deployed. Certificate authorities continued issuing SHA-1 certificates, and major platforms kept accepting them.

2015: The Freestart Collision

Marc Stevens and others demonstrated a "freestart collision" in the SHA-1 compression function. This wasn't a full SHA-1 collision (it required control over the initial state), but it proved the theoretical attacks were practical.

The researchers estimated a full collision would cost $75,000-$120,000 using cloud computing.

2017: SHAttered

On February 23, 2017, researchers from Google and CWI Amsterdam announced SHAttered: the first practical SHA-1 collision.

They created two different PDF files with identical (the same) SHA-1 hashes:

SHA1(pdf1) = 38762cf7f55934b34d179ae6a4c80cadccbb7f0a
SHA1(pdf2) = 38762cf7f55934b34d179ae6a4c80cadccbb7f0a

The PDFs displayed completely different content—one with a blue background, one with a red background—yet were cryptographically "identical (the same)" according to SHA-1.

The numbers behind SHAttered:

9,223,372,036,854,775,808 SHA-1 computations (9 quintillion)
6,500 CPU-years of computation
110 GPU-years of computation
Equivalent cost: ~$110,000 in cloud computing

While expensive, this was within reach of nation-states, large corporations, and well-funded criminal organizations. And the cost would only decrease.

How SHAttered Worked

The attack exploited SHA-1's Merkle-Damgård construction:

Both PDFs share an identical (the same) prefix (the PDF header and some structure)
The collision blocks are inserted—different bytes that produce the same intermediate hash state
Both PDFs share an identical (the same) suffix (the remaining content)

Because SHA-1 processes blocks sequentially, if two messages reach the same internal state, they'll produce the same final hash regardless of what comes after.

The researchers crafted the collision blocks to fall within the PDF's image data region. This let them embed different images while maintaining the same hash.

PDF Structure:
┌─────────────┐
│ PDF Header  │ ← Identical in both files
├─────────────┤
│ Collision   │ ← Different bytes, same hash state
│ Block       │
├─────────────┤
│ Image Data  │ ← Different images (blue vs red)
├─────────────┤
│ PDF Footer  │ ← Identical in both files
└─────────────┘

2020: SHA-1 is a Shambles

Researchers went further with a chosen-prefix collision—the more dangerous variant that allows attackers to start with arbitrary, meaningful prefixes.

The "SHA-1 is a Shambles" paper demonstrated:

Chosen-prefix collisions in 2^63.4 operations
Cost: ~$45,000 (down from SHAttered's $110,000)
Practical PGP/GPG key impersonation attacks

This attack could create two different PGP keys with the same fingerprint, allowing an attacker to impersonate any user.

Why SHAttered Matters

"So they found two PDFs with the same hash. Who cares?"

Here's why it matters:

1. Certificate Forgery

Before 2017, many certificate authorities still issued SHA-1 certificates. An attacker could:

Create a legitimate certificate request
Craft a CA certificate with the same SHA-1 hash
Get the legitimate request signed
Use that signature on the CA certificate
Issue arbitrary "trusted" certificates

2. Software Supply Chain

If a software repository uses SHA-1 for integrity:

Submit legitimate software for review
Create malware with the same SHA-1
After approval, distribute the malware
Users verify the hash—it matches!

3. Version Control Attacks

Git identifies everything by SHA-1:

What if two different commits have the same hash?
What if malicious code and legitimate code hash identically (produce the same hash)?
An attacker could potentially poison repositories

Linus Torvalds initially downplayed the Git risk (Git uses SHA-1 for content addressing, not security), but the project has been migrating to SHA-256.

4. Legal Document Fraud

Digitally signed contracts using SHA-1 could be swapped:

Create two contracts with the same hash
Get the favorable one signed
Substitute the malicious version
The signature still validates

Current Status: Dead for Security

SHA-1 is now universally deprecated for cryptographic purposes:

Organization	Action
NIST	Deprecated 2011, disallowed for signatures 2013
CA/Browser Forum	Banned SHA-1 certificates from January 2016
Google Chrome	Warnings from 2016, blocked from 2017
Mozilla Firefox	Rejected SHA-1 certificates from 2017
Microsoft	Blocked SHA-1 certificates in Edge/IE from 2017
Apple	Rejected SHA-1 certificates from 2017
Git	Migrating to SHA-256 (ongoing)

What About Preimage Resistance?

Like MD5, SHA-1's collision resistance is broken, but preimage resistance remains intact. The best known preimage attack is still 2^160 operations—computationally impossible.

This means:

Broken: Finding two messages with the same hash
Not broken: Reversing a hash to find the original message

However, broken collision resistance is enough to retire SHA-1 from all security applications.

What Should You Use Instead?

Purpose	Recommended
General hashing	SHA-256, SHA-3, BLAKE3
Digital signatures	SHA-256 or SHA-3
Code signing	SHA-256
Certificates	SHA-256 (mandatory since 2016)
Password hashing	Argon2, bcrypt, scrypt (NOT any SHA)
Git	SHA-256 (migration in progress)
HMAC	HMAC-SHA256

Where SHA-1 Still Lurks

Despite deprecation, SHA-1 persists in legacy systems:

Still Acceptable (Non-Security)

HMAC-SHA1: When used with a secret key, the collision weakness doesn't apply directly. Still, migration to HMAC-SHA256 is recommended.
Non-cryptographic checksums: Detecting accidental corruption (not malicious tampering)
Legacy identifiers: Old Git commits, historical records

Not Acceptable

Digital signatures
Certificate signing
New cryptographic protocols
Integrity verification against malicious actors
Password storage (never was acceptable)

Lessons from SHA-1

1. Government Approval ≠ Security

SHA-1 was designed by the NSA and approved by NIST. It still fell. Cryptographic standards require continuous evaluation.

2. Deprecation Takes Forever

NIST deprecated SHA-1 in 2011. SHAttered happened in 2017. Six years of warnings, and SHA-1 was still everywhere when it finally broke.

3. Theoretical Attacks Become Practical

Wang's 2005 attack was "theoretical." Twelve years later, it was $110,000. Three years after that, $45,000. Theoretical attacks are early warnings, not false alarms.

4. Collision Resistance is Fragile

Both MD5 and SHA-1 fell to collision attacks while their preimage resistance held. When designing systems, assume collision resistance will break first.

5. Cryptographic Agility is Essential

Systems that hardcoded SHA-1 (like Git) faced painful migrations. Design systems to swap algorithms without major rewrites.

Conclusion

SHA-1 served the internet well for over 15 years. Its fall wasn't sudden—it was a slow-motion collapse that the cryptographic community saw coming for a decade.

The lesson isn't that SHA-1 was poorly designed. In 1995, it was solid. The lesson is that cryptography exists in an adversarial environment where attacks only improve. SHA-1's 160-bit output seemed generous in 1995. By 2017, it was barely enough to buy time.

Today, SHA-256 and SHA-3 stand where SHA-1 once stood. They're stronger, but they're not eternal. When their time comes—and it will—we'll need to be ready to migrate. Again.

References

NIST FIPS 180-4. Secure Hash Standard
Wang, X., et al. (2005). Finding Collisions in the Full SHA-1
Stevens, M., et al. (2017). SHAttered: The First Practical SHA-1 Collision
Leurent, G., Peyrin, T. (2020). SHA-1 is a Shambles
NIST SP 800-131A Rev 2. Transitioning the Use of Cryptographic Algorithms
CA/Browser Forum. Ballot 118 - SHA-1 Sunset
Google Security Blog. Announcing the first SHA1 collision