CITS3007 lab 8 (week 10) – Passwords

Passwords are one of the oldest,¹ most widely used mechanisms for authenticating users – found everywhere from web apps and APIs, to operating systems and IoT devices. Other authentication methods have since become available (such as biometric logins and hardware tokens), but passwords still underpin access control for the vast majority of services.

However, using passwords alone is increasingly considered insecure. Modern best practice favours multi-factor authentication (MFA), where a password is combined with something the user has (like a phone or hardware key) or something they “are” (like a fingerprint). Relying on passwords alone leaves systems too easily vulnerable to “credential stuffing” (attackers re-using stolen passwords) and brute-force attacks.

The way we hash and store passwords has evolved significantly in the past few decades. Simple hashing algorithms like MD5 or SHA-1 are no longer considered acceptable for use in securing systems. Today, secure systems use slow, salted, key-stretching algorithms like bcrypt, scrypt, or Argon2 to make attacks computationally expensive.

Guidelines have also shifted. Prior to the late 2010s, typical advice for generating passwords was that they should:

Much of the burden was placed on users to come up with strong passwords, remember dozens of them, rotate them regularly, and never write them down.

The principle of psychological acceptability.

Psychological acceptability is one of Saltzer and Schroeder’s classic principles of secure system design, and suggests that security mechanisms should not make the system harder to use than necessary, and should fit naturally into how users think and behave.

But old-style password rules make passwords:

hard to remember: humans are bad at memorising lots of complex, arbitrary strings.
hard to type correctly: instead of being English or other natural language words, they contain digits and punctuation in unusual places.
tempting to reuse, or write down: if people are forced to change passwords frequently, or aren’t encouraged to use a password manager, they’ll likely reuse passwords with slight variations, or write the passwords down (e.g. on a sticky note affixed to the computer screen).

By 2017, many standards bodies, such as NIST, had abandoned these practices. They resulted in users:

In short, as the comic XKCD explains, these practices trained people to use passwords that are hard for humans to remember, but easy for computers to guess.

2 Modern user password best practice

The UK National Cyber Security Centre (NCSC) provides a helpful page of advice for system owners on modern best practices, which is worth reading through.

3 Cryptography exercises

3.1 Digital signatures and hash collisions

Visit the MD5 Collision Demo page, at https://www.mscs.dal.ca/~selinger/md5collision/. In lectures, we’ve noted that the MD5 hash algorithm has been considered cryptographically broken since 2004. It is vulnerable to collision attacks, where two different inputs produce the same hash value – thus, it should never be used for password hashing, digital signatures, or sensitive data integrity checks.

Follow the provided links on that page to two PostScript documents – the first a letter of recommendation to an intern, and the second, an order to grant the intern a security clearance. Download them (e.g. with wget), and confirm that they have the same MD5 hash. (The command md5 <somefile> will display the MD5 hash.)

What exactly is a digital signature? In essence, it’s an assertion of the form:

But how do we make such an assertion? We need a way of doing so which also allows the assertion to be easily checked. The trick is: rather than working with the whole document contents, we pass it through a hash function. This produces a short, fixed-size value that acts like a “fingerprint” of the document – changing even a single byte of the document will result in a radically different hash value.

When someone “signs” a document, what they actually sign is this hash value. So the claim they are making is really: “I, person A, signed a document whose hash is H.” For a good cryptographic hash function, it is easy to compute the hash of a document, but infeasible to find a different document with the same hash. That means the hash uniquely (for practical purposes) identifies the document.

When MD5 was “broken” in 2005, researchers showed it was possible to generate a collision (like the PostScript documents linked to) on a typical home PC in several hours. Today, generating an MD5 collision typically takes only minutes or second (and if an attacker has access to GPUs to do the computation, it is effectively instantaneous).⁴

3.2 Checking for password breaches

Do you feel safe entering your password into the site? Perhaps not. Fortunately, you don’t have to. Save the following in your development environment as passcheck.sh, and make it executable with chmod a+rx passcheck.sh:

The Pwned Passwords site relies on a system call “k-anonymity”⁵ – you never actually send your plaintext password to the website, or even the full hash. Rather, you just send a small prefix of your hash (in this case, the first 5 characters) to the service, and it returns a list of all the suffixes (in this case, the last 35 characters) of breached password hashes that start with that prefix. So a full password hash is never sent between you and the server, or seen by the server.

The prefix you send is far too short to uniquely identify your password’s hash – many different hashes will share the same prefix, so the server never sees enough information to reconstruct or recognise your specific password.

And you can then check, locally, “Does my password match any of them exactly?” Your query is effectively “hidden” amongst many possible candidates, so your actual password (or its full hash) is never revealed.

as output, indicating that an instance of the password “password” has been found over 52 million times in known data breaches. The pass-phrase “correct horse battery staple” had never been used before Randall Munroe used it in his comic Xkcd – how many times has it now been found in known data breaches?

4 Cryptography questions

See if you can answer the following questions, after reviewing the material on cryptography in the lectures.

Sample solution

If we run the commands, we get output like the following:

$ printf mypassword | md5sum
34819d7beeabb9260a5c854bc85b3e44  -

The first “word” of output is the actual hash; the “-” represents the name of the file being hashed (in this case, “-” represents standard input).

The hash is a sequence of hexadecimal digits, and represents 16 bytes (or 128 bits).

Each pair of characters in the original hash represents one byte, so if stored in C as an array of bytes, we could write it as follows:

  // fragment 1
  char somehash[] = {0x34, 0x81, 0x9d, 0x7b, 0xee, 0xab, 0xb9, 0x26,
                     0x0a, 0x5c, 0x85, 0x4b, 0xc8, 0x5b, 0x3e, 0x44};

Strings in C also allow us to use hexadecimal escape sequences, so we could also write the following:

  // fragment 2
  char somehash[] = "\x34\x81\x9d\x7b\xee\xab\xb9\x26"
                    "\x0a\x5c\x85\x4b\xc8\x5b\x3e\x44";

The difference is that in fragment 1, somehash is a “plain” array or buffer, of size 16 elements, but in fragment 2, somehash is a null-terminated C string, so the array will be of size 17.

Note that best practice suggests that in the above examples, we should specify an exact size for the array, rather than relying on it being implicitly defined, and should use an enum or #define to specify the size, rather than a magic number – so the first example becomes

  // fragment 1
  enum {
    HASH_SIZE = 16
  };

  char somehash[HASH_SIZE] = {
      0x34, 0x81, 0x9d, 0x7b, 0xee, 0xab, 0xb9, 0x26,
      0x0a, 0x5c, 0x85, 0x4b, 0xc8, 0x5b, 0x3e, 0x44
  };

Sample solution

Hash collisions occur because a hash function maps a very large (in fact, infinite) set of possible inputs (“all possible files”, for instance) to a fixed, finite set of outputs (the hash values). For MD5, the number of possible hash values is \(2^{128}\) – the output is always exactly 16 bytes, and that is how many different values there are in 16 bytes.

Since there are far more possible inputs than outputs, some different inputs must inevitably produce the same hash value. We can see this more clearly if we imagine writing a hash function (not a very good one!) which outputs a hash value of only one byte. The function should be able to take in any possible file – but always gives one of only 256 results; so there must be many files which give the same result.

This is an application of the pigeonhole principle in mathematics: if you have more “items” (inputs) than “boxes” (possible hashes), at least one box must contain more than one item. In fact, for any given hash value, there are not just a few but infinitely many inputs that produce it.

Sample solution

No, this is not a good cryptographic hash function.

The function \(h(x) = x \bmod 10000\) produces only 10,000 possible output values, regardless of how large or varied the inputs are. This is far too small for a hash function: collisions (different inputs producing the same hash) will be extremely frequent. In fact, any two inputs that differ by a multiple of 10,000 will always have the same hash, so collisions are trivial to construct.

A good cryptographic hash function should make it computationally infeasible to find two inputs with the same hash (collision resistance), or to find an input that produces a particular hash (preimage resistance). This function fails both requirements: given a hash value, it is easy to find many inputs that produce it, and collisions can be generated immediately.

It also lacks any notion of mixing or diffusion: small, predictable changes to the input lead to equally predictable changes in the output. Overall, it behaves more like a simple checksum than a secure hash, and is not suitable for cryptographic use.

Sample solution

Salting passwords prevents several common attacks on passwords.

If a password is used unsalted, directly as is, then every user in the world who happens to use the password “qwerty”, for instance, will have exactly the same hash for their password (assuming the same algorithm is used). If we used the MD5 hashing algorithm,⁶ then every user who uses the password “qwerty” will get the hash d8578edf8458ce06fbc5bb76a58c5ca4:

$ printf qwerty | md5sum
d8578edf8458ce06fbc5bb76a58c5ca4  -

That means if an attacker happens to get hold of the list of hashed passwords, it’s extremely easy for them to find out what the user’s password is – despite the fact that hashes are “difficult to reverse”.

The attacker knows that many people choose very common passwords and that “qwerty” is one of these, and that the MD5 hash of “qwerty” is d8578edf8458ce06fbc5bb76a58c5ca4. So if the attacker has a list of the hashes of common passwords, they’ll easily recognize them whenever they appear.

Adding a random salt to the password destroys this straightforward correspondence between password and hash.

In a bit more detail – how could attackers try to make use of a list of leaked, but hashed passwords? Let’s assume an attacker has access to a list of hashes for 100,000 of our customers, and wants to obtain the original passwords. There are a few options, depending on whether the hashes are salted or unsalted, and what kind of hash algorithm we used.

Full brute-force with real-time hashing (fast hash algorithm and unsalted hashes only)

Modern GPUs can compute billions of hashes per second using fast hash algorithms like MD5 (e.g., ~67 billion/sec on a mid-range 5-year-old NVIDIA GPU).⁷ This sounds fast, and can be used to brute-force short passwords that use these algorithms with no salt.

But once we consider all alphanumeric passwords up to length 8, that’s roughly 200 trillion passwords,⁸ which would take over 8 hours for one full scan. For large-scale attacks (e.g., cracking 100,000 user hashes), this becomes practically impossible due to enormous total time and cost. So attackers don’t normally rely on pure brute force.

Salts are combined with the password (e.g. concatenated or prepended) before hashing, so they effectively expand the password space (e.g. just a 1-byte salt prepended to our \(\leq\) length 8 passwords expands the password space from 200 trillion to around 5 quadrillion). This makes pure brute force infeasible in most real-world cases.

(And as we know, modern systems shouldn’t be using fast hash algorithms like MD5, but dedicated slow password hashing algorithms like bcrypt, scrypt and Argon2.)

tl;dr: Only feasible for very short, unsalted passwords using fast hash functions like MD5.

Precomputed hashes (simple in-memory lookup table – only for fast hash algorithms, with small hash size, and unsalted hashes)

Leaked lists exist of, say, 10 million popular plaintext passwords (e.g. the RockYou list⁹). Assume each password–hash pair includes the plaintext (say, 10 bytes on average – true for the RockYou list) and an MD5 hash (16 bytes), total storage is then around 300 MB – easily small enough to fit into RAM. This allows for extremely fast hash-to-password lookups via a simple table.

The hashes need to be precomputed, but that doesn’t take long – at 67 billions of hashes per second, it takes less than a millisecond.

The approach breaks down completely, however, if the stored hashes are salted. Each user’s hash depends on both the password and a unique salt. For 10,000 users with unique salts, that’s effectively 10,000 versions of the 300 MB table – about 3 terabytes total. (Even more with larger dictionaries or hash algorithms that produce longer bytestrings. SHA-1 produces 20-byte output. Modern algorithms like scrypt, bcrypt and Argon2 are configurable, but typically are used to produce 32 byte hashes, so 6 terabytes total; and the time to precompute the hashes would be much longer, too.)

tl;dr: Very efficient for unsalted hashes, generated from fast hash algorithms which produce smallish-sized output (e.g. MD5), but completely undermined by salting.

Rainbow tables (in-memory tables with space–time trade-offs)

Rainbow tables are a clever optimisation of precomputed hash attacks. Rather than storing every possible password–hash pair, they perform additional calculations at runtime, but can reduce storage requirements drastically. (Wikipedia gives a fuller explanation, for those who are interested.)

For hash algorithms like MD5 or SHA-1, rainbow tables can compress the entire space of all short (e.g. \(\leq 8\) character) passwords into a few gigabytes, making it feasible to load the table into RAM. Once in memory, lookups are very fast, even with the extra calculations needed.

But there are two drawbacks. Firstly, rainbow tables need to be pre-prepared (we need to precompute all the hashes) – this can take hours or days. Secondly, the entire technique fails if even a single salt byte is added. Since each salt value changes the output hash, a separate rainbow table would be needed for each possible salt. This ends up multiplying storage needs by billions.

Widespread use of salts has rendered rainbow tables obsolete, except when attacking systems that are old or have very poor security. These do still occur, though – in June 2020, the online antiques marketplace LiveAuctioneers suffered a data breach, and was discovered to be storing passwords as unsalted MD5 hashes.¹⁰

tl;dr Rainbow tables shrink the storage needed for unsalted hash attacks, but are useless against salted hashes.

Salted hashes with slow, modern algorithms

Best-practice password storage uses a unique, random salt per user, and a key-stretching algorithm designed to be expensive to compute, such as bcrypt, scrypt, or Argon2. These algorithms deliberately slow down hashing to make large-scale attacks computationally impractical, even with modern GPUs or custom hardware.¹¹

In practice, how many guesses could an attacker attempt? The above algorithms allow system designers to alter the RAM and time needed to compute a single hash.
- With bcrypt at cost factor 12 (default in many libraries), a single GPU can manage only 100–200 guesses per second per target.
- Argon2 or scrypt with high memory usage may reduce this to 10 guesses per second or fewer.
So even if a user has used one of the 10 million most common passwords, attacking the hash of just that one user could take a million seconds, which is about 12 days. We can’t re-use the results of our calculations for other customers, because they all will have a different salt. And spending 12 days each for all the 10,000 customers is completely uneconomic.

This is why salts and slow hashes are important – they reduce targeted, per-user brute-force attacks to a very slow rate, even when the user has used a short, not very secure password – and prevent attackers from re-using the results of previous calculations.

And if a long, unique password is used? Then an attack is not feasible at all – the space of all possible passwords is just too large.

tl;dr: Renders brute force attacks infeasible even for short, bad passwords. For long, good passwords – no chance.

Roman soldiers used “watchwords” to identify each other (especially at night, to distinguish allies from potential enemies). These watchwords would often be changed daily for security. See Polybius’s Histories, translated by E.S. Shuckburgh (London: Macmillan, 1889), p 487, available at Project Gutenberg.↩︎
See section 5.1.1, “Memorized secrets” of NIST standard SP 800-63B ↩︎
Sometimes service accounts will authenticate themselves using a password-like value. But often a preferred approach is for them to use a public-private key pair. When the service account needs to authenticate itself to another system, it sends an authentication request; the foreign system provides a randomly generated value (a “challenge”) which the service account then must encrypt with its private key (producing a “response”), proving that it is who it claims to be. This is basically the same method the Git program uses to authenticate itself on your behalf to GitHub if you use an SSH key pair to authenticate – it’s called a challenge–response protocol.↩︎
You might ask – what exactly does it mean to “generate a collision”? It means that you can find two files which produces the same hash – but there are no other constraints on what the files need to contain. If you need the files to, say, both be valid PNG files, or both be human-readable text, the task becomes harder.
Other attacks besides “collision attacks” are “preimage attacks” and “second preimage attacks”. In a “preimage attack”, you know the hash of some file X, but don’t have access to X itself, and your task is to find a second file which produces the same hash. In a “second preimage attack”, you have some file X – and therefore can easily calculate its hash – and want to again find a second file which produces the same hash. MD5 is still secure for these latter two kind of attacks; it is only collision attacks it is vulnerable to. Nevertheless, that is enough to mean that it shouldn’t be used for any kind of cryptographic purpose.↩︎
See “Validating Leaked Passwords with k-Anonymity” and https://haveibeenpwned.com/API/v3#PwnedPasswords for more information on how this works.↩︎
Note that as per the lectures, MD5 should not be used in practice as a password hashing function; a dedicated function like SCrypt should be used instead.↩︎
See here for benchmarks of the Hashcat tool using unsalted MD5 hashes on an NVIDIA GeForce RTX 3090. Graphics processors are often used for password cracking, because they contain many thousands of cores, so can be used to calculate a large number of hashes in parallel.↩︎
The number of possible characters is \(26 + 26 + 10\) (uppercase, lowercase and digits). So the total number of passwords is \(62^1 + 62^2 + ... + 62^8 \approx 2.2 \times 10^{14}\), or around 200 trillion.↩︎
In 2009, software development company RockYou was breached; they stored a list of 14 million passwords in plain text. (See https://www.kaggle.com/datasets/wjburns/common-password-list-rockyoutxt). This corpus of passwords is widely used in security research and password cracking.↩︎
See Troy Hunt, https://x.com/troyhunt/status/1297036195315085313, linking to The Daily Swig cybersecurity news article, “LiveAuctioneers data breach: Millions of cracked passwords for sale, say researchers”↩︎
Field-programmable gate arrays (FPGAs) are more expensive than GPUs, but faster, and tyically have lower power consumption. Custom integrated circuits with a specific hash algorithm “baked in” to them (application-specific integrated circuit, or ASICs) are fastest of all and have even lower power consumption, but are inflexible and expensive to produce – they’re used for tasks like Bitcoin mining due to their speed and low operational costs.↩︎

CITS3007 lab 8 (week 10) – Passwords – solutions

1 Introduction

2 Modern user password best practice

3 Cryptography exercises

3.1 Digital signatures and hash collisions

3.2 Checking for password breaches

4 Cryptography questions