Cryptographic hash functions take an input of arbitrary length and produces a fixed length output. The special features are that the outputs are deterministic (the same input always generates the same output) and yet the outputs are apparently “random” – two similar inputs are likely to produce very different outputs, and it’s difficult to determine the input by looking at the output.
A common application of hashing is in storing passwords. Many systems store only the hash of a user’s password along with their username. This ensures that even if someone gains access to the stored data, users’ actual passwords are not exposed. When a user types in a password on such a system, the password handling software grants access if the hash value generated from the user’s entry matches the hash stored in the password database.
We said above that it is difficult to determine an input given an output. Password cracking is a family of techniques for accomplishing this difficult (but possible!) task. That is, let’s say we have access to a user’s password hash only. Can we figure out their password? We could then use it to log in, and it may also be shared across their accounts which we could also access.
In some cases, password cracking can exploit the structure of a hash function; this is a topic for a cryptography class. In our case, we will take a more brute-force approach: trying variations on existing known passwords, under the assumption that many passwords are easy to guess.
Getting Started
To get started, visit the Github Classroom assignment link. Select your username from the list (or if you don’t see it, you can skip and use your Github username). A repository will be created for you to use for your work. This PA should be completed on the ieng6 server. Refer this section in Week 1’s Lab for instructions on logging in to your account on ieng6 and working with files on the server.
NOTE
The auto-grader uses the C11 standard of the C programming language
Overall Task
We’ll start by describing the overall task and goal so it’s clear where you’re going. However, don’t start by trying to implement this exactly – use the milestones below to get there.
pwcrack Password Cracker
Write a program pwcrack that takes one command-line argument, the SHA256 hash of a password in hex format.
The program then reads from stdin potential passwords, one per line (assumed to be in UTF-8 format).
The program should, for each password:
Check if the SHA256 hash of the potential password matches the given hash
Check if the SHA256 hash of the potential password with each of its ASCII characters uppercased or lowercased matches the given hash
If a matching password is found, the program should print
Found password: SHA256(<matching password>) = <hash>
If a matching password is not found, the program should print:
Did not find a matching password
Compiling Your Code
On ieng6
Because the OpenSSL libraries on ieng6 are not in the default library search path, you need to use this command:
The -L flag tells the compiler where to find the crypto libraries during linking.
On Other Systems
On most other Linux systems or macOS, you can compile with:
gcc *.c -o pwcrack -lcrypto
Examples
seCret has a SHA256 hash of a2c3b02cb22af83d6d1ead1d4e18d916599be7c2ef2f017169327df1f7c844fd, and notinlist has a SHA256 hash of 21d3162352fdd45e05d052398c1ddb2ca5e9fc0a13e534f3c183f9f5b2f4a041
$ ./pwcrack a2c3b02cb22af83d6d1ead1d4e18d916599be7c2ef2f017169327df1f7c844fdPasswordNeverGuessMe!!secretFound password: SHA256(seCret) = a2c3b02cb22af83d6d1ead1d4e18d916599be7c2ef2f017169327df1f7c844fd$ ./pwcrack 21d3162352fdd45e05d052398c1ddb2ca5e9fc0a13e534f3c183f9f5b2f4a041PasswordNeverGuessMe!!secret<Press Ctrl-D for end of input>Did not find a matching password
We could also put the potential passwords in a file:
we only consider single-character changes when trying uppercase/lowercase variants of the guesses (i.e. we DON’T try all possible capitalizations of the string): In the example below, the correct password is NOT found because going from SECRET to seCret would require 5 characters to be changed.
$ ./pwcrack a2c3b02cb22af83d6d1ead1d4e18d916599be7c2ef2f017169327df1f7c844fdSECRET<Press Ctrl-D for end of input>Did not find a matching password
Using Problem Set
Most of the needed functionality for the PA is in the problem set problems! So part of your workflow for this PA should be to move code over from the appropriate part of your problem set and into your PA code.
Real password crackers try many more variations than just uppercasing and lowercasing. Do a little research on password cracking and suggest at least 2 other ways to vary a password to crack it. Describe them both, and for each, write a sentence or two about what modifications you would make to your code to implement them. (You don’t have to actually implement them).
In my research, I have found something called a “Rainbow Table Attack” where it uses a precomputed table for password cracking. This precomputed table is generated by taking common password and hash them in well known hashing functions (SHA) and store them. To incorporate this, I would include a file with a table (space separated) that has the common password followed by the hashed code. Then I can check if the hashed password is a match with hashed code stored in the table, hence the password crack.
Another technique I found was “Spidering” where the hackers who the individual’s information and compile a list of identifying information and common keywords. By using these terms, combined with dictionary attacks, can form high quality password cracking methods. To do this, one would need the information of the individual and place them in a file. Check each keyword and do some alterations to each keyword and see if the hashed password matches
How much working memory is needed to store all of the variables needed to execute the password cracker? Based on your response would you say that a password cracker is more memory-limited or is it more limited by how fast the process can run the code?
Based on the implementation of the password cracker, it needs a string to store the input string, the hashed representation of a variation of the input string, and a strign to store the hexadecimal representation of the hahashed password. That in total comes to around 200 in variable memory. Note that there is no need to store all variations of the current password nor the previous password attempt as those can be disregarded as it has failed to match the correct password.
With that, the limiting factor of the password cracker process is the time it takes, or rather, how fast the process can run the code. This is due to the astronomical number of possible combinations for a password is too vast and too random if it is randomly generated. Where the local memory of the program contributes little to the limiting factor.