Some observations on the Myspace dump

Post general questions, ask for help or post proposals here.
Posts: 1
Joined: Sat 2. Jul 2016, 18:23

Some observations on the Myspace dump

Postby frekvent » Mon 4. Jul 2016, 19:07

Now that the Myspace dump is public, it might be interesting to add
the unsalted SHA-1 hashes (116.8 million unique) to the public "leaked
lists". The full dump is available as a torrent here: (15GiB)

Additionally, I have uploaded a file with just the unique unsalted sha1 hashes here: (2.1 GiB)

SHA1: 3c4da283e594773070404b646940fe14933668dd

Once unrared the file is a 33GiB large textfile with 360213049
rows. Here is 10 lines selected at random:

Code: Select all

543856369:************ 499145944:************'' 499250123:************'' 29775087:************************:0x1C19E741FF826D9AAED6DB7A1909E4E8D8A92286:'' 460542274:************'' 543844275:************@hotmail.com1:543844275:0x1EF9E7810B8A65BCEE41F687E607658372F2AB3B:0x7C9A7B3F8BD57EC9B430440AACE143B6FF82CE02 408108080:************************:0x8C9BE18DCF82225AB7E76A4EA6389F668116DFDB:'' 565243804:************************:0xFD8E45C9F9FD1BAB8C2D938EC9398F7B5E0F2C78:0x246D7F35E06BDD1D56737B657D5E4CD2C0E00CB5 146144766:::'':'' 466504840:************''
The format of each record is

Code: Select all

id : email : id/username : sha1(strtolower(substr($pass, 0, 9))) : sha1($id . $pass)
  • Field 1 is an integer.
  • Field 2 should be an email address but can contain any junk including
    unescaped newlines and colons. This has to be taken into account when
    parsing the data.
  • Field 3 is either an user id identical to field 1 or an username.
  • Field 4 is a sha1 hash of the password. The password was converted to
    lowercase and truncated to 10 characters before hashing.
  • Field 5 is a salted sha1 hash of the password. The salt is the user
    id in field 1. Unlike field 4 the password doesn't appear to have been lowercased and truncated
    before hashing.
Counting hashes
Each record that has a hash in field 5 also has a hash in field 4 but
the converse is not true. Some records have no hashes at all. In total
359006286 records have an associated password.

Code: Select all

$ tr -d '\r' < | grep -E "'':0x[A-F0-9]{40}$" | wc -l 0 $ tr -d '\r' < | grep -E ":0x[A-F0-9]{40}:''$" | wc -l 290524629 $ tr -d '\r' < | grep "'':''$" | wc -l 1206372 $ tr -d '\r' < | grep -E '(:0x[A-F0-9]{40}){2}$' | wc -l 68481657
Recovering salted passwords
The password in field 4 is a truncated and lowercased version of the
password in field 5. One can use the truncated password in field 4 to
recover the full password in field 5. If the password is shorter than
10 characters this is trivial.

Code: Select all

$ echo -n 123456 | openssl sha1 | sed 's/.* //;y/abcdef/ABCDEF/;s/^/0x/' 0x7C4A8D09CA3762AF61E59520943DC26494F8941B $ grep -F 0x7C4A8D09CA3762AF61E59520943DC26494F8941B:0x | tr -d '\r' | awk -F: '{ print $(NF) ":" $1 }' | sed 's/..//' | tr A-F a-f > test.hash $ wc -l test.hash 269356 test.hash $ hashcat -m 120 test.hash -a3 123456 -o /dev/null Initializing hashcat v2.00 with 4 threads and 32mb segment-size... Added hashes from file test.hash: 269356 (269356 salts) All hashes have been recovered Input.Mode: Mask (123456) [6] Index.....: 0/1 (segment), 1 (words), 0 (bytes) Recovered.: 269356/269356 hashes, 269356/269356 salts Speed/sec.: - plains, - words Progress..: 1/1 (100.00%) Running...: 00:00:00:05 Estimated.: --:--:--:-- ...
Extracting hashes from the dump
To extract the unsalted hashes in field 4 I use Awk. The fact that
field 2 can contain newlines and colons makes it more difficult.

Code: Select all

$ awk -F: '!/(:0x[A-F0-9]{40}|:''){2}$/ { print $(NF-1) }' | sed 's/..//' | tr -s '\n' | tr A-F a-f | sort -u > myspace-unsalted.sha1.txt $ wc -l myspace-unsalted-sha1.txt 116825318 myspace-unsalted-sha1.txt $ du -h myspace-unsalted-sha1.txt 4.5G myspace-unsalted-sha1.txt
Here is the result

That's it for now. The purpose of this thread is to discuss the Myspace dump.

User avatar
Posts: 51
Joined: Thu 24. Sep 2015, 09:50
Location: Switzerland

Re: Some observations on the Myspace dump

Postby s3in!c » Thu 7. Jul 2016, 18:47

Hi frekvent

That were some really good ideas you discovered there, as soon as I saw your post, we (CynoSure Prime) tried to use this information to recover the real passwords for the hashes where this is possible.
There is a more detailed text here: ... eyond.html

About your question, if the hashes get added to
Currently the list is just too big to be handled well by the database and storage space is limited for this. And also as the passes from the non-salted hashes are not 'real' password as they are lowercased and cut by length 10, there could be better hashlists to have imported I think.

Return to “General”

Who is online

Users browsing this forum: No registered users and 1 guest