Some observations on the Myspace dump

Post general questions, ask for help or post proposals here.
frekvent
Posts: 1
Joined: Sat 2. Jul 2016, 18:23

Some observations on the Myspace dump

Postby frekvent » Mon 4. Jul 2016, 19:07

Now that the Myspace dump is public, it might be interesting to add
the unsalted SHA-1 hashes (116.8 million unique) to the public "leaked
lists". The full dump is available as a torrent here:

https://myspace.thecthulhu.com (15GiB)

Additionally, I have uploaded a file with just the unique unsalted sha1 hashes here:

https://transfer.sh/lGA95/myspace-unsalted-sha1.txt.xz (2.1 GiB)

SHA1: 3c4da283e594773070404b646940fe14933668dd

Once unrared the file is a 33GiB large textfile with 360213049
rows. Here is 10 lines selected at random:

Code: Select all

543856369:************@yahoo.com:543856369:0x89166CFDE7D45E39B21206DAFA612C9FB4DAA92F:0xC785C1A73DA65BAFA40C572BEC840D1E70AA0DAE 499145944:************@hotmail.com:499145944:0x612F560AB94F488B859B7BFB1D7A9D4EE4FA443B:'' 499250123:************@woodyahoo.com:499250123:0xB2E6577D53B88CD3A00C404B11556AFF12454144:'' 29775087:************@hotmail.com:************:0x1C19E741FF826D9AAED6DB7A1909E4E8D8A92286:'' 460542274:************@yahoo.com:460542274:0xBAAFE1E377A382790B828DE507A58A8A20E87C2C:'' 543844275:************@hotmail.com1:543844275:0x1EF9E7810B8A65BCEE41F687E607658372F2AB3B:0x7C9A7B3F8BD57EC9B430440AACE143B6FF82CE02 408108080:************@gmx.at:************:0x8C9BE18DCF82225AB7E76A4EA6389F668116DFDB:'' 565243804:************@voila.fr:************:0xFD8E45C9F9FD1BAB8C2D938EC9398F7B5E0F2C78:0x246D7F35E06BDD1D56737B657D5E4CD2C0E00CB5 146144766:::'':'' 466504840:************@yahoo.com:466504840:0x74E3090558267BFE8E7F491E007CE262F3BD3CCD:''
The format of each record is

Code: Select all

id : email : id/username : sha1(strtolower(substr($pass, 0, 9))) : sha1($id . $pass)
  • Field 1 is an integer.
  • Field 2 should be an email address but can contain any junk including
    unescaped newlines and colons. This has to be taken into account when
    parsing the data.
  • Field 3 is either an user id identical to field 1 or an username.
  • Field 4 is a sha1 hash of the password. The password was converted to
    lowercase and truncated to 10 characters before hashing.
  • Field 5 is a salted sha1 hash of the password. The salt is the user
    id in field 1. Unlike field 4 the password doesn't appear to have been lowercased and truncated
    before hashing.
Counting hashes
Each record that has a hash in field 5 also has a hash in field 4 but
the converse is not true. Some records have no hashes at all. In total
359006286 records have an associated password.

Code: Select all

$ tr -d '\r' <Myspace.com.txt | grep -E "'':0x[A-F0-9]{40}$" | wc -l 0 $ tr -d '\r' <Myspace.com.txt | grep -E ":0x[A-F0-9]{40}:''$" | wc -l 290524629 $ tr -d '\r' <Myspace.com.txt | grep "'':''$" | wc -l 1206372 $ tr -d '\r' <Myspace.com.txt | grep -E '(:0x[A-F0-9]{40}){2}$' | wc -l 68481657
Recovering salted passwords
The password in field 4 is a truncated and lowercased version of the
password in field 5. One can use the truncated password in field 4 to
recover the full password in field 5. If the password is shorter than
10 characters this is trivial.

Code: Select all

$ echo -n 123456 | openssl sha1 | sed 's/.* //;y/abcdef/ABCDEF/;s/^/0x/' 0x7C4A8D09CA3762AF61E59520943DC26494F8941B $ grep -F 0x7C4A8D09CA3762AF61E59520943DC26494F8941B:0x Myspace.com.txt | tr -d '\r' | awk -F: '{ print $(NF) ":" $1 }' | sed 's/..//' | tr A-F a-f > test.hash $ wc -l test.hash 269356 test.hash $ hashcat -m 120 test.hash -a3 123456 -o /dev/null Initializing hashcat v2.00 with 4 threads and 32mb segment-size... Added hashes from file test.hash: 269356 (269356 salts) All hashes have been recovered Input.Mode: Mask (123456) [6] Index.....: 0/1 (segment), 1 (words), 0 (bytes) Recovered.: 269356/269356 hashes, 269356/269356 salts Speed/sec.: - plains, - words Progress..: 1/1 (100.00%) Running...: 00:00:00:05 Estimated.: --:--:--:-- ...
Extracting hashes from the dump
To extract the unsalted hashes in field 4 I use Awk. The fact that
field 2 can contain newlines and colons makes it more difficult.

Code: Select all

$ awk -F: '!/(:0x[A-F0-9]{40}|:''){2}$/ { print $(NF-1) }' Myspace.com.txt | sed 's/..//' | tr -s '\n' | tr A-F a-f | sort -u > myspace-unsalted.sha1.txt $ wc -l myspace-unsalted-sha1.txt 116825318 myspace-unsalted-sha1.txt $ du -h myspace-unsalted-sha1.txt 4.5G myspace-unsalted-sha1.txt
Here is the result
https://transfer.sh/lGA95/myspace-unsalted-sha1.txt.xz

That's it for now. The purpose of this thread is to discuss the Myspace dump.

User avatar
s3in!c
Administrator
Posts: 58
Joined: Thu 24. Sep 2015, 09:50
Location: Switzerland
Contact:

Re: Some observations on the Myspace dump

Postby s3in!c » Thu 7. Jul 2016, 18:47

Hi frekvent

That were some really good ideas you discovered there, as soon as I saw your post, we (CynoSure Prime) tried to use this information to recover the real passwords for the hashes where this is possible.
There is a more detailed text here: http://cynosureprime.blogspot.ch/2016/0 ... eyond.html

About your question, if the hashes get added to Hashes.org:
Currently the list is just too big to be handled well by the database and storage space is limited for this. And also as the passes from the non-salted hashes are not 'real' password as they are lowercased and cut by length 10, there could be better hashlists to have imported I think.


Return to “General”

Who is online

Users browsing this forum: No registered users and 1 guest

cron