Ed2k-hash

From AniDB
Jump to navigation Jump to search


What's an ed2k hash?

A hash is a <insert non-laziness here>

The ed2k hash is based on the md4 algorithm, but rather than providing a single hash of the entire file, it breaks the file up into 9500kb chunks and produces a final hash based on the md4 sums of the chunks. While no longer considered secure from a cryptographic perspective, for the purpose of uniquely identifying files it's more than adequate. It is often listed as part of an ed2k link, which also includes a size in bytes, and a name.

Why does anidb require ed2k-hashes?

The main reason for this is that it avoids adding of double db entries. AniDB will not allow you to add a file with the same ed2k-hash as an existing one.

The file size and ed2k-hash of a file is used to identify it globally.

You are allowed to add files without ed2k-hashes to AniDB, however you should edit those files later and add the missing ed2k-hashes. Once you added a certain number of files without ed2k-hashes you may no longer add new files without ed2k-hashes until you edit your old files first.

Why use ed2k instead of another type of hash if you do not support downloads?

  • The combination of ed2k-hash and file size makes ed2k effective for uniquely identifying files.
  • Since ed2k-hashes can be passed back and forth within a URL with both the hash and file size in a widely recognized format it is a convenient method for adding or checking files.
  • Other hashes are good for validating if a file is corrupt if you already know what file you are comparing it against, but cannot necessarily globally identify a file in the system like the ed2k.
  • AniDB was designed around ed2k, although other hashes have been added to the file records for validation, the internal structure is based on ed2k. If the site goes through a complete redesign then maybe another hash will be made the primary hash, but at this point, this is not likely to change.

Which software can be used to generate them?

If you have the file(s) on your hard disk or on CD you can use all kinds of tools to generate the ed2k-hash and other additional info:

  • AniDB_O'Matic by BennieB/PetriW can generate ed2k-hashes. It generates ed2k-hashes, md5, sha-1, crc32 in one go and also lists stuff like codec, resolution, bitrates, ...
  • ed2k_hash another tool which is commandline based and also available for Linux.
  • Filehash a little java program written by Malich. for further info on it read here
  • Hashcalc can create md5, sha-1, crc32, ed2k-hashes and various other hashes in 1 go. though at least 1 case is known in which it created a wrong ed2k!

How is an ed2k hash calculated exactly?

A file is hashed in 9728000 byte chunks, using the md4 algorithm, and produces a 128 bit hash for each chunk. For files with only one chunk, the ed2k hash is the md4 of the file, however for hashes with 2 or more chunks the the hash of each chunk is appended to those before it, and an further md4 of the hashes themselves provides the ed2k hash of the file. Pseudo code is given below:

if filesize is less than or equal to 9728000:
return md4 of file
for chunk of size upto 9728000 in file:
append md4 of chunk to hashlist
if filesize is a multiple of 9728000:
append md4 of null to hashlist
return md4 of hashlist

Note that there are two different ways in practice that implimentations treat the 9728000 byte boundary, given as either the red code or the blue code above, black is common to both. In practice this difference only affects a tiny number of files, however is the one case where two 'valid' ed2k hashes might be produced from one file.

<list of which clients use which method>

  • edonkey2000 v1.4.3
  • mldonkey-2.5.30.17
  • HashCalc Version 2.01
  • edonkey-tool-hash-0.4.0
  • fsum 2.51
  • emule 0.46c
  • AOM 0.5.5.239
  • webaom v1.13
  • ed2k (Stephane D'Alu) 1.4

<list of affected files, filesize>

File Size (bytes) Blue method Red method
File of zeros 9728000 d7def262a127cd79096a108e7a9fc138 fc21d9af828f92a8df64beac3357425d
File of zeros 19456000 194ee9e4fa79b2ee9f8829284c466051 114b21c63a74b6ca922291a11177dd5c
http://anidb.info/f7047 145920000 1c2b1a6b142955d84af5d3210d3ece6f 4f79548623c6099896a489257163764e
http://anidb.info/f24359 136192000
http://anidb.info/f31383 136192000 df294338b38a29f81ad84f1f364b4504
http://anidb.info/f48530 175104000
http://anidb.info/f51131 107008000 aa399ff3a0ab9f8eb939dbcd7b7d0ec3
http://anidb.info/f51330 175104000 fa7fbadaed151b003032985eae5c3420 148b2cf54cb4d66f70939ec5224d7961
http://anidb.info/f55744 97280000 2fcd55bdeae2a92cc99d70763a64f048
http://anidb.info/f56411 165376000 f498072c0849cee180e4a1a7d34a26d2
http://anidb.info/f57766 145920000 7a54eda5d89ed525974487aa94515701 85d995b678284e7db5d52df1375971a9
http://anidb.info/f73921 184832000 fc9210c307f99ed7339556d5f05f3d59
http://anidb.info/f78552 194560000 ddcec8fdcddd43276a2c173498345789 d61b705c59199666e164a274e7f91bec
http://anidb.info/f80216 165376000
http://anidb.info/f92884 68096000
http://anidb.info/f123554 243200000 822fc0f338fe8e43d96b9a99fe9632ce ee3557fe68ccd056302710a185f4445f
http://anidb.info/f126410 184832000 87ac7a62de204473d3f5448214f4207c
http://anidb.info/f130233 184832000
http://anidb.info/f142402 155648000 4538e1fded9d7661e1ad7d56e7406054
http://anidb.info/f165143 165376000 0ced631bb9010d3ccd331689a2fb02de 6a092c056bc46e7a08d63408f918ba52
http://anidb.info/f166304 184832000 aab2ce19d5b786af20d6e4a15f63552f aa9930ccd300a2feac30b0e49830c321
http://anidb.info/f174421 223744000