UDP API DEV: Difference between revisions

Line 28: Line 28:
: Well, the hashes are likely to be unique. However, this is not enforced by AniDB. Whether two distinct files really have the same hash or whether it is just an input error (i.e. wrong c&p) of the submitting user, the result is the same. You're trying to do a lookup for which you expect exactly one result or nothing at all and you end up with multiple results. And yes, most clients do support other hashing methods and also submit such data to anidb. However, internally they are all based on ed2k hashes AFAIK. [[User:Exp|Exp]] 08:52, 10 July 2007 (UTC)
: Well, the hashes are likely to be unique. However, this is not enforced by AniDB. Whether two distinct files really have the same hash or whether it is just an input error (i.e. wrong c&p) of the submitting user, the result is the same. You're trying to do a lookup for which you expect exactly one result or nothing at all and you end up with multiple results. And yes, most clients do support other hashing methods and also submit such data to anidb. However, internally they are all based on ed2k hashes AFAIK. [[User:Exp|Exp]] 08:52, 10 July 2007 (UTC)
:: The real question is: Should we enforce uniqueness? It's not really a problem to implement and shouldn't lead to any problems. We have reports on uniqueness and there are atm 2x2 files with equal md5, which obviously is wrong (if you look at them). Constraints would just mean that DerIdiot doesn't have to go around and fix such entries from time to time. On the other hand, supporting md5 lookup for clients is hardly important and it'll only have a negative impact on performance (although probably unnoticeable). --[[User:Epoximator|Epoximator]] 12:02, 10 July 2007 (UTC)
:: The real question is: Should we enforce uniqueness? It's not really a problem to implement and shouldn't lead to any problems. We have reports on uniqueness and there are atm 2x2 files with equal md5, which obviously is wrong (if you look at them). Constraints would just mean that DerIdiot doesn't have to go around and fix such entries from time to time. On the other hand, supporting md5 lookup for clients is hardly important and it'll only have a negative impact on performance (although probably unnoticeable). --[[User:Epoximator|Epoximator]] 12:02, 10 July 2007 (UTC)
::: Well, enforcing uniqueness in the db brings a slight performanc overhead with it, but as file additions are a very seldom event, that wouldn't hurt us. We'd have a couple of extra indicies on the file table, which would increase the storage requirements of the db. Though even that wouldn't be all that much. Supporting MD5 hashes on the UDP API might simplyfy the writing of very simple UDP clients, as there are easily available MD5 libraries for every programming language out there. For ed2k/md4 libs you might have to search for a bit. However, I think I wouldn't go as far as to enforce uniqueness for all our hashes (i.e. sha1 and tth). But it might be seriously worth considering enforcing MD5 uniqueness, especially if we might somday drop ed2k hashes. MD5 hashes would offer a nice fallback in such a case. [[User:Exp|Exp]] 08:52, 11 July 2007 (UTC)


== Mylist Commands ==
== Mylist Commands ==
MediaWiki spam blocked by CleanTalk.
MediaWiki spam blocked by CleanTalk.