1,633
edits
mNo edit summary |
mNo edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 8: | Line 8: | ||
=== Anime === | === Anime === | ||
* | * episodes added for anime | ||
* files added for anime | * files added for anime | ||
* groups subbing the anime | * groups subbing the anime | ||
Line 21: | Line 21: | ||
=== Episode === | === Episode === | ||
* files added for this | * files added for this episode | ||
* users collecting this | * users collecting this episode | ||
=== File === | === File === | ||
Line 28: | Line 28: | ||
=== User === | === User === | ||
* anime in | * anime in MyList | ||
* | * episodes in MyList | ||
* files in | * files in MyList | ||
* size of files in | * size of files in MyList | ||
* last anime added to | * last anime added to MyList | ||
* date of last | * date of last MyList addition | ||
* anime added to AniDB | * anime added to AniDB | ||
* | * episodes added to AniDB | ||
* files added to AniDB | * files added to AniDB | ||
* groups added to AniDB | * groups added to AniDB | ||
Line 53: | Line 53: | ||
* independence percentage | * independence percentage | ||
* leech percentage | * leech percentage | ||
* number of watched | * number of watched episodes | ||
* watched percentage for | * watched percentage for MyList | ||
* watched percentage for AniDB | * watched percentage for AniDB | ||
* collected percentage for AniDB | * collected percentage for AniDB | ||
Line 63: | Line 63: | ||
The data collected is stored inside of an in-memory Perl hash and any required updates are written back to the database at the end of each chunk in one transaction. | The data collected is stored inside of an in-memory Perl hash and any required updates are written back to the database at the end of each chunk in one transaction. | ||
This leads to one big, monolithic cronjob which creates a lot of database, memory and | This leads to one big, monolithic cronjob which creates a lot of database, memory and CPU load. | ||
Current Runtimes: (13.05.2007) | Current Runtimes: (13.05.2007) | ||
* Anime Stats | * Anime Stats | ||
** 1525 seconds (25 minutes) for <5210 | ** 1525 seconds (25 minutes) for <5210 anime | ||
* Group Stats | * Group Stats | ||
Line 80: | Line 80: | ||
* Total: 7602 seconds (127 minutes) | * Total: 7602 seconds (127 minutes) | ||
The process is mostly limited by the database (the script uses about 900 seconds of | The process is mostly limited by the database (the script uses about 900 seconds of CPU time). | ||
The key issue here is that these numbers rise all the time. In the early days we've run that script multiple times a day, then once a day, now 3 times a week. If things continue as they are now, we'll reach a point where we can't run it at all any more the way it works right now. | The key issue here is that these numbers rise all the time. In the early days we've run that script multiple times a day, then once a day, now 3 times a week. If things continue as they are now, we'll reach a point where we can't run it at all any more the way it works right now. | ||
Line 103: | Line 103: | ||
=== Dirty Flag === | === Dirty Flag === | ||
The current approach gathers data for all DB entries. It doesn't matter whether any of their stats values are likely to have changed. This is especially problematic for the user stats. With each | The current approach gathers data for all DB entries. It doesn't matter whether any of their stats values are likely to have changed. This is especially problematic for the user stats. With each stats update we're collecting the data for all users, even though only a small percentage of them has done any changes to AniDB. They might not even have logged in since the last stats update. | ||
Possible approaches for this would be to: | Possible approaches for this would be to: |
edits