Maintenance DEV: Difference between revisions

Jump to navigation Jump to search
mNo edit summary
Line 2: Line 2:
= AniDB Stats =
= AniDB Stats =


A very labour intensive task is the generation of all the statistics and counters for the anidb db entries. Optimization of this process is therefore on the todo list.
A very labour intensive task is the generation of all the statistics and counters for the AniDB db entries. Optimization of this process is therefore on the todo list.


== Data ==
== Data ==
Line 36: Line 36:




* animes added to anidb
* animes added to AniDB
* eps added to anidb
* eps added to AniDB
* files added to anidb
* files added to AniDB
* groups added to anidb
* groups added to AniDB
* producers added to anidb
* producers added to AniDB
* anime titles added to anidb
* anime titles added to AniDB
* anime categories added to anidb
* anime categories added to AniDB
* anime-producer relation added to anidb
* anime-producer relation added to AniDB
* anime-group comments added to anidb
* anime-group comments added to AniDB
* review comments added to anidb
* review comments added to AniDB
* reviews added to anidb
* reviews added to AniDB
* votes added to anidb
* votes added to AniDB
 


* lame files (no ed2k link)
* lame files (no ed2k link)
Line 56: Line 55:
* number of watched eps
* number of watched eps
* watched percentage for mylist
* watched percentage for mylist
* watched percentage for anidb
* watched percentage for AniDB
* collected percentage for anidb
* collected percentage for AniDB
 


== Current Approach ==
== Current Approach ==
Line 105: Line 103:


=== Dirty Flag ===
=== Dirty Flag ===
The current approach gathers data for all db entries. It doesn't matter whether any of their stats values are likely to have changed. This is especially problematic for the user stats. With each statsupdate we're collecting the data for all users, even though only a small percentage of them has done any changes to anidb. They might not even have logged in since the last stats update.
The current approach gathers data for all db entries. It doesn't matter whether any of their stats values are likely to have changed. This is especially problematic for the user stats. With each statsupdate we're collecting the data for all users, even though only a small percentage of them has done any changes to AniDB. They might not even have logged in since the last stats update.


Possible approaches for this would be to:
Possible approaches for this would be to:
* skip users who haven't logged in since last update
* skip users who haven't logged in since last update
* add a "dirty" boolean flag to entries in the user table which is set whenever a user makes a change to anidb which is potentially relevant for his stats.
* add a "dirty" boolean flag to entries in the user table which is set whenever a user makes a change to AniDB which is potentially relevant for his stats.
* ... ?
* ... ?