OstDB DEV: Difference between revisions

From AniDB
Jump to navigation Jump to search
Line 67: Line 67:
* person<=>company relations? (works/ed for?)
* person<=>company relations? (works/ed for?)
* company<=>company relations?
* company<=>company relations?
* person<=>person relations? (married?)


An multiplicities, i.e. has each producer only one person/company related to it, or multiple ones?
An multiplicities, i.e. has each producer only one person/company related to it, or multiple ones?

Revision as of 18:06, 23 February 2007

General

this is the place to contribute ideas on a possible future addition of anime OST data to anidb.

For other areas of active development on AniDB, check: Development

Vision

The general idea would be that AniDB clients would be extended with audio file support and would automatically provide anidb with lots of raw data on audio files being collected by it's userbase. For so far unknown audio files interested users (aka work monkeys) would either use a client or the webinterface to specify the song (or add it, if it is not yet listed on anidb). Known audio files could automatically be added to the users my(ost)list, could be renamed or their ID3/Comment data could be updated.

Data

What are the things we should be able to store/provide?

... list all entities and their attributes here ...

Album

...

Song

...

Audio File

...

...

Implementation

General

One key factor to allow for a certain degree of automation is the automatic identification of audio files. There are some services out there like music brainz which do this but tend to list only the very well known OSTs. Reimplementing something like this for anidb would be clearly inveasible. One possible approach would be to generate normal SHA1 hashes over the raw audio data (still in compressed form but without any ID3 Tags, Comments, ..., basically this would mostly mean skipping the header for hash generation). This could be extended by storing additional TRM IDs from music brainz, where available. Content hashes would differ for the same song from encode to encode. However, matching of audio files to songs could probably automated to a certain degree by using ID3/Comment values found on the files in question.

Database

Approach 1

Here is one possible way of realizing the database structure, not exactly 100% correct UML but you should get the idea. Classes are supposed to represent database entities. Lots of attributes are still missing. But I'd like some feedback on whether this general structure would be viable.

umbrello uml modeller source: http://anidb.info/tmp/draft1.xmi

To Fix:

  • ...

Don't Fix?:

  • we can't store very complex cases, but maybe we don't need to

Fixed?:

  • store type with song<->artist relations (i.e. composer, lyrics, singer, ...)
  • rename some tables to make them more generic (ID3-Tags -> MetaData?, Album -> Collection?)
  • album<->song relation needs int trackno attribute
  • maybe a song<->song relation "cover version of" ?
  • list bands/groups and members? -> artist<->artist relation "member of" and a type flag for artists: band/person?
  • make song<->audio file a many-to-many relation (for audio files which contain more than one song)
  • maybe try to unify typical data about a person in a new person table which can then be refered to by seiyu, artist and producer tables. (would also remove the need for a special artist<->seiyu relation)
  • multipilcities of released-by relation between audio file and audio group are wrong (switched) in diagram
  • anime<->song relation with attributes (type: OP/ED, first-ep: eid)

Generic Person/Company Tables

As the CharDB, OstDB, MangaDB and AniDB's producers all tend to include data about indivuduals or companies it would be a good idea to extract the generic data for individuals and companies into two separate tables which are then used by all other parts.

I.e. instead of having all the data inside the producer table, it would only include those datafields which are producer specific and would refer to a person or company table for the general data.

The question is, which fields can be considered generic (and should thus be listed in the person/comapny table) and which fields are specific.

And also which relations should be possible between entries.

  • person<=>company relations? (works/ed for?)
  • company<=>company relations?
  • person<=>person relations? (married?)

An multiplicities, i.e. has each producer only one person/company related to it, or multiple ones?

Generic Fields - Person

  • name
  • shortname
  • synonyms
  • image
  • url
  • url wiki (en)
  • url wiki (jp)
  • birth place
  • birth date
  • age (for cases where exact birth date is unknown)
  • day of death
  • gender
  • blood type
  • nationality
  • occupation
  • description
  • ...

Generic Fields - Company

  • name
  • shortname
  • synonyms
  • image
  • url
  • founded
  • closed down
  • description
  • ...