Difference between revisions of "MusicBrainz Database/Schema"
m (→Recording: add link to wiki page in header) |
m (→Medium: add link to as of yet unexisting page) |
||
Line 26: | Line 26: | ||
* 1984 US release of "''The Wall''" by "''[[Artist:83d91898-7763-47d7-b03b-b92132375c47|Pink Floyd]]''", release on label "''Columbia Records''" with catalog number "C2K 36183" and UPC "074643618328", it's an official release and comes with two CDs in jewel case. |
* 1984 US release of "''The Wall''" by "''[[Artist:83d91898-7763-47d7-b03b-b92132375c47|Pink Floyd]]''", release on label "''Columbia Records''" with catalog number "C2K 36183" and UPC "074643618328", it's an official release and comes with two CDs in jewel case. |
||
− | === Medium === |
+ | === [[Medium]] === |
Piece of media, included in a release. Contains information about the format, position in the release and an optional title. Has attached CD TOCs. Example: |
Piece of media, included in a release. Contains information about the format, position in the release and an optional title. Has attached CD TOCs. Example: |
Revision as of 14:22, 21 May 2011
Objects
Artist
Same as our current artist entity, but with new "country" and "gender" attributes.
Artist Credit
List of artists, variations of artist names and pieces of text to join the artist names. Examples:
- "Queen & David Bowie" -- two artists ("Queen" and "David Bowie"), no name variations, joined with " & "
- "Jean-Michel Jarre*" -- one artist ("Jean Michel Jarre"), name variation "Jean-Michel Jarre"
- "Tracy W. Bush, Derek Duke, Jason Hayes and Glenn Stafford" -- four artists, no name variations, joined with ", ", ", ", ", " and " and ".
Release Group
Represents an abstract "album" entity. Technically it's a group of releases, with a specified type. The type is the same as current release type. Examples:
- Single "Under Pressure" by "Queen & David Bowie"
- Album "The Wall" by "Pink Floyd"
Release
Real-world release object you can buy in your music store. It has release date and country, list of catalog number and label pairs, packaging type and release status. Example:
- 1984 US release of "The Wall" by "Pink Floyd", release on label "Columbia Records" with catalog number "C2K 36183" and UPC "074643618328", it's an official release and comes with two CDs in jewel case.
Medium
Piece of media, included in a release. Contains information about the format, position in the release and an optional title. Has attached CD TOCs. Example:
- CD1 of the 1984 US release of "The Wall" by "Pink Floyd"
- CD2 of the 2005 UK release of "Aerial" by "Kate Bush", named "A Sky of Honey"
Tracklist
Simple list of tracks. Tracklists are used by mediums.
Track
This object is not visible to users on its own, only in the context of a tracklist. It contains a link to a recording, title, artist credit and the position on the tracklist.
Recording
Represents unique audio data. Has title, artist credit, duration, list of PUIDs and ISRCs. Examples (all are different Recordings):
- Album version of the track "Into the Blue" by "Moby"
- Remix "Into the Blue (Buzz Boys Main Room Mayhem mix)" by "Moby"
- Remix "Into the Blue (Underground mix)" by "Moby"
Work
One layer above recordings ("song", "composition", etc.). While recording represents audio data, work represents the composition behind the recording. Advanced Relationships should be used to link recording and work.
- Song "Into the Blue" by "Moby" -- all the recordings listed above will be linked to this object
Label
Same as our current label entity.
Database Schema
This diagrams shows the core database tables and relationships between them:
Because we are adding new core entities, and there are plans for adding more, ARs between them need to be optimized. Only the absolutely necessary information is in l_*_* tables, everything else is shared:
Data Dictionary
- artist = @id + @gid + name + sort name + (type) + (begin date) + (end date) + (gender) + (country) + (comment) + {artist alias}
- artist alias = @id + name
- artist credit = @id + 1{artist + name + (join phrase)}
- label = @id + @gid + name + sort name + (label code) + (type) + (begin date) + (end date) + (country) + (comment) + {label alias}
- label alias = @id + name
- release group = @id + @gid + artist credit + name + (type) + (comment) + {release}
- release = @id + @gid + artist credit + name + (barcode) + (date) + (country) + (status) + (packaging) + (language) + (script) + {label + catalog number} + {medium} + (comment)
- medium = @id + position + (format) + (name) + tracklist
- tracklist = @id + 1{recording + name + artist credit + length + position}
- recording = @id + @gid + artist credit + name + length + (comment) + {puid} + {isrc}
- work = @id + @gid + artist credit + name + {work alias}
- work alias = @id + name
- date = (year) + (month) + (day)
Data conversion
Next Generation Schema Next_Generation_Schema/Migration_Plan.
See Next_Generation_Schema/Migration_Notes.
For ARs see Next Generation Schema/Release_Relationships_Conversion and Next Generation Schema/Track Relationships Conversion
Some edge cases are compiled in Next Generation Schema/Conversion_Edge_Cases.
Code Changes
TODO
http://java.sun.com/blueprints/corej2eepatterns/Patterns/DataAccessObject.html
Caching
- memcache:
- MBID are cached in an additional layer to avoid duplication, instead of UUID -> object, there is UUID -> rowid, and then rowid -> object.
- Used to store:
- artist credits
- artists
- labels
- links
- (maybe release groups?)
- Local in-process cache:
- Very fast access, but we have no control over invalidating it in all processes, so there needs to be low TTL for all cached items.
- Used to store:
- release group types
- release statuses
- release packaging types
- medium formats
- artist types
- label types
- countries
- scripts
- languages