History:Track Grouping Proposal

From MusicBrainz Wiki
Revision as of 21:06, 11 November 2005 by Shepard (talk | contribs) ((Imported from MoinMoin))
Jump to navigationJump to search

Status: This is a proposal for a database schema change that will allow grouping tracks that have something in common together on several levels. If it gets sanctionated it will probably be part of the NextGenerationSchema.


Note: this is NOT about groups of tracks on albums (like several parts of a song or movements).

Why grouping?

Ever tried to create a instance of CoverRelationshipType? It's a mess! You have to lookup the earliest release and if the cover is released on several albums link all of them and also normally should link all of the original releases together. But the real problem unfolds if you try to trace back such a relationship chain. Was this song covered? -> Look up the earliest release and see if it has a cover relationship. If you want to see the real mess I propose you read DontMakeRelationshipClusters (and perhaps also LinkingDifferentArtistNames).

The problems listed there can be solved simple and effectively by adding group entities to the database which represent a more abstract layer of what the member objects have in common. That does not only apply to tracks but also to albums and artists (see AlbumRework and AdvancedEntity). This group functioning as a representive can be used as target of AdvancedRelationships such as CoverRelationshipType.

Developing a group model for tracks

If you want to group tracks you have to see that you cannot simply say "track A, B and C have the same name so group them". Actually you need several layers which build a hierarchy - from the bottom to the top becoming more abstract.

Let's start to roll this up from the top where it starts in the artist's mind: the most abstract thing is a song. The artist(s) develope the idea for a song and first this is an abstract thing you cannot touch.

Then the artists start composing and writing lyrics. Sometimes songs are recomposed (for acoustic performance for example) in a different way or lyrics are changed - but the basic idea stays the same. So there are different compositions which are about the same song. This is more "touchable" as in you can see the lyrics and notes.

After that the artists start producing audio material - either by recording a song in a studio or by performing it live (where it then also is recorded - as we don't care about the unrecorded ones in MusicBrainz). The song may be re-recorded in the studio or as said played live - then you have different recordings/performances using the same composition so that builds the next layer. Also: if a song is being covered, the performer changes and a new recording is produced but the composition stays the same.

The recorded audio material is - after being refined at computers - being stored as the original version and then to be released. But this material may also be edited or mixed several times. So for one recording there are several versions/mixes.

You could think here we reached the track, that is what is released. No, unfortunately it is not. If the original version is released on an album it is often faded into the other tracks which changes the audio data slightly. On a single it is mostly not faded. On a compilation again the original recording or a mixed version may be faded into the other tracks and adjusted in loudness - this is what is called a DJ-mix. Ergo: one version/mix, several unique tracks. Unique as in being completly identical in the audio data (which of course affects the duration), not only the name. Note: if there really happens to be one unique track being released on several albums we still can use the same object. This is for example the case when one album is released in two versions which only differ by the one having bonus tracks.

So to sum this up in a graphic:

firstmodel.png

Rather simple. Please note that every group contains several sub items normally, not only one.

Applying the model to MB

This all sounds ok but how to tell the users? And what's the use? Ok, let's try to simplify it and see where we can add which fields and use what AdvancedRelationshipTypes.

First of all those are too many groups. Not easy to use for our editors. What problems would we run into if we merge the song with the composition layer? For that we have to examin how often a song is recomposed and how important it is to differentiate between the compositions for users. I don't know how often this happens with classical stuff but for the rest I guess it's rather rare. Like I said: acoustic versions for example could use another composition. So if we come to those cases we could either ignore the small composition/lyrics changes and store them all under one song group - or we create two separate song groups if it's really important - like with covers: you would create two separate song groups for the original and the cover and link them with AdvancedRelationships.

Can we then merge the song layer with the recording/performance layer? No. One composer, different performers. That's important. Can recording/performane be merged with version/mix? No, the mix versions have completly different audio material and track titles and also often different mix artists. But can we merge version/mix with track? Tricky... depends on how accurate you want to be. The different tracks still can have different durations so I'd say better not.

Ok, at least we reduced it to 4 layers. Now what data can we store where? AdvancedRelationships now no longer have to be linked to tracks only but also to groups. Most types even can be restricted to certain groups. I'm doing this only for some classes/types now, will extend it later. A song group can be target of all AdvancedRelationships of the CompositionRelationshipClass and origin and target of CoverRelationshipType. Recording groups are target of the PerformanceRelationshipClass. Version/mix groups can be target of RemixerRelationshipType and origin and target of RemixRelationshipType.

What other data is there apart from AdvancedRelationships? Well the duration for example. That clearly belongs to to the track level. Then we have AudioFingerprints - it depends on the accuracy of those wether they should be attatched to track objects or version groups. This is a decision of the developers. But there is one thing that is not under our control: The ISRC, the International Standard Recording Code. This is a (young) code system that assings unified code numbers to recordings of songs which then come with the CD/whatever medium. We could store those as well. The only question is: how accurate are they defined, so on which level would they belong? http://www.ifpi.org/isrc/isrc_faq.html#Heading44 is quite clear about that and makes me think they belong to the track layer - but that has to be checked well enough.

You have to keep the following in mind: those groups are only for abstraction and attaching meta data non-redundant. They are helpers - nonetheless what albums and artists link to are still the tracks and none of the layers above. When presenting tracks though they inherit all AdvancedRelationships from their super groups (makes it a bit slower) and also should present links to their super groups.

All in all it could look like this:

secondmodel.png

Converting to this model

This divides in two parts: the first is done automatically when converting to the new schema. For every track we have an new version group, recording group and song group is created. The AdvancedRelationships can easily be moved to the correct group then. The next step is the hard work for us: merging the groups. This should be an AutoEdit for AutoEditors and go to vote for normal users.

An example:

tree.png

Problems

Fast and usable or non-redundant?

Requesting all the information from a track's super groups will make the album view slower anyways - as will all the db schema changes in AlbumRework I guess. But there is one thing in question. Should titles be stored redundant? That is: if a TrackEntity stores its title, should the groups above too? Or the other way round? Should only the song store a title and all sub groups that don't differ from it have an empty title? That would reduce redundancy (less risk of inconsistency) but also be a bit more complicated to use and require more lookups in the database to actually retrieve titles.

Track renaming

Editing a track title for style reasons is no problem. But if a tracklisting is wrong and a track title is renamed to be a completly different song then this could lead to inconsistencies with the groups of that track. Therefore there should be possibilities to change the groups of the track as well as an option "apply rename for version/recording/song group too" when editing track titles. Edits ignoring this normally would get voted down of course but this is the optimal case and won't happen for artists without subscribers.

One track multiple songs

We already identified several cases where more than one song/recording is put together in one track: mash-ups, medleys, megamixes and just multiple songs in one track (for example hidden bonus tracks). How is this modelled with groups? I see two possibilities:

Either the combined track can be member of multiple song groups (of course recording and version between). Then we have a direct linking to the original material used in the track - AdvancedRelationships like SamplesRelationshipType and MashUpRelationshipType would be nearly obsolete, the only use for them would be to provide the missing description for the type of combination this is.

The other possibility is: the combined track is just member of its own song group. The contained songs are linked to this song with the appropriate AdvancedRelationship. If they weren't released separatly - like said hidden tracks - then simply an empty abstract song group is created and linked to the song group of the combined track with some AdvancedRelationship.

Merging dependency

Example: You want to merge the song and recording group for two tracks. The song group merge fails while the recording group merge goes through. What happens? It should fail dependency. Though better would be: merging groups of one layer autimatically merges the connected groups of the higher layers. But this again leads to problems with multi-group-membership.

Special purpose song groups

Analogue to the SpecialPurposeArtists we would define several song groups that are used by many different tracks. Those can act as collectors for tracks of some types that where not composed and/or don't have lyrics, for example spontaneous performances on live acts or audiobooks.

Such songs could be: [silence], [instrumental], [solo], [jam], [spokenword], [reading], [narration], ...


Author: Shepard