History:Track Grouping Proposal

From MusicBrainz Wiki
Revision as of 21:49, 11 November 2005 by Shepard (talk | contribs) ((Imported from MoinMoin))
Jump to navigationJump to search

Status: This is a proposal for a database schema change that will allow grouping tracks that have something in common together on several levels. If it gets sanctioned it will probably be part of the NextGenerationSchema.


Note: this is NOT about groups of tracks on albums (like several parts of a song or movements).

Why grouping?

Ever tried to create a instance of CoverRelationshipType? It's a mess! You have to look up the earliest release and if the cover is released on several albums, link all of them together (and also should normally link all of the original releases together). But the real problem unfolds if you try to trace back such a relationship chain. Was this song covered? -> Look up the earliest release and see if it has a cover relationship. If you want to see the real mess, I propose you read DontMakeRelationshipClusters (and perhaps also LinkingDifferentArtistNames).

The problems listed there can be solved simply and effectively by adding group entities to the database which represent a more abstract layer of what the member objects have in common. That does not only apply to tracks but also to albums and artists (see AlbumRework and AdvancedEntity). This group functioning as a representative can be used as target of AdvancedRelationships, such as CoverRelationshipType.

Developing a group model for tracks

If you want to group tracks, you have to see that you cannot simply say "track A, B and C have the same name so group them". You actually need several layers which build a hierarchy - becoming more abstract from the bottom to the top.

Let's start to roll this up from the top, where it starts in the artist's mind: the most abstract thing is a song. The artist(s) develop the idea for a song and first this is an abstract thing you cannot touch.

Then the artists start composing and writing lyrics. Sometimes songs are recomposed (for acoustic performance for example) in a different way or lyrics are changed, but the basic idea stays the same. So there are different compositions which are about the same song. This is more "touchable" as in you can see the lyrics and notes.

After that, the artists start producing audio material - either by recording a song in a studio or by performing it live (where it is then also recorded - as we don't care about the unrecorded ones in MusicBrainz). The song may be re-recorded in the studio or, as said, played live - then you have different recordings/performances using the same composition so that builds the next layer. Also: if a song is being covered, the performer changes and a new recording is produced but the composition stays the same.

The recorded audio material is - after being refined at computers - being stored as the original version and then to be released. But this material may also be edited or mixed several times. So for one recording there are several versions/mixes.

You could think that we've reached the track here, that is what is released. No, unfortunately it is not. If the original version is released on an album, it is often faded into the other tracks which changes the audio data slightly. On a single, it is mostly not faded. On a compilation, again the original recording or a mixed version may be faded into the other tracks and adjusted in loudness - this is what is called a DJ-mix. Ergo: one version/mix, several unique tracks. Unique as in having completely identical audio data (which of course affects the duration), not only the name. Note: if there really happens to be one unique track being released on several albums we still can use the same object. This is for example the case when one album is released in two versions which only differ by the one having bonus tracks.

So to sum this up in a graphic:

firstmodel.png

Rather simple. Please note that every group contains several sub items normally, not only one.

Applying the model to MB

This all sounds OK but how to tell the users? And what's the use? OK, let's try to simplify it and see where we can add which fields and use what AdvancedRelationshipTypes.

First of all, those are too many groups. Not easy to use for our editors. What problems would we run into if we were to merge the song with the composition layer? For that, we have to examine how often a song is recomposed and how important it is to differentiate between the compositions for users. I don't know how often this happens with classical stuff, but for the rest, I guess it's rather rare. Like I said: acoustic versions, for example, could use another composition. So if we come to those cases, we could either ignore the small composition/lyrics changes and store them all under one song group - or we create two separate song groups if it's really important - like with covers: you would create two separate song groups for the original and the cover and link them with AdvancedRelationships.

Can we then merge the song layer with the recording/performance layer? No. One composer, different performers. That's important. Can recording/performance be merged with version/mix? No, the mix versions have completely different audio material and track titles and also often different mix artists. But can we merge version/mix with track? Tricky... depends on how accurate you want to be. The different tracks still can have different durations, so I'd say better not.

OK, at least we reduced it to 4 layers. Now what data can we store where? AdvancedRelationships now no longer have to be linked to tracks only but also to groups. Most types can even be restricted to certain groups. I'm doing this only for some classes/types now, will extend it later. A song group can be target of all AdvancedRelationships of the CompositionRelationshipClass and origin and target of CoverRelationshipType. Recording groups are target of the PerformanceRelationshipClass. Version/mix groups can be target of RemixerRelationshipType and origin and target of RemixRelationshipType.

What other data is there, apart from AdvancedRelationships? Well, the duration, for example. That clearly belongs to to the track level. Then we have AudioFingerprints - it depends on the accuracy of those as to whether they should be attached to track objects or version groups. This is a decision of the developers. But there is one thing that is not under our control: The ISRC, the International Standard Recording Code. This is a (young) code system that assigns unified code numbers to recordings of songs which then come with the CD/whatever medium. We could store those as well. The only question is: how accurate are they defined, so on which level would they belong? http://www.ifpi.org/isrc/isrc_faq.html#Heading44 is quite clear about that and makes me think they belong to the track layer - but that has to be checked well enough.

You have to keep the following in mind: those groups are only for abstraction and attaching metadata non-redundant. They are helpers - nonetheless what albums and artists link to are still the tracks and none of the layers above. When presenting tracks though, they inherit all AdvancedRelationships from their super groups (makes it a bit slower) and also should present links to their super groups.

All in all it could look like this:

secondmodel.png

Converting to this model

This divides in two parts: the first is done automatically when converting to the new schema. For every track we have an new version group, recording group and song group. The AdvancedRelationships can then be easily moved to the correct group. The next step is the hard work for us: merging the groups. This should be an AutoEdit for AutoEditors and go to vote for normal users.

An example:

tree.png

Problems

Fast and usable or non-redundant?

Requesting all the information from a track's super groups will make the album view slower anyway - as will all the db schema changes in AlbumRework, I guess. But there is one thing in question. Should titles be stored redundant? That is: if a TrackEntity stores its title, should the groups above too? Or the other way round? Should only the song store a title and all sub groups that don't differ from it have an empty title? That would reduce redundancy (less risk of inconsistency) but also be a bit more complicated to use and require more lookups in the database to actually retrieve titles.

Track renaming

Editing a track title for style reasons is no problem. But if a tracklisting is wrong and a track title is renamed to be a completely different song then this could lead to inconsistencies with the groups of that track. Therefore, there should be possibilities to change the groups of the track as well as an option "apply rename for version/recording/song group too" when editing track titles. Edits ignoring this normally would get voted down, of course, but this is the optimal case and won't happen for artists without subscribers.

One track multiple songs

We already identified several cases where more than one song/recording is put together in one track: mash-ups, medleys, megamixes and just multiple songs in one track (for example hidden bonus tracks). How is this modelled with groups? I see two possibilities:

Either the combined track can be member of multiple song groups (of course recording and version between). Then we have a direct linking to the original material used in the track - AdvancedRelationships like SamplesRelationshipType and MashUpRelationshipType would be nearly obsolete, the only use for them would be to provide the missing description for the type of combination this is.

The other possibility is: the combined track is just member of its own song group. The contained songs are linked to this song with the appropriate AdvancedRelationship. If they weren't released separately - like said hidden tracks - then simply an empty abstract song group is created and linked to the song group of the combined track with some AdvancedRelationship.

Merging dependency

Example: You want to merge the song and recording group for two tracks. The song group merge fails while the recording group merge goes through. What happens? It should fail dependency. Though better would be: merging groups of one layer automatically merges the connected groups of the higher layers. But this again leads to problems with multi-group-membership.

Special purpose song groups

Analogue to the SpecialPurposeArtists we would define several song groups that are used by many different tracks. Those can act as collectors for tracks of some types that where not composed and/or don't have lyrics, for example spontaneous performances on live acts or audiobooks.

Such songs could be: [silence], [instrumental], [solo], [jam], [spokenword], [reading], [narration], ...


Author: Shepard