MusicBrainz XML Meta Data

From MusicBrainz Wiki
(Redirected from MusicBrainzXMLMetaData)
Jump to navigationJump to search

The MusicBrainz XML Metadata Format

Introduction

The MusicBrainz XML Metadata Format (MMD) is an XML based document format to represent music metadata. It has been designed to be easy to read, powerful and extensible. MMD is the official successor of the old RDF-based metadata format, which was popular among semantic web enthusiasts, but didn't have much acceptance otherwise because of its perceived complexity.

The initial and predetermined use of MMD is in the MusicBrainz XML API. However, it may be useful for other applications, too. Third party use and extensions to the format will be discussed later in this document.

The official format description is a Relax NG schema. You can browse and access the schema and example documents via GitHub or clone it locally:

git clone https://github.com/metabrainz/mmd-schema.git

Questions about MMD can be asked on either the IRC channel or posted to mb-devel.

The rest of this document is a loose collection of examples and descriptions, feel free to improve the documentation to make it a more valuable resource for developers.

Structure

All valid MMD documents are in the http://musicbrainz.org/ns/mmd-1.0# namespace. The following is a perfectly valid, but empty, MMD document:

<?xml version="1.0" encoding="UTF-8"?>

<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#">
    <!-- no elements -->
</metadata>

The metadata element may contain one of the following elements: artist, release, track, artist-list, release-list, track-list. Although the schema permits more than one of these elements to appear as children of metadata, it usually just contains one.

The artist, release, and track elements

The artist, release and track elements represent a corresponding entity from the MusicBrainz database. This MMD document represents an artist:

<?xml version="1.0" encoding="UTF-8"?>

<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#">
    <artist id="c0b2500e-0cef-4130-869d-732b23ed9df5" type="Person">
        <name>Tori Amos</name>
        <sort-name>Amos, Tori</sort-name>
        <life-span begin="1963-08-22"/>
        <release-list>
            <!-- releases omitted -->
        </release-list>
    </artist>
</metadata>

If there are two artists with the same name in the database, a disambiguation element can be used. It could contain the decade the artist was active, or the genre.

The following document represents a release:

<?xml version="1.0" encoding="UTF-8"?>

<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#">
    <release id="02232360-337e-4a3f-ad20-6cdd4c34288c" type="Album Official">
        <title>Little Earthquakes</title>
        <text-representation language="ENG" script="Latn"/>
        <asin>B000002IT2</asin>
        <artist id="c0b2500e-0cef-4130-869d-732b23ed9df5" type="Person">
            <name>Tori Amos</name>
            <sort-name>Amos, Tori</sort-name>
            <life-span begin="1963-08-22"/>
        </artist>
        <release-event-list>
            <event date="1992-01-13" country="GB"/>
            <event date="1992-01-17" country="DE"/>
            <event date="1992-02-25" country="US"/>
        </release-event-list>
        <disc-list>
            <disc id="ILKp3.bZmvoMO7wSrq1cw7WatfA-"/>
            <disc id="ejdrdtX1ZyvCb0g6vfJejVaLIK8-"/>
            <disc id="Y96eDQZbF4Z26Y5.Sxdbh3wGypo-"/>
        </disc-list>
        <track-list count="12"/>
    </release>
</metadata>

The text-representation gives the language and script the release and track titles are written in. The asin element contains an amazon shop identifier.

The following MMD document describes a track:

<?xml version="1.0" encoding="UTF-8"?>

<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#">
    <track id="d6118046-407d-4e06-a1ba-49c399a4c42f">
        <title>Silent All These Years</title>
        <duration>253466</duration>
        <puid-list>
            <puid id="c2a2cee5-a8ca-4f89-a092-c3e1e65ab7e6"/>
            <puid id="db5b66b3-fa97-4cfa-9296-de7e57ef05f4"/>
            <puid id="fbb3d1b6-f9e7-49ed-834c-03e9ec9726e5"/>
            <puid id="0e2e66e3-13d7-4138-af90-5ec8a4c2db99"/>
            <puid id="4778d58b-50fe-4004-8cd6-462a816114c8"/>
            <puid id="88541d46-9b74-4835-b8e8-d985fc77e02a"/>
            <puid id="42ab76ea-5d42-4259-85d7-e7f2c69e4485"/>
        </puid-list>
    </track>
</metadata>

There may also be an artist element, which is left out in this example. The puid-list lists all PUIDs (audio checksums) associated with this track.

The list elements

Additional to the elements discussed in the last section, MMD supports several list elements. The metadata element may contain artist-list, release-list, and track-list elements, the release element can (among others) contain a track-list element etc.

Usually, a list contains other elements. A track-list contains track elements, a puid-list contains puid elements, and so on. In some cases, it is not possible to list all elements of the list because there would be too many. In this case, the count and offset attributes can be used, which are available for all lists in MMD. The following example shows a partial artist list:

<?xml version="1.0" encoding="UTF-8"?>

<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#">
    <track-list count="147" offset="0">
        <track id="748f2b79-8c50-4581-adb1-7708118a48fc">
            <title>Little Earthquakes</title>
            <duration>457760</duration>
            <artist id="c0b2500e-0cef-4130-869d-732b23ed9df5">
                <name>Tori Amos</name>
            </artist>
            <!-- ... -->
        </track>
        <track id="51d2c2ff-a5fd-44f9-9c1c-7ca9fdc7dd1d">
            <title>Little Earthquakes</title>
            <duration>413693</duration>
            <artist id="c0b2500e-0cef-4130-869d-732b23ed9df5">
                <name>Tori Amos</name>
            </artist>
            <!-- ... -->
        </track>
    </track-list>
</metadata>

The full track-list would have 147 entries (the count attribute), and this partial list starts at the first element (offset is zero-based).

Relations

Relations are the most complex part of MMD. An artist, release, or track can be in a relation to any other artist, release, track, or URL. This is how it looks like:

<?xml version="1.0" encoding="UTF-8"?>

<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#">
    <artist id="c0b2500e-0cef-4130-869d-732b23ed9df5" type="Person">
        <name>Tori Amos</name>
        <sort-name>Amos, Tori</sort-name>
        <life-span begin="1963-08-22"/>
        <relation-list target-type="Artist">
            <relation type="MemberOfBand" target="ee361394-914f-4074-891a-3b17b0f89a37">
                <artist id="ee361394-914f-4074-891a-3b17b0f89a37" type="Group">
                    <name>Y Kant Tori Read</name>
                    <sort-name>Y Kant Tori Read</sort-name>
                </artist>
            </relation>
            <relation type="Married" direction="backward" target="07538f3d-81d5-4c04-923e-4542b8ac9dbc" begin="1998">
                <artist id="07538f3d-81d5-4c04-923e-4542b8ac9dbc" type="Person">
                    <name>Mark Hawley</name>
                    <sort-name>Hawley, Mark</sort-name>
                </artist>
            </relation>
        </relation-list>
        <relation-list target-type="Release">
            <relation type="Producer" target="5201f3a4-4a9e-4980-909e-4ee3262334d0">
                <release id="5201f3a4-4a9e-4980-909e-4ee3262334d0" type="Album Official">
                    <title>From the Choirgirl Hotel</title>
                    <text-representation language="ENG" script="Latn"/>
                </release>
            </relation>
            <relation type="Vocal" attributes="Lead" target="44a57f0f-3de5-4e5f-a4eb-9334b4eac4f0">
                <release id="44a57f0f-3de5-4e5f-a4eb-9334b4eac4f0" type="Album Official">
                    <title>Y Kant Tori Read</title>
                </release>
            </relation>
        </relation-list>
        <relation-list target-type="Track">
            <relation type="Performer" target="0fcae62b-906c-4e61-b036-ed9c8fe8dd57">
                <track id="0fcae62b-906c-4e61-b036-ed9c8fe8dd57">
                    <title>Blue Skies (feat. Tori Amos)</title>
                    <duration>306040</duration>
                </track>
            </relation>
            <relation type="Performer" target="b6f882aa-6d53-4c2a-8bf2-d358e997a087">
                <track id="b6f882aa-6d53-4c2a-8bf2-d358e997a087">
                    <title>Down by the Seaside (feat. Tori Amos)</title>
                    <duration>256414</duration>
                </track>
            </relation>
        </relation-list>
        <relation-list target-type="Url">
            <relation type="OfficialHomepage" target="http://www.toriamos.com/"/>
            <relation type="Wikipedia" target="http://en.wikipedia.org/wiki/Tori_Amos"/>
            <relation type="IMDb" target="http://www.imdb.com/name/nm0002169/"/>
            <relation type="Fanpage" target="http://www.thedent.com/index.php"/>
            <relation type="Discography" target="http://www.yessaid.com/albums.html"/>
            <relation type="Discography" target="http://www.hereinmyhead.com/"/>
        </relation-list>
    </artist>
</metadata>

In the XML document above, there are four relation-list elements, each representing relations from the artist Tori Amos to a different target type (Artist, Release, Track, and Url).

IDs and Types

The IDs and types used in the XML format are URIs. To keep the transmission overhead low, all URIs in the MusicBrainz namespace may be used in their relative form. So if a track's fully qualified id is http://musicbrainz.org/track/d6118046-407d-4e06-a1ba-49c399a4c42f, it may be shortened to d6118046-407d-4e06-a1ba-49c399a4c42f in the XML. Note that this shortening is only allowed for URIs from the MusicBrainz namespace.

The following rules apply to create a fully qualified URI from a relative one:

  • The id attributes:
    • artist: http://musicbrainz.org/artist/relative URI
    • release: http://musicbrainz.org/release/relative URI
    • track: http://musicbrainz.org/track/relative URI
  • The type attributes:
    • artist: http://musicbrainz.org/ns/mmd-1.0#relative URI
    • release: http://musicbrainz.org/ns/mmd-1.0#relative URI (for each relative URI in the list)

Due to their large number, relations are in a namespace on their own to avoid clashes:

  • Various relation attributes:
    • type: http://musicbrainz.org/ns/rel-1.0#relative URI
    • attributes: http://musicbrainz.org/ns/rel-1.0#relative URI (for each relative URI in the list)

Note: Don't confuse the URIs, especially the id URIs with URLs. The URIs are just names, they should not be used to query data from the server. But they are in a permanent format which will always be valid and can easily be transformed to URLs. Example:

The following is an absolute, permanent MusicBrainz artist identifier which is the preferred representation. Shorter representations may be used for storing IDs in file tags or databases.

    http://musicbrainz.org/artist/c0b2500e-0cef-4130-869d-732b23ed9df5

This one is a URL created from the URI above, using a simple transformation. It can be used to request data from the MusicBrainz server via the API. URLs may change over time, the URIs will not.

    http://musicbrainz.org/ws/1/artist/c0b2500e-0cef-4130-869d-732b23ed9df5

Note: When requesting XML data, you should specify the type to be xml using GET parameters. Currently only type xml is supported; in the future more might be supported. For example:

    http://musicbrainz.org/ws/1/artist/c0b2500e-0cef-4130-869d-732b23ed9df5?type=xml

Using the XML format for other applications

The XML format format uses URIs for several IDs and types. The artist element, for example, has a type attribute which accepts a URI. Users of the format may use the URIs defined by MusicBrainz or use one from their own namespace. This example uses a definition from MusicBrainz:

    <artist id="c0b2500e-0cef-4130-869d-732b23ed9df5" type="http://musicbrainz.org/ns/mmd-1.0#Group"/>

If an application needs "Orchestra" as an artist type, a different namespace has to be used:

    <artist id="c0b2500e-0cef-4130-869d-732b23ed9df5" type="http://example.org/ext-7.2#Orchestra"/>

This method may be used in all places where the schema accepts an anyURI datatype. As mentioned earlier, there is a special rule for all URIs defined by MusicBrainz: They may be relative, as you can see in the examples above. The complete artist ids have the form http://musicbrainz.org/artist/c0b2500e-0cef-4130-869d-732b23ed9df5 and the type attribute in the first example could also be written as Group, without the namespace prefix.

To extend the format even further, the schema has several extension points (see def_extension) which allows adding arbitrary XML elements from a user-defined namespace. Using the mmd-namespace or no namespace at all is not permitted. There are no namespace restrictions inside that element, however, and unlimited nesting is possible, too.

If your private namespace is http://example.org/ext-9.1# and you want to add data from a rating system, for example, it could be coded like this:

<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#" xmlns:ext="http://example.org/ext-9.1#">
    <track id="d6118046-407d-4e06-a1ba-49c399a4c42f">
        <title>Silent All These Years</title>
        <duration>253466</duration>
        <ext:rating value="9"/>
    </track>
</metadata>

Even more complicated things, like nested tags are possible. Note that the em doesn't belong to the ext namespace.

<?xml version="1.0" encoding="UTF-8"?>
<metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#" xmlns:ext="http://example.org/ext-9.1#">
    <track id="d6118046-407d-4e06-a1ba-49c399a4c42f">
        <title>Silent All These Years</title>
        <duration>253466</duration>
        <ext:annotation>This is a <em>very</em> nice song.</ext:annotation>
    </track>
</metadata>

This is still valid according to the schema, but inside the extension elements, only well-formedness can be checked.

Credits

The MusicBrainz XML Metadata Format has been designed by Matthias Friedrich and Robert Kaye. Most of the schema as well as the test suite have been written by Matthias Friedrich.



TODO

  • explain type, attributes, reading direction, dates in the relations section
  • what else?