History:Development/XML Web Service/Version 1

From MusicBrainz Wiki
Revision as of 21:24, 25 January 2006 by DonRedman (talk | contribs) (no free-text wiki links for unexistant pages (they look as if the'd exist) (Imported from MoinMoin))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Attention.png Status: Currently under development. Please do not change unless you are told to!




Introduction

The web service discussed in this document is an interface to the MusicBrainz database which contains a huge amount of music metadata, all maintained by the MusicBrainz community. It is aimed at developers of media players, CD rippers, taggers and other applications requiring music metadata. The service's architecture follows the REST design principles. Interaction with the web service is done using HTTP and all content is served in a simple but flexible XML format.

This document first describes how the data in MusicBrainz is organized. Users who already have experience in using the website can safely skip this section and start with the specification sections.

The MusicBrainz Metadata Model

MusicBrainz uses an object oriented schema to model music metadata. The main classes are artist, album and track, each with a different set of attributes and relationships. Apart from their traditional relationships (artists have albums, albums contain tracks), a more complex schema, called AdvancedRelationships, was introduced.

It allows users to link an object of one class to an object of any other class (URLs are permitted, too). Many different link types exist (see AdvancedRelationshipType for a list), which can be used to specify the artist who did background vocals on an album or track, who is married to whom, where an artist's offical homepage is and a lot more. Those links themselves may have attributes, with their semantics depending on the link type.

The following sections discuss the main classes in more detail.

The Artist Class

In MusicBrainz, artists always have a unique ID, a name and a SortName. Additionally, an artist may also be flagged as Person or Group and it can have an ArtistBeginDate and an ArtistEndDate. For persons, these are the dates of birth and death, for groups they are the founding and dissolving dates.

An artist can have any number of albums and relationships to other artists, albums, tracks and URLs.

The Album Class

All albums have a unique ID, a title and one or more tracks. Each album has a release type ("Album", "Compilation", "Single" etc.), a release status ("Official", "Promotion", "Bootleg") and language information. An album may also have release dates, which are represented as a list of (country, date) tuples.

A common use case is to look up an album using a DiscID generated from an Audio CD's table of contents (TOC). An album can have any number of DiscIDs (including none), mostly due to different pressings. In rare cases, where two different CDs have the same TOC, a DiscID may map to more than one album.

Relationships are used to link an album to other albums, artists, tracks and URLs.

The Track Class

Tracks have a unique ID, a title and one main artist. They may also have a length attribute indicating the play time. There can be any number of TRMs, which are used to lookup tracks. TRMs are audio fingerprints generated from music files, but they are not unique, so a TRM can be associated with many tracks.

As with the other classes, a track object can have any number of relationships to artists, albums, tracks and URLs.

The URL Schema

All MusicBrainz objects (artists, albums, tracks) are modeled as resources. Resources have unique URLs and can be accessed using standard HTTP. Each resource is also part of a collection. This is a special resource which represents all objects of a type.

For this version of the web service, the http://musicbrainz.org/ws/1/ namespace has been reserved. It is further structured like this:

http://musicbrainz.org/ws/1/artist/ Collection of all artists
http://musicbrainz.org/ws/1/artist/UUID An individual artist
http://musicbrainz.org/ws/1/album/ Collection of all albums
http://musicbrainz.org/ws/1/album/UUID An individual album
http://musicbrainz.org/ws/1/track/ Collection of all tracks
http://musicbrainz.org/ws/1/track/UUID An individual track

Basically, there are two different ways to access MusicBrainz data. If you know the UUID (a globally unique identifier assigned to each object in the database), you can request the resource directly. To access the artist "Tori Amos" for example, the resource http://musicbrainz.org/ws/1/artist/c0b2500e-0cef-4130-869d-732b23ed9df5 may be used.

Another option is to use the artist collection. Since this collection is huge, it is unfeasible to request all of it and then extract the data you need. Instead, collections support filters, which allow to limit the amount of data based on some criteria. For example, you can use a filter to only request artists with the name "Tori Amos": http://musicbrainz.org/ws/1/artist/?name=Tori+Amos. The Filters supported depend on the collection and are described below.

In REST, HTTP methods are used to create (PUT), retrieve (GET), modify (POST) and delete (DELETE) resources. The most important method for this web service is GET, which returns a representation for the requested resource. Several different representations are possible, but at this point only the XML format discussed later in this document is supported.

By default, the web service only returns a basic representation of a resource. Additional information can be requested using the inc parameter, which depends on the resource. If you want to request an album including all tracks and release dates, for example, you can use this URL: http://musicbrainz.org/ws/1/album/02232360-337e-4a3f-ad20-6cdd4c34288c?type=xml&inc=tracks+releases

The following sections discuss the parameters available for each type of resource. These parameters are supported for all resource types (both collections and individual resources):

type Selects the representation of the resource. Currently only xml is supported.
inc A list of space separated values describing how much detail should be included in the output. If there is no inc parameter, just the basic data for a resource is returned. For artists that would be name and sortName.

artist resources

Parameters for http://musicbrainz.org/ws/1/artist/MBID:

inc Supported: 'albums', 'va-albums'

Parameters for http://musicbrainz.org/ws/1/artist/:

name Fetch a list of artists with a matching name
limit Limit the number of artists returned

album resources

Parameters for http://musicbrainz.org/ws/1/album/MBID:

inc Supported: 'cdids', 'tracks', 'releases'

Parameters for http://musicbrainz.org/ws/1/album/:

title Fetch a list of albums with a matching title
cdid Fetch all albums matching to the given CdIndexId
limit Limit the number of albums returned

track resources

Parameters for http://musicbrainz.org/ws/1/track/MBID:

inc Supported: 'trms'

Parameters for http://musicbrainz.org/ws/1/track/:

title Fetch a list of tracks with a matching title
artist The returned tracks have to match the given artist name
album The returned tracks have to match the given album title
limit Limit the number of tracks returned

The XML Format

The following sample XML document describes a partial album:

<?xml version="1.0" encoding="UTF-8"?>
<mb:metadata xmlns:dc="http://purl.org/dc/elements/1.1/"
             xmlns:mb="http://musicbrainz.org/ns/mb/1/"
             xmlns:az="http://www.amazon.com/gp/aws/landing.html#">
    <mb:album-list>
      <mb:album id="8f468f36-8c7e-4fc1-9166-50664d267127" type="album">

        <dc:title>Dummy</dc:title>
        <mb:counts track="11" cdid="10" trmid="73"/>
        <az:asin>B000001FI7</az:asin>

        <mb:artist id="8f6bd1e4-fbe1-4f50-aa9b-94c450ec0f11" begin="1991" end="2011">
          <dc:title>Portishead</dc:title>
          <mb:sortName>Portishead</mb:sortName>
        </mb:artist>

        <mb:release-list>
          <mb:release date="1994-10-17" country="us"/>
        </mb:release-list>

        <mb:cdid-list>
           <mb:cdid id="D5LsXhbWwpctL4s5xHSTS_SefQw-"/>
        </mb:cdid-list>

        <mb:track-list>
          <mb:track id="b5d7d380-f43a-4c1f-a5de-694150b093ac" duration="306200">
            <dc:title>Mysterons</dc:title>

            <!-- This block is shown only if the track artist != album artist -->
            <mb:artist id="8f6bd1e4-fbe1-4f50-aa9b-94c450ec0f11" begin="1991" end="2011" type="group">
              <dc:title>Portishead</dc:title>
              <mb:sortName>Portishead</mb:sortName>
            </mb:artist>

            <mb:trmid-list>
               <mb:trmid id="58bdde9c-9c47-487a-94ae-1e9c4001bd3d"/>
            </mb:trmid-list>

          </mb:track>
        </mb:track-list>

      </mb:album>
    </mb:album-list>
</mb:metadata>

The above XML parses and validates with the following Relax NG schema:

<?xml version="1.0" encoding="UTF-8"?>
<!--
   <hr>
   Relax NG Schema for MusicBrainz XML Metadata Version 0, draft 1

   Copyright (c) 2004 Robert Kaye
   
   The schema is released under the Creative Commons 
   Attribution-ShareAlike 2.0 license.

   http://creativecommons.org/licenses/by-sa/2.0/
   <hr>

   TODO:
     - Constrain text values and attributes
     - Add AR support
     - Add extensibility

-->
<grammar xmlns="http://relaxng.org/ns/structure/1.0"
         datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"
         xmlns:mb="http://musicbrainz.org/ns/mb/1/"
         xmlns:dc="http://purl.org/dc/elements/1.1/"
         xmlns:az="http://www.amazon.com/gp/aws/landing.html#">

   <start>
     <ref name="metadata"/>
   </start>

   <define name="metadata">
       <element name="mb:metadata">
         <optional>
           <ref name="artist-list"/>
         </optional>
         <optional>
           <ref name="album-list"/>
         </optional>
         <optional>
           <ref name="track-list"/>
         </optional>
       </element>
   </define>

   <define name="album-list">
       <element name="mb:album-list">
           <oneOrMore>
               <ref name="album-entity"/>
           </oneOrMore>
       </element>
   </define>

   <define name="artist-list">
       <element name="mb:artist-list">
           <oneOrMore>
               <ref name="artist-entity"/>
           </oneOrMore>
       </element>
   </define>

   <define name="track-list">
       <element name="mb:track-list">
           <oneOrMore>
             <ref name="track-entity"/>
           </oneOrMore>
       </element>
   </define>

   <define name="artist-entity">
       <element name="mb:artist">
           <optional>
               <attribute name="id"/>
           </optional>
           <optional>
               <attribute name="begin"/>
           </optional>
           <optional>
               <attribute name="end"/>
           </optional>
           <optional>
               <attribute name="group"/>
           </optional>
           <optional>
               <attribute name="type"/>
           </optional>

           <element name="dc:title">
               <text/>
           </element>

           <optional>
               <element name="mb:sortName">
                   <text/>
               </element>
           </optional>
       </element>
   </define>

   <define name="album-entity">

       <element name="mb:album">
           <optional>
               <attribute name="id"/>
           </optional>
           <optional>
               <attribute name="type"/>
           </optional>
           <optional>
               <attribute name="status"/>
           </optional>

           <element name="dc:title">
               <text/>
           </element>
           <optional>
             <element name="mb:counts">
                 <optional>
                     <attribute name="track"/>
                 </optional>
                 <optional>
                     <attribute name="cdid"/>
                 </optional>
                 <optional>
                     <attribute name="trmid"/>
                 </optional>
                 <text/>
             </element>
           </optional>
           <optional>
               <element name="az:asin">
                   <text/>
               </element>
           </optional>

           <optional>
               <ref name="artist-entity"/>
           </optional>

           <zeroOrMore>
             <ref name="release-list"/>
           </zeroOrMore>

           <zeroOrMore>
             <ref name="cdid-list"/>
           </zeroOrMore>

           <optional>
             <ref name="track-list"/>
           </optional>

       </element>
   </define>

   <define name="track-entity">
       <element name="mb:track">
           <optional>
               <attribute name="id"/>
           </optional>
           <optional>
             <attribute name="duration"/>
           </optional>

           <element name="dc:title">
               <text/>
           </element>
           <optional>
             <ref name="artist-entity"/>
           </optional>
           <optional>
             <ref name="album-entity"/>
           </optional>
           <optional>
             <ref name="trmid-list"/>
           </optional>
       </element>
   </define>

   <define name="release-list">
       <element name="mb:release-list">
           <element name="mb:release">
           <optional>
               <attribute name="date"/>
           </optional>
           <optional>
               <attribute name="country"/>
           </optional>
           </element>
       </element>
   </define>

   <define name="cdid-list">
       <element name="mb:cdid-list">
           <element name="mb:cdid">
               <attribute name="id"/>
           </element>
       </element>
   </define>

   <define name="trmid-list">
       <element name="mb:trmid-list">
           <element name="mb:trmid">
               <attribute name="id"/>
           </element>
       </element>
   </define>

</grammar>


Development Notes

This page describes the next generation XML based web service for MusicBrainz. Initially we will discuss the proposed XML and its Relax NG schema for expressing MusicBrainz (and music in general) metadata. Once we hammer that out, we'll discuss the URL scheme proposed by Matthias Friedrich.

Use Cases

This new web service will have the following use cases:

  • Retrieve artist (via mbid)/album (via mbid/cdid)/track (via mbid)
    • Each of these should return a minimal amount of data and have options to return more detailed data:
      • artist: optional list of albums
      • album: optional artist info, cdids, release info, list of tracks
      • track: optional artist info
  • Retrieve a list of tracks that match a given trmid
    • same arguments that apply to track info should be used here
  • Full text search of the DB (via lucene query)
  • Login to MB
  • Submit trmids (after login)
  • Check donation status of user (for showing pop-ups in Picard)
  • Lookup track (like MBQ_FileLookup)

Unresolved Issues

There is a single schema for all responses of the web service. The definition isn't finished yet, there are a few issues that have to be discussed:

MatthiasFriedrich's questions:

  • Use 'mb' prefixes or make it the default namespace?
  • Do we really need amazon and DC namespaces? If we later want to make the format extensible by allowing
  • people to add elements in their own namespaces, amazon and DC will look like extensions.
  • The 'mb:counts' element is weird. If we don't need the 'trmid' attribute, we can easily add the track
  • counts to the 'track-list' element and 'cdid' to 'cdid-list'.
  • Some data is coded in attributes but most is in elements. How about removing all attributes except for ids?
  • ARs are currently missing.

RobertKaye's responses:

  • Lets use a default namespace and use mmd as the short name of the namespace. Music Meta Data.
  • I think its good practice to use dc -- we should keep it. But declaring the az namespace for one element is silly.
  • The counts element is intented to convey metadata about an album -- it can be useful to know how many ids of the various types there are without actually getting any of the ids. Maybe we can find a better name?
  • I did that to conserve space in the output -- esp with the counts where you could have 1000%+ overhead to represent a single digit integer... The RDF/XML is so VERBOSE that we're wasting TONS of bandwidth on a verbose encoding. Let's tighen this one and make it easier to look at.
  • Yup, still on the todo list.