History:Development/XML Web Service/Version 1
Status: Beta testing in progress. Please do not change unless you are told to!
Introduction
The web service discussed in this document is an interface to the MusicBrainz database which contains a huge amount of music metadata, all maintained by the MusicBrainz community. It is aimed at developers of media players, CD rippers, taggers and other applications requiring music metadata. The service's architecture follows the REST design principles. Interaction with the web service is done using HTTP and all content is served in a simple but flexible XML format.
This document first describes how the data in MusicBrainz is organized. Users who already have experience in using the website can safely skip this section and start with the specification sections.
The MusicBrainz Metadata Model
MusicBrainz uses an object oriented schema to model music metadata. The main classes are artist, release and track, each with a different set of attributes and relations. Apart from their traditional relations (artists have releases, releases contain tracks), a more powerful schema was introduced (sometimes called AdvancedRelationships).
It allows users to link an object of one class to an object of any other class (URLs are permitted, too). Many different link types exist (see AdvancedRelationshipType for a list), which can be used to specify the artist who did background vocals on a release or track, who is married to whom, where an artist's offical homepage is, and a lot more. Those links themselves may have attributes, with their semantics depending on the link type.
The following sections discuss the main classes in more detail.
The Artist Class
In MusicBrainz, artists always have a unique ID, a name and a SortName. If the artist name isn't unique, a disambiguation comment is used to provide more information about the artist (see IdenticallyNamedArtists). Additionally, an artist may also be flagged as Person or Group and it can have a begin date and an end date. For persons, these are the dates of birth and death, for groups they are the founding and dissolving dates.
An artist can have any number of releases and relations to other artists, releases, tracks and URLs.
The Release Class
All releases have a unique ID, a title and one or more tracks. Each release has a type ("Album", "Compilation", "Single" etc.), a status ("Official", "Promotion", "Bootleg") and language information. A release may also have additional release information, which are represented as a list of (country, date) tuples, also called release events.
A common use case is to look up a release using a DiscID generated from an Audio CD's table of contents (TOC). A release can have any number of DiscIDs (including none), mostly due to different pressings. In rare cases, where two different CDs have the same TOC, a DiscID may map to more than one release.
Relations are used to link a release to other releases, artists, tracks and URLs.
The Track Class
Tracks have a unique ID, a title and one main artist. They may also have a duration attribute indicating the play time. There can be any number of PUIDs, which are used to lookup tracks. PUIDs are audio fingerprints generated from music files, but they are not unique, so a PUID can be associated with many tracks.
As with the other classes, a track object can have any number of relations to artists, releases, tracks and URLs.
The URL Schema
All MusicBrainz objects (artists, releases, tracks) are modeled as resources. Resources have unique URLs and can be accessed using standard HTTP. Each resource is also part of a collection. This is a special resource which represents all objects of a type.
For this version of the web service, the http://musicbrainz.org/ws/1/
namespace has been reserved. It is further structured like this:
http://musicbrainz.org/ws/1/artist/ |
Collection of all artists |
http://musicbrainz.org/ws/1/artist/MBID |
An individual artist |
http://musicbrainz.org/ws/1/release/ |
Collection of all releases |
http://musicbrainz.org/ws/1/release/MBID |
An individual release |
http://musicbrainz.org/ws/1/track/ |
Collection of all tracks |
http://musicbrainz.org/ws/1/track/MBID |
An individual track |
Basically, there are two different ways to access MusicBrainz data. If you know the MBID (a globally unique identifier assigned to each object in the database), you can request the resource directly. To access the artist "Tori Amos" for example, the resource http://musicbrainz.org/ws/1/artist/c0b2500e-0cef-4130-869d-732b23ed9df5
may be used.
Another option is to use the artist
collection. Since this collection is huge, it is unfeasible to request all of it and then extract the data you need. Instead, collections support filters, which allow to limit the amount of data based on some criteria. For example, you can use a filter to only request artists with the name "Tori Amos": http://musicbrainz.org/ws/1/artist/?type=xml&name=Tori+Amos
. The Filters supported depend on the collection and are described below.
In REST, HTTP methods are used to create (PUT
), retrieve (GET
), modify (POST
) and delete (DELETE
) resources. The most important method for this web service is GET
, which returns a representation for the requested resource. Several different representations are possible, but at this point only the XML format discussed later in this document is supported.
By default, the web service only returns a basic representation of a resource. Additional information can be requested using the inc
parameter, which depends on the resource. If you want to request a release including all tracks and additional release information for example, you can use this URL: http://musicbrainz.org/ws/1/release/02232360-337e-4a3f-ad20-6cdd4c34288c?type=xml&inc=tracks+release-events
The following sections discuss the parameters available for each type of resource. The type
parameter is required for all web service queries:
type | Selects the representation of the resource. Currently only xml is supported. This is mandatory!
|
The inc
parameter is only allowed for individual resources (but not for collections):
inc | A list of space separated values describing how much detail should be included in the output. If there is no inc parameter, just the basic data for a resource is returned. For artists that would be name, sort-name, and disambiguation.
|
The limit
parameter is supported for all resource collections (but not for individual resources):
limit | An integer value defining how many entries should be returned. Only values between 1 and 100 (both inclusive) are allowed. If not given, this defaults to 25. |
Note that multiple parameters with the same name are not permitted.
The following HTTP status codes are used:
code | cause |
200 OK | Resource retrieved successfully. |
400 Bad Request | Syntactically invalid MBID requested. |
Invalid parameter value (ie. invalid inc tag)
| |
Missing required parameter value (ie. type not set)
| |
401 Unauthorized | Client requested a resource which requires authentication via HTTP Digest Authentication. |
If sent even though user name and password were given: user name and/or password are incorrect. | |
404 Not Found | Wrong web service prefix. /ws/ would be correct for the MusicBrainz server.
|
Invalid version number. Only version 1 is currently supported.
| |
Invalid entity name. Only artist , release , track , or user are permitted.
| |
Resource not found. There is no resource having this ID (maybe it was merged or deleted). |
artist resources
Parameters for http://musicbrainz.org/ws/1/artist/MBID:
inc | Supported: 'aliases', 'sa-'*, 'va-'*, 'artist-rels', 'release-rels', 'track-rels', 'url-rels' |
To get an artist's releases, the 'sa-' and 'va-' prefixes (for single artist and various artist releases, respectively) together with the desired release type have to be used. For example, the release tag sa-Album
requests single artist albums, while va-Bootleg
requests various artists bootlegs. Multiple tags may be used, so inc=sa-Compilation+sa-Official
is valid and returns all official compilations by that artist (AND conjunctions are used for release types!).
Note: Only 'sa-' or 'va-' may be used in an inc
parameter.
Parameters for http://musicbrainz.org/ws/1/artist/:
name | Fetch a list of artists with a matching name |
limit | The maximum number of artists returned. Defaults to 25, the maximum allowed value is 100. |
release resources
Parameters for http://musicbrainz.org/ws/1/release/MBID:
inc | Supported: 'artist', 'counts', 'release-events', 'discs', 'tracks', 'artist-rels', 'release-rels', 'track-rels', 'url-rels' |
Parameters for http://musicbrainz.org/ws/1/release/:
title | Fetch a list of releases with a matching title |
discid | Fetch all releases matching to the given DiscID |
artist | The returned releases should match the given artist name |
artistid | The returned releases should match the given artist ID (36 character ASCII representation). If this is given, the artist parameter is ignored.
|
releasetypes | The returned releases must match all of the given release types. This is a list of space separated values like Official , Bootleg , Album , Compilation , etc.
|
limit | The maximum number of releases returned. Defaults to 25, the maximum allowed value is 100. |
For the releasetypes
parameter, the MusicBrainz release type and release status values are used (see AlbumAttribute). Note that the current MusicBrainz server only supports one release type and one release status value, so for example releasetypes=Live+Compilation
won't work because for releasetypes
an AND conjunction is used.
track resources
Parameters for http://musicbrainz.org/ws/1/track/MBID:
inc | Supported: 'artist', 'releases', 'puids', 'artist-rels', 'release-rels', 'track-rels', 'url-rels' |
Parameters for http://musicbrainz.org/ws/1/track/:
title | Fetch a list of tracks with a matching title |
artist | The returned tracks have to match the given artist name. |
release | The returned tracks have to match the given release title. |
duration | The length of the track in milliseconds |
tracknum | The track number |
artistid | The artist's MBID. If this is given, the artist parameter is ignored.
|
releaseid | The release's MBID. If this is given, the release parameter is ignored.
|
puid | The returned tracks have to match the given PUID. |
limit | The maximum number of tracks returned. Defaults to 25, the maximum allowed value is 100. |
PUID submission works using POST
on the collection of tracks. The parameters are sent url-encoded, that means with a content type of application/x-www-form-urlencoded
.
Parameters for posting to http://musicbrainz.org/ws/1/track/:
client | The ID of the client software submitting the PUIDs. This has to be the application's name and version number, not that of a client library (client libraries should use HTTP's User-Agent header)! The required format: application-version , where version must not contain a - character.
|
puid | A (TrackId, PUIDId) pair, separated by a single space character. Both TrackId and PUID are in their 36 character ASCII representation. This parameter may appear multiple times. |
Users have to be logged in to submit PUIDs. This is independent from the website login and works via HTTP Digest Authentication. The realm is 'musicbrainz.org'.
Examples
Below are a few user contributed examples to illustrate the use of this web service:
To get the Official XTC Releases that best match the album title "Wasp Star Apple Venus Volume 2", this would be the query: http://musicbrainz.org/ws/1/release/?type=xml&artist=XTC&releasetypes=Official&limit=10&title=Wasp+Star+Apple+Venus+Vol+2
Tracks listing can then be gotten by choosing the release id and querying for tracks: http://musicbrainz.org/ws/1/release/6d931ac2-e389-4e99-8a01-1da65162c372?type=xml&inc=tracks
The XML Format
To represent web service responses, which are basically representations of some part of the MusicBrainz database, a new XML format has been developed. It is easy to read, powerful and extensible. Unfortunately, there is no documentation on it yet, but some examples and a Relax NG schema are available via subversion:
svn co http://svn.musicbrainz.org/mmd-schema/trunk mmd-schema
This is also available via the trac source browser.
Bugs in the schema should be reported on the IRC channel or posted to mb-devel. Please note that we're not going to make major changes to the format, only remaining mistakes will be corrected.
IDs and Types
The IDs and types used in the XML format are URIs. To keep the transmission overhead low, all URIs in the MusicBrainz namespace may be used in their relative form. So if a track's fully qualified id
is http://musicbrainz.org/track/d6118046-407d-4e06-a1ba-49c399a4c42f
, it may be shortened to d6118046-407d-4e06-a1ba-49c399a4c42f
in the XML. Note that this shortening is only allowed for URIs from the MusicBrainz namespace.
The following rules apply to create a fully qualified URI from a relative one:
- The
id
attributes:- artist:
http://musicbrainz.org/artist/
relative URI - release:
http://musicbrainz.org/release/
relative URI - track:
http://musicbrainz.org/track/
relative URI
- artist:
- The
type
attributes:- artist:
http://musicbrainz.org/ns/mmd-1.0#
relative URI - release:
http://musicbrainz.org/ns/mmd-1.0#
relative URI (for each relative URI in the list)
- artist:
Due to their large number, relations are in a namespace on their own to avoid clashes:
- Various relation attributes:
- type:
http://musicbrainz.org/ns/rel-1.0#
relative URI - attributes:
http://musicbrainz.org/ns/rel-1.0#
relative URI (for each relative URI in the list)
- type:
Note: Don't confuse the URIs, especially the id
URIs with URLs. The URIs are just names, they should not be used to query data from the server. But they are in a permanent format which will always be valid and can easily be transformed to URLs. Example:
The following is an absolute, permanent MusicBrainz artist identifier which is the preferred representation. Shorter representations may be used for storing IDs in file tags or databases.
http://musicbrainz.org/artist/c0b2500e-0cef-4130-869d-732b23ed9df5
This one is a URL created from the URI above, using a simple transformation. It can be used to request data from the MusicBrainz server via the webservice. URLs may change over time, the URIs will not.
http://musicbrainz.org/ws/1/artist/c0b2500e-0cef-4130-869d-732b23ed9df5
Using the XML format for other applications
The XML format format uses URIs for several IDs and types. The artist
element, for example, has a type
attribute which accepts a URI. Users of the format may use the URIs defined by MusicBrainz or use one from their own namespace. This example uses a definition from MusicBrainz:
<artist id="c0b2500e-0cef-4130-869d-732b23ed9df5" type="http://musicbrainz.org/ns/mmd-1.0#Group"/>
If an application needs "Orchestra" as an artist type, a different namespace may be used:
<artist id="c0b2500e-0cef-4130-869d-732b23ed9df5" type="http://example.org/ext-7.2#Orchestra"/>
This method may be used in all places where the schema accepts an anyURI datatype. As mentioned earlier, there is a special rule for all URIs defined by MusicBrainz: They may be relative, as you can see in the examples above. The complete artist ids have the form http://musicbrainz.org/artist/c0b2500e-0cef-4130-869d-732b23ed9df5
and the type
attribute in the first example could also be written as Group
, without the namespace prefix.
To extend the format even further, the schema has several extension points (see def_extension
) which allows adding arbitrary XML elements from a user-defined namespace. Using the mmd-namespace or no namespace at all is not permitted. There are no namespace restrictions inside that element, however, and unlimited nesting is possible, too.
If your private namespace is http://example.org/ext-9.1#
and you want to add data from a rating system, for example, it could be coded like this:
<?xml version="1.0" encoding="UTF-8"?> <metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#" xmlns:ext="http://example.org/ext-9.1#"> <track id="d6118046-407d-4e06-a1ba-49c399a4c42f"> <title>Silent All These Years</title> <duration>253466</duration> <ext:rating value="9"/> </track> </metadata>
Even more complicated things, like nested tags are possible. Note that the em
doesn't belong to the ext
namespace.
<?xml version="1.0" encoding="UTF-8"?> <metadata xmlns="http://musicbrainz.org/ns/mmd-1.0#" xmlns:ext="http://example.org/ext-9.1#"> <track id="d6118046-407d-4e06-a1ba-49c399a4c42f"> <title>Silent All These Years</title> <duration>253466</duration> <ext:annotation>This is a <em>very</em> nice song.</ext:annotation> </track> </metadata>
This is still valid according to the schema, but inside the extension elements, only well-formedness can be checked.
A Note to Application Developers
The PythonMusicBrainz2 package is the reference implementation of a client library for this web service. It has been designed to be as simple to use as possible but still provides access to all parts of the service. If you are planning to write bindings for another programming language, you are encouraged to follow the programming model, object oriented schema, and terminology of PythonMusicBrainz2 as far as it makes sense for your language.
Development Notes
This section has to be integrated into the spec at some point.
Use Cases
This new web service will have the following use cases:
- Retrieve artist (via mbid)/album (via mbid/cdid)/track (via mbid)
- Each of these should return a minimal amount of data and have options to return more detailed data:
- artist: optional list of albums
- album: optional artist info, cdids, release info, list of tracks
- track: optional artist info
- Each of these should return a minimal amount of data and have options to return more detailed data:
- Retrieve a list of tracks that match a given PUID
- same arguments that apply to track info should be used here
- Full text search of the DB (via lucene query)
- Login to MB
- Submit PUIDs (after login)
- Check donation status of user (for showing pop-ups in Picard)
- Lookup track (like MBQ_FileLookup)
Collection filtering issues
The exact collection filtering hasn't been defined yet. I think tying query fields together using logical OR is probably the best choice. Lucene will automatically move the best matches to the top of the result list. However, there should be one exception: If IDs are given (artistid, releaseid, discid, puid), then the results should always match those IDs. An example how a filter on the track collection could be built:
AND artistid AND releaseid AND ( title OR duration OR ... )
If both artist
and artistid
are given, the artist
parameter should be ignored (as with releaseid
and release
). The lucene query syntax is never used. That means, you can always put data read from file tags into your queries without having to escape special characters.
Discussion
- When using
?name=...
it says it retrieves resources with a matching name (or title). What kind of "matching"? - How exactly does
?duration=...
work - does it support range or fuzzy matching for example? - If multiple filter arguments are given, how do they combine? Is it allowed to repeat a filter argument (e.g.
?artist=X&artist=Y
) ? - AFAICT nowhere does it say what exactly all the
?inc
options do. I can guess most of them, in a vague sort of way, but for example is?inc=artist-rel
artist releases or artist relationships (or something else)? If these have already been defined, then a pointer to that definition would be helpful.