MusicBrainz Database/Download: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(Update eu ftp link)
(→‎Download: Add link to Danish mirror.)
Line 20: Line 20:
* http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/
* http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/
* ftp://ftp.eu.metabrainz.org/pub/musicbrainz/data/fullexport/ (EU mirror - FR)
* ftp://ftp.eu.metabrainz.org/pub/musicbrainz/data/fullexport/ (EU mirror - FR)
* https://mirrors.dotsrc.org/MusicBrainz/data/fullexport/ (EU mirror - DK)
* rsync://rsync.osuosl.org/musicbrainz/data/fullexport/ (rsync mirror)
* rsync://rsync.osuosl.org/musicbrainz/data/fullexport/ (rsync mirror)



== File Descriptions ==
== File Descriptions ==

Revision as of 12:39, 26 April 2017

Introduction

Please read the MusicBrainz Database product page and the database schema documentation if you are not familiar with the MusicBrainz Database.

Setup

There are a two different methods to get a local database up and running, you can either:

  • Download a pre-configured virtual image of the MusicBrainz Server, or
  • Download the data dumps and follow the relevant section of the INSTALL.md

Replication

If you are interested in keeping the data in sync with MusicBrainz using our live data feed, you can either:

  • Enable replication in the pre-configured virtual image,
  • Use an alternative PostgreSQL setup using mbslave that includes replication without the rest of MusicBrainz Server, or
  • Use an alternative MySQL setup using mbzdb that includes replication without the rest of MusicBrainz Server

Download

The data dumps are available for download via http, ftp or rsync at following places:

File Descriptions

Each data dump snapshot provided over FTP includes a number of different files. Depending on your use cases, you may or may not require all of them. Here's a rundown of what they contain:

The Basics

If you're only looking for music metadata, you can start here with the basics. These files should help you get everything you need to replicate the core catalog.

If you're looking for more advanced, or more analytical data, you should still have a look at these basics, but make sure to also see the Advanced section below.

ASC files

All of the `.asc` files contain the PGP signatures for their respective files. You can use these to verify the PGP signatures of the files after you've downloaded.

In order to verify the downloads, you must first fetch the MusicBrainz public key:

$ gpg --recv-keys C777580F
gpg: requesting key C777580F from hkp server keys.gnupg.net
gpg: /home/kevin/.gnupg/trustdb.gpg: trustdb created
gpg: key C777580F: public key "MusicBrainz (MusicBrainz data dump signing key) <support@musicbrainz.org>" imported
gpg: no ultimately trusted keys found
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)`

Now you can verify the GPG signatures. For example, if you download the SHA256SUMS files:

$ wget http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20150718-003933/SHA256SUMS
...
$ wget http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20150718-003933/SHA256SUMS.asc
...
# now you can run:
$ gpg --verify SHA256SUMS.asc SHA256SUMS
gpg: Signature made Sat 18 Jul 2015 03:10:45 AM UTC using RSA key ID C777580F
gpg: Good signature from "MusicBrainz (MusicBrainz data dump signing key) <support@musicbrainz.org>"
gpg: WARNING: This key is not certified with a trusted signature!
gpg:          There is no indication that the signature belongs to the owner.
Primary key fingerprint: D5E6 3B4B DCCE 1956 4294  8684 B8FC 2375 C777 580F


Note: If you don't use `gpg` very frequently, and haven't marked the key as trusted (or marked any other key as trusted), you'll see the above warning that the key is not certified. It doesn't mean that the signature is invalid, just that `gpg` won't be convinced that the source of the key you received is authentic until you tell it that you think it is.

MD5SUM and SHA25SUM

These files contain the checksums for the hosted files. You can run `md5sum` and `sha256sum` on the downloaded .tar.bz2 files to validate the checksums:

$ wget http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20150718-003933/mbdump-stats.tar.bz2
$ sha256sum mbdump-stats.tar.bz2
5ad5de5c6804c6c937729382f7a0db50f46dc9ae0a4a143e7720fb1d4bbbfeba  mbdump-stats.tar.bz2

mbdump.tar.bz2

This is the core MusicBrainz database, including the tables for Artist, Release, Recording, etc.

Most normal catalog use cases only require this database, and the derived data.

mbdump-derived.tar.bz2

The derived data consists of annotations, user tags, and search indexes. Combining this with the core database should cover most music-metadata-related use cases.


More Advanced features

mbdump-edit.tar.bz2

This is the complete edit history for the core database. If you want to see how metadata has evolved, make sure to grab this dump in addition to the core.

The history includes things like open and closed edits, edit notes, votes, and auto-editor elections. It does not include information about the people who made the edits. For that information, you'll need the next item as well.

mbdump-editor.tar.bz2

This table includes non-personal user data about the people who've enacted the edits enumerated in the database above.

mbdump-cdstubs.tar.bz2

The CD Stub data is described over on its dedicated page. As mentioned over there, the stubs are submitted anonymously, and are treated as an untrusted source of data, separate from the core database.

mbdump-stats.tar.bz2

Metadata about the metadata (very meta!). The statistics database includes things that you might find over at http://musicbrainz.org/statistics.


Licenses

The license and contents of each file is described below.

Public Domain

CC0 button.svg

The following database dumps are distributed under the CC0 license, which is effectively placing the data into the Public Domain:

  • mbdump.tar.bz2
  • mbdump-cdstubs.tar.bz2

Creative Commons

File:cc-nc-sa-2.0-88x31.png

The following database dumps are distributed under the Attribution-NonCommercial-ShareAlike 3.0 license:

  • mbdump-derived.tar.bz2
  • mbdump-edit.tar.bz2
  • mbdump-editor.tar.bz2
  • mbdump-stats.tar.bz2