History:Debian VMWare Database Installation

From MusicBrainz Wiki

Products > Database > Database setup > Database setup with VMWare

This how-to instructs how to step-by-step create a VMWare workstation that contains just the MusicBrainz Database (no Test Server). We will download all the necessary packages and Musicbrainz database throughout the how-to. This has been successfully tested during the first week of June in 2005.

Prerequisites

  • High speed connection
  • VMware workstation 5
  • 1 or 2 hours in front of the computer to install and setup everything.
  • 2 to 5 hours to import the database. It runs by itself.
  • About 20 gigs of hard drive (at least I think you need it)
  • The latest Debian minimal net install CD.

Installing Debian into a VMWare Workstation

1. Create a VMWare machine with a 15-20 gig hard drive. Be sure to preallocate disk space.

2. Install Debian ...

2a. Download the latest snapshot installer from Debian.org. Important installation notes:

  • I let Debian use all the defaults for the installation process.
  • Ensure you set-up the source for downloading apt packages
  • When asked if I wanted to install any additional packages (like "X Desktop, or SQL Server, File server), I chose none of them. We'll just install the ones we need later.

Installing Required Packages for MusicBrainz Server

3. When you're at the prompt, login as root and run "apt-get update" to make sure that you have the latest package list from Debian.

4. Run "apt-get install postgresql" to install Postgres. If setup-does not start you probably have problems with source of apt packages installations (see point 2a). Important notes: When asked which locale you want, select "C". By default it selects "en_us" which does not seem to work.

5. Run "apt-get install postgresql-dev". It is required by a Perl module that will later be installed.

6. Run "apt-get install subversion". Required to download the mb_server.

7. Run "apt-get install apache-perl". Required include files for mb_server's Perl scripts that we'll download later.

8. Run "apt-get install bzip2". Required to uncompress the the full exports.

Installing Perl Modules Required to Import MusicBrainz Database

9. Run "cpan" to start the CPAN shell. Important notes: Because this is the first time we've run CPAN, it will ask if we want to manually configure CPAN. I answered "no".

10. Type "install DBI" in the CPAN shell to install DBI.

11. Type "install DBD::Pg" in the CPAN shell to install DBD::Pg. Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

12. Type "install Text::Unaccent". Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

13. Type "install Date::Calc". Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

14. Type "install String::ShellQuote" Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

Getting mb_server from MusicBrainz

17. Type the following into the command prompt. This will create the default directories found in the mb_server Perl scripts so that you won't have to change any lines in the Musicbrainz code.

mkdir /home/httpd
mkdir /home/httpd/musicbrainz
cd /home/httpd/musicbrainz

Important note: From this point on, we're going to be working in the "/home/httpd/musicbrainz" directory.

18. Run "svn checkout http://svn.musicbrainz.org/mb_server/trunk mb_server" to begin downloading the mb_server branch.

19. Run "ln -s /usr/bin/perl mb_server/cgi-bin/perl" to symbolically link Perl. Some of the mb_server scripts look for Perl in a different place than it's really installed on Debian. This will fix that issue.

Change PostgreSQL's Permissions to Allow Everyone

20. Edit "/etc/postgresql/pg_hba.conf" so that we open Postgres's security up wide. If you are familiar with Postgres's security, then disregard steps #20 and #21 and change them to suit your own needs. For me, it's running entirely local, so security is not a concern.

20a. Uncomment line 60 to read: "Local all all trust"

20b. Uncomment line 62 to read: "host all all 127.0.0.1 255.255.255.255 trust"

20c. Change line 86 to read: "local all postgres password"

20d. Change line 99 to read: "host all all 0.0.0.0 0.0.0.0 password"

21. Change the password for the PostgreSQL user "postgres" ...

21a. Run "psql template1" to change the password for the Postgres user.

21b. Run "psql template1" to start the PostgreSQL prompt.

21c. Type "alter user postgres with password 'postgres';" at the prompt to change the password.

21d. Type "\q" at the prompt to quit PostgreSQL.

21e. Type "exit" at the shell prompt to logout of the postgres user and back to root.

21f. Type "/etc/init.d/postgresql restart" to restart Postgres and make the changes effective.

Downloading the MusicBrainz Database Dumps

22. If you havn't done it previous to this how-to, download all 5 the Musicbrainz database dumps. For this how-to, I suggest putting them into "/home/httpd/musicbrainz". Download the latest release of the database. To check, go to http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/

22a. You can use "wget -c" to download them like this:

wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump.tar.bz2
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-derived.tar.bz2
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-moderation.tar.bz2
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-closedmoderation.tar.bz2
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-artistrelation.tar.bz2

Importing the Database into PostgreSQL

23. Run "mb_server/admin/InitDb.pl --createdb --echo --import mb*.tar.bz2" to begin importing the Musicbrainz database.

Important note: This process took my computer (P3/1.0Ghz/1gig of ram, VMware 5 Workstation with 384Mb of ram for the VMware machine) 4.5 hours to complete! Be patient.

Troubleshooting:

  • If you see something similar to this: "Schema sequence mismatch - codebase is 6, snapshot files are 7.", it means that the Perl scripts that you download in steps #15 thru #19 do not match the database you downloaded in step #22. Make sure that you are using the most current branch as noted in step #18 and that you have the latest Musicbrainz databases in step #22!
  • If you see something similar to this:
Error loading /tmp/MBImport-hYEom30W/mbdump/moderation_closed: Error loading data at /drive2/home/httpd/musicbrainz/mb_server/admin/MBImport.pl line 259.
$sql->Rollback called without $sql->Begin at /drive2/home/httpd/musicbrainz/mb_server/admin/MBImport.pl line 270
  • ..it means that you ran out of hard drive space. It's a VMware machine, so make sure that you are creating the hard drive with a large maximum hard drive space. 20 Gigs worked for me. Thanks to Dave for both tips.

That's all!

Final notes

When you're done, you can download the Postgres database client "pgAdmin III" for Windows from http://www.pgadmin.org/. Connect it to your VMWare machine using the bridged network built into VMWare, and you can do whatever you want with the data.

Good luck.