History:Debian VMWare Database Installation

From MusicBrainz Wiki
Revision as of 17:11, 25 July 2005 by Zout (talk | contribs) (migration fix (Imported from MoinMoin))
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

MusicBrainz Database under Debian and VMWare HOWTO

This how-to instructs how to step-by-step create a VMWare workstation that contains just the MusicBrainzDatabase (no TestServer). We will download all the necessary packages and Musicbrainz database throughout the how-to. This has been successfully tested by myself during the first week of June in 2005.

Prerequisites

  • High speed connection
  • VMware workstation 5
  • 1 or 2 hours in front of the computer to install and setup everything.
  • 2 to 5 hours to import the database. It runs by itself.
  • About 20 gigs of hard drive (at least I think you need it)
  • The latest Debian minimal net install CD.

Installing Debian into a VMWare Workstation

1. Create a VMWare machine with a 15-20 gig hard drive.

2. Install Debian ...

2a. Download the latest snapshot installer from Debian.org. Important installation notes:

  • I let Debian use all the defaults for the installation process.
  • When asked if I wanted to install any additional packages (like "X Desktop, or SQL Server, File server), I chose none of them. We'll just install the ones we need later.

Installing Required Packages for MusicBrainz Server

3. When you're at the prompt, login as root and run "apt-get update" to make sure that you have the latest package list from Debian.

4. Run "apt-get install postgresql" to install Postgres. Important notes: When asked which locale you want, select "C". By default it selects "en_us" which does not seem to work.

5. Run "apt-get install postgresql-dev". It is required by a Perl module that will later be installed.

6. Run "apt-get install cvs". Required to download the mb_server via CVS.

7. Run "apt-get install apache-perl". Required include files for mb_server's Perl scripts that we'll download later.

8. Run "apt-get install bzip2". Required to uncompress the the full exports.

Installing Perl Modules Required to Import MusicBrainz Database

9. Run "cpan" to start the CPAN shell. Important notes: Because this is the first time we've run CPAN, it will ask if we want to manually configure CPAN. I answered "no".

10. Type "install DBI" in the CPAN shell to install DBI.

11. Type "install DBD::Pg" in the CPAN shell to install DBD::Pg. Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

12. Type "install Text::Unaccent". Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

13. Type "install Date::Calc". Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

14. Type "install String::ShellQuote" Important notes: If it asks to follow other dependencies, say "yes" and let it download and install those too.

Getting mb_server from MusicBrainz CVS

15. Cut and paste the following lines directly into the command prompt. This will setup the environment variables necessary to download "mb_server" from the CVS over an SSH connection.

CVSROOT=:ext:cvs at cvs.musicbrainz.org:/var/cvs 
 
export CVSROOT 
 
wget ftp://ftp.musicbrainz.org/pub/musicbrainz/misc/musicbrainz_anoncvs 
 
chmod 700 musicbrainz_anoncvs 
 
mv musicbrainz_anoncvs ~/ 
 
CVS_RSH=~/musicbrainz_anoncvs 
 
export CVS_RSH 
 

17. Type the following into the command prompt. This will create the default directories found in the mb_server Perl scripts so that you won't have to change any lines in the Musicbrainz code.

mkdir /home/httpd 
 
mkdir /home/httpd/musicbrainz 
 
cd /home/httpd/musicbrainz 
 

Important note: From this point on, we're going to be working in the "/home/httpd/musicbrainz" directory.

18. Run "cvs co -r RELEASE_20050527-BRANCH mb_server" to begin downloading the mb_server branch. This was the most current mb_server branch as of this writing. I would advise you to check http://musicbrainz.org/CVS/Tag to see which branch is the most current.

19. Run "ln -s /usr/bin/perl mb_server/cgi-bin/perl" to symbolically link Perl. Some of the mb_server scripts look for Perl in a different place than it's really installed on Debian. This will fix that issue.

Change PostgreSQL's Permissions to Allow Everyone

20. Edit "/etc/postgresql/pg_hba.conf" so that we open Postgres's security up wide. If you are familiar with Postgres's security, then disregard steps #20 and #21 and change them to suit your own needs. For me, it's running entirely local, so security is not a concern.

20a. Uncomment line 60 to read: "Local all all trust"

20b. Uncomment line 62 to read: "host all all 127.0.0.1 255.255.255.255 trust"

20c. Change line 86 to read: "local all postgres password"

20d. Change line 99 to read: "host all all 0.0.0.0 0.0.0.0 password"

21. Change the password for the PostgreSQL user "postgres" ...

21a. Run "psql template1" to change the password for the Postgres user.

21b. Run "psql template1" to start the PostgreSQL prompt.

21c. Type "alter user postgres with password 'postgres';" at the prompt to change the password.

21d. Type "\q" at the prompt to quit PostgreSQL.

21e. Type "exit" at the shell prompt to logout of the postgres user and back to root.

21f. Type "/etc/init.d/postgresql restart" to restart Postgres and make the changes effective.

Downloading the MusicBrainz Database Dumps

22. If you havn't done it previous to this how-to, download all 5 the Musicbrainz database dumps. For this how-to, I suggest putting them into "/home/httpd/musicbrainz". Download the latest release of the database. To check, go to http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/

22a. You can use "wget -c" to download them like this:

wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump.tar.bz2 
 
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-derived.tar.bz2 
 
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-moderation.tar.bz2 
 
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-closedmoderation.tar.bz2 
 
wget -c http://ftp.musicbrainz.org/pub/musicbrainz/data/fullexport/20050528-074029/mbdump-artistrelation.tar.bz2 
 

Importing the Database into PostgreSQL

23. Run "mb_server/admin/InitDb.pl --createdb --echo --import mb*.tar.bz2" to begin importing the Musicbrainz database.

Important note: This process took my computer (P3/1.0Ghz/1gig of ram, VMware 5 Workstation with 384Mb of ram for the VMware machine) 4.5 hours to complete! Be patient.

Troubleshooting:

  • If you see something similar to this: "Schema sequence mismatch - codebase is 6, snapshot files are 7.", it means that the Perl scripts that you download in steps #15 thru #19 do not match the database you downloaded in step #22. Make sure that you are using the most current branch as noted in step #18 and that you have the latest Musicbrainz databases in step #22!
  • If you see something similar to this:
Error loading /tmp/MBImport-hYEom30W/mbdump/moderation_closed: Error loading data at /drive2/home/httpd/musicbrainz/mb_server/admin/MBImport.pl line 259. 
 
$sql->Rollback called without $sql->Begin at /drive2/home/httpd/musicbrainz/mb_server/admin/MBImport.pl line 270 
 
  • ..it means that you ran out of hard drive space. It's a VMware machine, so make sure that you are creating the hard drive with a large maximum hard drive space. 20 Gigs worked for me. Thanks to Dave for both tips.

That's all!

Final notes

When you're done, you can download the Postgres database client "pgAdmin III" for Windows from http://www.pgadmin.org/. Connect it to your VMWare machine using the bridged network built into VMWare, and you can do whatever you want with the data.

Good luck.