Difference between revisions of "User:RobertKaye/Schema Change Release May 2012"

From MusicBrainz Wiki
(Database changes)
 
(19 intermediate revisions by 4 users not shown)
Line 1: Line 1:
== Tickets under consideration for May 15 2012 ==
+
== Tickets to be implemented for May 2012 release ==
  
=== Conflict tickets ===
+
==== http://tickets.musicbrainz.org/browse/MBS-2885 ====
  
This ticket is a lot of work and may conflict with a Summer of Code proposal:
+
'''Goal''': Allow more than one ISWC per work
* http://tickets.musicbrainz.org/browse/MBS-799
 
  
I'm unsure how to proceed with this ticket.
+
===== Synopsis =====
  
=== Clearly stated tickets ===
+
This change will allow users to add more than one ISWC to a work, by moving the ISWC column from the work table into its own ''iswc'' table.
  
These tickets appear to be cleanly stated and nearly actionable:
 
 
==== http://tickets.musicbrainz.org/browse/MBS-2885 ====
 
 
===== Database changes =====
 
===== Database changes =====
 
<pre>
 
<pre>
 
CREATE TABLE iswc
 
CREATE TABLE iswc
 
(
 
(
     id            SERIAL,
+
     id            SERIAL NOT NULL PRIMARY KEY,
     work          INTEGER,
+
     work          INTEGER NOT NULL REFERENCES work (id),
     iswc          CHARACTER(15),
+
     iswc          CHARACTER(15) CHECK (iswc ~ '^T-?\d{3}.?\d{3}.?\d{3}[-.]?\d$'::text),
 
     source        SMALLINT,
 
     source        SMALLINT,
     edits_pending  INTEGER,
+
     edits_pending  INTEGER NOT NULL DEFAULT 0,
     created        TIMESTAMP WITH TIME ZONE
+
     created        TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now()
 
);
 
);
 
ALTER TABLE work DROP COLUMN iswc;
 
ALTER TABLE work DROP COLUMN iswc;
 
</pre>
 
</pre>
  
* ''TODO'': Clarify constraints, clarify primary key
+
==== http://tickets.musicbrainz.org/browse/MBS-2851 ====
 +
 
 +
'''Goal''': Timeline graph events should be in the database
 +
 
 +
===== Synopsis =====
 +
 
 +
Right now the timeline events are stored in code and this ticket moves these items to our database. This table will be created and replicated, but it will initially be blank as we wont have the time to fully implement this feature for May 15th. The actual feature will be implemented in a future non-schema change release.
  
==== http://tickets.musicbrainz.org/browse/MBS-2851 ====
 
 
===== Database changes =====
 
===== Database changes =====
 
<pre>
 
<pre>
Line 40: Line 41:
 
</pre>
 
</pre>
  
* ''TODO'': Does this belong in the <tt>musicbrainz</tt> schema, or do we introduce a <tt>statistics</tt> schema?
+
NOTE: This ticket will have the schema change made to the DB, but the actual features will be implemented after the May 15th release.
** The proper course of action is to introduce the new statistics schema and move all statistics tables there. Lets not create a new schema with some stuff in it, but not all statistics. But, I dont see this as necessary for this release.
 
  
 
==== http://tickets.musicbrainz.org/browse/MBS-2532 ====
 
==== http://tickets.musicbrainz.org/browse/MBS-2532 ====
 +
 +
'''Goal''': Allow more than one IPI per artist
 +
 +
===== Synopsis =====
 +
 +
This change will allow users to add more than one IPI to an artist/label, by moving the IPI column from the artist/label tables into  ''artist_ipi'' and '''label_ipi''' tables.
 +
 
===== Database changes =====
 
===== Database changes =====
 
<pre>
 
<pre>
Line 67: Line 74:
  
 
==== http://tickets.musicbrainz.org/browse/MBS-3788 ====
 
==== http://tickets.musicbrainz.org/browse/MBS-3788 ====
 +
 +
'''Goal''': Alias improvements
 +
 +
===== Synopsis =====
 +
 +
This change extends our capabilities for handling artist and label aliases. A new type, sort name, begin and end dates and a primary flag columns will be added.
 +
 
===== Database changes =====
 
===== Database changes =====
 
The changes just for artists are:
 
The changes just for artists are:
Line 92: Line 106:
  
 
CREATE UNIQUE INDEX artist_alias_idx_primary ON artist_alias (artist, locale) WHERE primary_for_locale = TRUE AND locale IS NOT NULL;
 
CREATE UNIQUE INDEX artist_alias_idx_primary ON artist_alias (artist, locale) WHERE primary_for_locale = TRUE AND locale IS NOT NULL;
 +
 +
CREATE TABLE label_alias_type (
 +
    id SERIAL NOT NULL PRIMARY KEY,
 +
    name TEXT NOT NULL
 +
);
 +
 +
ALTER TABLE label_alias ADD COLUMN type INTEGER;
 +
ALTER TABLE label_alias ADD COLUMN sort_name INTEGER NOT NULL;
 +
ALTER TABLE label_alias ADD COLUMN begin_date_year SMALLINT;
 +
ALTER TABLE label_alias ADD COLUMN begin_date_month SMALLINT;
 +
ALTER TABLE label_alias ADD COLUMN begin_date_day SMALLINT;
 +
ALTER TABLE label_alias ADD COLUMN end_date_year SMALLINT;
 +
ALTER TABLE label_alias ADD COLUMN end_date_month SMALLINT;
 +
ALTER TABLE label_alias ADD COLUMN end_date_day SMALLINT;
 +
ALTER TABLE label_alias ADD COLUMN primary_for_locale BOOLEAN NOT NULL DEFAULT false;
 +
 +
ALTER TABLE label_alias ADD CONSTRAINT primary_check
 +
CHECK (locale IS NULL AND primary_for_locale IS FALSE); 
 +
 +
DROP INDEX label_alias_idx_locale_label;
 +
 +
CREATE UNIQUE INDEX label_alias_idx_primary ON label_alias (label, locale) WHERE primary_for_locale = TRUE AND locale IS NOT NULL;
 +
 +
CREATE TABLE work_alias_type (
 +
    id SERIAL NOT NULL PRIMARY KEY,
 +
    name TEXT NOT NULL
 +
);
 +
 +
ALTER TABLE work_alias ADD COLUMN type INTEGER;
 +
ALTER TABLE work_alias ADD COLUMN sort_name INTEGER NOT NULL;
 +
ALTER TABLE work_alias ADD COLUMN begin_date_year SMALLINT;
 +
ALTER TABLE work_alias ADD COLUMN begin_date_month SMALLINT;
 +
ALTER TABLE work_alias ADD COLUMN begin_date_day SMALLINT;
 +
ALTER TABLE work_alias ADD COLUMN end_date_year SMALLINT;
 +
ALTER TABLE work_alias ADD COLUMN end_date_month SMALLINT;
 +
ALTER TABLE work_alias ADD COLUMN end_date_day SMALLINT;
 +
ALTER TABLE work_alias ADD COLUMN primary_for_locale BOOLEAN NOT NULL DEFAULT false;
 +
 +
ALTER TABLE work_alias ADD CONSTRAINT primary_check
 +
CHECK (locale IS NULL AND primary_for_locale IS FALSE); 
 +
 +
DROP INDEX work_alias_idx_locale_work;
 +
 +
CREATE UNIQUE INDEX work_alias_idx_primary ON work_alias (work, locale) WHERE primary_for_locale = TRUE AND locale IS NOT NULL;
 
</pre>
 
</pre>
  
* ''TODO'': Decide on whether we need separate type tables for each entity, or if they can share a type table
+
==== http://tickets.musicbrainz.org/browse/MBS-842 ====
 +
 
 +
'''Goal''': Allow vinyl style track numbers and sides
 +
 
 +
===== Synopsis =====
 +
 
 +
We currently only allow simple numerics for track numbers; this ticket will allow more flexible arbitrary track numbers so that we can correctly catalog vinyl recordings.
  
==== http://tickets.musicbrainz.org/browse/MBS-842 ====
 
 
===== Database Changes =====
 
===== Database Changes =====
 
<pre>
 
<pre>
ALTER TABLE track ADD COLUMN number VARCHAR(255);
+
ALTER TABLE track ADD COLUMN number TEXT;
 
UPDATE track SET number = position;
 
UPDATE track SET number = position;
 
ALTER TABLE track ALTER COLUMN number SET NOT NULL;
 
ALTER TABLE track ALTER COLUMN number SET NOT NULL;
 
</pre>
 
</pre>
 +
 +
* The above accurately describes the database changes, but the UPDATE command would run a very long time.  We will implement this differently in the schema upgrade scripts.
 +
** Yes, and it will make major changes to a replicated table, so this change requires a lot of careful thought.
  
 
==== http://tickets.musicbrainz.org/browse/MBS-3646 ====
 
==== http://tickets.musicbrainz.org/browse/MBS-3646 ====
 +
 +
'''Goal''': Split release group attributes into two types
 +
 +
===== Synopsis =====
 +
 +
Right now the release group attributes are too inflexible; this ticket will improve this by adding a secondary type that can be used to flexibly modify the primary type.
 +
 
===== Database Changes =====
 
===== Database Changes =====
 
<pre>
 
<pre>
CREATE TABLE content_type (
+
CREATE TABLE release_group_secondary_type (
 
     id SERIAL NOT NULL PRIMARY KEY,
 
     id SERIAL NOT NULL PRIMARY KEY,
 
     name TEXT NOT NULL
 
     name TEXT NOT NULL
 
);
 
);
  
CREATE TABLE release_group_content_type (
+
CREATE TABLE release_group_secondary_type_join (
 
     release_group INTEGER NOT NULL REFERENCES release_group (id),
 
     release_group INTEGER NOT NULL REFERENCES release_group (id),
     content_type INTEGER NOT NULL REFERENCES content_type (id),
+
     content_type INTEGER NOT NULL REFERENCES release_group_secondary_type (id),
 
     PRIMARY KEY (release_group, content_type)
 
     PRIMARY KEY (release_group, content_type)
 
);
 
);
 +
 +
ALTER TABLE release_group_type RENAME to release_group_primary_type
 
</pre>
 
</pre>
  
* ''TODO'': Is <tt>content_type</tt> really the correct name?
+
==== http://tickets.musicbrainz.org/browse/MBS-1799 ====
 +
 
 +
'''Goal''': Add ISO 639-3 language codes to the database
 +
 
 +
===== Synopsis =====
 +
 
 +
As the label on the tin says: This ticket adds more ISO language codes to the database and normalizes column names.
  
==== http://tickets.musicbrainz.org/browse/MBS-1799 ====
 
 
===== Database Changes =====
 
===== Database Changes =====
 
<pre>
 
<pre>
Line 131: Line 211:
  
 
==== http://tickets.musicbrainz.org/browse/MBS-1798 ====
 
==== http://tickets.musicbrainz.org/browse/MBS-1798 ====
 +
 +
'''Goal''': Lyrics language for works
 +
 +
===== Synopsis =====
 +
 +
This ticket adds a language column to the works table.
 +
 
===== Database Changes =====
 
===== Database Changes =====
 
<pre>
 
<pre>
Line 138: Line 225:
  
 
==== http://tickets.musicbrainz.org/browse/MBS-1385 ====
 
==== http://tickets.musicbrainz.org/browse/MBS-1385 ====
 +
 +
'''Goal''': Support unknown end dates
 +
 +
===== Synopsis =====
 +
 +
This proposal adds an ''ended'' column to the link, artist and label tables to clearly indicate that the entity is no longer an active entity.
 +
 
===== Database Changes =====
 
===== Database Changes =====
 
<pre>
 
<pre>
Line 146: Line 240:
  
 
==== http://tickets.musicbrainz.org/browse/MBS-4337 ====
 
==== http://tickets.musicbrainz.org/browse/MBS-4337 ====
 +
 +
'''Goal''': Make user profile more social: add (optional) fields avatar, gender, birth year, country
 +
 +
===== Synopsis =====
 +
 +
Add more information to the user profiles. This ticket will add birth_year, gender, country and language skills to editor table. Note: The editor table is not replicated and will not affect Live Data Feed users.
 +
 
===== Database Changes =====
 
===== Database Changes =====
 
<pre>
 
<pre>
Line 166: Line 267:
 
CREATE INDEX editor_language_idx_language ON editor_language (language);
 
CREATE INDEX editor_language_idx_language ON editor_language (language);
 
</pre>
 
</pre>
 +
 +
 +
==== http://tickets.musicbrainz.org/browse/MBS-3189 ====
 +
 +
'''Goal''': Remove unused ref_count column and related functions
 +
 +
===== Synopsis =====
 +
 +
Remove the unused ref_count column from the url table.
 +
 +
===== Database Changes =====
 +
 +
<pre>
 +
ALTER TABLE url DROP COLUMN ref_count;
 +
</pre>
 +
  
 
== Next Steps ==
 
== Next Steps ==
  
For each of these tickets, we will require the following two pieces of work:
+
For each of these tickets, we will require the following pieces of work:
  
* Oliver and Warp will review these tickets and sanity check the proposed schema changes. Due date: 29 March
 
 
* Specify exactly what the user interface changes will be. Mock ups would be great. Due date: 2 April
 
* Specify exactly what the user interface changes will be. Mock ups would be great. Due date: 2 April
 
+
* Discuss what to do about the upgrade replication packet. As at least one bug requires updating an entire table, this will produce a very large packet.
  
 
Tickets that do not provide the required information by the given due date will be removed from the 2012-05-15 schema change.
 
Tickets that do not provide the required information by the given due date will be removed from the 2012-05-15 schema change.

Latest revision as of 22:46, 2 April 2012

Tickets to be implemented for May 2012 release

http://tickets.musicbrainz.org/browse/MBS-2885

Goal: Allow more than one ISWC per work

Synopsis

This change will allow users to add more than one ISWC to a work, by moving the ISWC column from the work table into its own iswc table.

Database changes
CREATE TABLE iswc
(
    id             SERIAL NOT NULL PRIMARY KEY,
    work           INTEGER NOT NULL REFERENCES work (id),
    iswc           CHARACTER(15) CHECK (iswc ~ '^T-?\d{3}.?\d{3}.?\d{3}[-.]?\d$'::text),
    source         SMALLINT,
    edits_pending  INTEGER NOT NULL DEFAULT 0,
    created        TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT now()
);
ALTER TABLE work DROP COLUMN iswc;

http://tickets.musicbrainz.org/browse/MBS-2851

Goal: Timeline graph events should be in the database

Synopsis

Right now the timeline events are stored in code and this ticket moves these items to our database. This table will be created and replicated, but it will initially be blank as we wont have the time to fully implement this feature for May 15th. The actual feature will be implemented in a future non-schema change release.

Database changes
CREATE TABLE timeline_event (
    event_date DATE PRIMARY KEY,
    event TEXT NOT NULL,
    description TEXT NOT NULL,
    link TEXT NOT NULL
);

NOTE: This ticket will have the schema change made to the DB, but the actual features will be implemented after the May 15th release.

http://tickets.musicbrainz.org/browse/MBS-2532

Goal: Allow more than one IPI per artist

Synopsis

This change will allow users to add more than one IPI to an artist/label, by moving the IPI column from the artist/label tables into artist_ipi and label_ipi tables.

Database changes
CREATE TABLE artist_ipi
(
    artist              INTEGER NOT NULL REFERENCES artist (id),
    ipi                 VARCHAR(11) NOT NULL,
    edits_pending       INTEGER NOT NULL DEFAULT 0 CHECK (edits_pending >= 0),
    created             TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

CREATE TABLE label_ipi
(
    label               INTEGER NOT NULL REFERENCES label (id),
    ipi                 VARCHAR(11) NOT NULL,
    edits_pending       INTEGER NOT NULL DEFAULT 0 CHECK (edits_pending >= 0),
    created             TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

ALTER TABLE artist DROP COLUMN ipi_code;
ALTER TABLE label DROP COLUMN ipi_code;

http://tickets.musicbrainz.org/browse/MBS-3788

Goal: Alias improvements

Synopsis

This change extends our capabilities for handling artist and label aliases. A new type, sort name, begin and end dates and a primary flag columns will be added.

Database changes

The changes just for artists are:

CREATE TABLE artist_alias_type (
    id SERIAL NOT NULL PRIMARY KEY,
    name TEXT NOT NULL
);

ALTER TABLE artist_alias ADD COLUMN type INTEGER;
ALTER TABLE artist_alias ADD COLUMN sort_name INTEGER NOT NULL;
ALTER TABLE artist_alias ADD COLUMN begin_date_year SMALLINT;
ALTER TABLE artist_alias ADD COLUMN begin_date_month SMALLINT;
ALTER TABLE artist_alias ADD COLUMN begin_date_day SMALLINT;
ALTER TABLE artist_alias ADD COLUMN end_date_year SMALLINT;
ALTER TABLE artist_alias ADD COLUMN end_date_month SMALLINT;
ALTER TABLE artist_alias ADD COLUMN end_date_day SMALLINT;
ALTER TABLE artist_alias ADD COLUMN primary_for_locale BOOLEAN NOT NULL DEFAULT false;

ALTER TABLE artist_alias ADD CONSTRAINT primary_check
CHECK (locale IS NULL AND primary_for_locale IS FALSE);  

DROP INDEX artist_alias_idx_locale_artist;

CREATE UNIQUE INDEX artist_alias_idx_primary ON artist_alias (artist, locale) WHERE primary_for_locale = TRUE AND locale IS NOT NULL;

CREATE TABLE label_alias_type (
    id SERIAL NOT NULL PRIMARY KEY,
    name TEXT NOT NULL
);

ALTER TABLE label_alias ADD COLUMN type INTEGER;
ALTER TABLE label_alias ADD COLUMN sort_name INTEGER NOT NULL;
ALTER TABLE label_alias ADD COLUMN begin_date_year SMALLINT;
ALTER TABLE label_alias ADD COLUMN begin_date_month SMALLINT;
ALTER TABLE label_alias ADD COLUMN begin_date_day SMALLINT;
ALTER TABLE label_alias ADD COLUMN end_date_year SMALLINT;
ALTER TABLE label_alias ADD COLUMN end_date_month SMALLINT;
ALTER TABLE label_alias ADD COLUMN end_date_day SMALLINT;
ALTER TABLE label_alias ADD COLUMN primary_for_locale BOOLEAN NOT NULL DEFAULT false;

ALTER TABLE label_alias ADD CONSTRAINT primary_check
CHECK (locale IS NULL AND primary_for_locale IS FALSE);  

DROP INDEX label_alias_idx_locale_label;

CREATE UNIQUE INDEX label_alias_idx_primary ON label_alias (label, locale) WHERE primary_for_locale = TRUE AND locale IS NOT NULL;

CREATE TABLE work_alias_type (
    id SERIAL NOT NULL PRIMARY KEY,
    name TEXT NOT NULL
);

ALTER TABLE work_alias ADD COLUMN type INTEGER;
ALTER TABLE work_alias ADD COLUMN sort_name INTEGER NOT NULL;
ALTER TABLE work_alias ADD COLUMN begin_date_year SMALLINT;
ALTER TABLE work_alias ADD COLUMN begin_date_month SMALLINT;
ALTER TABLE work_alias ADD COLUMN begin_date_day SMALLINT;
ALTER TABLE work_alias ADD COLUMN end_date_year SMALLINT;
ALTER TABLE work_alias ADD COLUMN end_date_month SMALLINT;
ALTER TABLE work_alias ADD COLUMN end_date_day SMALLINT;
ALTER TABLE work_alias ADD COLUMN primary_for_locale BOOLEAN NOT NULL DEFAULT false;

ALTER TABLE work_alias ADD CONSTRAINT primary_check
CHECK (locale IS NULL AND primary_for_locale IS FALSE);  

DROP INDEX work_alias_idx_locale_work;

CREATE UNIQUE INDEX work_alias_idx_primary ON work_alias (work, locale) WHERE primary_for_locale = TRUE AND locale IS NOT NULL;

http://tickets.musicbrainz.org/browse/MBS-842

Goal: Allow vinyl style track numbers and sides

Synopsis

We currently only allow simple numerics for track numbers; this ticket will allow more flexible arbitrary track numbers so that we can correctly catalog vinyl recordings.

Database Changes
ALTER TABLE track ADD COLUMN number TEXT;
UPDATE track SET number = position;
ALTER TABLE track ALTER COLUMN number SET NOT NULL;
  • The above accurately describes the database changes, but the UPDATE command would run a very long time. We will implement this differently in the schema upgrade scripts.
    • Yes, and it will make major changes to a replicated table, so this change requires a lot of careful thought.

http://tickets.musicbrainz.org/browse/MBS-3646

Goal: Split release group attributes into two types

Synopsis

Right now the release group attributes are too inflexible; this ticket will improve this by adding a secondary type that can be used to flexibly modify the primary type.

Database Changes
CREATE TABLE release_group_secondary_type (
    id SERIAL NOT NULL PRIMARY KEY,
    name TEXT NOT NULL
);

CREATE TABLE release_group_secondary_type_join (
    release_group INTEGER NOT NULL REFERENCES release_group (id),
    content_type INTEGER NOT NULL REFERENCES release_group_secondary_type (id),
    PRIMARY KEY (release_group, content_type)
);

ALTER TABLE release_group_type RENAME to release_group_primary_type

http://tickets.musicbrainz.org/browse/MBS-1799

Goal: Add ISO 639-3 language codes to the database

Synopsis

As the label on the tin says: This ticket adds more ISO language codes to the database and normalizes column names.

Database Changes
ALTER TABLE language ADD COLUMN iso_code_3 CHAR(3) NOT NULL;
ALTER TABLE language RENAME COLUMN iso_code_2 TO iso_code_1;
ALTER TABLE language RENAME COLUMN iso_code_3b TO iso_code_2b;
ALTER TABLE language RENAME COLUMN iso_code_3t TO iso_code_2t;

http://tickets.musicbrainz.org/browse/MBS-1798

Goal: Lyrics language for works

Synopsis

This ticket adds a language column to the works table.

Database Changes
ALTER TABLE work ADD COLUMN language INTEGER;
ALTER TABLE work ADD CONSTRAINT work_fk_language FOREIGN KEY (language) REFERENCES language (id);

http://tickets.musicbrainz.org/browse/MBS-1385

Goal: Support unknown end dates

Synopsis

This proposal adds an ended column to the link, artist and label tables to clearly indicate that the entity is no longer an active entity.

Database Changes
ALTER TABLE link ADD COLUMN ended BOOLEAN NOT NULL DEFAULT FALSE;
ALTER TABLE artist ADD COLUMN ended BOOLEAN NOT NULL DEFAULT FALSE;
ALTER TABLE label ADD COLUMN ended BOOLEAN NOT NULL DEFAULT FALSE;

http://tickets.musicbrainz.org/browse/MBS-4337

Goal: Make user profile more social: add (optional) fields avatar, gender, birth year, country

Synopsis

Add more information to the user profiles. This ticket will add birth_year, gender, country and language skills to editor table. Note: The editor table is not replicated and will not affect Live Data Feed users.

Database Changes
ALTER TABLE editor ADD COLUMN birth_year SMALLINT;
ALTER TABLE editor ADD COLUMN gender INTEGER;
ALTER TABLE editor ADD COLUMN country INTEGER;

ALTER TABLE editor ADD CONSTRAINT editor_fk_gender FOREIGN KEY (gender) REFERENCES gender (id);
ALTER TABLE editor ADD CONSTRAINT editor_fk_country FOREIGN KEY (country) REFERENCES country (id);

CREATE TYPE FLUENCY AS ENUM ('basic', 'intermediate', 'advanced', 'native');

CREATE TABLE editor_language (
    editor INTEGER NOT NULL REFERENCES editor (id),
    language INTEGER NOT NULL REFERENCES language (id),
    fluency FLUENCY NOT NULL,
    PRIMARY KEY (editor, language)
);

CREATE INDEX editor_language_idx_language ON editor_language (language);


http://tickets.musicbrainz.org/browse/MBS-3189

Goal: Remove unused ref_count column and related functions

Synopsis

Remove the unused ref_count column from the url table.

Database Changes
ALTER TABLE url DROP COLUMN ref_count;


Next Steps

For each of these tickets, we will require the following pieces of work:

  • Specify exactly what the user interface changes will be. Mock ups would be great. Due date: 2 April
  • Discuss what to do about the upgrade replication packet. As at least one bug requires updating an entire table, this will produce a very large packet.

Tickets that do not provide the required information by the given due date will be removed from the 2012-05-15 schema change.