History:Data Quality: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
((Imported from MoinMoin))
((Imported from MoinMoin))
Line 4: Line 4:
</ul>
</ul>


The concept of data quality is based on release locking suggestions in [[Quality And Quantity|QualityAndQuantity]] and [[Release Locking|ReleaseLocking]]. After much discussion and even more time, the concept has undergone a number of changes in and will now actually be implemented. This page serves as the point to describe the idea and the [[Talk:Data Qualituy|DataQualituyDiscussion]] page lets users chime in on the merits of this idea.
The concept of data quality is based on release locking suggestions in [[Quality And Quantity|QualityAndQuantity]] and [[Release Locking|ReleaseLocking]]. After much discussion and even more time, the concept has undergone a number of changes in and will now actually be implemented. This page serves as the point to describe the idea and the [[Talk:Data Quality|DataQualityDiscussion]] page lets users chime in on the merits of this idea.


==Goals==
==Goals==

Revision as of 23:02, 12 February 2007

Data Quality and Editing Strictness

  • Status: This is work in progress as of Feb 2007.

The concept of data quality is based on release locking suggestions in QualityAndQuantity and ReleaseLocking. After much discussion and even more time, the concept has undergone a number of changes in and will now actually be implemented. This page serves as the point to describe the idea and the DataQualityDiscussion page lets users chime in on the merits of this idea.

Goals

The data quality idea has the following goals:

  • Establish a method to determine the quality of an artist and the releases that belong to that artist. This providers consumers of MusicBrainz a clue about the relative quality rating of the data in the database.
  • Provide fine grained control over what efforts are required to edit the database and to vote on those edits.
  • Provide editors with a means to allow easier editing of data that is deemed to be of poor quality.
  • Provide editors with a means to make it harder to edit data that is considered to be of good quality.
  • Reduce the overall number of edits in the system by making the requirements to pass an edit suited for each edit type.

End user feature changes

To accomplish these goals, this feature will allow editors to indicate the quality for a given artist. An artist can be of unknown, low, medium or high data quality. The data quality indicator determines what level of effort is required to change the artist information or to add/remove albums from an artist. An artist with unknown or medium quality will roughly require the amount of effort that MusicBrainz currently requires to edit the database. An artist with low data quality will make it easier to add/remove albums or to change the artist information (name, sortname, aliases). And an artist with high data quality will require more effort to add/remove albums or the change the artist information. The data quality concept also applies to releases in the same manner. Changing a release with low data quality will be easier than changing a release with high data quality.

Each artist will have a new link in the edit bar: Change artist quality. This link will allow the user to select a new quality rating for the artist. Each album will have a similar link in its edit bar: Change release quality. As with the artist, this link allows the changing of the data quality rating for this release. Changing the quality rating for releases will also be a batch operation.

The daily artist subscription email will now inform users when the quality of an artist or a release belonging to that artist has been changed.

Data quality affects edit strictness

The quality rating for an artist/release will determine the following edit strictness values:

  • Edit voting duration (in days)
  • Number of unanimous votes to pass
  • Expire action: Accept, reject
  • EditTypes which are AutoEdits

As a rough illustration the data quality levels could influence the edit strictness as follows:

Normal or Unknown High Low
Voting period 2 weeks (3 weeks if there are subscribers) 2 weeks 1 week
Yes votes required to pass +1 (=1 more yes than no) +3 +1
Action on expiration accept reject accept
AutoEdits see EditType none All non-structural changes

(The table above needs to be replaced with a detailed table that lists all of the edit types and their associated edit strictness values)

Unresolved Issues

Under what circumstances should an editor be able to change the data quality level? It would probably be wise to:

  • Make it easy to mark the data quality as higher than it currently is.
  • Make it harder to mark the data quality as lower than it currently is. We do not want to allow people to take a high quality aritst/release, change it to low data quality and then make all sort of changes to the artist/release quickly.

Does it make sense to offer a Keep Open option for expired mods and have those exired-but-still-open mods be highlighted in the artist subscriber emails in order to get people to vote on it?