User:Ianmcorvidae/Recordings: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(AcoustID section, start of a page. Super super draft, not ready for general consumption :))
 
No edit summary
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
== Personal Use Cases ==
'''This is a draft and will change. Please do not link it to the general discussion page.'''
# Track IDs. I'd really like these, because I really want things (files, chunks of streams, CD tracks -- see below on documentation, but I think we mostly know what a track is) to be addressable by a single ID. I think this is somewhat independent of the discussion about what to do with recordings, which puts me in a good position to say that ''whatever'' we do we should implement this.
# Interface Complexity. Our interface ''sucks.'' Anything that makes it more complex is very bad, because realistically we may already need to do a lot of work and we're probably not going to do any changes "right" the first time.
# Documentation. Our primary problem right now is that '''recordings aren't defined as anything specific'''. mb-style has been reverse-engineering what we-the-devs "intended" when we made them. I think we didn't intend anything. We should start, and we should start by defining this entity or these entities carefully. We should shortly follow it by defining the ''other'' entities, which we also haven't defined but are at least marginally more immediately understandable (however, Works are somewhat on the edge. I think we mostly know what artists, releases and release groups are, though we should write it down for posterity and the indoctrination of new people). I'm not sure the specifics of these definitions are actually hugely important: I think it's most important we simply define them ''at all''.
# Shared use. Somewhat related to interface complexity, we need to be reducing duplicated work by our editors and probably helping the edit queue to be at least ''less'' insanely clogged all the time. NES edit-grouping will help with this, but in some ways that only hides the problem; the same amount of editing will be happening, it'll just be collapsed into fewer items. However, if we have real definitions we can start defining things like inheritance: a work marked as part of another work should inherit that parent work's relationships except where explicitly overridden, perhaps, or pseudo-releases should share relationships/dates/etc. with a source release (or several of them, perhaps -- pseudo-releases are another crazy corner of our data that could be fixed a lot of different ways).


== Levels ==
== Ideal System ==

Short term: Define recordings as mixes. Define other entities. Make tracks real entities with IDs. Tell people who care about masters to put them in recording annotations for now.

Middle term: Do a review of our design; build real user models/personas; list real use cases/scenarios for everything. Outline a path for interface consistency (especially considering strange corners like FreeDB, CDStubs, etc.) and reduction of editor work (inheritance, batch tools, etc.).

Longer term: Implement design. Begin ''considering'' adding more levels, such as masters, transliterated tracklist entities, etc., where they were not considered necessary parts of fixing our UI.

Or, in short: nikki's proposal, but do a review of the entire site's design between steps 1 and 2 ,and possibly throw out everything after step 1 due to said review. Or, more cynically, I think the problem isn't our schema, it's us focusing too much on the schema instead of the rest of MB.


== AcoustID ==
== AcoustID ==
First, a note: AcoustID is quite out of scope for this discussion except in terms of imagining what benefits it might lend. MusicBrainz does not control AcoustID; luks will look at whatever new system is created and do what he thinks is best (and I'm quite confident that if any of us make suggestions, he will be right, and we will be wrong).
First, a note: AcoustID is quite out of scope for this discussion (as I've been telling folks repeatedly) except in terms of imagining what benefits it might lend. MusicBrainz does not control AcoustID; luks will look at whatever new system is created and do what he thinks is best (and I'm quite confident that if any of us make suggestions, he will be right, and we will be wrong).


That being said, it is valuable to consider AcoustID's role in such a new system. So: what is AcoustID good at? The fingerprinting process highlights "chroma features" -- i.e. combinations of pitch (in a very western, 12-note sense) plus time plus intensity (for the first two minutes, most recently, 30s before that<ref name="chromaprint_06" />). After this, it uses a filtering process to turn the image into a series of numbers; this process was trained by machine learning<ref name="chromaprint_overview" /> using luks' collection (as I recall from IRC, no citation on that). After this, the fingerprint must be assigned to an acoustid (sometimes, a new one), which happens by way of a comparison process I don't deeply understand<ref name="acoustid_compare" /> but which compares the two fingerprints bitwise and the length of the track (the latter since the fingerprint itself will never be longer than 2 minutes). These are then matched to (at present, of course) recordings.
That being said, it is valuable to consider AcoustID's role in such a new system. So: what is AcoustID good at? The fingerprinting process highlights "chroma features" -- i.e. combinations of pitch (in a very western, 12-note sense) plus time plus intensity (for the first two minutes, most recently, 30s before that<ref name="chromaprint_06" />). After this, it uses a filtering process to turn the image into a series of numbers; this process was trained by machine learning<ref name="chromaprint_overview" /> using luks' collection (as I recall from IRC, no citation on that). After this, the fingerprint must be assigned to an acoustid (sometimes, a new one), which happens by way of a comparison process I don't deeply understand<ref name="acoustid_compare" /> but which compares the two fingerprints bitwise (or, part of the track<ref name="inside_acoustid"/>) and the length of the track (the latter since the fingerprint itself will never be longer than 2 minutes). These are then matched to (at present, of course) recordings.


So: acoustid is good at comparing/differentiating things that differ in terms of the times at which certain notes are played at a given intensity, with tolerance levels tuned for differentiating within luks' collection. Historically we've known this not to do very well with things like karaoke versions versus the normal, or alternate lyrics over the same base, or remasters. There are also some cases where multiple AcoustIDs will be assigned despite similarity, due to the fuzziness of the algorithm.
So: acoustid is good at comparing/differentiating things that differ in terms of the times at which certain notes are played at a given intensity, with tolerance levels tuned for differentiating within luks' collection. Historically we've known this not to do very well with things like karaoke versions versus the normal, or alternate lyrics over the same base, or remasters. There are also some cases where multiple AcoustIDs will be assigned despite similarity, due to the fuzziness of the algorithm.
Line 17: Line 29:
<ref name="acoustid_compare">https://github.com/lalinsky/acoustid-server/blob/master/postgresql/acoustid_compare.c#L119</ref>
<ref name="acoustid_compare">https://github.com/lalinsky/acoustid-server/blob/master/postgresql/acoustid_compare.c#L119</ref>
<ref name="chromaprint_06">http://oxygene.sk/2011/12/chromaprint-0-6-released/</ref>
<ref name="chromaprint_06">http://oxygene.sk/2011/12/chromaprint-0-6-released/</ref>
<ref name="inside_acoustid">http://oxygene.sk/2011/12/inside-the-acoustid-server/</ref>
</references>
</references>

Latest revision as of 21:52, 28 December 2012

Personal Use Cases

  1. Track IDs. I'd really like these, because I really want things (files, chunks of streams, CD tracks -- see below on documentation, but I think we mostly know what a track is) to be addressable by a single ID. I think this is somewhat independent of the discussion about what to do with recordings, which puts me in a good position to say that whatever we do we should implement this.
  2. Interface Complexity. Our interface sucks. Anything that makes it more complex is very bad, because realistically we may already need to do a lot of work and we're probably not going to do any changes "right" the first time.
  3. Documentation. Our primary problem right now is that recordings aren't defined as anything specific. mb-style has been reverse-engineering what we-the-devs "intended" when we made them. I think we didn't intend anything. We should start, and we should start by defining this entity or these entities carefully. We should shortly follow it by defining the other entities, which we also haven't defined but are at least marginally more immediately understandable (however, Works are somewhat on the edge. I think we mostly know what artists, releases and release groups are, though we should write it down for posterity and the indoctrination of new people). I'm not sure the specifics of these definitions are actually hugely important: I think it's most important we simply define them at all.
  4. Shared use. Somewhat related to interface complexity, we need to be reducing duplicated work by our editors and probably helping the edit queue to be at least less insanely clogged all the time. NES edit-grouping will help with this, but in some ways that only hides the problem; the same amount of editing will be happening, it'll just be collapsed into fewer items. However, if we have real definitions we can start defining things like inheritance: a work marked as part of another work should inherit that parent work's relationships except where explicitly overridden, perhaps, or pseudo-releases should share relationships/dates/etc. with a source release (or several of them, perhaps -- pseudo-releases are another crazy corner of our data that could be fixed a lot of different ways).

Ideal System

Short term: Define recordings as mixes. Define other entities. Make tracks real entities with IDs. Tell people who care about masters to put them in recording annotations for now.

Middle term: Do a review of our design; build real user models/personas; list real use cases/scenarios for everything. Outline a path for interface consistency (especially considering strange corners like FreeDB, CDStubs, etc.) and reduction of editor work (inheritance, batch tools, etc.).

Longer term: Implement design. Begin considering adding more levels, such as masters, transliterated tracklist entities, etc., where they were not considered necessary parts of fixing our UI.

Or, in short: nikki's proposal, but do a review of the entire site's design between steps 1 and 2 ,and possibly throw out everything after step 1 due to said review. Or, more cynically, I think the problem isn't our schema, it's us focusing too much on the schema instead of the rest of MB.

AcoustID

First, a note: AcoustID is quite out of scope for this discussion (as I've been telling folks repeatedly) except in terms of imagining what benefits it might lend. MusicBrainz does not control AcoustID; luks will look at whatever new system is created and do what he thinks is best (and I'm quite confident that if any of us make suggestions, he will be right, and we will be wrong).

That being said, it is valuable to consider AcoustID's role in such a new system. So: what is AcoustID good at? The fingerprinting process highlights "chroma features" -- i.e. combinations of pitch (in a very western, 12-note sense) plus time plus intensity (for the first two minutes, most recently, 30s before that[1]). After this, it uses a filtering process to turn the image into a series of numbers; this process was trained by machine learning[2] using luks' collection (as I recall from IRC, no citation on that). After this, the fingerprint must be assigned to an acoustid (sometimes, a new one), which happens by way of a comparison process I don't deeply understand[3] but which compares the two fingerprints bitwise (or, part of the track[4]) and the length of the track (the latter since the fingerprint itself will never be longer than 2 minutes). These are then matched to (at present, of course) recordings.

So: acoustid is good at comparing/differentiating things that differ in terms of the times at which certain notes are played at a given intensity, with tolerance levels tuned for differentiating within luks' collection. Historically we've known this not to do very well with things like karaoke versions versus the normal, or alternate lyrics over the same base, or remasters. There are also some cases where multiple AcoustIDs will be assigned despite similarity, due to the fuzziness of the algorithm.

In most of the mix/master/track proposed systems, therefore, AcoustIDs probably distinguish best among mixes. In systems with only recording/track, aside from the path-of-least-resistance benefits, they probably distinguish best among recordings. However, in neither case is AcoustID sufficient for defending either merging or splitting, though it's somewhat better for defending splits.

References