Guides/AcoustID: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
mNo edit summary
mNo edit summary
 
(One intermediate revision by the same user not shown)
Line 7: Line 7:
== Reading an AcoustID page==
== Reading an AcoustID page==


Take a look at [http://acoustid.org/track/5502d840-500f-48a6-b64c-fb9780bb14a2 this one]
Take a look at [https://acoustid.org/track/5502d840-500f-48a6-b64c-fb9780bb14a2 this one]


The first thing you see is a list of fingerprints. These are the slightly-different audio selections that were considered similar enough to get the same AcoustID. From there you can select a couple of the fingerprints to compare how different a “same” fingerprint might be.
The first thing you see is a list of fingerprints. These are the slightly-different audio selections that were considered similar enough to get the same AcoustID. From there you can select a couple of the fingerprints to compare how different a “same” fingerprint might be.
[http://acoustid.org/fingerprint/11831566 11831566] and [http://acoustid.org/fingerprint/11185730 11185730] are the most-submitted ones, so let’s take a look at those. Open each in a new tab so you can easily switch between them.
[https://acoustid.org/fingerprint/11831566 11831566] and [https://acoustid.org/fingerprint/11185730 11185730] are the most-submitted ones, so let’s take a look at those. Open each in a new tab so you can easily switch between them.


[[File:igotyoubabe-fpr1.png|link=http://acoustid.org/fingerprint/11831566]] [[File:igotyoubabe-fpr2.png|link=http://acoustid.org/fingerprint/11185730]]
[[File:igotyoubabe-fpr1.png|link=https://acoustid.org/fingerprint/11831566]] [[File:igotyoubabe-fpr2.png|link=https://acoustid.org/fingerprint/11185730]]


The vertical axis is time. The whole bar represents 2 minutes (or 1 minute in a few cases, due to the older fingerprints), so each vertical pixel is a small fragment of time.
The vertical axis is time. The whole bar represents 2 minutes (or 1 minute in a few cases, due to the older fingerprints), so each vertical pixel is a small fragment of time.
Line 19: Line 19:


===When not to merge===
===When not to merge===
Now we can add a fingerprint from an entirely [http://acoustid.org/track/fd3a066b-5f20-46ec-ae73-3c7a6900802d different acoustID] from a different recording:
Now we can add a fingerprint from an entirely [https://acoustid.org/track/fd3a066b-5f20-46ec-ae73-3c7a6900802d different AcoustID] from a different recording:
the most commonly-submitted fingerprint is [http://acoustid.org/fingerprint/11993528 11993528] so let’s look at that.
the most commonly-submitted fingerprint is [https://acoustid.org/fingerprint/11993528 11993528] so let’s look at that.


[[File:igotyoubabe-buster-fpr1.png|link=http://acoustid.org/fingerprint/11993528|a different one]] [[File:igotyoubabe-fpr1.png|link=http://acoustid.org/fingerprint/11831566|the original]]
[[File:igotyoubabe-buster-fpr1.png|link=https://acoustid.org/fingerprint/11993528|a different one]] [[File:igotyoubabe-fpr1.png|link=https://acoustid.org/fingerprint/11831566|the original]]


Compare it to either of the other ones and you can see that it is substantially different. Similar enough that it is probably the same song (the general appearance is the same), but still different enough to be almost certainly a different recording in some way. Of course, there’s no way to tell exactly what the differences are without hearing it.
Compare it to either of the other ones and you can see that it is substantially different. Similar enough that it is probably the same song (the general appearance is the same), but still different enough to be almost certainly a different recording in some way. Of course, there’s no way to tell exactly what the differences are without hearing it.


===When to merge different AcoustIDs===
===When to merge different AcoustIDs===
Finally, lets look at the one which is a different acoustID but I still think would be the [http://acoustid.org/track/b7b0e619-f0fc-4275-b7d5-15577fe41e97 same recording]:
Finally, lets look at the one which is a different AcoustID but I still think would be the [https://acoustid.org/track/b7b0e619-f0fc-4275-b7d5-15577fe41e97 same recording]:
The most-submitted fingerprint is [http://acoustid.org/fingerprint/21014712 21014712]. Comparing this back to [http://acoustid.org/fingerprint/11831566 11831566] you can see again that they are practically the same.
The most-submitted fingerprint is [https://acoustid.org/fingerprint/21014712 21014712]. Comparing this back to [https://acoustid.org/fingerprint/11831566 11831566] you can see again that they are practically the same.


[[File:igotyoubabe-fpr3.png|link=http://acoustid.org/fingerprint/21014712|a very similar one]] [[File:igotyoubabe-fpr1.png|link=http://acoustid.org/fingerprint/11831566|the original]]
[[File:igotyoubabe-fpr3.png|link=https://acoustid.org/fingerprint/21014712|a very similar one]] [[File:igotyoubabe-fpr1.png|link=https://acoustid.org/fingerprint/11831566|the original]]


IMO recordings having these two acoustIDs are probably OK to merge.
IMO recordings having these two AcoustIDs are probably OK to merge.


Further down on the AcoustID page you can see a list of recordings which have this AcoustID. In general, in my experience it is safe to merge recordings which have the same AcoustID, as long as they each only have the one AcoustID. If they each have multiple AcoustIDs its probably because some releases have a different recording and someone picked wrong. Also be mindful of different performance ARs and ISRCs (though liner notes and the various ISRC sources have been known to be wrong as well.)
Further down on the AcoustID page you can see a list of recordings which have this AcoustID. In general, in my experience it is safe to merge recordings which have the same AcoustID, as long as they each only have the one AcoustID. If they each have multiple AcoustIDs its probably because some releases have a different recording and someone picked wrong. Also be mindful of different performance [[AR]]s and [[ISRC]]s (though liner notes and the various ISRC sources have been known to be wrong as well.)


==Problems==
==Problems==
There are a few places where AcoustID has trouble.
There are a few places where AcoustID has trouble.


False matches: Karaoke versions; instrumental versions; radio edits (where only a small amount is edited, bleeped out, etc.). Recordings which only diverge after the 2-minute mark where the acoustID fingerprint ends. I have been told that very short (15–30s) tracks also have this problem, but I haven’t seen it myself.
False matches: Karaoke versions; instrumental versions; radio edits (where only a small amount is edited, bleeped out, etc.). Recordings which only diverge after the 2-minute mark<ref>https://github.com/acoustid/chromaprint/blob/master/src/cmd/fpcalc.cpp#L26</ref> where the acoustID fingerprint ends. I have been told that very short (15–30s) tracks also have this problem, but I haven’t seen it myself.


False differences: A time-shift of more than a few seconds will often cause a new/different AcoustID to be assigned.
False differences: A time-shift of more than a few seconds will often cause a new/different AcoustID to be assigned.


It is also worth noting that recordings that differ by more than 7<ref>https://github.com/lalinsky/acoustid-server/blob/master/acoustid/const.py#L19</ref> seconds will always be given a different AcoustID, even if the fingerprint data matches 100%. So if you see an acoustID with a track attached that with a difference greater than that, it always indicates something wrong (I’ve seen the recording being simply the wrong duration, and also recordings from two releases that shouldn’t have used the same recording.)
It is also worth noting that recordings that differ by more than 7<ref>https://github.com/lalinsky/acoustid-server/blob/master/acoustid/const.py#L19</ref> seconds will always be given a different AcoustID, even if the fingerprint data matches 100%. So if you see an AcoustID with a track attached that with a difference greater than that, it always indicates something wrong (I’ve seen the recording being simply the wrong duration, and also recordings from two releases that shouldn’t have used the same recording.)


==See Also==
==See Also==
* [http://oxygene.sk/lukas/2011/01/how-does-chromaprint-work/ luks’ explanation of how Chromaprint works]
* [https://oxygene.sk/2011/01/how-does-chromaprint-work/ luks’ explanation of how Chromaprint works]


==References==
==References==

Latest revision as of 00:50, 15 May 2020

Recordings with multiple AcoustIDs

This recording has four different AcoustIDs (look at the Fingerprints tab). This is almost certainly due to the huge variety of different releases the recording is listed as belonging to: the slightly different versions each appear on a different subset of the releases listed.

For that recording in particular: The hard part is determining which tracks belong to which AcoustID; the only real way to do it is to acquire copies of the files which are definitively from each of those releases, find a recording which matches only that AcoustID, and move the track to that recording.

Reading an AcoustID page

Take a look at this one

The first thing you see is a list of fingerprints. These are the slightly-different audio selections that were considered similar enough to get the same AcoustID. From there you can select a couple of the fingerprints to compare how different a “same” fingerprint might be. 11831566 and 11185730 are the most-submitted ones, so let’s take a look at those. Open each in a new tab so you can easily switch between them.

igotyoubabe-fpr1.png igotyoubabe-fpr2.png

The vertical axis is time. The whole bar represents 2 minutes (or 1 minute in a few cases, due to the older fingerprints), so each vertical pixel is a small fragment of time.

Once you have both those fingerprints open, you can see that the difference between them is very small: only a slight vertical shift, plus a single pixel different here and there.

When not to merge

Now we can add a fingerprint from an entirely different AcoustID from a different recording: the most commonly-submitted fingerprint is 11993528 so let’s look at that.

a different one the original

Compare it to either of the other ones and you can see that it is substantially different. Similar enough that it is probably the same song (the general appearance is the same), but still different enough to be almost certainly a different recording in some way. Of course, there’s no way to tell exactly what the differences are without hearing it.

When to merge different AcoustIDs

Finally, lets look at the one which is a different AcoustID but I still think would be the same recording: The most-submitted fingerprint is 21014712. Comparing this back to 11831566 you can see again that they are practically the same.

a very similar one the original

IMO recordings having these two AcoustIDs are probably OK to merge.

Further down on the AcoustID page you can see a list of recordings which have this AcoustID. In general, in my experience it is safe to merge recordings which have the same AcoustID, as long as they each only have the one AcoustID. If they each have multiple AcoustIDs its probably because some releases have a different recording and someone picked wrong. Also be mindful of different performance ARs and ISRCs (though liner notes and the various ISRC sources have been known to be wrong as well.)

Problems

There are a few places where AcoustID has trouble.

False matches: Karaoke versions; instrumental versions; radio edits (where only a small amount is edited, bleeped out, etc.). Recordings which only diverge after the 2-minute mark[1] where the acoustID fingerprint ends. I have been told that very short (15–30s) tracks also have this problem, but I haven’t seen it myself.

False differences: A time-shift of more than a few seconds will often cause a new/different AcoustID to be assigned.

It is also worth noting that recordings that differ by more than 7[2] seconds will always be given a different AcoustID, even if the fingerprint data matches 100%. So if you see an AcoustID with a track attached that with a difference greater than that, it always indicates something wrong (I’ve seen the recording being simply the wrong duration, and also recordings from two releases that shouldn’t have used the same recording.)

See Also

References

How-To Pages
Introductory Guides
Basic How-Tos
Specific How-Tos