History:Amazon Matching: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
((Imported from MoinMoin))
mNo edit summary
(8 intermediate revisions by 6 users not shown)
Line 1: Line 1:
{{HistoryHeader}}


The Amazon Matcher is a process that searches [[Amazon]]'s catalog for albums in the [[MusicBrainz]] database and records their URLs. This allows the website to display album cover art, and also provides the links that allow you to buy the album just by clicking on it. See also [[ASIN]].
The Amazon Matcher is a process that searches [[Amazon]]'s catalog for releases in the [[MusicBrainz]] database and records their URLs. This allows the website to display release cover art, and also provides the links that allow you to buy the release just by clicking on it. See also [[ASIN]].


[[Image:Alert.png]] '''Please note: The Amazon Matcher process is currently not running! To add cover art to an album / change incorrect cover art you can use the [[Amazon Relationship Type|AmazonRelationshipType]]. [[How To Change Cover Art|HowToChangeCoverArt]] will give you some hints.'''
[[Image:Alert.png]] '''Please note: The Amazon Matcher process is no longer running! To add cover art to a release you can use the [[Amazon Relationship Type|AmazonRelationshipType]] or link to an URL to one of the [[Cover Art Sites|CoverArtSites]]. To change incorrect cover art, [[How To Change Cover Art|HowToChangeCoverArt]] will give you some hints.'''


'''2004-06-01:''' Andy Grundman has submitted an update to the Amazon Matcher that should match a much higher percentage of albums. It contains track name matching as well as various artist support. This page will be updated to reflect the current state of the matcher after this new version goes live. For now, the Missed Matches section below has been updated to show which albums are now matched by the new script.
'''2004-06-01:''' Andy Grundman has submitted an update to the Amazon Matcher that should match a much higher percentage of releases. It contains track name matching as well as various artist support. This page will be updated to reflect the current state of the matcher after this new version goes live. For now, the Missed Matches section below has been updated to show which releases are now matched by the new script.


==Algorithm==
The current algorithm for matching MB albums to Amazon albums is as follows:

# Download all the albums for a given artist from amazon
The current algorithm for matching MB releases to Amazon releases is as follows:
# Download all the releases for a given artist from amazon
# Pass 1:
# Pass 1:
# For each album in musicbrainz:
# For each release in musicbrainz:
## Tokenize both album names: convert to lowercase, remove accents, punctuation and whitespace
## Tokenize both release names: convert to lowercase, remove accents, punctuation and whitespace
## Compare the album names and store its similarity
## Compare the release names and store its similarity
## Pick an album with the highest similarity rating, as long as it is above 80%
## Pick a release with the highest similarity rating, as long as it is above 80%


# Pass 2: (Try chopping () and [] to get a better match)
# Pass 2: (Try chopping () and [] to get a better match)
# For each album in musicbrainz
# For each release in musicbrainz
## remove anything appended in () from the MB album names
## remove anything appended in () from the MB release names
## remove anything appended in () or [] from the amazon album names
## remove anything appended in () or [] from the Amazon release names
## Tokenize the chopped album names and store the similarity
## Tokenize the chopped release names and store the similarity
## Pick an album with the highest similarity rating, as long as it is above 80%
## Pick a release with the highest similarity rating, as long as it is above 80%


Question: is pass #2 really done for each album in musicbrainz (as stated above), or is it only done for those left unmatched after the first pass? This could explain a couple of the mis-matches below.
Question: is pass #2 really done for each release in musicbrainz (as stated above), or is it only done for those left unmatched after the first pass? This could explain a couple of the mis-matches below.


==Todo items==
==Todo items==


* Select the best image server based on which amazon store a user has selected
* Select the best image server based on which Amazon store a user has selected
** NOTE: some album covers are not on all image servers -RJ (And in my experience 'imports' usually have lower quality scans than the same album in it's local amazon, though why this should be I have no idea - bawjaws)
** NOTE: some release covers are not on all image servers -RJ (And in my experience 'imports' usually have lower quality scans than the same release in its local amazon, though why this should be I have no idea - bawjaws)


* Match various artist albums
* Match various artist releases


This is not perfect, but it does get a reasonable number of matches. If you have observed things that should've matched, but didn't, please add them to the Missed Matches list below. Also, if some intrepid perl hacker would like to try tuning the matching script, I would appreciate that!
This is not perfect, but it does get a reasonable number of matches. If you have observed things that should've matched, but didn't, please add them to the Missed Matches list below. Also, if some intrepid perl hacker would like to try tuning the matching script, I would appreciate that!
Line 63: Line 66:
</pre>
</pre>


Not matched because "Classics" should be an artist album under Aphex Twin and not a VA album. <pre>"The Perfect Drug Versions", a4db9744-347f-47f5-a4bd-394fde23831c, "Perfect Drug [CD-SINGLE]", B000001Y7W
Not matched because "Classics" should be an artist release under Aphex Twin and not a VA release. <pre>"The Perfect Drug Versions", a4db9744-347f-47f5-a4bd-394fde23831c, "Perfect Drug [CD-SINGLE]", B000001Y7W
</pre>
</pre>


Probably not matched because the track names differ too much. <pre>"Silent Hill 2", 72ea51fd-0d61-48fc-ba16-e1ae178b408d, "Silent Hill V.2["IMPORT"]", ["B00005NO3D"]
Probably not matched because the track names differ too much. <pre>"Silent Hill 2", 72ea51fd-0d61-48fc-ba16-e1ae178b408d, "Silent Hill V.2["IMPORT"]", ["B00005NO3D"]
</pre>
</pre>
<ul><li style="list-style-type:none">Not matched because Amazon lists the artist as "Game Music". This is not a VA album though, so I am not sure how best to match this one.
<ul><li style="list-style-type:none">Not matched because Amazon lists the artist as "Game Music". This is not a VA release though, so I am not sure how best to match this one.
</ul>
</ul>


Line 83: Line 86:
</pre>
</pre>


are showing the cover art for the Final Fantasy IX Original Soundtrack. The Final Fantasy VII soundtrack should have the cover art from ASIN B000038I2O and the Final Fantasy VIII one should have ASIN [[B00003 C K5 N|B00003CK5N]].
are showing the cover art for the Final Fantasy IX Original Soundtrack. The Final Fantasy VII soundtrack should have the cover art from ASIN B000038I2O and the Final Fantasy VIII one should have ASIN "B00003CK5N".


<pre>"Led Zeppelin IV" in MB is matched to "Led Zeppelin II" in Amazon.
<pre>"Led Zeppelin IV" in MB is matched to "Led Zeppelin II" in Amazon.
Line 94: Line 97:
==Other Matching Problems==
==Other Matching Problems==


'Guerrilla' by the Super Furry Animals ([http://www.musicbrainz.org/showalbum.html?albumid=94127 http://www.musicbrainz.org/showalbum.html?albumid=94127]) matches to an album in Amazon called 'Guerrilla[[import]]' that doesn't have an image. But there is an album called just 'Guerrilla' that does have an image. The only things that I can think of that would stop it matching according to the algorithm above is that A) you can't currently buy it from Amazon as they are out of stock and B) the import copy now has a higher 'popularity' rank than the non-import as it is still available for purchase.
'Guerrilla' by the Super Furry Animals ([http://www.musicbrainz.org/showalbum.html?albumid=94127 http://www.musicbrainz.org/showalbum.html?albumid=94127]) matches to an album in Amazon called 'Guerrilla "import"' that doesn't have an image. But there is an album called just 'Guerrilla' that does have an image. The only things that I can think of that would stop it matching according to the algorithm above is that A) you can't currently buy it from Amazon as they are out of stock and B) the import copy now has a higher 'popularity' rank than the non-import as it is still available for purchase.


The album 'The Charlatans' ([http://www.musicbrainz.org/showalbum.html?albumid=42524 http://www.musicbrainz.org/showalbum.html?albumid=42524]) by the band also called 'The Charlatans' (who are also known in the states as'The Charlatans UK') retrieves the image for the album 'The Charlatans' by the american group called 'The Charlatans' (listed in Amazon as 'The Charlatans (1960's)'. A bit of an odd corner case I know but I'm surprised that it can get the other albums by this artist correct and get this one wrong. The real album/image is listed as both 'Charlatans[[import]]' and 'The Charlatans[[UK]]' by the artist 'Charlatans UK'.
The album 'The Charlatans' ([http://www.musicbrainz.org/showalbum.html?albumid=42524 http://www.musicbrainz.org/showalbum.html?albumid=42524]) by the band also called 'The Charlatans' (who are also known in the states as 'The Charlatans UK') retrieves the image for the album 'The Charlatans' by the american group called 'The Charlatans' (listed in Amazon as 'The Charlatans (1960's)'. A bit of an odd corner case I know but I'm surprised that it can get the other albums by this artist correct and get this one wrong. The real album/image is listed as both 'Charlatans "import"' and 'The Charlatans "UK"' by the artist 'Charlatans UK'.


Many of the albums by 'The Tragically Hip' have the cover art show up fine; however, if using a store like amazon.ca, following the buy link takes you to the[[IMPORT]] album (which is listed as unavailable) rather than the Canadian release (which are identical AFAIK). For example, the ASIN for "Fully Completely" is [[B000002 OMP|B000002OMP]] in Canada (this is invalid in the US), not [[B00000 IJRC|B00000IJRC]], which is fine with the US store, but the[[IMPORT]] version in Canada. I suspect this is a problem with other artists (but I haven't found any specific ones).
Many of the albums by 'The Tragically Hip' have the cover art show up fine; however, if using a store like amazon.ca, following the buy link takes you to the "IMPORT" album (which is listed as unavailable) rather than the Canadian release (which are identical AFAIK). For example, the ASIN for "Fully Completely" is "B000002OMP" in Canada (this is invalid in the US), not "B00000IJRC", which is fine with the US store, but the "IMPORT" version in Canada. I suspect this is a problem with other artists (but I haven't found any specific ones).


"With The Beatles" (a91b9173-b958-401b-9551-b15db0e7bc5d/B000002UAC) retrieves the cover for "The Beatles (The White Album)" ASIN [[B000002 UAX|B000002UAX]].
"With The Beatles" (a91b9173-b958-401b-9551-b15db0e7bc5d/B000002UAC) retrieves the cover for "The Beatles (The White Album)" ASIN "B000002UAX".


The self-titled first album by "Creedence Clearwater Revival" (6da15b06-b848-487c-a74a-af8fe26f1069) retrieves the cover of a best of called "The Best of Creedence Clearwater Revival" (ASIN: [[B000006 X V2|B000006XV2]]).
The self-titled first album by "Creedence Clearwater Revival" (6da15b06-b848-487c-a74a-af8fe26f1069) retrieves the cover of a best of called "The Best of Creedence Clearwater Revival" (ASIN: "B000006XV2").


This: [http://musicbrainz.org/album/8dac0482-cc08-4a45-82be-899604becbcb.html http://musicbrainz.org/album/8dac0482-cc08-4a45-82be-899604becbcb.html] is mis-matched because of the decision we made not to include text like "Music From The Motion Picture" in soundtrack titles.
This: [http://musicbrainz.org/album/8dac0482-cc08-4a45-82be-899604becbcb.html http://musicbrainz.org/album/8dac0482-cc08-4a45-82be-899604becbcb.html] is mis-matched because of the decision we made not to include text like "Music From The Motion Picture" in soundtrack titles.
Line 108: Line 111:
It should be: [http://tinyurl.com/34lsq http://tinyurl.com/34lsq]
It should be: [http://tinyurl.com/34lsq http://tinyurl.com/34lsq]


Things may have changed since the previous comment but at the moment the album is mismatched because there are four different Bullitt albums in Amazon: "Bullitt (1968 Film)[[SOUNDTRACK]]", "Bullitt (1968 Film)[[SOUNDTRACK]][[IMPORT]]", "Bullitt (Music from the Motion Picture)[[SOUNDTRACK]][[IMPORT]][[ORIGINA L RECORDIN G REMASTERED|ORIGINAL RECORDING REMASTERED]]" and "Bullitt (Music Recreated from and Inspired by the Motion Picture)[[SOUNDTRACK]][[IMPORT]][[ORIGINA L RECORDIN G REMASTERED|ORIGINAL RECORDING REMASTERED]]". The second and third of which appear to match the tracklist of the album in musicbrainz linked to above. Since the algorithm outlined above discards everything in brackets for the second pass then all of these match equally and I assume one of the four contenders is then chosen at random. The absence/presence of "Music from the motion picture" etc. is therefore in this case a red herring, though it probably does apply in the "Conquest of Paradise" case listed above.
Things may have changed since the previous comment but at the moment the album is mismatched because there are four different Bullitt albums in Amazon: "Bullitt (1968 Film) "SOUNDTRACK"", "Bullitt (1968 Film) "SOUNDTRACK" "IMPORT"", "Bullitt (Music from the Motion Picture) "SOUNDTRACK" "IMPORT" "ORIGINAL RECORDING REMASTERED"" and "Bullitt (Music Recreated from and Inspired by the Motion Picture) "SOUNDTRACK" "IMPORT" "ORIGINAL RECORDING REMASTERED"". The second and third of which appear to match the tracklist of the album in musicbrainz linked to above. Since the algorithm outlined above discards everything in brackets for the second pass then all of these match equally and I assume one of the four contenders is then chosen at random. The absence/presence of "Music from the motion picture" etc. is therefore in this case a red herring, though it probably does apply in the "Conquest of Paradise" case listed above.


Albums with the same name, but extra tracks don't match correctly. There are 3 different Weezer (Green) albums named the same and share the first 10 tracks, but the UK release has 11 tracks, and the Japanese release has 12 tracks [http://musicbrainz.org/showalbum.html?albumid=56450 http://musicbrainz.org/showalbum.html?albumid=56450] however they all match to ASIN [[B00005 ICAW|B00005ICAW]], but the 12 track release should be ASIN [[B00005 B7 U2|B00005B7U2]], and the 11 track release should be [[B00005 JHYM|B00005JHYM]]
Releases with the same name, but extra tracks don't match correctly. There are 3 different Weezer (Green) albums named the same and share the first 10 tracks, but the UK release has 11 tracks, and the Japanese release has 12 tracks [http://musicbrainz.org/showalbum.html?albumid=56450 http://musicbrainz.org/showalbum.html?albumid=56450] however they all match to ASIN "B00005ICAW", but the 12 track release should be ASIN B00005B7U2, and the 11 track release should be "B00005JHYM"


This is missing a match [http://www.musicbrainz.org/album/502cf184-caaa-4c77-ab81-87ff38c30c34.html http://www.musicbrainz.org/album/502cf184-caaa-4c77-ab81-87ff38c30c34.html] - as amazon has a 'dead page duplicate' for the album. it should point to: [http://www.amazon.co.uk/exec/obidos/ASIN/B00004XN08/ http://www.amazon.co.uk/exec/obidos/ASIN/B00004XN08/]
This is missing a match [http://www.musicbrainz.org/album/502cf184-caaa-4c77-ab81-87ff38c30c34.html http://www.musicbrainz.org/album/502cf184-caaa-4c77-ab81-87ff38c30c34.html] - as amazon has a 'dead page duplicate' for the album. it should point to: [http://www.amazon.co.uk/exec/obidos/ASIN/B00004XN08/ http://www.amazon.co.uk/exec/obidos/ASIN/B00004XN08/]


Albums [http://www.musicbrainz.org/album/7cee1d42-14f7-47e2-988c-14c46d55e162.html http://www.musicbrainz.org/album/7cee1d42-14f7-47e2-988c-14c46d55e162.html] and [http://www.musicbrainz.org/album/55ff080e-6fd1-4e2e-872f-8eff966bcb7d.html http://www.musicbrainz.org/album/55ff080e-6fd1-4e2e-872f-8eff966bcb7d.html] are mis-matched. Correct MB album for that Amazon match is only [http://www.musicbrainz.org/album/3900cff7-3334-4007-8eeb-29307c25a8ed.html http://www.musicbrainz.org/album/3900cff7-3334-4007-8eeb-29307c25a8ed.html]
Albums [http://www.musicbrainz.org/album/7cee1d42-14f7-47e2-988c-14c46d55e162.html http://www.musicbrainz.org/album/7cee1d42-14f7-47e2-988c-14c46d55e162.html] and [http://www.musicbrainz.org/album/55ff080e-6fd1-4e2e-872f-8eff966bcb7d.html http://www.musicbrainz.org/album/55ff080e-6fd1-4e2e-872f-8eff966bcb7d.html] are mis-matched. Correct MB album for that Amazon match is only [http://www.musicbrainz.org/album/3900cff7-3334-4007-8eeb-29307c25a8ed.html http://www.musicbrainz.org/album/3900cff7-3334-4007-8eeb-29307c25a8ed.html]

[[Category:To Be Reviewed]] [[Category:Development]]

Revision as of 16:12, 25 October 2011

Status: This Page is Glorious History!

The content of this page either is bit-rotted, or has lost its reason to exist due to some new features having been implemented in MusicBrainz, or maybe just described something that never made it in (or made it in a different way), or possibly is meant to store information and memories about our Glorious Past. We still keep this page to honor the brave editors who, during the prehistoric times (prehistoric for you, newcomer!), struggled hard to build a better present and dreamed of an even better future. We also keep it for archival purposes because possibly it still contains crazy thoughts and ideas that may be reused someday. If you're not into looking at either the past or the future, you should just disregard entirely this page content and look for an up to date documentation page elsewhere.

The Amazon Matcher is a process that searches Amazon's catalog for releases in the MusicBrainz database and records their URLs. This allows the website to display release cover art, and also provides the links that allow you to buy the release just by clicking on it. See also ASIN.

Alert.png Please note: The Amazon Matcher process is no longer running! To add cover art to a release you can use the AmazonRelationshipType or link to an URL to one of the CoverArtSites. To change incorrect cover art, HowToChangeCoverArt will give you some hints.

2004-06-01: Andy Grundman has submitted an update to the Amazon Matcher that should match a much higher percentage of releases. It contains track name matching as well as various artist support. This page will be updated to reflect the current state of the matcher after this new version goes live. For now, the Missed Matches section below has been updated to show which releases are now matched by the new script.

Algorithm

The current algorithm for matching MB releases to Amazon releases is as follows:

  1. Download all the releases for a given artist from amazon
  2. Pass 1:
  3. For each release in musicbrainz:
    1. Tokenize both release names: convert to lowercase, remove accents, punctuation and whitespace
    2. Compare the release names and store its similarity
    3. Pick a release with the highest similarity rating, as long as it is above 80%
  1. Pass 2: (Try chopping () and [] to get a better match)
  2. For each release in musicbrainz
    1. remove anything appended in () from the MB release names
    2. remove anything appended in () or [] from the Amazon release names
    3. Tokenize the chopped release names and store the similarity
    4. Pick a release with the highest similarity rating, as long as it is above 80%

Question: is pass #2 really done for each release in musicbrainz (as stated above), or is it only done for those left unmatched after the first pass? This could explain a couple of the mis-matches below.

Todo items

  • Select the best image server based on which Amazon store a user has selected
    • NOTE: some release covers are not on all image servers -RJ (And in my experience 'imports' usually have lower quality scans than the same release in its local amazon, though why this should be I have no idea - bawjaws)
  • Match various artist releases

This is not perfect, but it does get a reasonable number of matches. If you have observed things that should've matched, but didn't, please add them to the Missed Matches list below. Also, if some intrepid perl hacker would like to try tuning the matching script, I would appreciate that!

  • This seems as good a place as any to note that the addition of BarCodes to the database would make this kind of matching much easier, more efficient, and more reliable --MatthewExon

Previously Missed, Now Matching

These previously missed matches are now matched successfully by the latest version!

"Strangeitude", dfdb6572-97d0-4852-a4e4-a5f55f27711b, "Strangeitude", ["B000000QGN"]
"1492 - Conquest of Paradise", 7b249234-0aa3-45e9-aa70-3043d5fc28f2, "1492: Conquest of Paradise - Original Motion Picture Soundtrack["SOUNDTRACK"]", ["B000002IUK"]
"Please Please Me", ade577f6-6087-4a4f-8e87-38b0f8169814, "Please Please Me", ["B000002UA9"]
"Fold Your Hands Child, You Walk Like a Peasant", 94a5439b-f067-4b1d-9d4f-024dad95bdf6, "Fold Your Hands Child, You Walk Like a Peasant", B00004T8ZB
"Legal Man", 788604c8-74fc-4235-ae68-e414f2f1c475, "Legal Man [CD-SINGLE]", ["B00004SWH2"]
"Dog on Wheels", 64fd5312-24db-4c4f-b1a4-9b7e33a64f98, "Dog on Wheels / State I Am in / String Bean [CD-SINGLE]["IMPORT"]", ["B000007WND"]
"3.. 6.. 9 Seconds of Light", 64989a07-675f-4b29-8c19-7d68528550f6, "3-6-9 Seconds of Light [CD-SINGLE]["EP"]["IMPORT"]", ["B000007WNC"]
"Storytelling", 6d1d433e-709b-4c6b-8d09-7e8b845be806, "Storytelling["SOUNDTRACK"]", ["B00005OM56"]
"Tactical Neural Implant", 872b430a-170b-42d8-9942-bfe1fe96447e, "Tactical Neural Implant", B000007U3A
"Analogue Bubblebath IV", ca6dbf75-970b-4ff7-8c82-c8baf263ca50, "Analogue Bubblebath 4["EP"]", ["B00000FEOX"]
"The Day The World Went Away", ae8d1dea-e9b6-4018-b4c6-755ff43553ed, "Day the World Went Away [CD-SINGLE] ", ["B00000JNIR"]
"Light", 04c29e01-ef17-4e48-bb05-1737fd8b65e4, "Light [CD-SINGLE]", ["B000003RIV"]
"Cascade", f09e2b42-b4c6-47ba-b2c9-4160855d880a, "Cascade", ["B000005LB3"]
"Euphoria (Firefly)", 7a51b9fb-363d-4fb1-90e4-17b986aa2732, "Euphoria [CD-SINGLE]", ["B000005DD3"]
"Analogue Bubblebath", f4e39fbf-743c-4186-bd23-2a5be5365551, "Analogue Bubblebath [CD-SINGLE]", ["B000000GRN"]
"06:21:03:11 Up Evil", a6d4018b-244a-4041-ba0a-2aae6cd7cb3b, "06:21:03:11 Up Evil", ["B0000028ZU"]

Missed Matches

MB Album Name, MB Album Id, Amazon Album Name, Amazon Asin

"Classics", ff0dff59-9a2a-4498-ad9f-09915b91ba8a, "Classics", B00005Y1TM

Not matched because "Classics" should be an artist release under Aphex Twin and not a VA release.

"The Perfect Drug Versions", a4db9744-347f-47f5-a4bd-394fde23831c, "Perfect Drug [CD-SINGLE]", B000001Y7W

Probably not matched because the track names differ too much.

"Silent Hill 2", 72ea51fd-0d61-48fc-ba16-e1ae178b408d, "Silent Hill V.2["IMPORT"]", ["B00005NO3D"]
  • Not matched because Amazon lists the artist as "Game Music". This is not a VA release though, so I am not sure how best to match this one.

Incorrectly Matched

"Final Fantasy VII Original Soundtrack (disc 1)" (64a20811-f819-4f0b-b305-7ffbf127ab64)
"Final Fantasy VII Original Soundtrack (disc 2)" (6131c8f5-ccb3-4156-b925-1c2f2a12ed20)
"Final Fantasy VII Original Soundtrack (disc 3)" (78a11ee2-42d7-4035-9463-fdf9fe4640c3)
"Final Fantasy VII Original Soundtrack (disc 4)" (62dd9e39-28b5-4d59-8380-087f9f2d42b1)
"Final Fantasy VIII Original Soundtrack (disc 1)" (1c82c54c-58e2-46e3-8a53-23185af40795)
"Final Fantasy VIII Original Soundtrack (disc 2)" (0827d683-933b-431d-a97e-dcb71d3bc3a4)
"Final Fantasy VIII Original Soundtrack (disc 3)" (bed52222-4ba1-4e83-b514-34f017e44f46)
"Final Fantasy VIII Original Soundtrack (disc 4)" (1d11fa6b-2b33-41bc-a16b-dff27b44c394)

are showing the cover art for the Final Fantasy IX Original Soundtrack. The Final Fantasy VII soundtrack should have the cover art from ASIN B000038I2O and the Final Fantasy VIII one should have ASIN "B00003CK5N".

"Led Zeppelin IV" in MB is matched to "Led Zeppelin II" in Amazon. 
 

Peter Gabriel (1978) - the "scratch" album (8e66ea2b-b57b-47d9-8df0-df4630aeb8e5) is pointing to Amazon's Peter Gabriel (1977) - the "car" album.

"The Best of James" (http://musicbrainz.org/album/5575f9cd-3a0d-4bf1-b4b3-ffce44ea1806.html) is pointing to Amazon's "The Best of James Taylor".

Other Matching Problems

'Guerrilla' by the Super Furry Animals (http://www.musicbrainz.org/showalbum.html?albumid=94127) matches to an album in Amazon called 'Guerrilla "import"' that doesn't have an image. But there is an album called just 'Guerrilla' that does have an image. The only things that I can think of that would stop it matching according to the algorithm above is that A) you can't currently buy it from Amazon as they are out of stock and B) the import copy now has a higher 'popularity' rank than the non-import as it is still available for purchase.

The album 'The Charlatans' (http://www.musicbrainz.org/showalbum.html?albumid=42524) by the band also called 'The Charlatans' (who are also known in the states as 'The Charlatans UK') retrieves the image for the album 'The Charlatans' by the american group called 'The Charlatans' (listed in Amazon as 'The Charlatans (1960's)'. A bit of an odd corner case I know but I'm surprised that it can get the other albums by this artist correct and get this one wrong. The real album/image is listed as both 'Charlatans "import"' and 'The Charlatans "UK"' by the artist 'Charlatans UK'.

Many of the albums by 'The Tragically Hip' have the cover art show up fine; however, if using a store like amazon.ca, following the buy link takes you to the "IMPORT" album (which is listed as unavailable) rather than the Canadian release (which are identical AFAIK). For example, the ASIN for "Fully Completely" is "B000002OMP" in Canada (this is invalid in the US), not "B00000IJRC", which is fine with the US store, but the "IMPORT" version in Canada. I suspect this is a problem with other artists (but I haven't found any specific ones).

"With The Beatles" (a91b9173-b958-401b-9551-b15db0e7bc5d/B000002UAC) retrieves the cover for "The Beatles (The White Album)" ASIN "B000002UAX".

The self-titled first album by "Creedence Clearwater Revival" (6da15b06-b848-487c-a74a-af8fe26f1069) retrieves the cover of a best of called "The Best of Creedence Clearwater Revival" (ASIN: "B000006XV2").

This: http://musicbrainz.org/album/8dac0482-cc08-4a45-82be-899604becbcb.html is mis-matched because of the decision we made not to include text like "Music From The Motion Picture" in soundtrack titles.

It should be: http://tinyurl.com/34lsq

Things may have changed since the previous comment but at the moment the album is mismatched because there are four different Bullitt albums in Amazon: "Bullitt (1968 Film) "SOUNDTRACK"", "Bullitt (1968 Film) "SOUNDTRACK" "IMPORT"", "Bullitt (Music from the Motion Picture) "SOUNDTRACK" "IMPORT" "ORIGINAL RECORDING REMASTERED"" and "Bullitt (Music Recreated from and Inspired by the Motion Picture) "SOUNDTRACK" "IMPORT" "ORIGINAL RECORDING REMASTERED"". The second and third of which appear to match the tracklist of the album in musicbrainz linked to above. Since the algorithm outlined above discards everything in brackets for the second pass then all of these match equally and I assume one of the four contenders is then chosen at random. The absence/presence of "Music from the motion picture" etc. is therefore in this case a red herring, though it probably does apply in the "Conquest of Paradise" case listed above.

Releases with the same name, but extra tracks don't match correctly. There are 3 different Weezer (Green) albums named the same and share the first 10 tracks, but the UK release has 11 tracks, and the Japanese release has 12 tracks http://musicbrainz.org/showalbum.html?albumid=56450 however they all match to ASIN "B00005ICAW", but the 12 track release should be ASIN B00005B7U2, and the 11 track release should be "B00005JHYM"

This is missing a match http://www.musicbrainz.org/album/502cf184-caaa-4c77-ab81-87ff38c30c34.html - as amazon has a 'dead page duplicate' for the album. it should point to: http://www.amazon.co.uk/exec/obidos/ASIN/B00004XN08/

Albums http://www.musicbrainz.org/album/7cee1d42-14f7-47e2-988c-14c46d55e162.html and http://www.musicbrainz.org/album/55ff080e-6fd1-4e2e-872f-8eff966bcb7d.html are mis-matched. Correct MB album for that Amazon match is only http://www.musicbrainz.org/album/3900cff7-3334-4007-8eeb-29307c25a8ed.html