Bots/Blacklist: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(Add "Amazon Links" blacklist)
 
(2 intermediate revisions by the same user not shown)
Line 1: Line 1:
== Motivation ==
== Motivation ==


[[Bots]] are a good thing because they can do boring (easily automatable) stuff much more efficiently than humans, so that they can concentrate on more interesting/complicated edits. However, they make mistakes, and some mistakes are very hard (if not impossible) to fix. Reasons include:
[[Bots]] are a good thing because they can do boring (easily automatable) stuff much more efficiently than humans, so that they can concentrate on more interesting/complicated [[Edit|edits]]. However, they make mistakes, and some mistakes are very hard (if not impossible) to fix. Reasons include:
* Errors in other databases: A bot trying to link to Discogs can't know if the Discogs release is faulty. And it is not always possible to fix the other database.
* Errors in other databases: A bot trying to link to Discogs can't know if the Discogs release is faulty. And it is not always possible to fix the other database.
* Matching thresholds: When matching by similarity, usually bots have some thresholds. They often have to be lower than 100% so that they doesn't exclude too many positives, but it can't be guaranteed that they won't have a (very) small percentage of false positives then.
* Matching thresholds: When matching by similarity, usually bots have some thresholds. They often have to be lower than 100% so that they doesn't exclude too many positives, but it can't be guaranteed that they won't have a (very) small percentage of false positives then.


Edits like this are usually entered by bots as normal edits, so editors can (and should) vote them down. Clever bots have a persistent memory of the edits they made, and won't do them again. And here is the problem: If some other bot author at a later point of time runs the same (or a similar) bot, it will probably make the same error again (and again and again).
Edits like this are usually entered by bots as normal edits, so [[Editor|editors]] can (and should) vote them down. Clever bots have a persistent memory of the edits they made, and won't do them again. And here is the problem: If some other bot author at a later point in time runs the same (or a similar) bot, it will probably make the same error again (and again and again).


To give humans a chance to defend themselves against this problem, there are blacklists for certain kinds of bot edits. If you want to protect a MusicBrainz entity against future faulty bot edits, enter them in the corresponding list. Bot authors should download the appropriate blacklist before every run, and omit edits that are listed.
To give humans a chance to defend themselves against this problem, there are blacklists for certain kinds of bot edits. If you want to protect a [[MusicBrainz Entity|MusicBrainz entity]] against future faulty bot edits, enter them in the corresponding list. Bot authors should download the appropriate blacklist before every run, and omit edits that are listed.


== Blacklists ==
== Blacklists ==


* [[Bots/Blacklist/Discogs_Links|Discogs Links]]
* [[Bots/Blacklist/Discogs_Links|Discogs Links]]
* [[Bots/Blacklist/Amazon_Links|Amazon Links]]

Latest revision as of 12:15, 2 May 2013

Motivation

Bots are a good thing because they can do boring (easily automatable) stuff much more efficiently than humans, so that they can concentrate on more interesting/complicated edits. However, they make mistakes, and some mistakes are very hard (if not impossible) to fix. Reasons include:

  • Errors in other databases: A bot trying to link to Discogs can't know if the Discogs release is faulty. And it is not always possible to fix the other database.
  • Matching thresholds: When matching by similarity, usually bots have some thresholds. They often have to be lower than 100% so that they doesn't exclude too many positives, but it can't be guaranteed that they won't have a (very) small percentage of false positives then.

Edits like this are usually entered by bots as normal edits, so editors can (and should) vote them down. Clever bots have a persistent memory of the edits they made, and won't do them again. And here is the problem: If some other bot author at a later point in time runs the same (or a similar) bot, it will probably make the same error again (and again and again).

To give humans a chance to defend themselves against this problem, there are blacklists for certain kinds of bot edits. If you want to protect a MusicBrainz entity against future faulty bot edits, enter them in the corresponding list. Bot authors should download the appropriate blacklist before every run, and omit edits that are listed.

Blacklists