Difference between revisions of "Bots/Blacklist"

From MusicBrainz Wiki
Jump to navigationJump to search
(Add "Amazon Links" blacklist)
 
(One intermediate revision by the same user not shown)
Line 1: Line 1:
 
== Motivation ==
 
== Motivation ==
   
[[Bots]] are a good thing because they can do boring (easily automatable) stuff much more efficiently than humans, so that they can concentrate on more interesting/complicated edits. However, they make mistakes, and some mistakes are very hard (if not impossible) to fix. Reasons include:
+
[[Bots]] are a good thing because they can do boring (easily automatable) stuff much more efficiently than humans, so that they can concentrate on more interesting/complicated [[Edit|edits]]. However, they make mistakes, and some mistakes are very hard (if not impossible) to fix. Reasons include:
 
* Errors in other databases: A bot trying to link to Discogs can't know if the Discogs release is faulty. And it is not always possible to fix the other database.
 
* Errors in other databases: A bot trying to link to Discogs can't know if the Discogs release is faulty. And it is not always possible to fix the other database.
 
* Matching thresholds: When matching by similarity, usually bots have some thresholds. They often have to be lower than 100% so that they doesn't exclude too many positives, but it can't be guaranteed that they won't have a (very) small percentage of false positives then.
 
* Matching thresholds: When matching by similarity, usually bots have some thresholds. They often have to be lower than 100% so that they doesn't exclude too many positives, but it can't be guaranteed that they won't have a (very) small percentage of false positives then.
   
Edits like this are usually entered by bots as normal edits, so editors can (and should) vote them down. Clever bots have a persistent memory of the edits they made, and won't do them again. And here is the problem: If some other bot author at a later point in time runs the same (or a similar) bot, it will probably make the same error again (and again and again).
+
Edits like this are usually entered by bots as normal edits, so [[Editor|editors]] can (and should) vote them down. Clever bots have a persistent memory of the edits they made, and won't do them again. And here is the problem: If some other bot author at a later point in time runs the same (or a similar) bot, it will probably make the same error again (and again and again).
   
To give humans a chance to defend themselves against this problem, there are blacklists for certain kinds of bot edits. If you want to protect a MusicBrainz entity against future faulty bot edits, enter them in the corresponding list. Bot authors should download the appropriate blacklist before every run, and omit edits that are listed.
+
To give humans a chance to defend themselves against this problem, there are blacklists for certain kinds of bot edits. If you want to protect a [[MusicBrainz Entity|MusicBrainz entity]] against future faulty bot edits, enter them in the corresponding list. Bot authors should download the appropriate blacklist before every run, and omit edits that are listed.
   
 
== Blacklists ==
 
== Blacklists ==
   
 
* [[Bots/Blacklist/Discogs_Links|Discogs Links]]
 
* [[Bots/Blacklist/Discogs_Links|Discogs Links]]
  +
* [[Bots/Blacklist/Amazon_Links|Amazon Links]]

Latest revision as of 12:15, 2 May 2013

Motivation

Bots are a good thing because they can do boring (easily automatable) stuff much more efficiently than humans, so that they can concentrate on more interesting/complicated edits. However, they make mistakes, and some mistakes are very hard (if not impossible) to fix. Reasons include:

  • Errors in other databases: A bot trying to link to Discogs can't know if the Discogs release is faulty. And it is not always possible to fix the other database.
  • Matching thresholds: When matching by similarity, usually bots have some thresholds. They often have to be lower than 100% so that they doesn't exclude too many positives, but it can't be guaranteed that they won't have a (very) small percentage of false positives then.

Edits like this are usually entered by bots as normal edits, so editors can (and should) vote them down. Clever bots have a persistent memory of the edits they made, and won't do them again. And here is the problem: If some other bot author at a later point in time runs the same (or a similar) bot, it will probably make the same error again (and again and again).

To give humans a chance to defend themselves against this problem, there are blacklists for certain kinds of bot edits. If you want to protect a MusicBrainz entity against future faulty bot edits, enter them in the corresponding list. Bot authors should download the appropriate blacklist before every run, and omit edits that are listed.

Blacklists