User talk:Jeroen: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(New section: Second hand information)
Line 23: Line 23:
* [[Edit:12582655]] Two different artists added for only one performance credit. [[User:Murdos|Murdos]] 15:58, 27 May 2010 (UTC)
* [[Edit:12582655]] Two different artists added for only one performance credit. [[User:Murdos|Murdos]] 15:58, 27 May 2010 (UTC)
:* Will do that. Nothing is running now, but before I restart I will add a check for common 'ancestors' in the equivalence map. [[User:Jeroen|Jeroen]] 16:11, 27 May 2010 (UTC)
:* Will do that. Nothing is running now, but before I restart I will add a check for common 'ancestors' in the equivalence map. [[User:Jeroen|Jeroen]] 16:11, 27 May 2010 (UTC)

== Second hand information ==

I usually think we shouldn't use second hand information such as user contrubuted websites (discogs, wikipedia) but only first information from physical release.
And here a bot should really be strict on this, not even using official websites maybe. The mistakes are copied pasted all over the net when everyone does like this. (cf. [http://musicbrainz.org/show/edit/?editid=12691335 edit:12691335], the latest I can remember of, but I often see so many mistakes coming from second hand ''references'', including my edits of course)
So I think a bot shouldn't do ''automatic'' imports from such second hand sources. But maybe I'm just too strict. However, once an AR is there, we tend to think it's correct, this is the problem here. [[User:Jesus2099|Jesus2099]] 10:37, 24 June 2010 (UTC)

Revision as of 10:37, 24 June 2010

A few comments on the method

  • Please create a specific user for all the automated edits, that will then be given a bot status. See e.g. Editor:ffimon_bot. Murdos 15:58, 27 May 2010 (UTC)
  • Ah, I checked for something like this, but could not find anything in the docs. I created Editor:JeroenBot. Jeroen 16:11, 27 May 2010 (UTC)
  • Please limit the number of open edits. You saying that your scripts/reports are error prone and that you're ready to react (cancel edit, fix script) each time someone finds a mistake. However if you're flooding the open edits queue, nobody will ever be able to review your edits and the errors will just silently pass. Murdos 15:58, 27 May 2010 (UTC)
  • What would you propose? Jeroen 16:11, 27 May 2010 (UTC)
  • You have currently way too much open edits (~7000). So I suggest you to wait before editing again, and then limit yourself to 500-1000 open edits. Murdos 08:28, 28 May 2010 (UTC)
  • Do you plan to open the source of your scripts? I'm not interested in running them, but if I'm able to check how you're doing your business, I may be able to spot errors at source. And I'm not really inclined to trust a black box machine. Murdos 15:58, 27 May 2010 (UTC)
  • Sure, if it helps. Do you have a proposed way of doing that? Jeroen 16:11, 27 May 2010 (UTC)
  • Not really. If I'm the only one interested, you can just sent them by mail or upload them somewhere. Murdos 08:28, 28 May 2010 (UTC)
  • Do you really 100% trust Discogs? Murdos 15:58, 27 May 2010 (UTC)
  • No, of course not, just like I don't 100% trust MusicBrainz. But I make sure that the tools will never repeat the same edit, so if there's a fix, it won't be re-added. It helps that editors are watching the artists they know. By the way, I'm not focusing on Discogs in particular. I think the next data source on my roadmap are the structured infoboxes on Wikipedia. Jeroen 16:11, 27 May 2010 (UTC)
  • FYI Wikipedia content is covered by a different license (CC), so you may not be allowed to extract information there to include it here. Murdos 08:28, 28 May 2010 (UTC)
  • Focus on releases. Currently you're doing 1 or 2 edits on a release then change to a completely unrelated release/artist/track. Instead: pick a release, try to add as much "safe" information as you can, then continue on a new release. Murdos 08:28, 28 May 2010 (UTC)
  • Don't add partial credits. E.g. it happens that you add a "composed by" credit for only one artist while the work was done by 2 or more artists. I'd prefer that you do not enter any information rather than partial and misleading information. Murdos 08:28, 28 May 2010 (UTC)

Spotter errors. Are they fixed?

  • Edit:12582655 Two different artists added for only one performance credit. Murdos 15:58, 27 May 2010 (UTC)
  • Will do that. Nothing is running now, but before I restart I will add a check for common 'ancestors' in the equivalence map. Jeroen 16:11, 27 May 2010 (UTC)

Second hand information

I usually think we shouldn't use second hand information such as user contrubuted websites (discogs, wikipedia) but only first information from physical release. And here a bot should really be strict on this, not even using official websites maybe. The mistakes are copied pasted all over the net when everyone does like this. (cf. edit:12691335, the latest I can remember of, but I often see so many mistakes coming from second hand references, including my edits of course) So I think a bot shouldn't do automatic imports from such second hand sources. But maybe I'm just too strict. However, once an AR is there, we tend to think it's correct, this is the problem here. Jesus2099 10:37, 24 June 2010 (UTC)