User talk:Jeroen

From MusicBrainz Wiki
Revision as of 21:17, 7 July 2010 by Jeroen (talk | contribs) (Respond to feature request + move past conversations to Archive)
Jump to navigationJump to search

Feature requests

Duplicate artist name check

I was sorting out the two Randy Thornton's in the database and came across Edit:12770647. The bot is adding the correct Randy to the track (the conductor) but in this instance the other Randy (the disney producer/engineer) had already been added to the track.

My request is for the bot to notice the fact that an artist with the same artist name has already been added to the track for the same credit. It doesn't have to do anything with it, it just has to flag it for later review. To make it more convenient you could publish this "to be reviewed" list online somewhere so it could be dealt with by others as well. --navap 16:06, 7 July 2010 (UTC)

  • Thanks, I'll put that on the feature list. Finding them shouldn't be very hard to do. My plan is to have reports run regularly and placed on the wiki for review. This could also be used to spot suspect Discogs URLs for releases (with tracks that don't seem to match). Before I do that, though, I want to get the Discogs data extraction right. First make sure I can restart editing...

Archive

  • I usually think we shouldn't use second hand information such as user contrubuted websites (discogs, wikipedia) but only first information from physical release. And here a bot should really be strict on this, not even using official websites maybe. The mistakes are copied pasted all over the net when everyone does like this. (cf. edit:12691335, the latest I can remember of, but I often see so many mistakes coming from second hand references, including my edits of course)

So I think a bot shouldn't do automatic imports from such second hand sources. But maybe I'm just too strict. However, once an AR is there, we tend to think it's correct, this is the problem here. Jesus2099 10:37, 24 June 2010 (UTC)

  • I can certainly see the reasoning behind that. It does mean that you have to make a trade-off. People can make mistakes adding information to MusicBrainz, just as they can adding information to Discogs or Wikipedia. At some point, you have to accept that there can be errors in your data. From a user's perspective, I think including information from sources like Discogs or Wikipedia adds much more value than the occasional error takes away. Nikki makes a good point in the edit you referenced: if you include the source information, then it's possible to trace the source of the data and make a judgement. Jeroen 11:01, 24 June 2010 (UTC)
  • Don't add partial credits. E.g. it happens that you add a "composed by" credit for only one artist while the work was done by 2 or more artists. I'd prefer that you do not enter any information rather than partial and misleading information. Murdos 08:28, 28 May 2010 (UTC)
  • Latest version does this. - Jeroen 21:17, 7 July 2010 (UTC)
  • Focus on releases. Currently you're doing 1 or 2 edits on a release then change to a completely unrelated release/artist/track. Instead: pick a release, try to add as much "safe" information as you can, then continue on a new release. Murdos 08:28, 28 May 2010 (UTC)
  • Latest version does this. - Jeroen 21:17, 7 July 2010 (UTC)
  • Do you really 100% trust Discogs? Murdos 15:58, 27 May 2010 (UTC)
  • No, of course not, just like I don't 100% trust MusicBrainz. But I make sure that the tools will never repeat the same edit, so if there's a fix, it won't be re-added. It helps that editors are watching the artists they know. By the way, I'm not focusing on Discogs in particular. I think the next data source on my roadmap are the structured infoboxes on Wikipedia. Jeroen 16:11, 27 May 2010 (UTC)
  • FYI Wikipedia content is covered by a different license (CC), so you may not be allowed to extract information there to include it here. Murdos 08:28, 28 May 2010 (UTC)


  • Edit:12582655 Two different artists added for only one performance credit. Murdos 15:58, 27 May 2010 (UTC)
  • Will do that. Nothing is running now, but before I restart I will add a check for common 'ancestors' in the equivalence map. Jeroen 16:11, 27 May 2010 (UTC)
  • Is in the current version. - Jeroen 21:17, 7 July 2010 (UTC)
  • Please create a specific user for all the automated edits, that will then be given a bot status. See e.g. Editor:ffimon_bot. Murdos 15:58, 27 May 2010 (UTC)
  • Ah, I checked for something like this, but could not find anything in the docs. I created Editor:JeroenBot. Jeroen 16:11, 27 May 2010 (UTC)
  • Please limit the number of open edits. You saying that your scripts/reports are error prone and that you're ready to react (cancel edit, fix script) each time someone finds a mistake. However if you're flooding the open edits queue, nobody will ever be able to review your edits and the errors will just silently pass. Murdos 15:58, 27 May 2010 (UTC)
  • What would you propose? Jeroen 16:11, 27 May 2010 (UTC)
  • You have currently way too much open edits (~7000). So I suggest you to wait before editing again, and then limit yourself to 500-1000 open edits. Murdos 08:28, 28 May 2010 (UTC)
  • Current implementation limits to 200 unreviewed edits. That is: max 200 edits that are Open and have not been manually reviewed by me (voted Abstain on)
  • Do you plan to open the source of your scripts? I'm not interested in running them, but if I'm able to check how you're doing your business, I may be able to spot errors at source. And I'm not really inclined to trust a black box machine. Murdos 15:58, 27 May 2010 (UTC)
  • Sure, if it helps. Do you have a proposed way of doing that? Jeroen 16:11, 27 May 2010 (UTC)
  • Not really. If I'm the only one interested, you can just sent them by mail or upload them somewhere. Murdos 08:28, 28 May 2010 (UTC)