MusicBrainz Summit/2012 Mini-Summit/Notes

From MusicBrainz Wiki

General Schedule

  • 10:30 Sign in, settle in, introductions
  • 10:45 discuss agenda, prioritize topics
  • 11:00ish Start discussion agenda points
  • 12:30ish Lunch -- we can continue informal conversations over lunch.
  • 14:00 Continue on agenda points
  • 16:00 Justin Barker from Universal Joins us. Universal Artist Gateway demo.
  • 17:00 Informal networking and open discussion -- no agenda
  • 18:00 Adjourn to the Tower Tavern, continue open discussion

MusicBrainz Update

Current projects (right now):

  • Cover art archive: Waiting on archive.org to finish POST and COPY features. Hoping to be done near the end of Feb.
  • Product revamp: More replication options, changed pricing for dumps, etc, MetaBrainz web page overhaul, non-rate limited paid/commercial WS
  • Ongoing release editor improvements and created some edit submissions via WS
  • AR editor: Allow batch creation of links to make the link-adding process easier.

Medium term projects (approx. in 2012):

  • Classical schema planning
  • Edit system rewrite
  • Group multiple release events together
  • Data quality
  • 3rd party data set integration

Longer term (past 2012):

See the remaining bits from the Summit 11 roadmap

New features desired

No specific new features were requested, but Andy requested the creation of a few new AR types. We outlined the community-driven process for creating new ARs and that BBC editors can engage in the process directly. We suggested that the BBC outline their desires and goals to the Style List in order to get feedback about their ideas. The community know what has been tried and what has been previously discussed. Getting feedback and suggestions from the community is a good way to get some buy in from the community before putting forth actual proposals.

Orpheus data

  • Orpheus is a database that BBC Radio 3 have built themselves to help track what music has been played. This has a ton of classical information, so it’d be very interesting to MusicBrainz!
  • They have evolved this database overtime to create the “Morpheus” database - which is essentially their next generation database.
  • Interest in getting this data to MusicBrainz, and when Nick gets some free tuits, he’ll try and get some sort of usable dump to us.

Fingerprinting

General discussion about the state of fingerprinting. First PUID’s sad current state and end of life. Update on Echoprint status and the apparent lack of progress. Detailed discussion of Acoustid and its capabilities. Paul gave supporting statements in favour of Acoustid. 7 Digital expressed a lot of interest in Acoustid and is considering running their catalog through it.

Otherwise no real new developments on this front.

Data import/data source/label feeds

We’ve asked Sony, EMI and Universal for label feed data. Both EMI and Universal suggested making MusicBrainz digital distribution partners; that would give us access to the label data feeds. On tuesday, Oliver and Robert will take the small amounts of label data already provided to MB and some data from last.fm’s Music Manager to attempt to hack on a very crude label data search index. This search index is designed to expose the data to MusicBrainz users for general investigation. At this point in time we are not suggesting any methods for importing this data into MB -- the sole purpose of this search index is to expose the data and to get people thinking about how we might efficiently use this data.

7digital

  • 7digital is an online distributor of music. They have many customers who take their data and music feeds and create stores or other solutions that offer 7digital content. Customers have expressed their frustrations with the limitations of the 7digital offerings.
  • 7digital is currently revamping their offerings to improve these issues. They have expressed great interest in the work that last.fm is doing and have looked to MusicBrainz NGS for inspiration for reworking their own offerings.
  • The 7digital team expressed a lot of interest in getting MBIDs into their product offering in order to make the lives of their customers easier. They will investigate the use of MusicBrainz in their offering further and will stay in touch with us.

BBC

Current Happenings

  • New artist pages have been rolled out which use a lot of MusicBrainz data.
  • The BBC didn’t expect NGS to ship so quickly, so they didn’t have the necessary changes in place quite when we launched. This means that instead of working with the database directly, they actually generate pages by making web service queries directly to musicbrainz.org (with lots and lots of caching to make it fast).
  • They are working on getting a local database setup and eventually moving back to hourly replication.
  • Lots of cross linking between various music sites, with the goal to get more people to discover new music.

Future Happenings

Ben Chapman summarized the current offerings of the BBC and pointed out that the current offerings are very much directed towards the average music consumer. They feel that their future offerings should be geared more towards some of the music experts and be able to pull them into the fold.

BBC Introducing landing page

  • This is a project that helps new musicians get their music played. Artists can submit their music to BBC for review, and if they like what they hear then it will be played on the radio.
  • All of the data going through BBC Introducing has MusicBrainz IDs, which means that new artists do have to do MusicBrainz editing. This process could be better, and can sometimes be confusing for new editors. There was talk about “observing” an artist through this process, and seeing how they find the experience and how we can improve it.

MusicBrainz community engagement / New editor mentoring program

  • MusicBrainz can be daunting for new users, so it would be good if there was a set of features to help experienced editors mentor users. This would roughly include closer watching of their edits, and a clear way to communicate and request help.
  • ocharles mentioned various other related ideas to generally help foster new users to the site, mostly around highlighting specific edits - for example, the users first “remove relationship” - so that we can make sure they understand what they are doing. He says to expect a mailing list post summarising these thoughts later in the week.

New MusicBrainz features

The BBC have requested for more information to be in the MusicBrainz database, but it seems like most of this can be solved by adding more ARs. The BBC have been encouraged to take charge here, and determine what these new ARs are, and also researching if the AR has been proposed and rejected in the past.

Editorial data (artist reviews, album reviews, artist pictures)

  • There was a lot of interest in collecting album reviews, but it’s unclear whether or not this belongs in MusicBrainz proper, or somewhere else. The general consensus was a separate product to collect reviews (perhaps CC licensed) might be the right approach.
  • Discussion also forked briefly into discussing a “read only” MusicBrainz web site, with a primary focus on exposing as much of our data as possible. Trying to be “less databasey” and more exploratory.
  • Artist pictures are certainly something that people are interested in, but no clear decision was reached due to the very unstable legal ground.
  • Artist profiles are hand written biographies that the BBC use in place of Wikipedia, and there was interest in being able to have this data. Again, it seems to fit in to the larger picture of this separate product.

last.fm update

  • Last.fm have previously had a script that synced artist names to MBIDs, but this got lost in various staff switch-overs. However, it’s finally been resurrected, and greatly improved! Coverage is rapidly going up for artist, release and recording MBIDs.
  • Currently they can only sync a single MBID per artist “name”, but this is going to change soon and they will be able to speak multiple MBIDs per artist, etc.
  • Meetings towards the ongoing disambiguation problems have gone well, and MusicBrainz is going to be the source of data for artist disambiguation.
  • Last.fm will use MusicBrainz data to determine if artists are ambiguous (that is, there are multiple artists with the same name), and to split these artists apart. In the interface, Last.fm will display disambiguation comments as a way to help users determine which artist they are interested in.
  • Existing data in their catalogue will automagically be moved over time, when it’s clear which artist certain recordings/releases/etc belong to. This will be done partially by using more MusicBrainz data, and partially mining their own data.
  • Last.fm want to have 100% coverage in the future - meaning they want to import artists/releases/recordings that are not even in Last.fm. Their goal is to fully track MusicBrainz.

Classical schema summit (contributors from: Last.fm, Naxos?, Universal, BBC, IMSLP?)

The classical schema summit should happen before we re-write the edit system so that we can re-write the edit system with the pending classical improvements in mind. This will be an invite only summit, likely to take place in London. It should have one database person (MB) and one community representative (MB), Rob Kaye as MC, and two walking classical edge case generators from BBC, Universal, Naxos, Last.fm, ArtistXite or IMSLP. The summit should be limited to no more than 5 people and may likely require more coordination before the summit than the previous NGS schema summit.

RDF data consumption

Gioele asked for a status update on RDF. RDF is currently of interest to academic institutions but until a killer app has been developed that utilises RDF there doesn’t seem to be that much interest outside of academia.

Universal Artist Gateway

Provides an artist page for their most popular artists that aggregates data from a variety of sources including MusicBrainz, and has some similarity to the BBC artist pages. A beta with the top 200 artists should be available within a couple of months.

Users' Questions

  • last.fm
    • last.fm artist URL relationships in MusicBrainz
    • How does last.fm map to MBIDs?
    • Can http://www.last.fm/mbid/<MBID> URL be safely used? Is it public and will it stay available?
      • It seems the artists always point to the MBID that are most recently updated, so currently the links are not stable
        • Last.fm says it will become stable and able to resolve all artist/release MBIDs, with an ETA of around 3 months; including actually creating pages for any MB artists that are not in last.fm at the moment. auto-linking by MBID should then be safe and never result in a 404 (except in the hour or so that is needed for replication)
    • How often are new links pulled for artists (official homepage, myspace, etc.)? BBC does it (somehow) instantly...
      • Actually, the BBC are just querying the ws directly now. It'll go back to an 1 hour delay once they go back to use replication. As for last.fm, they don't pull them from us at all right now apparently
  • BBC
    • Is an API planned to get the list of reviews (to cross-link them from MB)?
    • Some reviews are not yet (or not anymore) linked to an MB release, is it planned to do this? Can MB editors help?
      • Here's the current list of reviews (admittedly not exactly an API...): https://gist.github.com/1704822
        • More talks need to happen about how to work together on reviews, see the Editorial part earlier in the notes.
        • Thanks a lot! I’ve started adding the already linked reviews