MusicBrainz Summit/10/Session Notes

From MusicBrainz Wiki
Jump to navigationJump to search


Please add yourself if you attended.
  • Rob Kaye (ruaok)
  • Paul Taylor (ijabz)
  • Nikolai (pronik)
  • Nikki
  • Oliver Charles (aCiD2)
  • Carol Smith (luciddreamer)
  • chirlu
  • Mathias Kunter (mathiaskunter)
  • Age Bosma (Prodoc)
  • Kuno Woudt (warp)

A big thanks goes to luciddreamer for taking notes the entire day!


  1. Welcome, thank you for coming. Happy 10th Anniversary! There's cake.
  2. No one finished NGS overnight, which is a shame.

Agenda Brainstorming

  1. Status of NGS (ACiD2)
    1. Status
    2. Edge Cases
  2. Status of MusicBrainz (Rob)
    1. BBC
    2. Amplified Media Services
    3. 24 Hour Diner
  3. NGS for Taggers
  4. MB Hosting
  5. Internationalization
  6. Future
    1. Classical Music
    2. Edit System
    3. Post NGS
      1. Events, Venues, Studios (geographical data)
      2. all kinds of other data and then some
      3. Recommendations
    4. Facts vs. concerts/collections
    5. Editor karma/rating
    6. Stash
    7. Submit Edits to Queue
    8. Roles, Podcasts, Radioshows ??
  7. Avoiding/Dealing with poisonous people, Community Issues
  8. Long Term Plan
  9. NGS Install Trouble
  10. Rollout NGS
  11. DVCS
  12. mb-server implementation (aCiD2)
  13. Tagging Tags

State of MusicBrainz

  1. London was depressing. Nothing bad for MB, but nothing good either.
  2. BBC
    1. Articles about Mark Thompson talking about the future of the BBC - apparently the BBC is doing things too well
    2. Independent radio stations aren't keeping up with the BBC
    3. Have a general plan to replace upper management completely every six months
    4. Given that, the BBC is changing their online strategy and the exact outcomes are unclear at the moment.
    5. MB at the BBC had two purposes
      1. organize the BBC better - focus on music, not on the products
      2. internal systems of keeping track of everything they play out - all the things in the system has MB IDs and everything they play out helps their reporting and gives them more information about the
      3. People outside of the BBC get more access to the music
    6. Still going to be a part under the hood for the BBC to organize all their stuff
    7. Beneficial thing is the goal is to hire someone - need initial amounts of money to bring in a new developer
    8. BBC has agreed to a "bounty program" or "milestone program" for more funding
      1. Need to figure out how we want to implement this
      2. We now need to define what these bounties are and what we want to do
      3. Should have 4-6 milestones and attach money to each of these milestones. Should probably run until the delivery of NGS.
      4. We determine how much money we ask for for each milestone, but they're welcome to say no. We should do due diligence on how much effort it will take to do these tasks
      5. Need a "person month" idea and present that to the BBC - for each task you can figure out how much it translates to in a person's salary
      6. ACiD2, warp and mayhem will break out on this topic and talk about it tomorrow.
  3. Amplified Media Services
    1. Isn't going well.
  4. Recommendations, 24 Hour Diner (demo:
    1. A guy who made the recommendation engine for Amazon now runs a service called 24 Hour Diner
    2. Edits, advanced relationships, some other data was provided to this guy in the form of MB IDs and he ran it through his engine
    3. Demo on metawebsite -
    4. For mayhem's musical taste, it wasn't that great. But for more popular stuff, its better.
    5. There may be some information we could extract from the server logs to make this more useful/relevant to feed into and improve the system
    6. Once we have a recommendation system, we can replace the recommendations on MB
    7. We *could* build a recommendation engine, but:
    8. A sexier application would be improving our voting system!
    9. On the to do list to go improve and clean up that data
  5. Venues, Concerts company
    1. Has been pushing mayhem to do integration between MB and their system
    2. Been putting them on hold in lieu of NGS
    3. mayhem will send a link to this -
    4. How do we integrate their technology? Their suggestion is a link to what concerts are playing and pull in this data and display it on the MB page for that artist
    5. Idea is for crosslinking - but we should decide what we want to do
    6. Maybe not useful to have information on "here's an artist you might like who's playing 200km from here" or maybe it is?
    7. We should discuss this for the "future" discussion point
    8. State of MB
      1. Fairly stable - infrastructure has been improved quite a bit
      2. Planning on installing an instance of Zen for each machine - keep things in a better state
      3. MB is stable, replicatable, yay!
      4. Thank you Paul for your work!
      5. We have about $80,000 in the bank, we're stable and breaking even every year.
      6. Mayhem is hoping we will end up on a cloud-hosting platform of some sort. This is still very squishy in its application still, but this is the idea.

State of NGS

  1. We started working on the NGS server at the start of SOC
  2. We've completely tested the schema - you can get all the data in the schema
  3. First milestone was a RO interface and we've gotten that far
  4. Now we're trying to get to the editing step
  5. As part of implementing the backend, we are completely changing the frontend
  6. We started working on the edit system, have most of the major edit types in there now - a few things as well like editing works and recordings
  7. Oliver is working on the release editor, which was working but is now broken again
  8. Edit system is 60% of the way there
  9. We have a lot of bugs! Have a bug report on bug tracker
    2. 23 tickets in there atm, but want more in there
  10. Dont have statistics support, or the dashboard from the last release on MB
  11. Past the halfway milestone. Lots of little things that haven't been done.
  12. Some to-do tests are also in the unit tests
  13. Three milestones set - Beta 1, Beta 2, NGS final
  14. Where something belongs is up to you, a little squishy
  15. is more or less automatically updated now
  16. Data population script re: works
    1. The reason we've left it out is we dont know how people are going to use it.
    2. We need to talk about how we're going to use it
    3. BBC Radio 3 has a tremendous database of classical music and works, we should work with Nick to figure out structure for works
  17. Main advantages of NGS
    1. Artist credits
    2. Schema itself
    3. Easier to accept bug reports
  18. How long will it take to finish?
    1. It would be nice to have milestone 1 out by the end of this year.
    2. If we get to hire another person, this will move things a lot faster.

NGS for Taggers

  1. What does NGS mean for taggers?
  2. Firstly, we'll have a compatibility service. In the long term, we'll have new possibilities with the NGS - better selection of identical albums from one release
  3. In NGS track IDs are being moved forward to become recording IDs, the IDs that are being thrown out are going to the GID table
  4. Only have identifiers for recordings, not tracks. If you want to identify one track on a release you use the combination.
  5. Is NGS going to change webservice for existing applications?
    1. Idea is we'll have backward-compatible v1 layer.
    2. Concepts don't map perfectly - like artist credits.
  6. Artist credits
    1. Artist is an individual person or band but not a compound band.
      1. Queen & David Bowie - artist credit should capture what it will say on the spine of a CD you would buy
        1. Queen --> MBID
        2. David Bowie --> MBID
        3. =Queen "&" David Bowie
    2. Release has an artist credit and you can string them together
    3. Paul's suggestion (but he's not bothered as much by this problem now)
      1. <artist>
        1. <id>
        2. <name>
        3. <artist credit> (optional)
      2. <artist>
  7. We need to encourage people to change to v2 as soon as possible.
  8. Tagger authors all need to talk to their community about this.

Release Process for NGS

  1. Edit Structure is changing - not fixing the edit system
  2. We need to fix the edit system and take the archive of edits and put them back in the database, but this means we have to close all the edits
    1. We would need to either write a script to tweak all the data in the old format into the new format or close MB for 3 weeks to be read-only
    2. We'll need a conversion script anyway, maybe we should just write it now. Oliver will figure out about how long this might take. "It looks scary but doable." Easily a week of writing the script itself.
  3. Two options depending on our decision on edits:
    1. If we close edits, we announce it widely that we are taking down MB and get as many people as we can to vote on edits over a space of 2-3 days. Voting system determines when we fail all the other edits, and then we put MB in a read-only state. (May need to purchase a new machine for this to make it read-only).
      1. For live update feed people, we'll do a dump of the data to the ftp site and those people
      2. Then run a database upgrade script
      3. Then..?
      4. Update the web frontends
      5. Run all the reports
      6. Then some people will see the new information
      7. Then, somehow, later, NGS is upon the world and MB is back and open for business.
    2. Process is similar if we decide to run a script for the open edits without the 2-3 days of voting.

NGS Install Troubles

  1. What can we do to reduce the amount of troubles?
  2. Best Practices for making sure the make file is up-to-date and correct.
  3. We've always specified Linux for the install and it works better for that.
  4. Rob will take the information on how he got it running on Mac ports into a
  5. Need debian repository with all the dependencies?
  6. Rob needs to do a new updated NGS dump
  7. C-pan problems are most of the trouble-makers, it seems

Schema/UI Change

  1. NGS is clearly a schema change, but then we'll probably have a rapid succession of no schema changes and those releases should allow us to iterate a lot faster.


  1. pronik has been translating the website. Isn't that nice of him?
  2. Musicbrainz translation is basically ready
  3. Making translations to other languages much easier as well
  4. Changing the language from the webpage is on the to-do list, right now its based on the browser's default language
  6. Transifex (
  7. A lot of work to do still on this - much of it to come after the NGS release

MB Hosting

  2. Moving to cloud-based computing is a scary option, since everything needs to be redundant.
  3. Each of the components of the MB architecture will need to come out of a debian package essentially.


Edit System

  1. Need to start thinking about a way to group edit
  2. Concept brought forth is an "edit" is a collection of small changes
  3. Entire edit either passes or fails, not partials
  4. Could be some concept of "Starting a New Edit" and then "Done with Edit" will probably need to be implemented analogous to a database transaction
  5. May need to set some guidelines of "keep changes small" because you can keep edits easy to vote on
  6. Need to be able to resubmit edits and edit edits
  7. Should be as easy as possible for the editors and a bit harder for the voters
  8. Each sub-transaction part of a larger transaction can be scored. e.g. if I change 1 part of a 1,000 part edit I get 1 point and the original editor gets 999 points.
  9. Each change can be voted on: yes, no, amend, no vote.
  10. Problem: We don't want people to put edits that aren't related in one grouping.
  11. Subscriptions to particular searches would be great: e.g. I want to be notified or search for each time there's a new release in Russian.
  12. Editor karma/ratings
    1. We want a way to give people respect/cred in the community for doing frequent, high-quality changes. They'll get less scrutiny on their changes and make the edits faster. It will hopefully make the open edit queue smaller.
    2. We should capture more information about edits once the new edit system is in place - run reports and propose rules for a karma system.

Submit Edits to Queue

  1. Spam
  2. People who use MB as an API and start submitting 1,000s of edits at a time.
  3. How do we prevent a n00b from starting to use an application to make 10,000 edits?
    1. API key to block a tool from doing something on behalf of a user
    2. Should go in a different holding queue instead of an open edit queue?
    3. Mapping between what edits are allowed to be submitted to the queue and what aren't?
    4. Rollout slowly and give people more access over time
    5. Have a handful to begin with and see what works
    6. Limit the number of open edits any user can only have (with karma this will work well because you can give a person with good karma more open edit permissions)
    7. Undo worst damages later if we discover something doesn't work


  1. Sit down and write a bit of an edit and finish it later (could do this with an iPhone in the store, for example)
  2. Get a list of previous stashes when you come back to musicbrainz

Classical Music

  1. NGS genesis is artist revamp; releases, release groups, and mediums; needs a "master" concept
  2. Need a "master" table, this is the missing component- the idea of the original work that can be composed by someone, performed by someone else, and then remixed or included in an album later.
  3. Opus Numbers support
  4. Also, how do you keep track of an orchestra of 100 people, for example?
  5. Work support needs to be improved
  6. BBC is really interested in us handling classical well.
  7. We could start the discussion after NGS with the work support and iterate
  8. On the roadmap for sooner rather than later
  9. Need to define "works". No one volunteered to do that. :-)
  10. Probably need to send this out to the community - make a post to the blog and see what people think. AI for Rob to do this.

Events and Venues

  2. To what extent do we want to get venues or gigs into MB?
  3. Venues tie in nicely with recordings and would help the bootlegs community
  4. Concerts that have happened as well as upcoming concerts
  5. Setlists would be great to base the bootlegs off of
  6. How do we feel about offering this data? Is it a question of MB being a database of characteristics related to music or is it a music site in general?
  7. How do we integrate someone who's very eager to work with us to be one degree of separation from MB?
  8. Would be useful to have the concert information for the music, will offer a more well-rounded experience
  9. Proposal: preamped will provide the data, MB will pull the data from their site as often as they provide it, we will cache it, and there will be a link on the side bar for the concerts associated with that artist. Link will be clear what concert provider you're going to.
    1. Will not be exposed through any API
    2. Additional data feed services will be randomized into a link on the sidebar
  10. If we have geographic information, we can talk about providing a more targetted link based on that information later.
  11. Need to expose this on the webservice as well.


  1. This is an easy win - but the question of what the priority is vs. getting classical done.

Roles, Podcasts, Radioshows ??

  1. MusicBrainz features happen because people continually request them - so the official answer is "yes, we could see these things..."
  2. Podcasts are a modern version of radioshows.
  3. Do we have to have a use for data in order for it to be in MB?
  4. Does anybody keep podcasts around? Yes.
  5. Is this merely a task for creating style guidelines or do we need to write code to make it happen?
  6. We can easily make a new release group type, just need to work out how we name it and what we associate with it
  7. Anything with is published as audio is ok for MusicBrainz
  8. To Do: Need to add it to the style guidelines for NGS for podcasts and radioshows
  9. Roles: Will be done with classical.

Facts vs. Collections

  1. Covered somewhat with the discussion that we want to keep track of facts; things that happened in the past, we don't want things that are going to happen
  2. Collections could be expressed as facts, but ratings and tags cannot be expressed as facts
  3. Porting collections from the current server? Keeping it mostly the same.

Tagging Tags

  1. In lieu of genres, we should have a flag we can attach to certain tags to show it is representative of a genre
  2. Classifications for tags, maybe? Tag categories? Meta-tags?
  3. How would it help?
  4. Would like to get better documentation on how to develop for MB and gather more information on tags
  5. Not a high priority, we'll revisit later
  6. There will not be a tag submission limiting in the new web service
  7. The first issue we need to solve is getting people to start tagging things more, define the communal
  8. Remove frictions, get more tagging capabilities, make it more visible, revisit at the next summit

Long-Term Plan

  1. Rob wants a headsup display of information about music
  2. "We're in the business of knowing where information about music can be found." - Rob
  3. Start a ConcertBrainz project under the MetaBrainz foundation as a separate project?
  4. We need a more complete and concise mission statement - Rob needs to think about this when he's less tired.