https://wiki.musicbrainz.org/api.php?action=feedcontributions&user=Valerio.paolini&feedformat=atomMusicBrainz Wiki - User contributions [en]2024-03-29T04:42:06ZUser contributionsMediaWiki 1.39.4https://wiki.musicbrainz.org/index.php?title=MusicBrainz_Summit/11/Session_Notes&diff=48620MusicBrainz Summit/11/Session Notes2011-10-18T13:04:52Z<p>Valerio.paolini: </p>
<hr />
<div>== Attendees ==<br />
<br />
* Kuno Woudt (warp)<br />
* Pavan Chander (navap)<br />
* Rob Kaye (ruaok)<br />
* Nikki<br />
* Oliver Charles (ocharles)<br />
* Jamie McDonald (jdamcd)<br />
* Nicolás Tamargo (reosarevok)<br />
* CatCat<br />
* Per Øyvind Øygard (Wizzcat)<br />
* Paul Taylor (ijabz)<br />
* Mathias Kunter (mathiaskunter)<br />
* Hilbert Woudt (monedula)<br />
<br />
Sponsor representatives<br />
<br />
* musiXmatch: Valerio Paolini<br />
* Last.fm: Adrian Woodhead<br />
* Google/Freebase: Micah Saul (micahsaul)<br />
* Zvooq: Andrey Popp (andreypopp)<br />
* BBC: Dave Evans (djce)<br />
<br />
== Customer introductions ==<br />
<br />
=== Last.fm ===<br />
<br />
* They have had a lot of personnel changes over the last few years, but would like to re-establish a relationship with MB<br />
* Are looking to switch to NGS schema by the end of the year<br />
* They would like to use MBIDs internally to make communication between incoming data sets easier<br />
* Will consider sharing partial label feed/data<br />
* Might actually solve their artist disambiguation issue soon..ish<br />
<br />
=== Zvooq ===<br />
<br />
* They are a Spotify competitor in Russia that focuses on music released worldwide<br />
<br />
=== musiXmatch ===<br />
<br />
* They are a lyrics database<br />
* World wide license from Sony, Universal, EMI, Warner, BMG, Kobalt<br />
<br />
=== Freebase ===<br />
<br />
* Freebase is a big data repository of various data sets covering movies, music, sports, people, locations, and others<br />
*http://freebase.org<br />
<br />
=== BBC ===<br />
<br />
* They are looking to finally make the switch to NGS<br />
* Their music news website now uses ws/2<br />
* They outsource their album reviews and MB data entry to Unique Broadcasting Company<br />
<br />
== Discussions ==<br />
<br />
=== Friday (Oct 14) ===<br />
<br />
==== Single sign on & password security ====<br />
<br />
Goals<br />
<br />
* Not storing plaintext passwords<br />
* Not having knowable (i.e. reversible) passwords<br />
* Not transmitting passwords in the clear<br />
* Single sign on<br />
<br />
Questions<br />
<br />
* What specific password issues are we trying to solve?<br />
<br />
Discussed proposals<br />
<br />
* Implement OpenID<br />
* Using digest authentication (still requires storing and transferring the clear text password)<br />
* Using SSL (requires updating web service libraries)<br />
* Using a separate LDAP server (password no longer in MB database and stored elsewhere, also allows for possible single sign on integration)<br />
<br />
'''Conclusion:''' Use LDAP and phase in SSL to increase password security. Bonus: LDAP makes single sign on possible.<br />
<br />
=== Saturday (Oct 15) ===<br />
<br />
==== Cover art archive ====<br />
<br />
* Universal is considering handing over their entire cover art archive to us<br />
* Labels actually don't own copyright on cover art<br />
* There are potential messy legal issues to using cover art<br />
* The Internet Archive functions as a library and can act as a 'cover art shelter' for us<br />
* Possible process:<br />
** A release's MBID can be used to receieve a cover art image<br />
** If you know a release's MBID you can do a GET and receive a cover art image<br />
** Track cover art uploads by user and also use regular voting process<br />
** Images will be provided as a hi-res (~15 MB) and as a low-res (500 px)<br />
<br />
Questions<br />
<br />
* Does the user have to upload JPEG or can the server transcode?<br />
* What status code will we return when a 'darkened' image exists but we're not allowed to display it?<br />
* If we get cover art from Universal, how do we match each image up with a release?<br />
* How do we handle a release group with many releases (i.e., do we use the same image?)<br />
* How do we handle multiple images (e.g. front, back, obi, liner notes, cd faces, etc.)<br />
<br />
http://wiki.musicbrainz.org/Cover_Art_Archive<br />
<br />
http://wiki.musicbrainz.org/Cover_Art_Wishlist<br />
<br />
==== Data quality ====<br />
<br />
* '''See Sunday<br />
<br />
==== Edit system ====<br />
<br />
Goals<br />
<br />
* Allow grouping edits together and bulk submitting them<br />
* Allow editing an edit and resubmitting it without impacting the edit queue<br />
* Allow editing via the web service, eventually<br />
<br />
Bookbrainz<br />
<br />
* Oliver's pet project and testing ground for future MB framework changes<br />
* BB emulates git<br />
** it allows building a stack of changes and then submitting all of them together in one 'commit'<br />
** It takes a snapshot of the data at the time. We don't have that with historical edits so migrating old edits is a problem<br />
<br />
Further reading regarding <br />
<br />
* <span class="c10">http://en.wikipedia.org/wiki/Inter-rater_reliability (especially the links at the end)<br />
* <span class="c10">http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00074<br />
<br />
==== Web service ====<br />
<br />
* Roll out 3scale and move all commercial users over to a pay2play system with different packages<br />
* Non-commercial users would use the free2play rate-limited system with the option of paying for better access<br />
<br />
==== Audio fingerprinting ====<br />
<br />
* We all hate PUIDs and we need to move forward<br />
* Acoustid looks very promising, it's open source, file oriented, and has strong ties with MB<br />
* <span class="c10">http://acoustid.org/<br />
* May be possible to bulk fingerprint some data sources<br />
<br />
==== Concert support ====<br />
<br />
* Do we go with one provider or several?<br />
** Start with Songkick, but stay open to the option of different providers - especially to gain global coverage<br />
* Do we concentrate on future events or archived events?<br />
** Initially link to Songkick for future events<br />
** Create a new setlist entity for past events<br />
** Create a new venue entity<br />
* Need to consider Location, would be useful for artist as well as for events.<br />
<br />
==== Tracks vs recordings (vs works) ====<br />
<br />
* Similar to the remaster issue<br />
* Do we add further levels of abstraction?<br />
** No. We're already saturated with entities. We need better definitions<br />
** ...and we still haven't totally defined works<br />
* Do we count silence as a divergence point?<br />
<br />
==== Service segregation ====<br />
<br />
* Announce the closing of trac (and all its tickets) and the deprecation of subversion<br />
* <span class="c10">[http://svn.musicbrainz.org svn.musicbrainz.org] will remain as an interface for the search server <br />
* Consider replacing gitweb with github in a more official capacity<br />
<br />
==== Genres ====<br />
<br />
* A new field that is to be used specifically for genres<br />
* Features: autocomplete, canonical names, <br />
* Micah is offering genre data based on wikipedia <br />
<br />
==== Product offering ====<br />
<br />
:'''This is not a complete or final model and not official!<br />
<br />
* "Drug dealer" model - free the first time, get addicted, pay for easy further access<br />
* Data dumps (twice a week)<br />
** Public $100* *suggested<br />
** CC-NC $250 (Paying for commercial use of NC use data)<br />
* Live data feed ($/mth)<br />
** Twice-weekly $500<br />
** Daily $1500<br />
** Hourly $2500<br />
* Web service calls (flat fee)<br />
** 10K $10<br />
** 25K $20<br />
** 50K $30<br />
** 100K $50<br />
* Virtual machine<br />
** VM + Data $300<br />
** VM + Data + Search $400<br />
* Tagger Affiliate Program<br />
** TBD: Clarification of the scope of the program<br />
** TBD: Web service referral kickbacks<br />
<br />
=== Sunday (Oct 16) ===<br />
<br />
==== 3rd party data set integration ====<br />
<br />
* '''Lyrics from musiXmatch'''<br />
** daily updates, but will start with weekly ones<br />
** updates will include all MB/mXm matched lyrics<br />
** lyrics can be added also from edit interface<br />
** How do we best use their lyrics data?<br />
*** '''Solution''': Link to mxm via a lyrics icon in the tracklist and a proper link on the recording page<br />
* '''See also Monday<br />
<br />
==== Tracklist/medium overhaul with video support ====<br />
<br />
* Videos are becoming increasingly common as a music release medium (e.g. itunes)<br />
* WIll require major schema changes and looking at the long term goals of MusicBrainz<br />
* <span class="c12">Solution: Table the discussion for now, reopen in a different setting with developers<br />
<br />
==== Group multiple release events (country+date) together ====<br />
<br />
* There is a need to group multiple releases together when each release is the exact same - just released in a different country<br />
* Due to tradition, different countries/regions issue releases on different days of the week<br />
* <span class="c12">Solution: Allow multiple release events per release when the label, barcode, and tracklist is the same<br />
<br />
==== Date improvements ====<br />
<br />
* Unknown end date (dead/disbanded, but we don't exactly know when)<br />
** '''Solution''': Add a column to the date table to specifically state that the entity is dead/disbanded, but we don't know when<br />
* Fuzzy dates (16th century composer edge cases)<br />
** '''Solution''': Use a 'century' column<br />
<br />
==== Data quality ====<br />
<br />
* <span class="c10">http://wiki.musicbrainz.org/User:Wizzcat/Data_Quality_Extension<br />
* <span class="c10">http://wiki.xabbu.net/Data_quality<br />
* Current implementation of data quality has a bad name, is poorly defined, and isn't used<br />
* What do we want to solve?<br />
** Explicitly state that a release has been reviewed/verified<br />
*** <span class="c12">'''Solution''': +1 / -1 votes that decay in weight over some function of time<br />
** Protect against ignorance (The White Album vs The Beatles)<br />
*** <span class="c12">'''Solution''': Add a 'Protected' flag (i.e. edits expire by default)<br />
** Measure of completeness<br />
*** <span class="c12">'''Solution:''' "Completed as per liner notes" checkbox that is accessible via the WS <br />
* <span class="c12">'''Conclusion''': High quality is the protected flag, default quality is default, low quality goes away<br />
<br />
==== Release group attributes ====<br />
<br />
* Currently, 'remix' and 'soundtrack' are at the same meta level as 'album' or 'lp'<br />
* '''Conclusion''': Postponed till a proposal can be drawn up<br />
<br />
==== Reports ====<br />
<br />
* Improve the explanation that is shown at the top of each reports' page<br />
* Improve report flow (e.g. ability to hide items from reports)<br />
<br />
* Allow marking an entry as 'done'<br />
* Default report list should filter out all entries marked as done with more than X votes<br />
* Allow viewing the report with the filtered out entries<br />
<br />
==== Site notifications + subscriptions ====<br />
<br />
* List all emails in a site inbox<br />
* Create a dynamic list of subscribed artists with open edits<br />
<br />
==== Testing ====<br />
<br />
* As finances improve, employ a dedicated person that will lead the testing<br />
<br />
==== Pagination ====<br />
<br />
* Filter on release group properties<br />
* Use infinite scroll<br />
* Be able to reorder, add, remove, and sort columns<br />
<br />
==== Medium attributes (12" vinyl, 80 cm CD) ====<br />
<br />
* Switch from a hierarchical tree to attributes<br />
<br />
==== Music dashboard ====<br />
<br />
* What information should we show?<br />
**http://www.discogs.com/<br />
** http://www.freebase.com/<br />
** http://www.soundunwound.com/<br />
<br />
==== Instrument tree ====<br />
<br />
* Change from a tree to a graph<br />
** Flatten the graph into a tree and allow an instrument to have multiple parents<br />
* Add model support to the instrument tree<br />
* Importing freebase data<br />
** How often do we sync the data?<br />
** How do we reconcile differences in data?<br />
** How often do deletes/merges/changes happen?<br />
* Going forward, if we need a new instrument we would add it to freebase<br />
<br />
==== Universal Music Group International ====<br />
<br />
* "I am very happy to declare Universal's support for MusicBrainz and its community" - Innovation Manager at Universal Music Group International<br />
<br />
==== Release editor ====<br />
<br />
* Default tracklist page shows the advanced view<br />
* For new releases you see the add disc dialog<br />
* THe track parser noves into the add disc dialog<br />
* There needs to be a way to reparse from the advanced view<br />
<br />
==== Wiki ====<br />
<br />
* Remove unneeded extensions<br />
* Update to Ubuntu's MediaWiki package<br />
* Get the API working<br />
* Install wiki at /wiki/Article and then redirect to /Article<br />
* Write a wiki test suite<br />
<br />
=== Monday (Oct 17th) ===<br />
<br />
==== Initial dates on release group ====<br />
<br />
* Last.fm would like to create 'best of the decade' lists and filter out data such as the 2009 re-release of The Beatles <br />
* Currently, release group dates match the date of the earliest release in that group, but in the case of re-releases we often only have data on the modern release and are missing (for example) the original '70s vinyl release<br />
* '''Solution''': Add an editable initial date field at the release group level<br />
** The date field will default to empty because anyone who wants the group date can guess via its earliest release (like MB does now)<br />
<br />
==== musiXmatch ====<br />
<br />
* short description of musiXmatch expectations<br />
* feedback from MB Editors on musiXmatch contributions<br />
* Editors' willingness to help musiXmatch (IRC channel)<br />
* musiXmatch will report unexpected Edit Interface behaviour (for example Split Artists while adding a Release)<br />
* change usernames to make them easily identifiable (add customer name to username)<br />
* provide guidelines for interactions between MB Editors and external Editors<br />
<br />
<br />
==== 3rd party data set integration ====<br />
<br />
* How do we properly link to different data sets? (e.g., musiXmatch, soundunwound, last.fm, etc.)<br />
** '''Solution''': Build a generic framework that allows us to import any external data set and reconcile it with the data we have<br />
** Use a second "integration database" that contains all raw data from external sources (label feeds, partners, etc.)<br />
** Import data into the main database with a de-duplication script, but do not remove any of the original raw data (this allows further parsing in the future)<br />
** Also look into Google Refine for manual reconciliation: <span class="c10">http://code.google.com/p/google-refine/<br />
* A long term goal is to create an editing API that we can gradually open up to our data partners and the ecosystem<br />
** This will allow partners like Zvooq to edit data on their website, but feed the changes back to the rest of the MB ecosystem<br />
<br />
== Feature prioritization ==<br />
<br />
Feature (votes)<br />
<br />
# Edit system (9)<br />
# Group multiple release events together (6)<br />
# Data quality (6)<br />
# 3rd party data set integration (5)<br />
# Single sign on & password security (5)<br />
# Instrument tree (4)<br />
# Genres (4)<br />
# Medium attributes (4)<br />
# Release group attributes (3)<br />
# Music dashboard (2)<br />
# Tracklist/medium overhaul with video support (1)<br />
# Pagination (1)<br />
# Site notifications (1)<br />
# Report improvements (1)<br />
# Date improvements (0)<br />
# Auto-editor elections (0)</div>Valerio.paolinihttps://wiki.musicbrainz.org/index.php?title=MusicBrainz_Summit/11/Session_Notes&diff=48619MusicBrainz Summit/11/Session Notes2011-10-18T12:54:48Z<p>Valerio.paolini: /* 3rd party data set integration */</p>
<hr />
<div>== Attendees ==<br />
<br />
* Kuno Woudt (warp)<br />
* Pavan Chander (navap)<br />
* Rob Kaye (ruaok)<br />
* Nikki<br />
* Oliver Charles (ocharles)<br />
* Jamie McDonald (jdamcd)<br />
* Nicolás Tamargo (reosarevok)<br />
* CatCat<br />
* Per Øyvind Øygard (Wizzcat)<br />
* Paul Taylor (ijabz)<br />
* Mathias Kunter (mathiaskunter)<br />
* Hilbert Woudt (monedula)<br />
<br />
Sponsor representatives<br />
<br />
* musiXmatch: Valerio Paolini<br />
* Last.fm: Adrian Woodhead<br />
* Google/Freebase: Micah Saul (micahsaul)<br />
* Zvooq: Andrey Popp (andreypopp)<br />
* BBC: Dave Evans (djce)<br />
<br />
== Customer introductions ==<br />
<br />
=== Last.fm ===<br />
<br />
* They have had a lot of personnel changes over the last few years, but would like to re-establish a relationship with MB<br />
* Are looking to switch to NGS schema by the end of the year<br />
* They would like to use MBIDs internally to make communication between incoming data sets easier<br />
* Will consider sharing partial label feed/data<br />
* Might actually solve their artist disambiguation issue soon..ish<br />
<br />
=== Zvooq ===<br />
<br />
* They are a Spotify competitor in Russia that focuses on music released worldwide<br />
<br />
=== musiXmatch ===<br />
<br />
* They are a lyrics database<br />
* World wide license from Sony, Universal, EMI, Warner, BMG, Kobalt<br />
<br />
=== Freebase ===<br />
<br />
* Freebase is a big data repository of various data sets covering movies, music, sports, people, locations, and others<br />
*http://freebase.org<br />
<br />
=== BBC ===<br />
<br />
* They are looking to finally make the switch to NGS<br />
* Their music news website now uses ws/2<br />
* They outsource their album reviews and MB data entry to Unique Broadcasting Company<br />
<br />
== Discussions ==<br />
<br />
=== Friday (Oct 14) ===<br />
<br />
==== Single sign on & password security ====<br />
<br />
Goals<br />
<br />
* Not storing plaintext passwords<br />
* Not having knowable (i.e. reversible) passwords<br />
* Not transmitting passwords in the clear<br />
* Single sign on<br />
<br />
Questions<br />
<br />
* What specific password issues are we trying to solve?<br />
<br />
Discussed proposals<br />
<br />
* Implement OpenID<br />
* Using digest authentication (still requires storing and transferring the clear text password)<br />
* Using SSL (requires updating web service libraries)<br />
* Using a separate LDAP server (password no longer in MB database and stored elsewhere, also allows for possible single sign on integration)<br />
<br />
'''Conclusion:''' Use LDAP and phase in SSL to increase password security. Bonus: LDAP makes single sign on possible.<br />
<br />
=== Saturday (Oct 15) ===<br />
<br />
==== Cover art archive ====<br />
<br />
* Universal is considering handing over their entire cover art archive to us<br />
* Labels actually don't own copyright on cover art<br />
* There are potential messy legal issues to using cover art<br />
* The Internet Archive functions as a library and can act as a 'cover art shelter' for us<br />
* Possible process:<br />
** A release's MBID can be used to receieve a cover art image<br />
** If you know a release's MBID you can do a GET and receive a cover art image<br />
** Track cover art uploads by user and also use regular voting process<br />
** Images will be provided as a hi-res (~15 MB) and as a low-res (500 px)<br />
<br />
Questions<br />
<br />
* Does the user have to upload JPEG or can the server transcode?<br />
* What status code will we return when a 'darkened' image exists but we're not allowed to display it?<br />
* If we get cover art from Universal, how do we match each image up with a release?<br />
* How do we handle a release group with many releases (i.e., do we use the same image?)<br />
* How do we handle multiple images (e.g. front, back, obi, liner notes, cd faces, etc.)<br />
<br />
http://wiki.musicbrainz.org/Cover_Art_Archive<br />
<br />
http://wiki.musicbrainz.org/Cover_Art_Wishlist<br />
<br />
==== Data quality ====<br />
<br />
* '''See Sunday<br />
<br />
==== Edit system ====<br />
<br />
Goals<br />
<br />
* Allow grouping edits together and bulk submitting them<br />
* Allow editing an edit and resubmitting it without impacting the edit queue<br />
* Allow editing via the web service, eventually<br />
<br />
Bookbrainz<br />
<br />
* Oliver's pet project and testing ground for future MB framework changes<br />
* BB emulates git<br />
** it allows building a stack of changes and then submitting all of them together in one 'commit'<br />
** It takes a snapshot of the data at the time. We don't have that with historical edits so migrating old edits is a problem<br />
<br />
Further reading regarding <br />
<br />
* <span class="c10">http://en.wikipedia.org/wiki/Inter-rater_reliability (especially the links at the end)<br />
* <span class="c10">http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00074<br />
<br />
==== Web service ====<br />
<br />
* Roll out 3scale and move all commercial users over to a pay2play system with different packages<br />
* Non-commercial users would use the free2play rate-limited system with the option of paying for better access<br />
<br />
==== Audio fingerprinting ====<br />
<br />
* We all hate PUIDs and we need to move forward<br />
* Acoustid looks very promising, it's open source, file oriented, and has strong ties with MB<br />
* <span class="c10">http://acoustid.org/<br />
* May be possible to bulk fingerprint some data sources<br />
<br />
==== Concert support ====<br />
<br />
* Do we go with one provider or several?<br />
** Start with Songkick, but stay open to the option of different providers - especially to gain global coverage<br />
* Do we concentrate on future events or archived events?<br />
** Initially link to Songkick for future events<br />
** Create a new setlist entity for past events<br />
** Create a new venue entity<br />
* Need to consider Location, would be useful for artist as well as for events.<br />
<br />
==== Tracks vs recordings (vs works) ====<br />
<br />
* Similar to the remaster issue<br />
* Do we add further levels of abstraction?<br />
** No. We're already saturated with entities. We need better definitions<br />
** ...and we still haven't totally defined works<br />
* Do we count silence as a divergence point?<br />
<br />
==== Service segregation ====<br />
<br />
* Announce the closing of trac (and all its tickets) and the deprecation of subversion<br />
* <span class="c10">[http://svn.musicbrainz.org svn.musicbrainz.org] will remain as an interface for the search server <br />
* Consider replacing gitweb with github in a more official capacity<br />
<br />
==== Genres ====<br />
<br />
* A new field that is to be used specifically for genres<br />
* Features: autocomplete, canonical names, <br />
* Micah is offering genre data based on wikipedia <br />
<br />
==== Product offering ====<br />
<br />
:'''This is not a complete or final model and not official!<br />
<br />
* "Drug dealer" model - free the first time, get addicted, pay for easy further access<br />
* Data dumps (twice a week)<br />
** Public $100* *suggested<br />
** CC-NC $250 (Paying for commercial use of NC use data)<br />
* Live data feed ($/mth)<br />
** Twice-weekly $500<br />
** Daily $1500<br />
** Hourly $2500<br />
* Web service calls (flat fee)<br />
** 10K $10<br />
** 25K $20<br />
** 50K $30<br />
** 100K $50<br />
* Virtual machine<br />
** VM + Data $300<br />
** VM + Data + Search $400<br />
* Tagger Affiliate Program<br />
** TBD: Clarification of the scope of the program<br />
** TBD: Web service referral kickbacks<br />
<br />
=== Sunday (Oct 16) ===<br />
<br />
==== 3rd party data set integration ====<br />
<br />
* '''Lyrics from musiXmatch'''<br />
** daily updates, but will start with weekly ones<br />
** updates will include all MB/mXm matched lyrics<br />
** lyrics can be added also from edit interface<br />
** How do we best use their lyrics data?<br />
*** '''Solution''': Link to mxm via a lyrics icon in the tracklist and a proper link on the recording page<br />
* '''See also Monday<br />
<br />
==== Tracklist/medium overhaul with video support ====<br />
<br />
* Videos are becoming increasingly common as a music release medium (e.g. itunes)<br />
* WIll require major schema changes and looking at the long term goals of MusicBrainz<br />
* <span class="c12">Solution: Table the discussion for now, reopen in a different setting with developers<br />
<br />
==== Group multiple release events (country+date) together ====<br />
<br />
* There is a need to group multiple releases together when each release is the exact same - just released in a different country<br />
* Due to tradition, different countries/regions issue releases on different days of the week<br />
* <span class="c12">Solution: Allow multiple release events per release when the label, barcode, and tracklist is the same<br />
<br />
==== Date improvements ====<br />
<br />
* Unknown end date (dead/disbanded, but we don't exactly know when)<br />
** '''Solution''': Add a column to the date table to specifically state that the entity is dead/disbanded, but we don't know when<br />
* Fuzzy dates (16th century composer edge cases)<br />
** '''Solution''': Use a 'century' column<br />
<br />
==== Data quality ====<br />
<br />
* <span class="c10">http://wiki.musicbrainz.org/User:Wizzcat/Data_Quality_Extension<br />
* <span class="c10">http://wiki.xabbu.net/Data_quality<br />
* Current implementation of data quality has a bad name, is poorly defined, and isn't used<br />
* What do we want to solve?<br />
** Explicitly state that a release has been reviewed/verified<br />
*** <span class="c12">'''Solution''': +1 / -1 votes that decay in weight over some function of time<br />
** Protect against ignorance (The White Album vs The Beatles)<br />
*** <span class="c12">'''Solution''': Add a 'Protected' flag (i.e. edits expire by default)<br />
** Measure of completeness<br />
*** <span class="c12">'''Solution:''' "Completed as per liner notes" checkbox that is accessible via the WS <br />
* <span class="c12">'''Conclusion''': High quality is the protected flag, default quality is default, low quality goes away<br />
<br />
==== Release group attributes ====<br />
<br />
* Currently, 'remix' and 'soundtrack' are at the same meta level as 'album' or 'lp'<br />
* '''Conclusion''': Postponed till a proposal can be drawn up<br />
<br />
==== Reports ====<br />
<br />
* Improve the explanation that is shown at the top of each reports' page<br />
* Improve report flow (e.g. ability to hide items from reports)<br />
<br />
* Allow marking an entry as 'done'<br />
* Default report list should filter out all entries marked as done with more than X votes<br />
* Allow viewing the report with the filtered out entries<br />
<br />
==== Site notifications + subscriptions ====<br />
<br />
* List all emails in a site inbox<br />
* Create a dynamic list of subscribed artists with open edits<br />
<br />
==== Testing ====<br />
<br />
* As finances improve, employ a dedicated person that will lead the testing<br />
<br />
==== Pagination ====<br />
<br />
* Filter on release group properties<br />
* Use infinite scroll<br />
* Be able to reorder, add, remove, and sort columns<br />
<br />
==== Medium attributes (12" vinyl, 80 cm CD) ====<br />
<br />
* Switch from a hierarchical tree to attributes<br />
<br />
==== Music dashboard ====<br />
<br />
* What information should we show?<br />
**http://www.discogs.com/<br />
** http://www.freebase.com/<br />
** http://www.soundunwound.com/<br />
<br />
==== Instrument tree ====<br />
<br />
* Change from a tree to a graph<br />
** Flatten the graph into a tree and allow an instrument to have multiple parents<br />
* Add model support to the instrument tree<br />
* Importing freebase data<br />
** How often do we sync the data?<br />
** How do we reconcile differences in data?<br />
** How often do deletes/merges/changes happen?<br />
* Going forward, if we need a new instrument we would add it to freebase<br />
<br />
==== Universal Music Group International ====<br />
<br />
* "I am very happy to declare Universal's support for MusicBrainz and its community" - Innovation Manager at Universal Music Group International<br />
<br />
==== Release editor ====<br />
<br />
* Default tracklist page shows the advanced view<br />
* For new releases you see the add disc dialog<br />
* THe track parser noves into the add disc dialog<br />
* There needs to be a way to reparse from the advanced view<br />
<br />
==== Wiki ====<br />
<br />
* Remove unneeded extensions<br />
* Update to Ubuntu's MediaWiki package<br />
* Get the API working<br />
* Install wiki at /wiki/Article and then redirect to /Article<br />
* Write a wiki test suite<br />
<br />
=== Monday (Oct 17th) ===<br />
<br />
==== Initial dates on release group ====<br />
<br />
* Last.fm would like to create 'best of the decade' lists and filter out data such as the 2009 re-release of The Beatles <br />
* Currently, release group dates match the date of the earliest release in that group, but in the case of re-releases we often only have data on the modern release and are missing (for example) the original '70s vinyl release<br />
* '''Solution''': Add an editable initial date field at the release group level<br />
** The date field will default to empty because anyone who wants the group date can guess via its earliest release (like MB does now)<br />
<br />
<br />
<br />
==== 3rd party data set integration ====<br />
<br />
* How do we properly link to different data sets? (e.g., musiXmatch, soundunwound, last.fm, etc.)<br />
** '''Solution''': Build a generic framework that allows us to import any external data set and reconcile it with the data we have<br />
** Use a second "integration database" that contains all raw data from external sources (label feeds, partners, etc.)<br />
** Import data into the main database with a de-duplication script, but do not remove any of the original raw data (this allows further parsing in the future)<br />
** Also look into Google Refine for manual reconciliation: <span class="c10">http://code.google.com/p/google-refine/<br />
* A long term goal is to create an editing API that we can gradually open up to our data partners and the ecosystem<br />
** This will allow partners like Zvooq to edit data on their website, but feed the changes back to the rest of the MB ecosystem<br />
<br />
== Feature prioritization ==<br />
<br />
Feature (votes)<br />
<br />
# Edit system (9)<br />
# Group multiple release events together (6)<br />
# Data quality (6)<br />
# 3rd party data set integration (5)<br />
# Single sign on & password security (5)<br />
# Instrument tree (4)<br />
# Genres (4)<br />
# Medium attributes (4)<br />
# Release group attributes (3)<br />
# Music dashboard (2)<br />
# Tracklist/medium overhaul with video support (1)<br />
# Pagination (1)<br />
# Site notifications (1)<br />
# Report improvements (1)<br />
# Date improvements (0)<br />
# Auto-editor elections (0)</div>Valerio.paolinihttps://wiki.musicbrainz.org/index.php?title=MusicBrainz_Summit/11/Session_Notes&diff=48618MusicBrainz Summit/11/Session Notes2011-10-18T12:47:51Z<p>Valerio.paolini: /* musiXmatch */</p>
<hr />
<div>== Attendees ==<br />
<br />
* Kuno Woudt (warp)<br />
* Pavan Chander (navap)<br />
* Rob Kaye (ruaok)<br />
* Nikki<br />
* Oliver Charles (ocharles)<br />
* Jamie McDonald (jdamcd)<br />
* Nicolás Tamargo (reosarevok)<br />
* CatCat<br />
* Per Øyvind Øygard (Wizzcat)<br />
* Paul Taylor (ijabz)<br />
* Mathias Kunter (mathiaskunter)<br />
* Hilbert Woudt (monedula)<br />
<br />
Sponsor representatives<br />
<br />
* musiXmatch: Valerio Paolini<br />
* Last.fm: Adrian Woodhead<br />
* Google/Freebase: Micah Saul (micahsaul)<br />
* Zvooq: Andrey Popp (andreypopp)<br />
* BBC: Dave Evans (djce)<br />
<br />
== Customer introductions ==<br />
<br />
=== Last.fm ===<br />
<br />
* They have had a lot of personnel changes over the last few years, but would like to re-establish a relationship with MB<br />
* Are looking to switch to NGS schema by the end of the year<br />
* They would like to use MBIDs internally to make communication between incoming data sets easier<br />
* Will consider sharing partial label feed/data<br />
* Might actually solve their artist disambiguation issue soon..ish<br />
<br />
=== Zvooq ===<br />
<br />
* They are a Spotify competitor in Russia that focuses on music released worldwide<br />
<br />
=== musiXmatch ===<br />
<br />
* They are a lyrics database<br />
* World wide license from Sony, Universal, EMI, Warner, BMG, Kobalt<br />
<br />
=== Freebase ===<br />
<br />
* Freebase is a big data repository of various data sets covering movies, music, sports, people, locations, and others<br />
*http://freebase.org<br />
<br />
=== BBC ===<br />
<br />
* They are looking to finally make the switch to NGS<br />
* Their music news website now uses ws/2<br />
* They outsource their album reviews and MB data entry to Unique Broadcasting Company<br />
<br />
== Discussions ==<br />
<br />
=== Friday (Oct 14) ===<br />
<br />
==== Single sign on & password security ====<br />
<br />
Goals<br />
<br />
* Not storing plaintext passwords<br />
* Not having knowable (i.e. reversible) passwords<br />
* Not transmitting passwords in the clear<br />
* Single sign on<br />
<br />
Questions<br />
<br />
* What specific password issues are we trying to solve?<br />
<br />
Discussed proposals<br />
<br />
* Implement OpenID<br />
* Using digest authentication (still requires storing and transferring the clear text password)<br />
* Using SSL (requires updating web service libraries)<br />
* Using a separate LDAP server (password no longer in MB database and stored elsewhere, also allows for possible single sign on integration)<br />
<br />
'''Conclusion:''' Use LDAP and phase in SSL to increase password security. Bonus: LDAP makes single sign on possible.<br />
<br />
=== Saturday (Oct 15) ===<br />
<br />
==== Cover art archive ====<br />
<br />
* Universal is considering handing over their entire cover art archive to us<br />
* Labels actually don't own copyright on cover art<br />
* There are potential messy legal issues to using cover art<br />
* The Internet Archive functions as a library and can act as a 'cover art shelter' for us<br />
* Possible process:<br />
** A release's MBID can be used to receieve a cover art image<br />
** If you know a release's MBID you can do a GET and receive a cover art image<br />
** Track cover art uploads by user and also use regular voting process<br />
** Images will be provided as a hi-res (~15 MB) and as a low-res (500 px)<br />
<br />
Questions<br />
<br />
* Does the user have to upload JPEG or can the server transcode?<br />
* What status code will we return when a 'darkened' image exists but we're not allowed to display it?<br />
* If we get cover art from Universal, how do we match each image up with a release?<br />
* How do we handle a release group with many releases (i.e., do we use the same image?)<br />
* How do we handle multiple images (e.g. front, back, obi, liner notes, cd faces, etc.)<br />
<br />
http://wiki.musicbrainz.org/Cover_Art_Archive<br />
<br />
http://wiki.musicbrainz.org/Cover_Art_Wishlist<br />
<br />
==== Data quality ====<br />
<br />
* '''See Sunday<br />
<br />
==== Edit system ====<br />
<br />
Goals<br />
<br />
* Allow grouping edits together and bulk submitting them<br />
* Allow editing an edit and resubmitting it without impacting the edit queue<br />
* Allow editing via the web service, eventually<br />
<br />
Bookbrainz<br />
<br />
* Oliver's pet project and testing ground for future MB framework changes<br />
* BB emulates git<br />
** it allows building a stack of changes and then submitting all of them together in one 'commit'<br />
** It takes a snapshot of the data at the time. We don't have that with historical edits so migrating old edits is a problem<br />
<br />
Further reading regarding <br />
<br />
* <span class="c10">http://en.wikipedia.org/wiki/Inter-rater_reliability (especially the links at the end)<br />
* <span class="c10">http://www.mitpressjournals.org/doi/abs/10.1162/COLI_a_00074<br />
<br />
==== Web service ====<br />
<br />
* Roll out 3scale and move all commercial users over to a pay2play system with different packages<br />
* Non-commercial users would use the free2play rate-limited system with the option of paying for better access<br />
<br />
==== Audio fingerprinting ====<br />
<br />
* We all hate PUIDs and we need to move forward<br />
* Acoustid looks very promising, it's open source, file oriented, and has strong ties with MB<br />
* <span class="c10">http://acoustid.org/<br />
* May be possible to bulk fingerprint some data sources<br />
<br />
==== Concert support ====<br />
<br />
* Do we go with one provider or several?<br />
** Start with Songkick, but stay open to the option of different providers - especially to gain global coverage<br />
* Do we concentrate on future events or archived events?<br />
** Initially link to Songkick for future events<br />
** Create a new setlist entity for past events<br />
** Create a new venue entity<br />
* Need to consider Location, would be useful for artist as well as for events.<br />
<br />
==== Tracks vs recordings (vs works) ====<br />
<br />
* Similar to the remaster issue<br />
* Do we add further levels of abstraction?<br />
** No. We're already saturated with entities. We need better definitions<br />
** ...and we still haven't totally defined works<br />
* Do we count silence as a divergence point?<br />
<br />
==== Service segregation ====<br />
<br />
* Announce the closing of trac (and all its tickets) and the deprecation of subversion<br />
* <span class="c10">[http://svn.musicbrainz.org svn.musicbrainz.org] will remain as an interface for the search server <br />
* Consider replacing gitweb with github in a more official capacity<br />
<br />
==== Genres ====<br />
<br />
* A new field that is to be used specifically for genres<br />
* Features: autocomplete, canonical names, <br />
* Micah is offering genre data based on wikipedia <br />
<br />
==== Product offering ====<br />
<br />
:'''This is not a complete or final model and not official!<br />
<br />
* "Drug dealer" model - free the first time, get addicted, pay for easy further access<br />
* Data dumps (twice a week)<br />
** Public $100* *suggested<br />
** CC-NC $250 (Paying for commercial use of NC use data)<br />
* Live data feed ($/mth)<br />
** Twice-weekly $500<br />
** Daily $1500<br />
** Hourly $2500<br />
* Web service calls (flat fee)<br />
** 10K $10<br />
** 25K $20<br />
** 50K $30<br />
** 100K $50<br />
* Virtual machine<br />
** VM + Data $300<br />
** VM + Data + Search $400<br />
* Tagger Affiliate Program<br />
** TBD: Clarification of the scope of the program<br />
** TBD: Web service referral kickbacks<br />
<br />
=== Sunday (Oct 16) ===<br />
<br />
==== 3rd party data set integration ====<br />
<br />
* '''See Monday<br />
<br />
==== Tracklist/medium overhaul with video support ====<br />
<br />
* Videos are becoming increasingly common as a music release medium (e.g. itunes)<br />
* WIll require major schema changes and looking at the long term goals of MusicBrainz<br />
* <span class="c12">Solution: Table the discussion for now, reopen in a different setting with developers<br />
<br />
==== Group multiple release events (country+date) together ====<br />
<br />
* There is a need to group multiple releases together when each release is the exact same - just released in a different country<br />
* Due to tradition, different countries/regions issue releases on different days of the week<br />
* <span class="c12">Solution: Allow multiple release events per release when the label, barcode, and tracklist is the same<br />
<br />
==== Date improvements ====<br />
<br />
* Unknown end date (dead/disbanded, but we don't exactly know when)<br />
** '''Solution''': Add a column to the date table to specifically state that the entity is dead/disbanded, but we don't know when<br />
* Fuzzy dates (16th century composer edge cases)<br />
** '''Solution''': Use a 'century' column<br />
<br />
==== Data quality ====<br />
<br />
* <span class="c10">http://wiki.musicbrainz.org/User:Wizzcat/Data_Quality_Extension<br />
* <span class="c10">http://wiki.xabbu.net/Data_quality<br />
* Current implementation of data quality has a bad name, is poorly defined, and isn't used<br />
* What do we want to solve?<br />
** Explicitly state that a release has been reviewed/verified<br />
*** <span class="c12">'''Solution''': +1 / -1 votes that decay in weight over some function of time<br />
** Protect against ignorance (The White Album vs The Beatles)<br />
*** <span class="c12">'''Solution''': Add a 'Protected' flag (i.e. edits expire by default)<br />
** Measure of completeness<br />
*** <span class="c12">'''Solution:''' "Completed as per liner notes" checkbox that is accessible via the WS <br />
* <span class="c12">'''Conclusion''': High quality is the protected flag, default quality is default, low quality goes away<br />
<br />
==== Release group attributes ====<br />
<br />
* Currently, 'remix' and 'soundtrack' are at the same meta level as 'album' or 'lp'<br />
* '''Conclusion''': Postponed till a proposal can be drawn up<br />
<br />
==== Reports ====<br />
<br />
* Improve the explanation that is shown at the top of each reports' page<br />
* Improve report flow (e.g. ability to hide items from reports)<br />
<br />
* Allow marking an entry as 'done'<br />
* Default report list should filter out all entries marked as done with more than X votes<br />
* Allow viewing the report with the filtered out entries<br />
<br />
==== Site notifications + subscriptions ====<br />
<br />
* List all emails in a site inbox<br />
* Create a dynamic list of subscribed artists with open edits<br />
<br />
==== Testing ====<br />
<br />
* As finances improve, employ a dedicated person that will lead the testing<br />
<br />
==== Pagination ====<br />
<br />
* Filter on release group properties<br />
* Use infinite scroll<br />
* Be able to reorder, add, remove, and sort columns<br />
<br />
==== Medium attributes (12" vinyl, 80 cm CD) ====<br />
<br />
* Switch from a hierarchical tree to attributes<br />
<br />
==== Music dashboard ====<br />
<br />
* What information should we show?<br />
**http://www.discogs.com/<br />
** http://www.freebase.com/<br />
** http://www.soundunwound.com/<br />
<br />
==== Instrument tree ====<br />
<br />
* Change from a tree to a graph<br />
** Flatten the graph into a tree and allow an instrument to have multiple parents<br />
* Add model support to the instrument tree<br />
* Importing freebase data<br />
** How often do we sync the data?<br />
** How do we reconcile differences in data?<br />
** How often do deletes/merges/changes happen?<br />
* Going forward, if we need a new instrument we would add it to freebase<br />
<br />
==== Universal Music Group International ====<br />
<br />
* "I am very happy to declare Universal's support for MusicBrainz and its community" - Innovation Manager at Universal Music Group International<br />
<br />
==== Release editor ====<br />
<br />
* Default tracklist page shows the advanced view<br />
* For new releases you see the add disc dialog<br />
* THe track parser noves into the add disc dialog<br />
* There needs to be a way to reparse from the advanced view<br />
<br />
==== Wiki ====<br />
<br />
* Remove unneeded extensions<br />
* Update to Ubuntu's MediaWiki package<br />
* Get the API working<br />
* Install wiki at /wiki/Article and then redirect to /Article<br />
* Write a wiki test suite<br />
<br />
=== Monday (Oct 17th) ===<br />
<br />
==== Initial dates on release group ====<br />
<br />
* Last.fm would like to create 'best of the decade' lists and filter out data such as the 2009 re-release of The Beatles <br />
* Currently, release group dates match the date of the earliest release in that group, but in the case of re-releases we often only have data on the modern release and are missing (for example) the original '70s vinyl release<br />
* '''Solution''': Add an editable initial date field at the release group level<br />
** The date field will default to empty because anyone who wants the group date can guess via its earliest release (like MB does now)<br />
<br />
<br />
<br />
==== 3rd party data set integration ====<br />
<br />
* How do we properly link to different data sets? (e.g., musiXmatch, soundunwound, last.fm, etc.)<br />
** '''Solution''': Build a generic framework that allows us to import any external data set and reconcile it with the data we have<br />
** Use a second "integration database" that contains all raw data from external sources (label feeds, partners, etc.)<br />
** Import data into the main database with a de-duplication script, but do not remove any of the original raw data (this allows further parsing in the future)<br />
** Also look into Google Refine for manual reconciliation: <span class="c10">http://code.google.com/p/google-refine/<br />
* A long term goal is to create an editing API that we can gradually open up to our data partners and the ecosystem<br />
** This will allow partners like Zvooq to edit data on their website, but feed the changes back to the rest of the MB ecosystem<br />
<br />
== Feature prioritization ==<br />
<br />
Feature (votes)<br />
<br />
# Edit system (9)<br />
# Group multiple release events together (6)<br />
# Data quality (6)<br />
# 3rd party data set integration (5)<br />
# Single sign on & password security (5)<br />
# Instrument tree (4)<br />
# Genres (4)<br />
# Medium attributes (4)<br />
# Release group attributes (3)<br />
# Music dashboard (2)<br />
# Tracklist/medium overhaul with video support (1)<br />
# Pagination (1)<br />
# Site notifications (1)<br />
# Report improvements (1)<br />
# Date improvements (0)<br />
# Auto-editor elections (0)</div>Valerio.paolini