Development/Summer of Code/2011: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
((Imported from MoinMoin))
 
 
(62 intermediate revisions by 22 users not shown)
Line 1: Line 1:
The MetaBrainz Foundation has applied as a [http://code.google.com/soc Google Summer of Code] organization in 2011. This will allow MusicBrainz hackers to apply for the Summer of Code program and if accepted, get paid for hacking on MusicBrainz.
=Ideas for Google's Summer of Code=


This page lays out various ideas for projects that people can take on during their Summer of Code. If you have your own ideas for Summer of Code, please add them at the bottom of this page.
The MetaBrainz Foundation is applying as a [http://code.google.com/soc Google Summer of Code] organization this year. This will allow MusicBrainz hackers to apply for the Summer of Code program and if accepted, get paid for hacking on MusicBrainz.


All applications for Summer of Code must pass a community review process where the proposer must clearly define their idea and present it to the community at large. Proposers must be/become active members of the community and must adapt their proposals according to community feedback. If the community does not approve of the project, the project will not be accepted by the MetaBrainz Foundation. If your project makes it into the final round of consideration for acceptance, be ready for an interview and possibly even an entrace test to verify the skills claimed on your Summer of Code application.
This page discussed various ideas for project that people can take on during their Summer of Code. If you have your own ideas for Summer of Code, please add them at the bottom of this page.


Furthermore, all projects must develop '''new''' features for MusicBrainz. Proposals for replacing existing and working projects for the sake of making them ''more open'' will not be accepted. Proposals for extending existing projects with new features have a much greater chance at being accepted.
All applications for Summer of Code must pass a community review process where the proposer must clearly define their idea and present it to the community at large. Proposers must be/become active members of the community and must adapt their proposals according to community feedback. If the community does not approve of the project, the project will not be accepted by Google or by the MetaBrainz Foundation.


We '''strongly''' encourage students to delve into MusicBrainz and provide their own ideas for Summer of Code. We're listing a few projects here that we care about greatly, but we're more excited to hear what students want to do!
Futhermore, all projects must develop '''new''' features for MusicBrainz. Proposals for replacing existing and working projects for the sake of making them ''more open'' will not be accepted. Proposals for extending existing projects with new features have a much greater chance at being accepted.


== Mentors ==
==Collaborative Filtering: Artist - Artist Relationships==
This year Robert Kaye, Oliver Charles and Kuno Woudt will be mentoring students. That's ruaok (Robert), warp (Kuno) and ocharles (Oliver) on IRC, if you want to come and speak to us first.


== Proposals ==
Currently MusicBrainz has very old and stale artist to artist relationship data. This data provides an indication of how closely (musically) two artists are related. The data we're currently using was derived from crawling the Gnutella network and determining relationship data based on what music people have in their music collection.


<br>
Rather than relying on external data sources, it would be best to rely on our internal data from the MusicBrainz project. Useful piece of data include:
'''IMPORTANT NOTE''': All of these ideas should assume that they are going to be built on top of our [[Next Generation Schema]]. We will not accept proposals based on our old code base.
* various artist albums -- two artists that are on a compilation together are likely to be somewhat similar.
* search logs -- one user searching for a number of artists also gives an indication of similarity.
* artist subscription -- the artists that MusicBrainz users subscribe should also yield some information.


=== Artist credit utilization ===
This project should figure out what data sources to use, write the code to collect data from these sources and then generate new artist-artist data.
Artists and artist pages are a very central convergence point for the data in MusicBrainz, but unfortunately the current UI doesn't fully capitalize on all the data available such as [[Artist Credit|artist credits]]. This proposal is to redesign the artist and related pages with the intention of providing more thorough access to artist credit data.


Artist credits offer two features: one is the ability to credit artists with various alternative names and still link everything back to the one artist, and the second is to be able to assign multiple artists to a given recording, release, release group, etc.
Skills required: Perl, Python and SQL


Possible outcomes of this proposal would be the ability to:
==User Interface improvements: Creative Commons Integration==
* "Filter" the data on the artist page based on what artist credit was used (similar to Discogs' artist name variation).
* Find what all data a specific collaboration artist credit is credited to (similar to the current server's collaboration artist pages).
* Select a "frequently used" artist credit when doing a lookup for an artist in the add release wizard.


=== Database size/growth visualization ===
Right now the support for indicating Creative Commons licenses inside of MusicBrainz is minimal and has not been widely used. This project would entail working with the Creative Commons to improve the current support. Also, people dedicating a new piece of music on the Creative Commons site should have some way of being passed over to MusicBrainz to enter metdata for their music into MusicBrainz. This would involve making a number of user interface improvements to MusicBrainz to make the site more friendly to Creative Commons users who are not steeped in the MusicBrainz philosophy.


Our current database size/growth visualization is quite lacking. We would love to have an improved means of inspecting the size/growth of our database over time. We would like to see:
Skills required: Perl, JavaScript and SQL
* How big the database is today
* How it has grown over all time or over a specific section of time.
* What is the rate of change of the data growth and can these changes be correlated to features or style policy changes?
* Anything ele that gives us more insight into our database and the community of people working on that database.


=== Embeddable widgets ===
==User Interface improvements: Artist Page Redesign==


All the data in MusicBrainz is easily accessible to programmers, but it's not as easy to use the data if you're not a programmer. To allow third party websites (specifically bloggers) to use data from MusicBrainz we would like to create a set of widgets. Widgets will make it easier for someone writing an article about an artist, or a blog post reviewing a particular release to include correct metadata about the artist or release. It will also let their visitors (human or searchbot) know exactly which release or artist is being talked about.
The current artist/release pages are in dire need of being redesigned. Some work on this front had been started in [[Artist Page Redesign|ArtistPageRedesign]], but now abandoned. This work needs to be re-evaluated and freshened up. The implementation of this would require adding support for JSON in our current web service and for lots of user interface work.


=== Proposal Tracker ===
Skills required: Perl, JavaScript and SQL


The StyleCouncil sometimes has a hard time keeping track of proposals. This project would develop an proposal tracker for the [[Proposals|Style Council]], either modified from an existing issue tracker or written from scratch. The goal is for anyone interested to be able to see:
==Implementation of the "Next Generation Schema"==


* the current status of a proposal (e.g. RFC/RFV/withdrawn/approved/implemented)
The current MusicBrainz database schema is limited and needs to be redesigned to capture all the information required to build a complete discography. Implementation of the [[Next Generation Schema|NextGenerationSchema]] will be a long-running project, but hopefully it will be possible to split the work into a few independent steps. One of these steps could be done during the SoC.
* when it was last discussed on the mailing list
* who is responsible for it
* the proposal itself


===Your idea here!===
Skills required: Perl and SQL


We really like getting completely different suggestions, so please do not feel ''at all'' limited to the above proposals. If there's something cool you think could be done with MusicBrainz and want to make it happy - please suggest away.
==Improved Statistics and Trivia==


====Idea:Embeddable Widgets====
The [http://musicbrainz.org/stats.html current database statistics] page is very simplistic. We would like to see a more comprehensive statistics gathering module complete with more intelligent data graphs that visualize the MusicBrainz data set over time. Part of this project would be a set of trivia pages that show useful information contained in the MusicBrainz database:
We could add widgets that embed different types of information like: Artist profiles Artist latest songs / greatest songs / Album cover art (Images in public domain), Direct links to songs (like to grooveshark), RSS feeds for artist news / concerts, Tune identification (HTML5 allows mic access) [[User:Divya|Divya]]
* Upcoming releases
* Recently deceased artists
* Useful statistics from [[Advanced Relationship|AdvancedRelationship]] links


====Idea====
For more details on this project idea, please see [[Data Trivia|DataTrivia]].
Adding data into Musicbrainz is largely a manual process - which I think is unsustainable, (new releases are coming along much quicker than they can be handled). It would be interesting to look at some way of automatically entering data from outside the Web Interface (it would still require manual voting) or improvements to modbots for fixing existing data. This could range from entering a new release, creating a link to a Discogs release or converting (Disc 1) in titles to release groups.

[[User:Ijabz|Ijabz]]
Skills required: Perl and SQL
* There is a bot that's not currently part of MusicBrainz, a project could be to work with the author of this bot into making it part of MusicBrainz proper.

==PicardQT==

Help development of newest version of tagger. Perhaps encapsulated as a module within the program or a specific aspect. Pretty much anything that helps move that development along. We much prefer to see proposals that work to make PicardQT better, rather than creating a new tagger from scratch.


==Proposals NOT wanted==
==Proposals NOT wanted==


We are not interested in Mentoring the following projects:
We are not interested in Mentoring the following projects:
* Creation of new tagging applications: We would much rather see proposals that extend the Picard tagger and help along with its development.
* Acoustic fingerprinting projects: There are new and open acoustic fingerprint projects coming out soon, so we don't need to focus on this.
* Artist-artist collaborative filtering: We have a third party that has volunteered to work on this for us pro bono.
* Collaborative similarity, along the lines of tags, as well as similarity cloud-link charts, between artists, labels, or releases.
* Ports of MusicBrainz: Yes, we're aware of Django and Ruby on Rails, but porting all of MusicBrainz to run on these platforms is not at all practical.


* Creation of new tagging applications: We would much rather see proposals that extend the PicardQT tagger and help along with its development. See the PicardQT section above. * Acoustic fingerprinting projects: We have an excellent partner in MusicIP who provides our current fingerprinting technology. Submitting a proposal to replace MusicIP is not going to be accepted since we are very happy with our current arrangement for acoustic fingerprinting.


== About proposals ==
==More ideas==
Before you dive in and send a proposal to us through Google, it's a good idea to take some time and learn about the MusicBrainz community. At MusicBrainz we pride ourselves for having a strong community - most of us know each other in same way, and some of us know each other face to face from development summits.


A good way to get a feel of this would be to lurk around in IRC, or to talk about your proposals on the mailing list.
If you have more ideas for the Summer of Code, please add them here.


[[Category:To Be Reviewed]] [[Category:Development]]
[[Category:Development]]

Latest revision as of 19:46, 31 January 2012

The MetaBrainz Foundation has applied as a Google Summer of Code organization in 2011. This will allow MusicBrainz hackers to apply for the Summer of Code program and if accepted, get paid for hacking on MusicBrainz.

This page lays out various ideas for projects that people can take on during their Summer of Code. If you have your own ideas for Summer of Code, please add them at the bottom of this page.

All applications for Summer of Code must pass a community review process where the proposer must clearly define their idea and present it to the community at large. Proposers must be/become active members of the community and must adapt their proposals according to community feedback. If the community does not approve of the project, the project will not be accepted by the MetaBrainz Foundation. If your project makes it into the final round of consideration for acceptance, be ready for an interview and possibly even an entrace test to verify the skills claimed on your Summer of Code application.

Furthermore, all projects must develop new features for MusicBrainz. Proposals for replacing existing and working projects for the sake of making them more open will not be accepted. Proposals for extending existing projects with new features have a much greater chance at being accepted.

We strongly encourage students to delve into MusicBrainz and provide their own ideas for Summer of Code. We're listing a few projects here that we care about greatly, but we're more excited to hear what students want to do!

Mentors

This year Robert Kaye, Oliver Charles and Kuno Woudt will be mentoring students. That's ruaok (Robert), warp (Kuno) and ocharles (Oliver) on IRC, if you want to come and speak to us first.

Proposals


IMPORTANT NOTE: All of these ideas should assume that they are going to be built on top of our Next Generation Schema. We will not accept proposals based on our old code base.

Artist credit utilization

Artists and artist pages are a very central convergence point for the data in MusicBrainz, but unfortunately the current UI doesn't fully capitalize on all the data available such as artist credits. This proposal is to redesign the artist and related pages with the intention of providing more thorough access to artist credit data.

Artist credits offer two features: one is the ability to credit artists with various alternative names and still link everything back to the one artist, and the second is to be able to assign multiple artists to a given recording, release, release group, etc.

Possible outcomes of this proposal would be the ability to:

  • "Filter" the data on the artist page based on what artist credit was used (similar to Discogs' artist name variation).
  • Find what all data a specific collaboration artist credit is credited to (similar to the current server's collaboration artist pages).
  • Select a "frequently used" artist credit when doing a lookup for an artist in the add release wizard.

Database size/growth visualization

Our current database size/growth visualization is quite lacking. We would love to have an improved means of inspecting the size/growth of our database over time. We would like to see:

  • How big the database is today
  • How it has grown over all time or over a specific section of time.
  • What is the rate of change of the data growth and can these changes be correlated to features or style policy changes?
  • Anything ele that gives us more insight into our database and the community of people working on that database.

Embeddable widgets

All the data in MusicBrainz is easily accessible to programmers, but it's not as easy to use the data if you're not a programmer. To allow third party websites (specifically bloggers) to use data from MusicBrainz we would like to create a set of widgets. Widgets will make it easier for someone writing an article about an artist, or a blog post reviewing a particular release to include correct metadata about the artist or release. It will also let their visitors (human or searchbot) know exactly which release or artist is being talked about.

Proposal Tracker

The StyleCouncil sometimes has a hard time keeping track of proposals. This project would develop an proposal tracker for the Style Council, either modified from an existing issue tracker or written from scratch. The goal is for anyone interested to be able to see:

  • the current status of a proposal (e.g. RFC/RFV/withdrawn/approved/implemented)
  • when it was last discussed on the mailing list
  • who is responsible for it
  • the proposal itself

Your idea here!

We really like getting completely different suggestions, so please do not feel at all limited to the above proposals. If there's something cool you think could be done with MusicBrainz and want to make it happy - please suggest away.

Idea:Embeddable Widgets

We could add widgets that embed different types of information like: Artist profiles Artist latest songs / greatest songs / Album cover art (Images in public domain), Direct links to songs (like to grooveshark), RSS feeds for artist news / concerts, Tune identification (HTML5 allows mic access) Divya

Idea

Adding data into Musicbrainz is largely a manual process - which I think is unsustainable, (new releases are coming along much quicker than they can be handled). It would be interesting to look at some way of automatically entering data from outside the Web Interface (it would still require manual voting) or improvements to modbots for fixing existing data. This could range from entering a new release, creating a link to a Discogs release or converting (Disc 1) in titles to release groups. Ijabz

  * There is a bot that's not currently part of MusicBrainz, a project could be to work with the author of this bot into making it part of MusicBrainz proper.

Proposals NOT wanted

We are not interested in Mentoring the following projects:

  • Creation of new tagging applications: We would much rather see proposals that extend the Picard tagger and help along with its development.
  • Acoustic fingerprinting projects: There are new and open acoustic fingerprint projects coming out soon, so we don't need to focus on this.
  • Artist-artist collaborative filtering: We have a third party that has volunteered to work on this for us pro bono.
  • Collaborative similarity, along the lines of tags, as well as similarity cloud-link charts, between artists, labels, or releases.
  • Ports of MusicBrainz: Yes, we're aware of Django and Ruby on Rails, but porting all of MusicBrainz to run on these platforms is not at all practical.


About proposals

Before you dive in and send a proposal to us through Google, it's a good idea to take some time and learn about the MusicBrainz community. At MusicBrainz we pride ourselves for having a strong community - most of us know each other in same way, and some of us know each other face to face from development summits.

A good way to get a feel of this would be to lurk around in IRC, or to talk about your proposals on the mailing list.