Development/Summer of Code/2011

From MusicBrainz Wiki
< Development‎ | Summer of Code
Revision as of 13:25, 28 July 2007 by Murdos (talk | contribs) (added status for accepted projects (Imported from MoinMoin))
Jump to navigationJump to search

Ideas for Google's Summer of Code

The MetaBrainz Foundation has applied as a Google Summer of Code organization in 2007. This will allow MusicBrainz hackers to apply for the Summer of Code program and if accepted, get paid for hacking on MusicBrainz.

This page discussed various ideas for project that people can take on during their Summer of Code. If you have your own ideas for Summer of Code, please add them at the bottom of this page.

All applications for Summer of Code must pass a community review process where the proposer must clearly define their idea and present it to the community at large. Proposers must be/become active members of the community and must adapt their proposals according to community feedback. If the community does not approve of the project, the project will not be accepted by Google or by the MetaBrainz Foundation.

Furthermore, all projects must develop new features for MusicBrainz. Proposals for replacing existing and working projects for the sake of making them more open will not be accepted. Proposals for extending existing projects with new features have a much greater chance at being accepted.

Collaborative Filtering: Artist - Artist Relationships

Status: Accepted.

Student: Sharon Myrtle Paradesi, Mentor: Robert Kaye

Currently MusicBrainz has very old and stale artist to artist relationship data. This data provides an indication of how closely (musically) two artists are related. The data we're currently using was derived from crawling the Gnutella network and determining relationship data based on what music people have in their music collection.

Rather than relying on external data sources, it would be best to rely on our internal data from the MusicBrainz project. Useful piece of data include:

  • various artist albums -- two artists that are on a compilation together are likely to be somewhat similar.
  • search logs -- one user searching for a number of artists also gives an indication of similarity.
  • artist subscription -- the artists that MusicBrainz users subscribe should also yield some information.

This project should figure out what data sources to use, write the code to collect data from these sources and then generate new artist-artist data.

Skills required: Perl, Python and SQL

User Interface improvements: Creative Commons Integration

Right now the support for indicating Creative Commons licenses inside of MusicBrainz is minimal and has not been widely used. This project would entail working with Creative Commons to improve the current support. Desired improvements include:

  • /edit/relationship/addcc.html should attempt to scrape license from album URL provided. If more than one potential license URL scraped, give user a choice.
  • If no license found on album URL and user really wants to associate a CC license with the album, the license should be taken in the form of a URL instead of a dropdown (the dropdown does not account for all CC licenses, and there are too many variations to reasonably use a dropdown).
  • On an album page, e.g., http://musicbrainz.org/release/e90879a2-9cd9-41bb-ba9a-ac5d7801686a.html with an associated CC license, the license should be linked to rather than merely spelled out.
  • It should be possible to associate a license with an individual track, not just an album. Many compilation albums have tracks with different licenses.
  • XML should include license info for albums and tracks, e.g., http://musicbrainz.org/ws/1/release/e90879a2-9cd9-41bb-ba9a-ac5d7801686a?type=xml&inc=artist+counts+release-events+discs+tracks
  • http://musicbrainz.org/search.html should allow filtering results by license property along the lines of http://search.yahoo.com/cc
  • Submission API should support submitting CC license info with an album/track if it doesn't already so that services like Jamendo and Magnatune can potentially auto-submit new releases to MusicBrainz with appropriate license info.
  • Import catalogs from ccMixter, Magnatune, and Jamendo. This includes coordinating a proper data pass off interface between MusicBrainz and ccMixter, Magnatune and Jamendo and implementing the MusicBrainz side of the data import features.

Skills required: Perl, JavaScript and SQL

User Interface improvements: Artist Page Redesign

The current artist/release pages are in dire need of being redesigned. Some work on this front had been started in ArtistPageRedesign, but now abandoned. This work needs to be re-evaluated and freshened up. The implementation of this would require adding support for JSON in our current web service and for lots of user interface work.

Skills required: Perl, JavaScript and SQL

Implementation of the "Next Generation Schema"

Status: Accepted.

Student: Erik Dalén, Mentor: Lukáš Lalinský

The current MusicBrainz database schema is limited and needs to be redesigned to capture all the information required to build a complete discography. Implementation of the NextGenerationSchema will be a long-running project, but hopefully it will be possible to split the work into a few independent steps. One of these steps could be done during the SoC.

Skills required: Perl and SQL

Improved Statistics and Trivia

Status: Accepted.

Student: Guelson Fostine, Mentor: Lukáš Lalinský

The current database statistics page is very simplistic. We would like to see a more comprehensive statistics gathering module complete with more intelligent data graphs that visualize the MusicBrainz data set over time. Part of this project would be a set of trivia pages that show useful information contained in the MusicBrainz database:

For more details on this project idea, please see DataTrivia.

Skills required: Perl and SQL

PicardQT

Help development of newest version of tagger. Perhaps encapsulated as a module within the program or a specific aspect. Pretty much anything that helps move that development along. We much prefer to see proposals that work to make PicardQT better, rather than creating a new tagger from scratch.

Multilingual interface and/or data

The current MusicBrainz website is only available in English. It would be a valuable addition to the project if we have multilingual support to be able to serve people without sufficient English knowledge. Multilingual data support would address some issues as well but this might need more consideration fist.

'Displayed on the cover as' and 'AKA' (Also Known As) track titles support

At the moment we only allow track titles to be added in MusicBrainz according to the style guidelines and the track titles can only have one specific title. Because of this we loose the (fancy) style that's being used on e.g. the release cover which some people prefer to use. It's also the cause of heated debate every now and again when dealing with the so called 'artist intend' exceptions (e.g. 'Fuck the System' vs. 'F**k the System'). By allowing (multiple) so called 'AKA' entry(s), kind of like the ArtistAlias system we already have, we would reduce these issues and we won't loose this information any more.

Proposals NOT wanted

We are not interested in Mentoring the following projects:

  • Creation of new tagging applications: We would much rather see proposals that extend the PicardQT tagger and help along with its development. See the PicardQT section above.
  • Acoustic fingerprinting projects: We have an excellent partner in MusicIP who provides our current fingerprinting technology. Submitting a proposal to replace MusicIP is not going to be accepted since we are very happy with our current arrangement for acoustic fingerprinting.

More ideas

If you have more ideas for the Summer of Code, please add them here.