Development/Summer of Code/2014

From MusicBrainz Wiki

Mentors

This year Robert Kaye, Ian McEwen and Michael Wiencek will probably be amongst our mentors. That's ruaok (Robert), ianmcorvidae (Ian) and bitmap (Michael) on IRC, if you want to come and speak to us first. Some potential mentors are listed by each project; this is far from a normative list, but it might give you somebody to ask about the project.

Suggestions

This is our set of starting ideas for 2014. Add more ideas if you have them!

Add Events to MusicBrainz

Proposed mentors: ianmcorvidae

After many years of wanting "Events" (concerts, performances, etc) we're finally in a position to take on that project. We'd like a student who is already familiar with MusicBrainz and how our dev team works, to implement Events in MusicBrainz. We are unlikely to consider students who are new to MusicBrainz for this project, due to its involved nature.

Read-only, browser-oriented site

Proposed mentors: ruaok, navap, Freso

There's been some discussion of implementing a site better geared toward casual browsing users, rather than editors, containing stuff like Wikipedia bios, reviews, embedded streaming for those recordings we have relationships for, etc. This is very open-ended at the moment, but some likely issues are in terms of maintainability (potentially two codebases!?), what exactly needs to be shown, effect on the existing site, etc. It's an open question whether this would constitute changes to the current site (when not logged-in, most likely) or whether this would be an additional site/codebase/view on our data.

Move MusicBrainz Search to SOLR

Proposed mentors: ruaok

Currently MusicBrainz uses custom search code that rebuilds indexes every few hours. We'd like someone to work on replacing our custom code with Apache SOLR and also work out a way to implement in place index updates to give us near real-time index updating capabilities. Students who work on this should be familiar with SOLR, JSON, Perl, Postgres and Python. Understanding how MusicBrainz works and having contributed to the project before GSoC is a great plus.

Finish & Deploy CritiqueBrainz

Proposed mentors: ruaok

Last year we had a student write the CritiqueBrainz project, which allows editors to write non-neutral point of view music reviews. Our student did a great job and finished everything he set out to do; however there wasn't enough time to actually deploy the project and fix initial bugs. In this proposed project one student would spend roughly half of the summer finishing up the project, adding more styling, fixing open bugs and writing documentation for users and developers. The latter half of the summer the student would work to deploy the project on MusicBrainz' servers and run a short alpha testing period. During this phase the student would fix bugs that appear and generally work to get the site stable and running well. The student who works on this should be familiar with Linux, Python, Postgres, nginx. Any experience hosting sites would be a big plus. This project would be a great opportunity for someone who already knows how to code, but would love to learn more about finishing and deploying a web site.

Give Picard a website

Currently Picard's "website" is https://musicbrainz.org/doc/MusicBrainz_Picard - a doc page buried in the MusicBrainz site. It's hard to navigate and the surrounding context is all MusicBrainz, not Picard. One idea would be to give Picard its own smaller site, similar to http://metabrainz.org/. This would have its own menu with a link back to MusicBrainz's main site and links to downloads, plugins, documentation and the tagger support section of the forums.

A separate site could be used to improve plugin support. Right now if you want to add a plugin to MusicBrainz_Picard/Plugins you need to figure out that you can go to the wiki from the /doc/ page in the first place and also find somewhere host your plugin. If you want to download a new plugin, you have to go to the page, download the plugin and then install it manually. If we had a site for it, we could have a database to store plugins in, users could log in (using their MB account details), upload a plugin and set various details about it (license, compatible versions, compatible OSes). It could then have an API for Picard to call instead of users having to download and install manually, and could track how many downloads a plugin has (which could help us decide which features to add to Picard itself) and it would make it possible for Picard to notify users when a newer version of a plugin they use is available.

Geordi

Proposed mentors: ianmcorvidae

MusicBrainz's new (alpha-quality) ingestion/matching/importing tool, geordi, is currently in the midst of a major rewrite (branch 'big-refactor' on github), in order to correct some of the mistakes of the first version and focus the project somewhat. Due to this, there's a variety of things that could serve as GSoC projects within geordi:

  • Additional data sources: The first version of geordi included/includes data from discogs and from the Internet Archive (the 'wcd' index). At present, the new version supports much less, and from different sources. Code to import data (into geordi), mapping code to geordi's internal format, and any necessary improvements to other code within geordi could be undertaken for a variety of public data sources such as discogs, jamendo, and public collections on the Internet Archive, such as the live music and netlabels collections.
  • Importer tools: We're hoping to have basic Release Editor Seeding, at least, done before the summer. Proposals to improve and extend this functionality, as well as importing tools for other types of data (relationships, other entity types, cover art) could be considered. This project would likely involve a lot of flexibility on the part of the student, since the base importer functionality isn't yet written; it would also likely require changes to the main musicbrainz-server codebase, to support any desired functionality.
  • Matching improvements: the new version of geordi only has basic, manual matching near the top of its priority list. Many improvements are possible, including tools for automatic or semi-automatic matching (e.g. matching a release's artist in geordi based on the artist of the matched release in MusicBrainz; or more complex tasks like matching releases based on tracklist similarity); implementing matches stored on the MusicBrainz side (for example: discogs URL relationships mark matches between discogs and MusicBrainz already, so it's redundant to store them separately in geordi when discogs is added as a data source). Many potential projects under this category would involve architectural work to make such tools possible and extensible.

All of these projects would require familiarity with the geordi code and with MusicBrainz generally, and tolerance for things changing under you, since the project is in somewhat early stages and not widely used. Geordi is a web application using python and flask.

Proposals

About proposals

Before you dive in and send a proposal to us through Google, it's a good idea to take some time and learn about the MusicBrainz community. At MusicBrainz we pride ourselves for having a strong community - most of us know each other in some way, and some of us know each other face to face from development summits.

A good way to get a feel of this would be to lurk around in IRC, or to talk about your proposals on the mailing lists.