Development/Summer of Code/2021/MusicBrainz

From MusicBrainz Wiki
< Development‎ | Summer of Code‎ | 2021
Revision as of 19:48, 11 February 2021 by Reosarevok (talk | contribs)
[*] Don't let the idea of writing Perl discourage you from checking out some of these projects! The MusicBrainz Server is written in readable, well-structured Perl, using the web application framework Catalyst. If you're comfortable in e.g. Python or Ruby web frameworks, then you'll probably be able to jump in and understand this codebase with only a little extra effort.


MusicBrainz data visualization with React

Proposed mentors: bitmap, reosarevok, yvanzo
Languages/skills: JavaScript (React), data visualization
Forum for discussion

MusicBrainz has a timeline about global statistics which is being rewritten to React and d3.js.

Moreover, timeline (or other kind of chart) would be a very helpful to visualize relationships for famous artists such as Mstislav Rostropovich who had a great career. For example, an history of band members would be nice to see when there are many changes over time, to better visualize which musicians played together. It can probably apply to famous works too, e.g. Over the Rainbow.

Note that data is already available to React components that render artist/work Relationship tabs, thus you won’t have to bother about the PostgreSQL/Perl data layer MusicBrainz is running.

This would lay the foundations of data visualization in the React-rendered website of MusicBrainz.

Automate areas management in MusicBrainz

Proposed mentors: yvanzo, bitmap, reosarevok
Languages/skills: SQL (Postgres), Python
Forum for discussion

Areas (such as cities, regions and countries) are used in MusicBrainz to indicate the location of concert halls and recording studios, the place of birth of artists, and so on. But the goal of MusicBrainz is curating music metadata, not geographical metadata, and thus we should rely on an external database instead.

Originally areas were automatically added from Wikidata with our old Perl bot, but that was dropped when some editors started making bad edits on Wikidata to ensure some specific area was added by the bot.

Currently, area editing is mostly reserved to editor dr_saunders who voluntarily addresses issues reported via AREQ tickets provided that references are given, usually Wikidata or GeoNames. This worked for years but has some issues:

  • Takes a fair amount of time for the area editor to maintain it manually;
  • Requested areas are not created immediately, thus are not immediately available to link to, causing delays;
  • Area data is missing localized names that are added into references later on;
  • Area data becomes silently outdated, except when an editor reports issues to be fixed by hand.

Nowadays Wikidata has much stronger anti-vandalism tools and we have the ability to report, admonish and temporarily ban any users we find trying to game the system, so we can probably go back to an automatic system using Wikidata. The old Perl bot is complicated and mostly abandoned, so ideally this would be done via the somewhat more active Python bot. This has the benefit of being able to use existing Python libraries for dealing with the Wikidata side of the task as well.

The first and main task for a student who picks this should be to look through the Python bot, add an "add_area" function to it, and find and import relevant areas in Wikidata that are still missing in MusicBrainz (with their Wikidata and Geonames links, and marked as a part of the appropriate area already in MusicBrainz). Once this is working, the bot should also add missing aliases in other languages to the areas and keep them updated by regularly checking that they haven't been removed from Wikidata (to avoid keeping old or incorrect data).

Integrate Internet Archive in MusicBrainz

Proposed mentors: yvanzo, bitmap
Languages/skills: RabbitMQ, JavaScript (React), Perl (Catalyst) [*]
Forum for discussion

The Internet Archive offers many resources that can mix very well with MusicBrainz.

  • The Wayback Machine is able to take and render a snapshot of public webpages at any given time. It can be used for URL relationships added to the MusicBrainz database, and for references given in edit notes and annotations.
  • The 78 RPMs and Cylinder Recordings is a collection of digitized recordings from physical releases of the early 20th century. Each recording comes with audio streaming, and metadata web service. It can be used to retrieve metadata automatically and to embed a player in MusicBrainz website. A lot of similar music collections are hosted by the Internet Archive.

The Internet Archive team specifically offered assistance with supporting such project.

Improve editing interface for event setlist on the MusicBrainz website

Proposed mentors: yvanzo, bitmap, reosarevok
Languages/skills: JavaScript (React), Perl (Catalyst) [*]
Forum for discussion

Since the summer of 2015, concerts can be added to MusicBrainz, including a detailed setlist. However, the implementation requires editors to know a very specific syntax for setlists, and doesn't even provide a preview option to make sure they're doing it right. This causes a lot of problems. In general, having to put the setlist together by hand is fairly user-unfriendly.

The task here is to build a new editing interface, ideally similar to the Tracklist page of the release editor, that allows users to add the information through a form that doesn't require them to learn any syntax. This will also allow us to potentially change the way we store the data in the background later on, without actually requiring big differences in the way the user-facing form works.

Improve the UX of voting for edits of the MusicBrainz database

Proposed mentors: yvanzo, bitmap
Languages/skills: JavaScript (React), Perl (Catalyst) [*], SQL (Postgres)
Forum for discussion

Edits made to the MusicBrainz database are either automatically applied or can be voted, usually for 7 days, depending on edit type. Combined with the Subscription feature, this process allows for editors to review the edits made by other editors.

The current issue is that many editors don’t receive any vote for their own edits. There are several ways we can imagine to try to address this issue: redesign of edits pages, gamification of the voting system, gamification of subscriptions, more gain from subscriptions, round-robin notification for edits made by editors missing votes for a long time, and so on.

The goal is to select a few suggestions (given above or of your own) with the community and to implement it into the MusicBrainz Server. The main part of the implementation will be the user interface to be coded using React/JSX.

Add social features to MusicBrainz

Proposed mentor: ruaok
Languages/skills: Perl and/or Python, SQL (Postgres)
Forum for discussion

We recently added event (read: concerts) support to MusicBrainz. Our main motivation was to add this feature for historical concerts, but it can also be used for future concerts. In the past the crowd-sourced concerts on were the best place to find concerts, but in the past few years has begun to fade from people's awareness. There is a possibility that MusicBrainz can take the former place of and become the best crowd source concert information site on the net. In order for this to happen, we would need to add a few more features to MusicBrainz:

  • Social notifications: MB users should be able to post to Facebook/Twitter when they do plan to attend a concert.
  • Other features: What features should we add to build a community around concert information curation?

These social features are important for building a community of users around concerts. The goal is to engage users to enter information about concerts and venues and then talk about upcoming concerts. The more people use MusicBrainz to talk about concerts publicly, more people will get drawn in to improve the concert listings in MusicBrainz.

Integrate more *Brainz in more *Brainz

Languages/skills: Perl and/or Python and/or Node.js, probably SQL (Postgres)
Forum for discussion

We have a bunch of different projects under the MetaBrainz umbrella by now, but they do not necessarily utilise each other to their fullest extent. MusicBrainz in particular is lacking utilisation of features/data from e.g., AcousticBrainz and ListenBrainz.

I don't have any specific things to do or not do with this, but a prospective student thinking about this should definitely approach us on IRC and talk with us about what they have in mind and if there's anything the community can think of.