Development/Summer of Code/2018/MusicBrainz

From MusicBrainz Wiki
[*] Don't let the idea of writing Perl discourage you from checking out some of these projects! The MusicBrainz Server is written in readable, well-structured Perl, using the web application framework Catalyst. If you're comfortable in e.g. Python or Ruby web frameworks, then you'll probably be able to jump in and understand this codebase with only a little extra effort.

Ideas

Improve editing interface for event setlist on the MusicBrainz website

Proposed mentors: yvanzo, bitmap
Languages/skills: JavaScript (React), Perl (Catalyst) [*]
Forum for discussion

Since summer 2015, concerts can be added to MusicBrainz with their detailed setlist. However, this initial support requires editors to know yet another specific syntax for setlist without any preview. This is all but handy.

It is more generally related to the UX redesign of the event setlist editing UI, which is monitored from MBS-9533. Thus, it must follow our UX redesign process. Until now, there are two potential ways to improve the situation:

  • Create a visual editor for event setlist that doesn’t require editor to know the setlist syntax at all (JavaScript)

Or

  • Extend the setlist editor with a live preview on the right side, see MBS-8085 (JavaScript)

Possible extensions to this idea:

  • Define a new syntax for event setlist as a subset of MarkBrainz, see MBS-8120 (Perl, JavaScript)
  • Add support for guest artists to work/song performance in an event setlist, see MBS-8366 (Perl, JavaScript)

Add support for in-place localization of the MusicBrainz website

Proposed mentor: yvanzo
Languages/skills: JavaScript (React), Perl (Catalyst) [*], Python (Django)
Forum for discussion

MusicBrainz website was originally published in English only. Since 2016, it is available in three more languages and beta now features six more half-completed translations. Technically, it is based on files in the GNU Gettext format which are updated from/to Transifex. However, this localization platform is not fully satisfactory regarding the context of messages, communication with translators, and some other things such as the review workflow and the glossary for example.

Server Internationalisation would be much easier for translators if it could be done in place. This is main advantage of Pontoon.

  • Make the MusicBrainz Server work with a local instance of Pontoon
  • Update the policy for localized messages containing links
  • Deploy an instance of Pontoon at pontoon.metabrainz.org with project MusicBrainz

Possible extension to this idea is to do the same migration for other *Brainz projects.

Embed documentation into the MusicBrainz website

Proposed mentors: yvanzo, bitmap
Languages/skills: Perl (Catalyst) [*], SQL (Postgres), JavaScript (React)
Forum for discussion

Most of user documentation for the MusicBrainz project is held on the MusicBrainz Wiki and made available to MusicBrainz Server through the WikiDocs transclusion mechanism. This has some drawbacks: relevant bits of documentation cannot be directly displayed within the MusicBrainz website, localization is not enabled and would use a distinct format from the rest of the MusicBrainz website, updating code and related documentation are two distinct processes.

At the latest MetaBrainz summit, we decided to improve the situation by embedding more documentation directly into the user interface, instead of current help links that redirect to static pages transcluded from the wiki. Most of consist into a descriptive paragraph followed by an descriptive enumeration of properties, for example Release Group. These property descriptions should be embedded directly into the website pages were this property is used. The full documentation page should be build by gathering these properties and their descriptions automatically. WikiDocs pages that cannot be generated should be moved as MarkDown files in the code repository.

  • Embed user documentation bits into the MusicBrainz website
  • Generate automatically full documentation pages for entity types and property types
  • Move the rest of documentation pages as MarkDown files in the code repository
  • Integrate both documentation bits/pages into the localization process

Improve the UX of voting for edits of the MusicBrainz database

Proposed mentors: yvanzo, bitmap
Languages/skills: JavaScript (React), Perl (Catalyst) [*], SQL (Postgres)
Forum for discussion

Edits made to the MusicBrainz database are either automatically applied or can be voted, usually for 7 days, depending on edit type. Combined with the Subscription feature, this process allows for editors to review the edits made by other editors.

The current issue is that many editors don’t receive any vote for their own edits. There are several ways we can imagine to try to address this issue: redesign of edits pages, gamification of the voting system, gamification of subscriptions, more gain from subscriptions, round-robin notification for edits made by editors missing votes for a long time, and so on.

The goal is to select a few suggestions (given above or of your own) with the community and to implement it into the MusicBrainz Server. The main part of the implementation will be the user interface to be coded using React/JSX.

Add social features to MusicBrainz

Proposed mentor: ruaok
Languages/skills: Perl and/or Python, SQL (Postgres)
Forum for discussion

We recently added event (read: concerts) support to MusicBrainz. Our main motivation was to add this feature for historical concerts, but it can also be used for future concerts. In the past the crowd-sourced concerts on last.fm were the best place to find concerts, but in the past few years last.fm has begun to fade from people's awareness. There is a possibility that MusicBrainz can take the former place of last.fm and become the best crowd source concert information site on the net. In order for this to happen, we would need to add a few more features to MusicBrainz:

  • Social notifications: MB users should be able to post to Facebook/Twitter when they do plan to attend a concert.
  • Other features: What features should we add to build a community around concert information curation?

These social features are important for building a community of users around concerts. The goal is to engage users to enter information about concerts and venues and then talk about upcoming concerts. The more people use MusicBrainz to talk about concerts publicly, more people will get drawn in to improve the concert listings in MusicBrainz.

Implement genres based on tags in MusicBrainz

Proposed mentor: bitmap
Languages/skills: Perl [*], JavaScript (React), SQL (Postgres)
Forum for discussion

Storing genre info in MusicBrainz has been discussed for years. Currently, we support arbitrary Folksonomy Tagging that users can apply to entities, and people have used that to store genres. But there's no way to tell if a tag is a genre or something else (like "seen live"). So, we've decided that the best way to support genres is to start with a hardcoded list of tags that we consider to be genres. This is documented at MBS-8600, and part of the project will be combining the sources there to come up with an appropriate list.

On top of that, we'd obviously like a way to present these tags as genres in the UI and web service. You should design and implement a UI for selecting genres for entities (with autocomplete), distinct from the normal tag list. You'd be using React for the editing interface. You also need to figure out how to modify our web service (Development/XML Web Service/Version 2) to indicate which tags are genres. (The web service is written in Perl.)

Additionally, come up with a way to manage our list of genres, possibly with support for aliasing them (e.g., if someone tags something as "aussie hip hop", it should be replaced by "australian hip hop"). This is where some familiarity with Postgres (coming up with a schema) will come in handy.

Redesign the artist overview pages on MusicBrainz

Proposed mentor: bitmap
Languages/skills: Perl [*], JavaScript (React)
Forum for discussion

Our artist pages are very boring and inaccessible right now. Example: https://musicbrainz.org/artist/b10bbbfc-cf9e-42e0-be17-e2c3e1d2600d

We should be displaying release group cover art from the Cover Art Archive where possible (as an option, since people might prefer the current fast/compact view) and provide a better interface for filtering things, sort of like Discogs does in their left sidebar: https://www.discogs.com/artist/82730-The-Beatles Especially nice would be a way to show aggregate credits and filter release groups based on them (e.g. release groups where the artist has a vocal credit on any linked release or recording).

You should propose a new design including the features mentioned above and/or some of your own ideas. The musicbrainz-server codebase is written in Perl, and the current artist pages use Template Toolkit. You can use React to code the new templates, and use JavaScript to make the page "dynamic," but there should be fallbacks in place if JavaScript is disabled.

Integrate more *Brainz in more *Brainz

Languages/skills: Perl and/or Python and/or Node.js, probably SQL (Postgres)
Forum for discussion

We have a bunch of different projects under the MetaBrainz umbrella by now, but they do not necessarily utilise each other to their fullest extent. MusicBrainz in particular is lacking utilisation of features/data from e.g., AcousticBrainz and ListenBrainz.

I don't have any specific things to do or not do with this, but a prospective student thinking about this should definitely approach us on IRC and talk with us about what they have in mind and if there's anything the community can think of.

SpamBrainz

Proposed mentors: ruaok, yvanzo
Languages/skills: Perl [*] and/or Python/Go/…, data science, machine learning
Forum for discussion

MusicBrainz has recently been plagued by a lot of automatic spam.

During the MusicBrainz_Summit/17 we decided to create a new MetaBrainz project, SpamBrainz, to automate spam detection and handling in the future. The general idea is that SpamBrainz should use some sort of machine learning to score new/recently-modified/user-reported entities (including user accounts) and flag these as spam or not. Thereby flagged entities are bring to spam ninjas for further hiding then deletion. For now its scope is limited to MusicBrainz but the project should be designed to be able to handle other MetaBrainz projects in the future as well.

Before SpamBrainz can be deployed there has to be a significant amount of offline training using data collected from MusicBrainz. The final goal is to have spam reported by SpamBrainz automatically hidden. Our thoughts on SpamBrainz are summed up in this Google Doc.