Difference between revisions of "Development/Summer of Code/2021/ListenBrainz"

From MusicBrainz Wiki
(Created page with "ListenBrainz is one of the newest MetaBrainz projects. Read more information on [https://listenbrainz.org its homepage]. == Getting started == (see also: Development/Summer...")
 
(Create a music recommendation algorithm using the Troi toolkit)
 
(7 intermediate revisions by 2 users not shown)
Line 13: Line 13:
 
* If you want to, see if you can contribute to fixing a ticket. Either add a comment to the ticket or ask in IRC for clarification if you don't understand what the ticket means
 
* If you want to, see if you can contribute to fixing a ticket. Either add a comment to the ticket or ask in IRC for clarification if you don't understand what the ticket means
  
==Ideas==
+
We're adding a number of new social features to ListenBrainz that we hope will enable people discover more music they like and users who have similar music tastes to their own. We're working on some of these features now, but we will need to get help for other features:
  
=== Create a high performance listen ingester ===
+
=== Create short music reviews on ListenBrainz using the CritiqueBrainz API ===
  
 
Proposed mentors:''mayhem'', ''iliekcomputers''<br>
 
Proposed mentors:''mayhem'', ''iliekcomputers''<br>
Languages/skills: Rust or Go, Python, Protobuf, RabbitMQ, Timescale DB
+
Languages/skills: Javascript, React
  
ListenBrainz currently processes incoming listens using pure python and this processing a listen requires parsing JSON, data validation, and re-serializing JSON and sending it to the database component for deduplication and writing to our datastore. The current process takes up too many resources and simply isn't very scalable; also the code isn't perfectly laid out causing us to serialize and deserialize each listen more than once.
+
On the ListenBrainz timeline (a coherent, non-sucky version of the FB timeline) we would like to enable people to write "mini-reviews" about artists, releases or tracks (recordings). Once a review is written, the ListenBrainz page should submit the the review to CritiqueBrainz using its API. This project must also include loading reviews into the timeline that may exist in CritiqueBrainz. This task may not be sufficient for a whole GSoC project, so you should consider proposing a stretch goal to shoot for in case this project is finished early.
  
For this summer of code project we would like a student to implement a single API endpoint (submit listen) and to port our existing ingestion pipeline to use Protocol Buffers. The new ingester should parse the JSON, validate the data, handle and report errors in exactly the same manner that is currently in use in our production system.  Furthermore, the incoming listen pipeline should be converted to use the new protobuf based format for internal communication in order to make the new ingester as performant as possible.
+
=== Pin 'My jam' on ListenBrainz ===
  
This will require the creation of a very small Go/Rust server that handles the submit listens endpoint (ingester) and a tool that will read incoming listens from the RabbitMQ queue, write them to the Timescale DB and then pass on the unique listens down another RabbbitMQ pipeline (was influx_writer, will soon be timescale_writer).
+
Proposed mentors:''mayhem'', ''iliekcomputers''<br>
 +
Languages/skills: Javascript, React, Postgres, Python/flask
 +
 
 +
On the ListenBrainz timeline mentioned above we would also like to allow a user to pick on track as "their jam" and to pin it on their timeline. This idea is taken from a defunct "This is my jam" website where users could post the song that is currently stuck in their head or really resonates with them. This project, much like the short music reviews idea might not be large enough for a project -- we could consider for one student to do both of these as their summer of code project.
 +
 
 +
=== Create html pages that display data missing from MusicBrainz ===
 +
 
 +
Proposed mentors:''mayhem'', ''iliekcomputers''<br>
 +
Languages/skills: Javascript, React, Python/Flask, Postgres
 +
 
 +
In the LB database we've been collecting metadata about artists, releases and recordings that may be missing from MusicBrainz. This project needs to add some new python flask views and react components to display what is known about the missing data we've collected. These pages should then explain to our users why they should be adding this data to MusicBrainz (it will make their stats/recommendations better) and then provide links to MB pages that allow pre-populating the missing data into the MusicBrainz data submission pages. For adding releases, the release editor should be seeded according to the [https://musicbrainz.org/doc/Development/Release_Editor_Seeding release editor seeding guidelines].
 +
 
 +
=== Create a music recommendation algorithm using the Troi toolkit ===
 +
 
 +
Proposed mentors:''mayhem''<br>
 +
Languages/skills: Python, possibly Postgres.
 +
 
 +
Our [https://github.com/metabrainz/troi-recommendation-playground troi recommendation toolkit] is our playground for developing recommendation algorithms. The toolkit already knows how to fetch data from ListenBrainz for stats, collaborative filtered recommended tracks and from MusicBrainz for metadata and from AcousticBrainz for tracks that have similar acoustic features. We're looking for one student who has an original idea that can be implemented in Troi, ideally using the existing data-sets without having to invent or create new data-sets. This plugin should create a new feature that allows users to discover new music. Please note: We're going to be very selective on what proposals we accept for this project. Before you propose an algorithm to us, you'll need to carefully familiarize yourself with the troi toolkit and what features it provides. Your idea needs to be new and novel, at least in the context of Troi.
 +
 
 +
=== Integrate more music services for recording listens and playing music ===
  
At this point we haven't quite settled on Rust or Go for this project. Do you have a feeling for that?
+
Proposed mentors: ''_lucifer''<br>
 +
Languages/skills: Python/Flask, Typescript/React
  
Relevant links:
+
LB has a number of music discovery features that use BrainzPlayer to facilitate track playback. BrainzPlayer (BP) is a custom React component in LB that uses multiple data sources to search and play a track. As of now, it supports Spotify, Youtube and Soundcloud as a backend. LB also supports linking a Spotify account to record listening history. Currently, we are reworking the integration of external music service in LB to make adding other music services easier. We have looked into some other services and found that Deezer and Apple Music also provide the music playback and recording listening history capability. Integrating these services into LB would make for a good SoC project.
* [https://github.com/metabrainz/listenbrainz-server/blob/master/listenbrainz/webserver/views/api.py#L21 LB submit-listen API endpoint]
 
* [https://github.com/metabrainz/listenbrainz-server/tree/master/listenbrainz/influx_writer LB influx-writer]
 
* [https://developers.google.com/protocol-buffers Protocol Buffers]
 
* [https://www.timescale.com/ Timescale DB]
 

Latest revision as of 17:07, 29 March 2021

ListenBrainz is one of the newest MetaBrainz projects. Read more information on its homepage.

Getting started

(see also: Getting started with GSoC)

If you want to work on ListenBrainz you should show that you are able to set up the server software and understand how some of the infrastructure works. Here are some things that we might ask you about

  • Show that you understand the goals that ListenBrainz wants to achieve, which are written on its homepage
  • Create an oauth application on the MusicBrainz website and add the configuration information to your ListenBrainz server. Use this to log in to your server with your MusicBrainz details
  • Use the import script that is part of the ListenBrainz server to load scrobbles from last.fm to your ListenBrainz server, or the main ListenBrainz server
  • Use your preferred programming language to write a submission tool that can send Listen data to ListenBrainz. You could make up some fake data for song names and artists. This data doesn't have to be real.
  • Try and delete the ListenBrainz database on your local server to remove the fake data that you added.
  • Look at the list of tickets that we have open for ListenBrainz and see if you understand what tasks the tickets involve
  • If you want to, see if you can contribute to fixing a ticket. Either add a comment to the ticket or ask in IRC for clarification if you don't understand what the ticket means

We're adding a number of new social features to ListenBrainz that we hope will enable people discover more music they like and users who have similar music tastes to their own. We're working on some of these features now, but we will need to get help for other features:

Create short music reviews on ListenBrainz using the CritiqueBrainz API

Proposed mentors:mayhem, iliekcomputers
Languages/skills: Javascript, React

On the ListenBrainz timeline (a coherent, non-sucky version of the FB timeline) we would like to enable people to write "mini-reviews" about artists, releases or tracks (recordings). Once a review is written, the ListenBrainz page should submit the the review to CritiqueBrainz using its API. This project must also include loading reviews into the timeline that may exist in CritiqueBrainz. This task may not be sufficient for a whole GSoC project, so you should consider proposing a stretch goal to shoot for in case this project is finished early.

Pin 'My jam' on ListenBrainz

Proposed mentors:mayhem, iliekcomputers
Languages/skills: Javascript, React, Postgres, Python/flask

On the ListenBrainz timeline mentioned above we would also like to allow a user to pick on track as "their jam" and to pin it on their timeline. This idea is taken from a defunct "This is my jam" website where users could post the song that is currently stuck in their head or really resonates with them. This project, much like the short music reviews idea might not be large enough for a project -- we could consider for one student to do both of these as their summer of code project.

Create html pages that display data missing from MusicBrainz

Proposed mentors:mayhem, iliekcomputers
Languages/skills: Javascript, React, Python/Flask, Postgres

In the LB database we've been collecting metadata about artists, releases and recordings that may be missing from MusicBrainz. This project needs to add some new python flask views and react components to display what is known about the missing data we've collected. These pages should then explain to our users why they should be adding this data to MusicBrainz (it will make their stats/recommendations better) and then provide links to MB pages that allow pre-populating the missing data into the MusicBrainz data submission pages. For adding releases, the release editor should be seeded according to the release editor seeding guidelines.

Create a music recommendation algorithm using the Troi toolkit

Proposed mentors:mayhem
Languages/skills: Python, possibly Postgres.

Our troi recommendation toolkit is our playground for developing recommendation algorithms. The toolkit already knows how to fetch data from ListenBrainz for stats, collaborative filtered recommended tracks and from MusicBrainz for metadata and from AcousticBrainz for tracks that have similar acoustic features. We're looking for one student who has an original idea that can be implemented in Troi, ideally using the existing data-sets without having to invent or create new data-sets. This plugin should create a new feature that allows users to discover new music. Please note: We're going to be very selective on what proposals we accept for this project. Before you propose an algorithm to us, you'll need to carefully familiarize yourself with the troi toolkit and what features it provides. Your idea needs to be new and novel, at least in the context of Troi.

Integrate more music services for recording listens and playing music

Proposed mentors: _lucifer
Languages/skills: Python/Flask, Typescript/React

LB has a number of music discovery features that use BrainzPlayer to facilitate track playback. BrainzPlayer (BP) is a custom React component in LB that uses multiple data sources to search and play a track. As of now, it supports Spotify, Youtube and Soundcloud as a backend. LB also supports linking a Spotify account to record listening history. Currently, we are reworking the integration of external music service in LB to make adding other music services easier. We have looked into some other services and found that Deezer and Apple Music also provide the music playback and recording listening history capability. Integrating these services into LB would make for a good SoC project.