Development/Summer of Code/2016/CritiqueBrainz
Ideas
Performance improvements for CritiqueBrainz
Proposed mentor: Gentlecat
Languages/skills: Python, Flask, SQL, PostgreSQL
Forum for discussion
Currently CritiqueBrainz uses MusicBrainz web service to get information about release groups, artists, etc. CritiqueBrainz depends on this information heavily. Basically, every time we show a review, it needs to be accompanied by information about an entity (event or release group depending on what was reviewed). Unfortunately requests to the web service take significant amount of time, and there is no way to request info about multiple entities in one request. This slows down the website significantly, especially on pages where we show multiple (10-40) reviews.
One way to improve this is to query MusicBrainz database directly. Caching can help as well, and we already use it in some places. Once this problem is solved it should allow us to do more advanced things.
Improve database access in CritiqueBrainz
Proposed mentor: Gentlecat
Languages/skills: Python, Flask, SQL, PostgreSQL
Forum for discussion
From the start CritiqueBrainz server has been using SQLAlchemy ORM to interact with the database. Unfortunately, we started to notice that it adds too many constraints that we have to work around: writing complex queries and updating old ones is harder, caching becomes more complicated. Apart from this, there are a lot of implicit things that happen in background when you use an ORM. Database access code ended up spread out all over the place (even in templates).
It might be worth replacing all ORM usage in CritiqueBrainz with raw SQL queries, and improving code around it. We already have a similar implementation in AcousticBrainz project, which can be used as a reference.