Development/Summer of Code/2021/BookBrainz: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(Created page with "This page will discuss the current list of suggested ideas for students to develop proposals for Google's Summer of Code for BookBrainz. If you're a student, feel free to base...")
 
(Update for 2021)
Line 7: Line 7:
The first thing to do to get started with BookBrainz is to clone the bookbrainz-site [https://github.com/BookBrainz/bookbrainz-site GitHub repository], and follow the README.md file to get the site up and running on your computer.
The first thing to do to get started with BookBrainz is to clone the bookbrainz-site [https://github.com/BookBrainz/bookbrainz-site GitHub repository], and follow the README.md file to get the site up and running on your computer.


Come and speak to us in the MetaBrainz [[IRC]] (freenode/#metabrainz) if you finish all of that, or get stuck at any point!
Come and speak to us in the BookBrainz [[IRC]] (freenode/#bookbrainz) if you finish all of that, or get stuck at any point!


==Ideas==
==Ideas==
Line 18: Line 18:
[https://community.metabrainz.org/c/bookbrainz Forum for discussion]
[https://community.metabrainz.org/c/bookbrainz Forum for discussion]


This project requires a solid knowledge of the BookBrainz schema and edit pages.
If you are familiar with BookBrainz, you will know that each entity has a separate creation page. You can see these in the website's menu bar under "+ Add".


This leads to a complex workflow for simple use cases (such as 'adding a book') and repeating data (for example the title of a work repeated for an edition and an edition group, in a simple use case of a book), increasing chances of making errors.
As each entity has a separate creation page, users are faced with a complex workflow for simple use cases (such as 'adding a book I have') with lots of repeatition (for example the title of a work repeated for an edition and an edition group, in a simple use case of a book), increasing chances of making errors.
Such an evident workflow should be a lot more straightforward (especially for inexperienced users). The goal of this project is to design and implement a single form that will improve the editing experience.


Such a basic workflow should be a lot more straightforward (especially for inexperienced users). The goal of this project is to design and implement a single form that will improve the editing experience.
Some form of separation (like tabs) will likely be necessary to separate the different entities forms as steps.

Some form of separation (like tabs) will be necessary to separate the different entities forms as steps.
For each step, entities can be either searched for and selected, or created.
For each step, entities can be either searched for and selected, or created.
While it would be best to show only the strict necessary fields for each step at first to limit clutter, users should have the full range of options of the current forms accessible easily.
While it would be best to show only the strict necessary fields for each step at first to limit clutter, users should have the full range of options of the current forms accessible easily.
Line 30: Line 31:
Keep in mind that "adding a book" is a simple case, and that we should be able to optionally enter more complicated cases (for example an anthology of short stories will have multiple Works, and the title of the Edition may be different from any of the Works it contains). Or an anthology of various Authors might be attributed on the cover to a single Author.
Keep in mind that "adding a book" is a simple case, and that we should be able to optionally enter more complicated cases (for example an anthology of short stories will have multiple Works, and the title of the Edition may be different from any of the Works it contains). Or an anthology of various Authors might be attributed on the cover to a single Author.


The first phase should be to make a mockup with a prototyping/wireframing tool such as Sketch, Figma or Pencil Project. You will be expected to develop the components using MetaBrainz' new React-storybook [https://github.com/metabrainz/design-system design system].
The first phase should be to make a mockup with a prototyping/wireframing tool (for example: Sketch, Figma, Pencil Project).
We will work together iteratively during both the UX and implementation sections of the project.
We will work together iteratively during both the UX and implementation sections of the project.



Here are three example scenarios of different level users. The unified form should allow for each of them and provide the best user experience possible.
Here are three example scenarios of different level users. The unified form should allow for each of them and provide the best user experience possible.
Line 54: Line 54:
The magazine has multiple editors
The magazine has multiple editors


=== Use Solr search engine ===


=== Documentation and style guides ===
Proposed Mentors: ''Monkey''<br>
Languages/skills: Node.js, Solr, (PostgreSQL)

[https://community.metabrainz.org/c/bookbrainz Forum for discussion]


BookBrainz currently uses ElasticSearch for its search engine, and we are aiming to replace it in favor of [https://lucene.apache.org/solr/guide/7_5/getting-started.html Solr].


This will simplify hosting BookBrainz on the same infrastructure as the other MetaBrainz projects and harmonize the technologies we use. ElasticSearch is also quite resource intensive and creating some issues for developers on slower computers.

A good understanding of NodeJS and ExpressJS is required to find and replace the relevant components in the web server.

You will set up Solr for use with Bookbrainz' schema, drawing inspiration from existing MusicBrainz code.

On the Node side, you will aim to reproduce the current search functionalities (have a look [https://github.com/bookbrainz/bookbrainz-site/blob/56c54a07ad9368c8ae79a803943e32dda4650429/src/server/routes/search.js here] ), including updating the Solr index on creation/modification of entities (as described [https://lucene.apache.org/solr/guide/6_6/uploading-data-with-index-handlers.html#UploadingDatawithIndexHandlers-JSONFormattedIndexUpdates in the docs] . Currently this is done [https://github.com/bookbrainz/bookbrainz-site/blob/c2f9bd31293290b1dc83f7d8e1eb10e70a836d58/src/server/helpers/search.js#L185 like this] for ES). In addition, we would like to add pagination to the search page, which should be fairly simple to achieve with Solr.

Communicating with Solr can be done simply [https://lucene.apache.org/solr/guide/6_6/using-javascript.html via HTTP requests]. If needed, it could otherwise be done using a JS library if deemed suitable.

You will also be involved in the preparation of the production deployment, adapting existing Docker configurations from MusicBrainz.

=== User collections ===


Proposed Mentors: ''Monkey''<br>
Proposed Mentors: ''Monkey''<br>
Languages/skills: React, Node.js, User Experience, SQL
Languages/skills: Knowledge of BookBrainz/MusicBrainz editing, writing documentation


[https://community.metabrainz.org/c/bookbrainz Forum for discussion]
[https://community.metabrainz.org/c/bookbrainz Forum for discussion]


The BookBrainz [https://bookbrainz-user-guide.readthedocs.io/en/latest/ user documentation] is currently only a stub with very little content.
With all this data available, it would be great if BookBrainz users could use the website to save the books they have read, mark the ones they would like to read, mark gifts ideas, etc.
This project aims to adapt the existing MusicBrainz documentations to BookBrainz.


Using test-driven development, you will develop a user collection feature that will allow users to create, organize and retrieve collections.


You will write a beginner user's guide based on the [https://musicbrainz.org/doc/Beginners_Guide MusicBrainz beginner's guide], adapt the [https://musicbrainz.org/doc/How_To How To pages] and flesh out the existing [https://bookbrainz.org/help BookBrainz FAQs] ([https://musicbrainz.org/doc/Frequently_Asked_Questions MusicBrainz FAQs])
A collection can only contain one type of entity (Author, Work, Edition Group, etc.).
A user can create an arbitrary number of collections with an arbitrary number of items in it.
A collection can be set to private or public, and can have an optional description.


You will also take the [https://musicbrainz.org/doc/Style MusicBrainz style guides] and adapt them to the world of BookBrainz.
This includes descriptions of each field for each entity type, relationships between them, identifiers, and every part of the interface that requires an explanation.


On the front-end, there should be a new page (`/user/{USERNAME}/collections`) to view another user's public collections, or view and manage (edit name, set private/public and description, delete) all one's own collections.


=== Import the (now defunct) Bookogs database ===
Each collection has its own page for display purposes, where a user can remove and add items, and change the collection's settings.


There should also be easily accessible ways of adding entities to your collections, at the very least in the entity display pages, but also possibly in the search page, homepage, etc. You will suggest appropriate places for such buttons.

Clicking on an "add to collection" button will require to show a component to choose which collection to add to (filtered by type corresponding to the entity to add), or to create a new collection on the fly.


Using a prototyping/wireframing tool will help for the design and user experience iterations.
You will learn to use MetaBrainz' new React-based [https://github.com/metabrainz/design-system design system]


This feature could also help us build a recommendation engine in the future, along with data from CritiqueBrainz.

=== Documentation and style guides ===


Proposed Mentors: ''Monkey''<br>
Proposed Mentors: ''Monkey''<br>
Languages/skills: Node.js, SQL, writing documentation
Languages/skills: SQL, Node.js, knowledge of BookBrainz schema


[https://community.metabrainz.org/c/bookbrainz Forum for discussion]
[https://community.metabrainz.org/c/bookbrainz Forum for discussion]


The sites Bookogs and Comicogs, sister projects of Discogs, have been [https://community.metabrainz.org/t/bookogs-has-been-archived/489375/11 closed] in 2020; some editors elected BookBrainz to continue contributing open data.
The BookBrainz documentation (both technical/development and style/usage) is currently very poor, fragmented and hard to find.
There is a [https://bb-user-guide.readthedocs.io/en/latest/ stub of user guides here]. And auto-generated developer documentation was previously [https://doclets.io/bookbrainz/bookbrainz-site/master running here] but the hosting service deoclet.io is now defunct.


The Bookogs database dumps were made publicly available for download in json format right after the closing of the project.
The goal of this project is to revive, correct and complete both of these documentations.
In order to prevent the loss of Bookogs contributions we want to import all the entries from the database dumps, as discussed in [https://community.metabrainz.org/t/import-data-from-bookogs-and-comicogs/487094/15 this thread].


This will require processing very large json files in a robust manner, creating "adapters" to transform entities from one database schema to the other, allowing for repeating the process without duplicating entries.
For the user guides, you will take the MusicBrainz guides as an example.
You will put together a detailed plan fo action ahead of time
You will write a beginner user's guide and flesh out the [https://bookbrainz.org/help existing FAQ]
You will also take the [https://musicbrainz.org/doc/Style MusicBrainz style guides] and try to apply everything you find there to the world of books.
This also includes descriptions of each field for each entity, relationships between them, identifiers, and every part of the interface that requires an explanation.


Discussions are in progress for matching [https://community.metabrainz.org/t/matching-bookogs-credit-roles-to-bookbrainz/489397 roles], [https://community.metabrainz.org/t/matching-bookogs-formats-to-bookbrainz/489495/2 formats] and [https://community.metabrainz.org/t/bookogs-genres/489689 genres] to BookBrainz' schema.
For the developper documentation, you will need to find a replacement for the defunct doclets.io and set up automated deployment of the docs (from JSDOC comments in the [https://github.com/bookbrainz/bookbrainz-site main codebase]).
You will also identify (with the mentor) the areas of the code and of the project that are not well documented, identify the most critical ones (for example those blocking new contributors), and improve their documentation.
Current code coverage is around 42%, which can be significantly improved.

Revision as of 11:59, 5 February 2021

This page will discuss the current list of suggested ideas for students to develop proposals for Google's Summer of Code for BookBrainz. If you're a student, feel free to base your proposal on one of these ideas, or pick and entirely new idea that you think might be useful to us.

Getting Started

(see also: Getting started with GSoC)

The first thing to do to get started with BookBrainz is to clone the bookbrainz-site GitHub repository, and follow the README.md file to get the site up and running on your computer.

Come and speak to us in the BookBrainz IRC (freenode/#bookbrainz) if you finish all of that, or get stuck at any point!

Ideas

Design and implement a unified creation form

Proposed Mentors: Monkey
Languages/skills: User interfaces, User Experience, React, Node.js

Forum for discussion

This project requires a solid knowledge of the BookBrainz schema and edit pages.

As each entity has a separate creation page, users are faced with a complex workflow for simple use cases (such as 'adding a book I have') with lots of repeatition (for example the title of a work repeated for an edition and an edition group, in a simple use case of a book), increasing chances of making errors.

Such a basic workflow should be a lot more straightforward (especially for inexperienced users). The goal of this project is to design and implement a single form that will improve the editing experience.

Some form of separation (like tabs) will be necessary to separate the different entities forms as steps. For each step, entities can be either searched for and selected, or created. While it would be best to show only the strict necessary fields for each step at first to limit clutter, users should have the full range of options of the current forms accessible easily. The form will automatically create the relevant relationships and links between the entities.

Keep in mind that "adding a book" is a simple case, and that we should be able to optionally enter more complicated cases (for example an anthology of short stories will have multiple Works, and the title of the Edition may be different from any of the Works it contains). Or an anthology of various Authors might be attributed on the cover to a single Author.

The first phase should be to make a mockup with a prototyping/wireframing tool (for example: Sketch, Figma, Pencil Project). We will work together iteratively during both the UX and implementation sections of the project.

Here are three example scenarios of different level users. The unified form should allow for each of them and provide the best user experience possible.

Scenario 1 User is new to BookBrainz and adding their first book. They don’t know what the different entities are, or what relationships are. The book to enter is a physical book the user has, a novel. The author’s name and title fo the book are on the cover.

Scenario 2 User has entered some books already for an author, The book is a collection of 20 short stories from that same author, written under a pen name. The pen name is on the cover, and the title of the book is different from the title of any of the short stories. The user has an ebook version and a paperback from another publisher borrowed from the library. The book is a translation, and the translator is well known and their name appears on the first page.

Scenario 3 User has been using BookBrainz for a long time and understands how all entities relate User has a series of magazines (35 issues) they want to add. The magazine has multiple editors


Documentation and style guides

Proposed Mentors: Monkey
Languages/skills: Knowledge of BookBrainz/MusicBrainz editing, writing documentation

Forum for discussion

The BookBrainz user documentation is currently only a stub with very little content. This project aims to adapt the existing MusicBrainz documentations to BookBrainz.


You will write a beginner user's guide based on the MusicBrainz beginner's guide, adapt the How To pages and flesh out the existing BookBrainz FAQs (MusicBrainz FAQs)

You will also take the MusicBrainz style guides and adapt them to the world of BookBrainz. This includes descriptions of each field for each entity type, relationships between them, identifiers, and every part of the interface that requires an explanation.


Import the (now defunct) Bookogs database

Proposed Mentors: Monkey
Languages/skills: SQL, Node.js, knowledge of BookBrainz schema

Forum for discussion

The sites Bookogs and Comicogs, sister projects of Discogs, have been closed in 2020; some editors elected BookBrainz to continue contributing open data.

The Bookogs database dumps were made publicly available for download in json format right after the closing of the project. In order to prevent the loss of Bookogs contributions we want to import all the entries from the database dumps, as discussed in this thread.

This will require processing very large json files in a robust manner, creating "adapters" to transform entities from one database schema to the other, allowing for repeating the process without duplicating entries. You will put together a detailed plan fo action ahead of time

Discussions are in progress for matching roles, formats and genres to BookBrainz' schema.