User:YvanZo/Draft/Cover Art Metadata Recognition: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(→‎Metadata recognition from cover art: Explain the difficulty and consider ML for a very long project)
Line 4: Line 4:
Languages/skills: React.js, WebAssembly<br>
Languages/skills: React.js, WebAssembly<br>
[https://community.metabrainz.org/c/musicbrainz Forum for discussion]<br>
[https://community.metabrainz.org/c/musicbrainz Forum for discussion]<br>
Estimated Project Length: TBD hours<br>
Estimated Project Length: 175 hours (or 350 hours if machine learning)<br>
Difficulty: TBD
Difficulty: hard


MusicBrainz gathers metadata about [[Release|releases]] and their cover art through the [[Cover Art Archive]].
MusicBrainz gathers metadata about [[Release|releases]] and their cover art through the [[Cover Art Archive]].

Revision as of 18:29, 29 November 2023

Metadata recognition from cover art

Proposed mentors: bitmap, reosarevok, yvanzo
Languages/skills: React.js, WebAssembly
Forum for discussion
Estimated Project Length: 175 hours (or 350 hours if machine learning)
Difficulty: hard

MusicBrainz gathers metadata about releases and their cover art through the Cover Art Archive. Very often editors have to type the data contained in the cover art images. A drastic boost for them would be to programmatically parse these images to extract as much metadata as possible: free text, title, artist credit, label code, barcode, tracklist…

The optical character recognition engine Tesseract can be used through either Naptha’s port in JavaScript Tesseract.js or Knight’s build in WebAssembly tesseract-wasm. In either case, the web user interface has to be written in React.js to allow a future integration to the website.

Tesseract has a lot of parameters that allow tuning it for specific usage, or focusing on some selected areas. However the main part of the project might be to turn its output into something useful. The parsing/mapping can potentially be achieved through machine learning but that would likely double the project length.