User:YvanZo/Draft/Cover Art Metadata Recognition

From MusicBrainz Wiki
< User:YvanZo
Revision as of 17:48, 27 November 2023 by YvanZo (talk | contribs) (Initial draft of an idea proposed during the summit 23)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

Metadata recognition from cover art

Proposed mentors: bitmap, reosarevok, yvanzo
Languages/skills: React.js, WebAssembly
Forum for discussion
Estimated Project Length: TBD hours
Difficulty: TBD

MusicBrainz gathers metadata about releases and their cover art through the Cover Art Archive. Very often editors have to type the data contained in the cover art images. A drastic boost for them would be to programmatically parse these images to extract as much metadata as possible: free text, title, artist credit, label code, barcode, tracklist…

The optical character recognition engine Tesseract can be used through either Naptha’s port in JavaScript Tesseract.js or Knight’s build in WebAssembly tesseract-wasm. In either case, the web user interface has to be written in React.js to allow a future integration to the website.