Tagger Script

From MusicBrainz Wiki
Revision as of 12:32, 19 March 2009 by Zout (talk | contribs) (removed author(s))
Status: This Page is Glorious History!

The content of this page either is bit-rotted, or has lost its reason to exist due to some new features having been implemented in MusicBrainz, or maybe just described something that never made it in (or made it in a different way), or possibly is meant to store information and memories about our Glorious Past. We still keep this page to honor the brave editors who, during the prehistoric times (prehistoric for you, newcomer!), struggled hard to build a better present and dreamed of an even better future. We also keep it for archival purposes because possibly it still contains crazy thoughts and ideas that may be reused someday. If you're not into looking at either the past or the future, you should just disregard entirely this page content and look for an up to date documentation page elsewhere.

What is TaggerScript?

There are many requests to add certain options to the PicardTagger like "Convert all first letters to upper case" or to set a certain frame in the ID3Tag of an MP3. Though it is possible to solve all of those requests more flexible than just by adding checkboxes to the options dialogue.

TaggerScript is a general idea of enhancing the PicardTagger with lightweight scripting features that allow influencing the tagging process easily according to the preferences of the user.

There are two variants of doing so.

Variant 1: Variable System

This would work similar to the naming description fields we have for filenames now. You would have fields like Album, Artist, Title aso that represent the target, that is for a MP3 file it would be the frame of the ID3Tag you want to put something into. And then it would have variables like $albumname, $sortname, $begindate aso that represent the values that come from the server request. When assigning values to the fields you could use both variables and freetext. But nothing more. It would look like this:

Album = $albumname [$year]
Artist = $sortname [$begindate - $enddate]

What is to be decided is: does it use default values for certain fields (so you don't have to write everything down like the MusicBrainzIdentifiers) or does it really only change the values of those fields that you note in the description box (or wherever). This could be an option.

Possible extensions: Add some predefined functions like upper_case_first_letters() which can also be used when assigning the server values to the fields. Use generic field names ("Album") that are matched to certain ID3Tags / FLAC comments depending on the tagging settings but also define specialized field names that overwrite the values of the generic fields. Those specialized field names are the exact equivalents to the places their values are written to (that is you have a field for the ID3v2.3 TALB frame and one field for the ID3v2.4 TALB frame).

Variant 2: Complete scripting language

This would add a complete scripting language to Picard. You could define different scripts for different use cases like full tagging <-> just updating. It would make the tagging process more complicated but would allow very complex operations through loops, if's, string-functions and so on.

Of course both variants could be combined - that is you normally only see a text box where you can enter a syntax as described in variant 1. In the background this is transformed into the complete scripting syntax and by clicking on a "Advanced" button you'd see the result and could edit it. Users of WinAmp might know this method from the simple view editor and its advanced variant. Though when editing the script code it might not be possible to transform it back to the simple syntax. Another way would be that the syntax of the complete scripting language is so easy that it does not look a lot different from the syntax described in variant 1 for such simple tasks. Then you wouldn't need two editors.

Overkill?

One might think variant 2 is a complete overkill and variant 1 is far enough for what users might want to do with the tagger. Though at least when the database converts to the NextGenerationSchema it will be so abstract and contain so much more depth that there are many more possibilities how a file could be tagged. Here are some of them:

Use case 1: Disc numbers

The NextGenerationSchema is to separate many things we currently put together. Among those are SubTitles and DiscNumbers / DiscNames. That means MainTitle and SubTitle of tracks should no longer be stored in one field as should ReleaseTitle and DiscName. This indeed does work also with tagging: ID3 has separate fields for all of those parts. Though very few players support them. Therefore some users might want to use the single fields and some users might want to still handle it as we do it now. So usergroup 1 might want to do something like this:

Album = $albumname;
PartOfSet = $discnumber;
SetSubTitle = $discname;

(Where PartOfSet and SetSubTitle refer to the frames TPOS and TSST in the ID3v2.4.0 standard). And usergroup 2 would do:

if ($discnumber != "")
{
        Album = $albumname + "(disc " + $discnumber;
        if ($discname != "")
                Album = Album + ": " + $discname;
        Album = Album + ")";
}

Use case 2: Artist names

The current ObjectModel for the NextGenerationSchema makes a distinction between an artist (a - virtual or real - person or group of discographic relevance) and a release artist which is mainly a label for how an artist name is written on a release. Users who want to be near to the release might prefer using the release artist string, users who don't want to have several entries for one and the same artist in their media library might prefer using the Artist name string. This could still be done with a simple variable system. But when it comes to classical releases you have a composer linked to the composition object, a performer linked to the recording object and perhaps something completly different as release artist. With some if statements you could do wonders. ;)

Use case 3: Live tracks on live albums

Currently we don't always label tracks as what they are but what they are labeled as on a release or we follow certain guidelines for this. For example an album version on a single is not labeled as this. Or some mix may just labeled as the normal song on a release. In the ObjectModel we have a mix object which shows which version of a song it is. This could contain the title of the mix. Also we have a recording object which could contain information if the song was recorded live or in the studio.

With scripting features in the tagger users could decide if they prefer what is written on the release or what it really is. For tracks on a live album they could decide putting " (live)" behind all track titles (which we of course don't do in the database because normally all tracks on a live album are live).

GUI proposal

Here is how I imagine the GUI of TaggerScript to look like. The v] at the end indicates it's a drop down menu.

[Fill v] (1) [Ablum v](2) with [Album title without disc no. v](3) -(4)
[Fill v] (1) [Comment v](2) with [Album's disc no with 'disc: ' v](3) -(4)
+ (4)

(1) Fill, Append to, Prepend to

(2) Tag: depending on the format (id3v2, Ogg, ..)

(3) The TaggerScript "script", with a last item "Create New"-

  • Either created with a new GUI or via REGEXP. For the above example I think of it like following: * Album title without disc no.: $albumname( "s/\([^(]*\) (.*/\1/" )
    * Album's disc no with 'disc: ': $albumname( "s/[^(]* (disc \([0-9]+\)/disc: \1/" ) As stated in the wiki this could be normal python code too. Each would be one .py in a scripts subfolder.

(4) +/- to add another/ remove the current rule

Different tag formats are chosen via radio buttons or tabs. Or even just check boxes, what would require a little exclamation mark or something else, to indicate if a tag field isn't present in one or more formats.

Implementation

See PicardQt/Scripting for details.

Discussion

You could strap python (or something similar) to picard, and have a module (or whatever, Perl is my language of choice) which provides canned functions for mangling data for tags, then extract all assigned variables and put them into tags. This would allow for both the simple concatenation of data, and also allow for writing hairy code to create tags. --MartinRudat

  • Yes, that's exactly what I want to do. Since Picard is written in Python I don't see need to implement a scripting language in another scripting language. --LukasLalinsky

I would suggest that there be at least two presets for the TaggerScript, one that supports the full ID3 spec, and the other that supports the way that MP3s are tagged now. Also, to be pedantic (and given I use OggVorbis), there's also an (unofficial) Vorbis comment spec too, which possibly needs its own support. --MartinRudat