Tagger Script: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
((Imported from MoinMoin))
(Redirected page to MusicBrainz Picard)
 
(16 intermediate revisions by 8 users not shown)
Line 1: Line 1:
#REDIRECT [[MusicBrainz Picard]]
==What is TaggerScript?==

There are many requests to add certain options to the [[Picard Tagger|PicardTagger]] like "Convert all first letters to upper case" or to set a certain frame in the ID3Tag of an MP3. Though it is possible to solve all of those requests more flexible than just by adding checkboxes to the options dialogue.

TaggerScript is a general idea of enhancing the [[Picard Tagger|PicardTagger]] with lightweight scripting features that allow influencing the tagging process easily according to the preferences of the user.

There are two variants of doing so.

==Variant 1: Variable System==

This would work similar to the naming description fields we have for filenames now. You would have ''fields'' like Album, Artist, Title aso that represent the target, that is for a MP3 file it would be the frame of the ID3Tag you want to put something into. And then it would have ''variables'' like $albumname, $sortname, $begindate aso that represent the values that come from the server request. When assigning values to the ''fields'' you could use both variables and freetext. But nothing more. It would look like this:

<pre>Album = $albumname [$year]
Artist = $sortname [$begindate - $enddate]
</pre>

What is to be decided is: does it use default values for certain fields (so you don't have to write everything down like the [[MusicBrainz Identifier|MusicBrainzIdentifier]]<code><nowiki></nowiki></code>s) or does it really only change the values of those fields that you note in the description box (or wherever). This could be an option.

Possible extensions: Add some predefined functions like upper_case_first_letters() which can also be used when assigning the server values to the fields. Use generic field names ("Album") that are matched to certain ID3Tags / FLAC comments depending on the tagging settings but also define specialized field names that overwrite the values of the generic fields. Those specialized field names are the exact equivalents to the places their values are written to (that is you have a field for the ID3v2.3 TALB frame and one field for the ID3v2.4 TALB frame).

==Variant 2: Complete scripting language==

This would add a complete scripting language to Picard. You could define different scripts for different use cases like full tagging <-> just updating. It would make the tagging process more complicated but would allow very complex operations through loops, if's, string-functions and so on.

Of course both variants could be combined - that is you normally only see a text box where you can enter a syntax as described in variant 1. In the background this is transformed into the complete scripting syntax and by clicking on a "Advanced" button you'd see the result and could edit it. Users of [[WinAmp]] might know this method from the simple view editor and its advanced variant. Though when editing the script code it might not be possible to transform it back to the simple syntax. Another way would be that the syntax of the complete scripting language is so easy that it does not look a lot different from the syntax described in variant 1 for such simple tasks. Then you wouldn't need two editors.

==Overkill?==

One might think variant 2 is a complete overkill and variant 1 is far enough for what users might want to do with the tagger. Though at least when the database converts to the [[Next Generation Schema|NextGenerationSchema]] it will be so abstract and contain so much more depth that there are many more possibilities how a file could be tagged. Here are some of them:

===Use case 1: Disc numbers===

The [[Next Generation Schema|NextGenerationSchema]] is to separate many things we currently put together. Among those are [[Subtitle|SubTitle]]<code><nowiki></nowiki></code>s and [[Disc Number|DiscNumber]]<code><nowiki></nowiki></code>s / [[Disc Name|DiscName]]<code><nowiki></nowiki></code>s. That means [[Main Title|MainTitle]] and [[Subtitle|SubTitle]] of tracks should no longer be stored in one field as should [[Release Title|ReleaseTitle]] and [[Disc Name|DiscName]]. This indeed does work also with tagging: ID3 has separate fields for all of those parts. Though very few players support them. Therefore some users might want to use the single fields and some users might want to still handle it as we do it now. So usergroup 1 might want to do something like this:

<pre>Album = $albumname;
PartOfSet = $discnumber;
SetSubTitle = $discname;
</pre>

(Where [[Part Of Set|PartOfSet]] and [[Set Sub Title|SetSubTitle]] refer to the frames TPOS and TSST in the [http://www.id3.org/id3v2.4.0-frames.txt ID3v2.4.0 standard]). And usergroup 2 would do:

<pre>if ($discnumber != "")
{
Album = $albumname + "(disc " + $discnumber;
if ($discname != "")
Album = Album + ": " + $discname;
Album = Album + ")";
}
</pre>

===Use case 2: Artist names===

The current [[Object Model|ObjectModel]] for the [[Next Generation Schema|NextGenerationSchema]] makes a distinction between an [[Object Model/Artist Object|artist]] (a - virtual or real - person or group of discographic relevance) and a [[Object Model/Release Artist Object|release artist]] which is mainly a label for how an artist name is written on a release. Users who want to be near to the release might prefer using the release artist string, users who don't want to have several entries for one and the same artist in their media library might prefer using the Artist name string. This could still be done with a simple variable system. But when it comes to classical releases you have a composer linked to the [[Object Model/Composition Object|composition object]], a performer linked to the [[Object Model/Recording Object|recording object]] and perhaps something completly different as release artist. With some if statements you could do wonders. ;)

===Use case 3: Live tracks on live albums===

Currently we don't always label tracks as what they are but what they are labeled as on a release or we follow certain guidelines for this. For example an album version on a single is not labeled as this. Or some mix may just labeled as the normal song on a release. In the [[Object Model|ObjectModel]] we have a [[Object Model/Mix Object|mix object]] which shows which version of a song it is. This could contain the title of the mix. Also we have a [[Object Model/Recording Object|recording object]] which could contain information if the song was recorded live or in the studio.

With scripting features in the tagger users could decide if they prefer what is written on the release or what it really is. For tracks on a live album they could decide putting " (live)" behind all track titles (which we of course don't do in the database because normally all tracks on a live album are live).

==GUI proposal==

Here is how I imagine the GUI of TaggerScript to look like. The v] at the end indicates it's a drop down menu.
{| border="1"
|-
| [Fill v] (1) || [Ablum v](2) || with [Album title without disc no. v](3) || -(4)
|-
| [Fill v] (1) || [Comment v](2) || with [Album's disc no with 'disc: ' v](3) || -(4)
|-
| + (4)
|}

(1) Fill, Append to, Prepend to

(2) Tag: depending on the format (id3v2, Ogg, ..)

(3) The TaggerScript "script", with a last item "Create New"-
<ul><li style="list-style-type:none">Either created with a new GUI or via REGEXP. For the above example I think of it like following:
* Album title without disc no.: $albumname( "s/\([^(]*\) (.*/\1/" ) <br>
* Album's disc no with 'disc: ': $albumname( "s/[^(]* (disc \([0-9]+\)/disc: \1/" )
As stated in the wiki this could be normal python code too. Each would be one .py in a scripts subfolder.
</ul>

(4) +/- to add another/ remove the current rule

Different tag formats are chosen via radio buttons or tabs. Or even just check boxes, what would require a little exclamation mark or something else, to indicate if a tag field isn't present in one or more formats.

==Implementation==

This section describes simple scripting language implemented in [[User:LukasLalinsky/PicardQt|PicardQt]].

===Syntax===

The syntax comes from [http://wiki.hydrogenaudio.org/index.php?title=Foobar2000:Titleformat_Reference Foobar2000's titleformat]. There are three base elements: '''text''', '''variable''' and '''function'''. Variables consist of alpha-numeric characters enclosed in percent signs (e.g. <code><nowiki>%artist%</nowiki></code>). Functions start with dollar and end with argument list enclosed in parentheses (e.g. <code><nowiki>$lower(...)</nowiki></code>).

===Variables===

<ul><li style="list-style-type:none">''TODO: Add all tag names from [[Unified Tagging|UnifiedTagging]] here.''
</ul>

===Functions===

====$if(if,then,else)====

<ul><li style="list-style-type:none">If <code><nowiki>if</nowiki></code> is not empty returns <code><nowiki>then</nowiki></code>, otherwise returns <code><nowiki>else</nowiki></code>.
</ul>

====$if2(a1,a2,a3,...)====

<ul><li style="list-style-type:none">Returns first non empty argument.
</ul>

====$lower(text)====

<ul><li style="list-style-type:none">Returns <code><nowiki>text</nowiki></code> in lower case.
</ul>

====$upper(text)====

<ul><li style="list-style-type:none">Returns <code><nowiki>text</nowiki></code> in upper case.
</ul>

====$left(text,num)====

<ul><li style="list-style-type:none">Returns first <code><nowiki>num</nowiki></code> characters from <code><nowiki>text</nowiki></code>.
</ul>

====$right(text,num)====

<ul><li style="list-style-type:none">Returns last <code><nowiki>num</nowiki></code> characters from <code><nowiki>text</nowiki></code>.
</ul>

====$replace(text,search,replace)====

<ul><li style="list-style-type:none">..
</ul>

====$search(text)====

<ul><li style="list-style-type:none">..
</ul>
<ul><li style="list-style-type:none">''TODO: add more functions, add descriptions''
</ul>

===Examples===

====Use case 1: Disc numbers====

<pre>$set(album,$replace(%album%,\(disc \d+(: [^)]+)\),))
$set(discnumber,$search(%album%,\(disc (\d+)\),))
</pre>

====Use case 2: Artist names====

<pre>$if($search(%album%,(feat. conductor)),
$set(%artist%,%orchestra%))
</pre>
<ul><li style="list-style-type:none">''Stupid assumption that all classical albums have "feat. conductor" in the title, but it shows the idea. :)''
</ul>

====Use case 3: Live tracks on live albums====

<pre>$if($and($eq(%albumstatus%,live),$not($search(%title,(\(live\))))),$set(title,%title% (live)))
</pre>

====Lower case filenames with underscores====

<pre>$lower($replace(%albumartist%/%album%/$num(%tracknumber%,2) %title, ,_))
</pre>

==Discussion==

You could strap python (or something similar) to picard, and have a module (or whatever, Perl is my language of choice) which provides canned functions for mangling data for tags, then extract all assigned variables and put them into tags. This would allow for both the simple concatenation of data, and also allow for writing hairy code to create tags. --[[User:MartinRudat|MartinRudat]]
<ul><li style="list-style-type:none">Yes, that's exactly what I want to do. Since Picard is written in Python I don't see need to implement a scripting language in another scripting language. --[[User:LukasLalinsky|LukasLalinsky]]
</ul>

I would suggest that there be at least two presets for the TaggerScript, one that supports the full ID3 spec, and the other that supports the way that MP3s are tagged now. Also, to be pedantic (and given I use [[Ogg Vorbis|OggVorbis]]), there's also an (unofficial) Vorbis comment spec too, which possibly needs its own support. --[[User:MartinRudat|MartinRudat]]
----- Original author: [[User:Shepard|Shepard]]

[[Category:To Be Reviewed]] [[Category:Terminology]] [[Category:Development]] [[Category:Proposal]]

Latest revision as of 22:51, 20 May 2015

Redirect to: