Indexed Search Syntax: Difference between revisions

From MusicBrainz Wiki
Jump to navigationJump to search
(Not listed anywhere, but it works :) (Imported from MoinMoin))
(Updated for the new version of the search server now in production. (Imported from MoinMoin))
Line 1: Line 1:
This page describes the syntax for MusicBrainz ''indexed searches'' which use the Lucene text search engine. The search indexes for these types of searches are updated once a day, and thus may not reflect up to the minute changes. The ''direct search'' searches the database directly, but it can only carry out simple keyword searches with no boolean logic, but it is always up-to-date.
[[Image:Attention.png]] '''Status:''' ''Currently under development. Please help copy and adapt the lucene query syntax docs to our needs!''

While we contruct this help page, please see the [http://lucene.apache.org/java/docs/queryparsersyntax.html lucene query syntax] page for details on how to construct [[MusicBrainz]] [http://musicbrainz.org/search/textsearch.html text search] queries.


==Overview==
==Overview==


Lucene offers a large flexibility in defining search queries for all needs imaginable. To make it easier to understand, this page was divided into subpages. While this one offers an introduction to the most commonly used features, the others explain more advanced search operators and constructs.
Lucene offers much flexibility in defining search queries for all needs imaginable. To make it easier to understand, this page was divided into subpages. While this one offers an introduction to the most commonly used features, the others explain more advanced search operators and constructs.


First some words on the the terminology used in these pages: <dl><dt>Query
First some words on the the terminology used in these pages: <dl><dt>Query
Line 29: Line 27:
* '''tori amos''' - search artist, sortname and alias fields
* '''tori amos''' - search artist, sortname and alias fields
* '''comment:electronic''' - search for the word electronic in artist disambiguation (comment) fields
* '''comment:electronic''' - search for the word electronic in artist disambiguation (comment) fields
* '''begin:1984 AND type:group''' -- search for all groups formed in 1984


===Albums===
===Albums===
Line 34: Line 33:
* '''café del mar''' - search for all Café del Mar albums
* '''café del mar''' - search for all Café del Mar albums
* '''"the understanding" AND artist:royksopp''' - search for the album The Understanding by Röyksopp
* '''"the understanding" AND artist:royksopp''' - search for the album The Understanding by Röyksopp
* '''date:1999 AND country:de AND rock''' - search for releases from released in Germany in 1999 with the word ''rock'' in them'''


===Tracks===
===Tracks===


* '''type:album AND amadeus''' search for tracks from albums with the title ''amadeus''
* '''day life''' will retrieve '''A Day In The Life''' and '''Life In A Day''' but also '''This Day''' and '''That's Life'''
* '''day life''' will retrieve '''A Day In The Life''' and '''Life In A Day''' but also '''This Day''' and '''That's Life'''
* '''day AND life''' will retrieve '''A Day In The Life''' and '''Life In A Day''' but not '''This Day''' or '''That's Life'''
* '''day AND life''' will retrieve '''A Day In The Life''' and '''Life In A Day''' but not '''This Day''' or '''That's Life'''
* '''"day in the life"''' will retrieve '''A Day In The Life''' but not '''Life In A Day''' or '''This Day''' or '''That's Life'''
* '''"day in the life"''' will retrieve '''A Day In The Life''' but not '''Life In A Day''' or '''This Day''' or '''That's Life'''
* '''"voodoo people" AND artist:"the prodigy"''' - search for all tracks Voodoo People by The Prodigy
* '''"voodoo people" AND artist:"the prodigy"''' - search for all tracks Voodoo People by The Prodigy

TODO: add more here!


<span id="query-syntax"></span>
<span id="query-syntax"></span>
Line 56: Line 55:


To do a fuzzy search use the tilde, "~", symbol at the end of a single word term. Optionally can specify the required similarity, a value is between 0 and 1. For example to search for a term similar in spelling to "roam" use the fuzzy search '''roam~''' or '''roam~0.8'''
To do a fuzzy search use the tilde, "~", symbol at the end of a single word term. Optionally can specify the required similarity, a value is between 0 and 1. For example to search for a term similar in spelling to "roam" use the fuzzy search '''roam~''' or '''roam~0.8'''

===Boolean operators===

===Boosting a term===

===Grouping===


==MusicBrainz specific search fields==
==MusicBrainz specific search fields==
Line 76: Line 69:
| sortname || artist sortname
| sortname || artist sortname
|-
|-
| artype || artist type (0 - unknown, 1 - person, 2 - band)
| type || artist type (person or group)
|-
|-
| begin || artist birth date/band founding date
| begin || artist birth date/band founding date
Line 102: Line 95:
| artist || artist name
| artist || artist name
|-
|-
| type || release type (1 - album, 2 - single, etc.<ref>1 - Album, 2 - Single, 3 - EP, 4 - Compilation, 5 - Soundtrack, 6 - Spokenword, 7 - Interview, 8 - Audiobook, 9 - Live, 10 - Remix, 11 - Other</ref>)
| type || release type (album, single, ep, compilation, soundtrack, spokenword, interview, audiobook, live, remix, other)
|-
|-
| status || release status (1 - official, 2 - promo, 3 - bootleg, 4 - pseudo-release)
| status || release status (official, promotion, bootleg, pseudo-release)
|-
|-
| tracks || number of tracks in the release
| tracks || number of tracks in the release
Line 114: Line 107:
| asin || the Amazon ASIN for
| asin || the Amazon ASIN for
|-
|-
| lang || The language for this release. Use the [http://www.loc.gov/standards/iso639-2/php/code_list.php three character ISO 639 codes] to search for a specific language. (e.g. lang:eng)
| lang || The language for this release (18 - Arabic, 76 - Chinese, 100 - Danish, 113 - Dutch/Flemish, 120 - English, 131 - Finnish, 134 - French, 145 - German, 176 - Hungarian, 195 - Italian, 198 - Japanese, 239 - Latvian, 309 - Norwegian, 338 - Polish, 340 - Portuguese, 353 - Russian, 393 - Spanish/Castilian, 403 - Swedish, 433 - Turkish, 284 - [Multiple languages])
|-
|-
| script || The script used for this release
| script || The [http://unicode.org/iso15924/iso15924-codes.html 4 character script code] (e.g. latn) used for this release
|-
| country || The [http://www.iso.org/iso/country_codes/iso_3166_code_lists/english_country_names_and_code_elements.htm two letter country code] for the release country
|-
| date || The release date
|-
| label || The name of the label for this release
|-
| catno || The catalog number for this release
|-
|-
| barcode || The barcode in a release event attached to a release
| barcode || The barcode in a release event attached to a release
Line 142: Line 143:
| release || release name
| release || release name
|-
|-
| type || release type (1 - album, 2 - single, etc.)
| type || release type (album, single, ep, compilation, soundtrack, spokenword, interview, audiobook, live, remix, other)
|-
|-
| tracks || number of tracks in the release
| tracks || number of tracks in the release
Line 158: Line 159:


''If you know the answer to these, please remove the question and integrate the answer into the docs above.''
''If you know the answer to these, please remove the question and integrate the answer into the docs above.''
* What is the difference between the "Indexed Search" and the "Direct Search"? I cannot find it explained anywhere. --[[User:DonRedman|DonRedman]]
** The "Indexed Search" is the Lucene search, it's retrived from the lucene indexed database that only updates once a day. The "Direct Search" is the 'old' search, it utelizes the [[MusicBrainz]] database itself to search, henche it reflects imidiate changes, but it is not as 'smart' ~[[User:mo|mo]]

* Is there a way to search for an album based on its length? I'm currently grabbing all results and filtering them by summing their track durations. Is the value computed on the fly from the tracks for each release page? --[[Chris Colvard|ChrisColvard]]
* Is there a way to search for an album based on its length? I'm currently grabbing all results and filtering them by summing their track durations. Is the value computed on the fly from the tracks for each release page? --[[Chris Colvard|ChrisColvard]]
** No, that is not possible. You can search for the total number of tracks. --[[User:RobertKaye|RobertKaye]]

* Is there also a searchable field for the media type, not just the date? ie "media:1" where 1=CD? -- [[Brian Schweitzer|BrianSchweitzer]] 12:44, 28 October 2007 (UTC)
* Is there also a searchable field for the media type, not just the date? ie "media:1" where 1=CD? -- [[Brian Schweitzer|BrianSchweitzer]] 12:44, 28 October 2007 (UTC)
** No, that is also not available -- [[User:RobertKaye|RobertKaye]]


[[Category:To Be Reviewed]] [[Category:Documentation]]
[[Category:To Be Reviewed]] [[Category:Documentation]]

Revision as of 22:44, 22 February 2008

This page describes the syntax for MusicBrainz indexed searches which use the Lucene text search engine. The search indexes for these types of searches are updated once a day, and thus may not reflect up to the minute changes. The direct search searches the database directly, but it can only carry out simple keyword searches with no boolean logic, but it is always up-to-date.

Overview

Lucene offers much flexibility in defining search queries for all needs imaginable. To make it easier to understand, this page was divided into subpages. While this one offers an introduction to the most commonly used features, the others explain more advanced search operators and constructs.

First some words on the the terminology used in these pages:

Query
A query is the complete expression you put in one of the search fields.
Term
A term is the smallest unit inside a query. In the default case each single word inside a query is a term of it's own, except for ...
Phrases
A phrase is a groups of words surrounded by quotation marks. Even though it's containing more than one word, a phrase is handled like a term.
Operators
or search operators are special characters and words that define either how single terms are processed by the search system (e.g. in -house the - tells the search system, not to return anything with the word house) or how to terms are to be combined in the search (e.g. one AND love means search for anything that has both words one and love).

The sections below Query Syntax describe simple and commonly used operators, in /AdvancedSyntax you'll find the more complicated and seldomly needed features of the search interface.

But first take a look at a few simple examples which might show everything necessary for the majority of your searches.

Example searches

Artists

  • tori amos - search artist, sortname and alias fields
  • comment:electronic - search for the word electronic in artist disambiguation (comment) fields
  • begin:1984 AND type:group -- search for all groups formed in 1984

Albums

  • café del mar - search for all Café del Mar albums
  • "the understanding" AND artist:royksopp - search for the album The Understanding by Röyksopp
  • date:1999 AND country:de AND rock - search for releases from released in Germany in 1999 with the word rock in them

Tracks

  • type:album AND amadeus search for tracks from albums with the title amadeus
  • day life will retrieve A Day In The Life and Life In A Day but also This Day and That's Life
  • day AND life will retrieve A Day In The Life and Life In A Day but not This Day or That's Life
  • "day in the life" will retrieve A Day In The Life but not Life In A Day or This Day or That's Life
  • "voodoo people" AND artist:"the prodigy" - search for all tracks Voodoo People by The Prodigy

Query syntax

Wildcards

To perform a single character wildcard search use the "?" symbol. To perform a multiple character wildcard search use the "*" symbol. For example, to search for "text" or "test" you can use the search te?t, to search for "test", "tests" or "tester", you can use the search test*.

Note: You cannot use a * or ? symbol as the first character of a search.

Fuzzy searches

To do a fuzzy search use the tilde, "~", symbol at the end of a single word term. Optionally can specify the required similarity, a value is between 0 and 1. For example to search for a term similar in spelling to "roam" use the fuzzy search roam~ or roam~0.8

MusicBrainz specific search fields

The artist index contains the following fields you can search:

field Description
arid artist id
artist artist name
sortname artist sortname
type artist type (person or group)
begin artist birth date/band founding date
end artist death date/band dissoluion date
comment artist comment to differentiate similar artists
alias the aliases/misspellings for this artist

Artist search terms with no fields specified search the artist, sortname and alias fields.

The release/album index contains theses fields:

field Description
reid release id
release release name
arid artist id
artist artist name
type release type (album, single, ep, compilation, soundtrack, spokenword, interview, audiobook, live, remix, other)
status release status (official, promotion, bootleg, pseudo-release)
tracks number of tracks in the release
discids number of cd ids for the release
date earliest release date for the release
asin the Amazon ASIN for
lang The language for this release. Use the three character ISO 639 codes to search for a specific language. (e.g. lang:eng)
script The 4 character script code (e.g. latn) used for this release
country The two letter country code for the release country
date The release date
label The name of the label for this release
catno The catalog number for this release
barcode The barcode in a release event attached to a release

Template:FootNote

Album search terms with no fields search the release field only.

And track searches can contain:

field Description
trid track id
track track name
arid artist id
artist artist name
reid release id
release release name
type release type (album, single, ep, compilation, soundtrack, spokenword, interview, audiobook, live, remix, other)
tracks number of tracks in the release
dur duration of track in milliseconds
qdur quantized duration (duration / 2000)
tnum track number

Track search terms with no fields search the track field only.

Questions

If you know the answer to these, please remove the question and integrate the answer into the docs above.

  • Is there a way to search for an album based on its length? I'm currently grabbing all results and filtering them by summing their track durations. Is the value computed on the fly from the tracks for each release page? --ChrisColvard
    • No, that is not possible. You can search for the total number of tracks. --RobertKaye
  • Is there also a searchable field for the media type, not just the date? ie "media:1" where 1=CD? -- BrianSchweitzer 12:44, 28 October 2007 (UTC)