User:Jokipii: Difference between revisions
From MusicBrainz Wiki
Jump to navigationJump to search
(possible future bot sets (Advanced relationships between releases and artists where both are already linked to discogs)) |
(update) |
||
Line 1: | Line 1: | ||
== Who am I? == |
== Who am I? == |
||
MusicBrainz editor [http://musicbrainz.org/user/Jokipii Jokipii] and operator of [http://musicbrainz.org/user/Jokipii_bot Jokipii_bot]. |
MusicBrainz editor [http://musicbrainz.org/user/Jokipii Jokipii] and operator of [http://musicbrainz.org/user/Jokipii_bot Jokipii_bot]. I have both MusicBrainz and Discogs databases installed on PostgreSQL. I am currently trying to improve linking between those. Bot code can be found at [https://github.com/Jokipii/musicbrainz-bot musicbrainz-bot] and code that produces Discogs database from monthly XML dumps found at [https://github.com/Jokipii/discogs-xml2db discogs-xml2db]. |
||
== Currently working with == |
|||
I have both MusicBrainz and Discogs databases installed on PostgreSQL. I am currently trying to improve linking between those. |
|||
Bot code can be found at [https://github.com/Jokipii/musicbrainz-bot musicbrainz-bot] and code that produces Discogs database from monthly XML dumps found at [https://github.com/Jokipii/discogs-xml2db discogs-xml2db]. |
|||
== Userscripts == |
== Userscripts == |
||
Line 10: | Line 6: | ||
== Bot queue == |
== Bot queue == |
||
⚫ | |||
⚫ | |||
** 24500 |
|||
'''''In Progress''''' |
|||
== Possible future sets == |
|||
⚫ | |||
⚫ | |||
** 18000 |
|||
⚫ | |||
⚫ | |||
'''''Not Started''''' |
|||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
** |
** [http://test.musicbrainz.org/edit/16086228 Example] |
||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
⚫ | |||
'''''Done''''' |
|||
== Bot tasks done == |
|||
* Artist Discogs links |
* Artist Discogs links |
||
** Exact name match. Have release(s) with Discogs links. All releases found that way point on same artist at Discogs. |
** Exact name match. Have release(s) with Discogs links. All releases found that way point on same artist at Discogs. |
||
* Artist types based on disambiguation comment |
|||
== Bot programming tasks == |
== Bot programming tasks == |
||
* Merge bot code to [https://github.com/lalinsky/musicbrainz-bot musicbrainz-bot] |
* Merge bot code to [https://github.com/lalinsky/musicbrainz-bot musicbrainz-bot] '''''Done''''' |
||
* Start using [https://github.com/philipmat/discogs-xml2db discogs-xml2db] to produce Discogs database |
* Start using [https://github.com/philipmat/discogs-xml2db discogs-xml2db] to produce Discogs database '''''Done''''' |
||
* Better documentation |
|||
* Map Discogs credits <-> MB Advanced relationships |
|||
== Some stats == |
== Some stats == |
||
Line 57: | Line 58: | ||
| 2720810 |
| 2720810 |
||
| 171170 |
| 171170 |
||
| |
| 17% |
||
| |
| |
||
| |
| |
||
Line 67: | Line 68: | ||
| 365081 |
| 365081 |
||
| 47387 |
| 47387 |
||
| |
| 13% |
||
| 106445 |
| 106445 |
||
| |
| 11% |
||
| 277207 |
| 277207 |
||
| |
| 10% |
||
|- |
|- |
||
! Artist: |
! Artist: |
||
Line 77: | Line 78: | ||
| 2100250 |
| 2100250 |
||
| 110825 |
| 110825 |
||
| |
| 18% |
||
| 606737 |
| 606737 |
||
| |
| 61% |
||
| 1738474 |
| 1738474 |
||
| |
| 64% |
||
|- |
|- |
||
! Label: |
! Label: |
||
Line 87: | Line 88: | ||
| 245988 |
| 245988 |
||
| 16004 |
| 16004 |
||
| |
| 29% |
||
| 414436 |
| 414436 |
||
| |
| 42% see note |
||
| 1542803 |
| 1542803 |
||
| |
| 57% |
||
|} |
|} |
||
note: In MB only 567392 releases have label information, and 420909 don't have. |
note: In MB only 567392 releases have label information, and 420909 don't have. |
||
⚫ | |||
{| |
{| |
||
⚫ | |||
! |
|||
! MusicBrainz Total |
! MusicBrainz Total |
||
! Discogs Total |
! Discogs Total |
||
Line 127: | Line 128: | ||
| 17141 |
| 17141 |
||
| 30% |
| 30% |
||
|} |
|||
{| |
|||
! 2012-03-09 |
|||
! MusicBrainz Total |
|||
! Discogs Total |
|||
! Links (all these are not unique) |
|||
! Percent done (compared to smaller total) |
|||
|- |
|||
! Releases: |
|||
| 1012072 |
|||
| 2926422 |
|||
| 193183 |
|||
| 19% |
|||
|- |
|||
! Release Groups: |
|||
| 842794 |
|||
| 405891 |
|||
| 76053 |
|||
| 19% |
|||
|- |
|||
! Artists: |
|||
| 647860 |
|||
| 2251526 |
|||
| 128111 |
|||
| 20% |
|||
|- |
|||
! Labels: |
|||
| 58432 |
|||
| 300452 |
|||
| 17224 |
|||
| 29% |
|||
|} |
|} |
Revision as of 12:34, 9 March 2012
Who am I?
MusicBrainz editor Jokipii and operator of Jokipii_bot. I have both MusicBrainz and Discogs databases installed on PostgreSQL. I am currently trying to improve linking between those. Bot code can be found at musicbrainz-bot and code that produces Discogs database from monthly XML dumps found at discogs-xml2db.
Userscripts
Here is userscript that makes voting for Discogs links easier.
Bot queue
Set descriptions and number of links
In Progress
* Release links identified by exact match on catalog number, release name, linked label, format, same number of tracks and same release country. ** 18000
Not Started
* Artist Discogs links ** Exact name match. One or more already linked various artist release(s) where artist have track(s). All track(s) found that way point on same artist at Discogs. ** Example ** 21546 * Advanced relationships between releases and artists where both are already linked to discogs ** Producer Hand made example 74164 ** Mastered 27970 ** and certainly lots also in other relationship classes * Artist name with exact (case insensitive) match, is member of groups with Discogs links, all groups found that way have same Discogs artist as member. ** 8692 * Artist (type:group) name with exact (case insensitive) match, have members with Discogs links, all members found that way have been also market as members in Discogs entry. ** 2083 * Artist that have Discogs link, and not have type(person/group) set, and have multiple members in Discogs (indicating type:group) ** 1309 * Artist that have Discogs link, and not have type(person/group) set, and have Discogs realname without characters "&,/+" and word "and" (indicating type:person) ** 1699
Done
* Artist Discogs links ** Exact name match. Have release(s) with Discogs links. All releases found that way point on same artist at Discogs. * Artist types based on disambiguation comment
Bot programming tasks
- Merge bot code to musicbrainz-bot Done
- Start using discogs-xml2db to produce Discogs database Done
- Better documentation
- Map Discogs credits <-> MB Advanced relationships
Some stats
MusicBrainz Total | Discogs Total | Links (all these are not unique) | Percent done (compared to smaller total) | Sum of unique MusicBrainz releases connected to linked entities | Percent of all MusicBrainz releases | Sum of unique Discogs releases connected to linked entities | Percent of all Discogs releases | |
---|---|---|---|---|---|---|---|---|
Releases: | 988301 | 2720810 | 171170 | 17% | ||||
Release groups: | 822442 | 365081 | 47387 | 13% | 106445 | 11% | 277207 | 10% |
Artist: | 626598 | 2100250 | 110825 | 18% | 606737 | 61% | 1738474 | 64% |
Label: | 55844 | 245988 | 16004 | 29% | 414436 | 42% see note | 1542803 | 57% |
note: In MB only 567392 releases have label information, and 420909 don't have.
2012-02-23 | MusicBrainz Total | Discogs Total | Links (all these are not unique) | Percent done (compared to smaller total) |
---|---|---|---|---|
Releases: | 1008061 | 2926422 | 182581 | 18% |
Release Groups: | 839314 | 405891 | 73656 | 18% |
Artists: | 644784 | 2251519 | 126210 | 20% |
Labels: | 58038 | 300452 | 17141 | 30% |
2012-03-09 | MusicBrainz Total | Discogs Total | Links (all these are not unique) | Percent done (compared to smaller total) |
---|---|---|---|---|
Releases: | 1012072 | 2926422 | 193183 | 19% |
Release Groups: | 842794 | 405891 | 76053 | 19% |
Artists: | 647860 | 2251526 | 128111 | 20% |
Labels: | 58432 | 300452 | 17224 | 29% |