History:Display Inheritance Proposal

From MusicBrainz Wiki
Revision as of 17:16, 28 August 2016 by CallerNo6 (talk | contribs) (CallerNo6 moved page History:Display Inheritance Propodal to History:Display Inheritance Proposal without leaving a redirect: typo)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search
Status: This page describes a failed proposal. It is not official, and should only be used, if at all, as the basis for a new proposal.

Proposal number: RFC-Unassigned
Champion: None
Status: Failed, due to Officially closed as Abandoned, March 24, 2010
This proposal was not tracked in Trac.

Attention.png Status: this proposal is still lacking a lot of flesh/thoughts, and may have a quite deep development impact, making it either difficult to implement with the current schema version, or not worth the effort before NGS.

This proposal which tries to solve redundancy problems is based on an idea (don't remember who it was) had in the discussion around redundancy of Björk's birth date (see also DateOfBirthStyle).

The problem is: we have multiple entities which share equal data - or should share it, at the moment we can only store it multiple times. This produces unwanted DataRedundancy.


  • Björk is a performance name for the person Björk Guðmundsdóttir (actually it isn't but that's another discussion). A user browsing Björk's artist page wants to see her birth date. If it's not present he will probably try to add it. But the date is present on the artist page of Björk Guðmundsdóttir so here we have redundancy.
  • Jeanette is a performance name for the person Jeanette Biedermann. Though the German Wikipedia article is under http://de.wikipedia.org/wiki/Jeanette_Biedermann so I stored it under her LegalName. But it would be nice if it is also visible under the performance name. But if it would be called http://de.wikipedia.org/wiki/Jeanette and she had another performance name I didn't want it to appear under the other performance name.
  • Album reviews are normally about albums in general and not about certain releases or editions, so I want a review link to be visible to all albums but don't want to store it under all of them. But sometimes articles are only about a certain release/edition so I really want the review link only to be about this edition and not about others.
  • An album consisting of two discs is stored as two separate AlbumEntities in the db. But I don't want to store the release date twice. And I don't want to link to the Discogs entry twice (they have only one page for albums consisting of several discs).

How can this be solved? Using the PowerOfAR!


Let's look at it more abstract.. We have some entities which are linked in different ways through AR. Each entity has data fields in the db and AR links to additional information. You can identify one entity as the MainEntity (for artists: the LegalName or BirthName, for albums: the earliest release) and the others as SubEntities. Store all information which is also valid for SubEntities under the MainEntity (this does not have to be the root of the whole tree but can also be the root of a sub set). When building the page for a SubEntity try to lookup data from the linked MainEntity under certain rules (for example depending on the relationship type) and with a certain search depth (also lookup data from the parent of the parent node?) and display it on the page as if it belonged to the SubEntity (the SubEntity inherits data of its MainEntity only in the display - therefore DisplayInheritance).

This sounds rather technical. An example? Of course:

Jeanette Biedermann is the MainEntity, Jeanette is its SubEntity. When building the page for Jeanette the birth date of Jeanette Biedermann and her links are looked up and displayed as if they were stored under Jeanette. The same would apply for other PerformanceNames if she had any. But if you store a link under Jeanette it will only be shown under this entity. Only if it had further children they would inherit this data.

If you have an earliest release of an album and then linked a remaster or re-release and a special edition (link type for this is yet to be invented) then you can use the link type as a rule for deciding if a SubEntity should inherit data. For example the earliest release has a review page. Then you decide: all linked re-releases and remasters inherit this review page. Linked special editions do not.

So this will require a well-thought-out model for what will be inherited and what not. And it will require some more link types which link SubEntities to MainEntities.


So basically what you are proposing is that AdvancedRelationshipTypes get a direction of inheritance. There would be a set of directional inheritance rules attached to each AR type. I like this idea, but do you realize the dependencies that this will build up? These inheritance rules and the database fields will be so tightly linked that changing something in the database schema will probably become quite a nightmare. --DonRedman

  • These dependencies already exist in the upcomming ArtistPageRedesign which makes use of the different types of ARs to build views depending on this. Also AlbumGroups using ARs to link to albums would depend on the used type. And I think this is exactly what AR is for: to make use of the different types to represent the data in different ways. This is what I then call the PowerOfAR. ;) And no, I don't think we need such a rule for every AdvancedRelationshipType. So it's nothing which is to be stored with the types but could also be stored separatly. But that's a question of implementation. --Shepard

You could potentially use this concept to solve the DontMakeRelationshipClusters problem. If an artist SubEntity inherits the "is a sibling of" relationships from its MainEntity, then adding Rebbie Jackson as a sibling of Michael Jackson would also add her as a sibling of Janet Jackson. If the search depth is large enough, users could add siblings between whichever entities they want, and it'd all be calculated correctly on the artist page.

The difference is that there would be no distinction between a SubEntity and a MainEntity for that case. You're not working in a tree, but rather in an arbitrary network. This can still be handled algorithmically pretty easily - you just maintain a list of the entities you've already looked at, and make sure you don't look at them twice.

The main problem I'm worried about is that the code for this would be rather complicated. If it was just relationships that got inherited that'd be simpler, but you're also trying to treat "normal" database fields like this. The code could get quite hairy.

I'm less concerned about the problem of changing the database schema. The code would have to be robust enough to check for fields it's expecting not being present. If it's expecting, say, date of birth, and it doesn't exist, then that's fine: it simply doesn't display that field on the SubEntity's page.


  • Wait, I don't get that. You add a "is a sibling of" relationship between Rebbie Jackson and Michael Jackson and then Janet Jackson inherits this relationship? How if she's not yet connected to them? So you have to connect her to one of them. Using what? Using the "is a sibling of" relationship. So this doesn't really save you anything. Not if you look at it as an arbitrary network. But if you look at it as a tree and add relationships to one main entity (the oldest Jackson) then you can inherit the relationships to the SubEntities. Problem is the users who have to really know what they are doing. And yes, the code can become rather ugly. :/ --Shepard
    • Yes, this doesn't save anything. But the users won't have to know what they are doing, since you don't need Main- and SubEntities any longer. And the code won't become that ugly: I could do it with a simple loop with only 3-5 (depending on the language) commands inside the loop. - But, sure, this will become increasingly slow with more "siblings", as you need one lookup per artist. --derGraph