User:Nikki/Recording lengths 2

From MusicBrainz Wiki
< User:Nikki
Revision as of 16:33, 26 February 2013 by Kuno (talk | contribs) (The median and the mode both suck! I think we should ...)

After the initial round of feedback, all except one person who responded supported always setting recording lengths automatically. We decided in a dev chat that we will always set them automatically and we are focusing on how to calculate the the length. As in the initial round of feedback, the discussion on this page does not apply to standalone recordings.

Determining the length

The goals I think we have when determining the length are:

  • To use one of the actual track lengths.
  • To avoid anomalous values, whether that's from appended silence, clipping or erroneous data.
  • To avoid huge changes in lengths when data is added or removed (difficult with only one or two values however).

Given those goals:

  • The mean (0 votes previously) does not fit, since it will often not return one of the actual track lengths.
  • Sorting the releases and taking the length from the first one (0 votes previously) does not fit, since it does not avoid anomalous values and does not avoid huge changes in lengths when data is added or removed.
  • Using the shortest length (2 votes previously) does not fit, because it does not avoid all anomalous values (it works for anomalies that make the track longer, but not ones which make it shorter) and also does not avoid huge changes in lengths when data is added or removed.

The two which do fit are the median (3 votes previously) and the mode (4 votes previously). The median is problematic when there are an even number of values, since then you normally take the mean of the two values, which would not necessarily result in an actual track length (e.g. given 3:00 and 5:00, it would give 4:00). The mode is also problematic when there are multiple modes (e.g. again, 3:00 and 5:00, neither is more common than the other). We could however avoid that problem by instead taking the shortest of the middle values or most common values for the median and mode respectively.

Votes and reasons for how the length should be determined

I think the median is better than the mode, because ...

  • if there are an equal number of different track lengths(e.g. {1:03, 1:04, 1:07, 1:08}) then all of them are the mode. Not very useful. However, median could produce a decimal duration. Hawke (talk)

I think the mode is better than the median, because ...

  • An actual track length. JonnyJD (talk) 03:12, 26 February 2013 (UTC)

The median and the mode both suck! I think we should ...

  • OliverCharles (talk) 13:08, 26 February 2013 (UTC) Do both! I prefer using a real track length, so I'd put preference on the mode. However, in the case of length being multimodal, I'd just take the median, rounded to the nearest actual track length.

I still don't care, just calculate it automatically somehow.

  • Nikki (talk) 02:48, 26 February 2013 (UTC)
  • Ianmcorvidae (talk) 03:22, 26 February 2013 (UTC) (as long as the "choose the shortest" variation is chosen in order to have an actual track length)
  • Reosarevok (talk) 13:03, 26 February 2013 (UTC)

Which lengths should be included?

In the initial round of feedback there were some suggestions of only including lengths from releases with disc IDs if any of the releases have disc IDs. Which track lengths should be included for determining the recording length?

I don't care

All track lengths

  • Nikki (talk) 02:48, 26 February 2013 (UTC)
  • Ianmcorvidae (talk) 03:22, 26 February 2013 (UTC) (if we do something about disc IDs or official releases, IMO it should be weighting, not exclusion)
  • OliverCharles (talk) 13:03, 26 February 2013 (UTC)
  • Kepstin (talk) 16:28, 26 February 2013 (UTC)

If there are official releases, only include lengths from official releases

If there are releases with disc IDs, only include lengths from releases with disc IDs

  • This, but what ian said above: weighting not exclusion. Hawke (talk)

Weighted (Official/TOC)

  • "Add a track twice" in the list where the mode/median is chosen when the release is official or has a Disc ID. Not more when both is the case though. That might be too much. --JonnyJD (talk) 04:37, 26 February 2013 (UTC)