-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Release types #50
Release types #50
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merging currently takes a union of types across all providers. It probably should be more clever. Especially for the primary type, where there should be only a single type, there should maybe be some logic to select only one.
The merging logic should probably based on rules. Some initial ideas:
- Album is always discarded in favour of another primary type (EP, Single, Broadcast)
- All secondary types will be preserved (unless they conflict, like Audiobook and Audio drama would, but this is not relevant yet)
- If both EP and Single are specified, discard both as the type is ambiguous? (depends on when the automatic guessing happens, see the next question)
The generic guessing functions could be applied for all providers. Should this be done per provider, or should we have this automatically run for any provider after
convertRawRelease
?
I agree that the guessing functions should not have to be called in each provider implementation.
What I had in mind is to do only the required, provider-specific release type processing steps inside the provider and handle everything else later on.
There would be two possibilities for the second processing step:
- Apply the guessing functions once after merging. Unless there are any issues with this approach, it would be my favourite because it is more efficient and we could display the "original source values" (without guessed values) as alternative values on the website.
- Apply the guessing functions for each individual before merging. Might yield better results if the titles from a secondary metadata source contain more useful information?
On #15 there was some discussions about additional hints that could be used at least for validating if types are plausible. I personally think they involve a bit much guesswork, but maybe this could be used to trigger warnings if a type seems unlikely.
Yes, I would probably keep these track count and duration metrics for a separate step.
providers/Spotify/mod.ts
Outdated
// Type can also be "album", but this might be too generic | ||
// to be reliably set as a MusicBrainz type. | ||
// 'album': 'Album', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can set the type to album as soon as we have a clever merge algorithm which prefers other primary types over Album. Although most providers likely declare everything an Album which is not a Single or an EP, it would feel strange to ignore the most common case completely.
There might be some other cases for which we should also ignore an Album type (like non-music releases), but we can have this logic at a higher level and keep the providers as dumb as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not convinced by this. All of Spotify, Deezer and Tidal use album for everything they don't consider a Single or EP (or Compilation for Spotify).
I think this is similar to e.g. the uncertainty around release countries, where in the discussion around atisket the consensus seems to be "if in doubt leave it empty". I think we should apply the same careful approach here and leave it to the user if we can't have a certain confidence in a type.
But I'll try it once the merge algorithm is in place.
This also changes HarmonyRelease.types to be an array instead of a set.
This keeps only a single primary type and sorts the result.
@kellnerd I think I have now addressed most issues and this is usable. This has become much more complex than I had originally anticipated. I have implemented sorting the types with primary type first in 0f08a80 and a merging algorithm in d9db1f1 . The merging takes the secondary types as given, but for the primary types keeps only one. It prefers more specific types over "Other" or "Album", and it prefers "EP" over "Single" (as e.g. Spotify has not EP, just Single, and EP is usually the more specific classification). This overly seems to work quite well and I have even allowed the "Album" type from Deezer, Spotify and Tidal again, as you suggested.
The final release merging now runs I'm unsure how to go about the "original source values" , but maybe we can move this to a separate PR. This one is already big enough. Also I'm unsure whether we need to do something about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has become much more complex than I had originally anticipated.
Indeed, I also thought this would be more straightforward. Thank you for carefully thinking this through with me!
I'm unsure how to go about the "original source values" , but maybe we can move this to a separate PR.
Oh, that part is a lot simpler than it sounds because the necessary tools are already there.
I have left a suggestion for it, it is basically a one-liner without the formatting.
Also I'm unsure whether we need to do something about preferSameProvider in
merge.ts
(https://github.com/kellnerd/harmony/blob/main/harmonizer/merge.ts#L146-L148). In the current implementation the types get merged across all releases regardless. This is essentially the same as regions are handled as well. But maybe this is unexpected ifpreferSameProvider
is true. I'm not sure about the wanted behavior here.
Interesting point, I had not thought about that. The concept of the provider preferences only applies to "immutable" properties so far (those for which we only use one source) but not to "combinable" properties:
harmony/harmonizer/properties.ts
Lines 4 to 9 in dfdffd4
export const combinableReleaseProperties: Array<keyof HarmonyRelease> = [ | |
'externalLinks', | |
'availableIn', | |
'excludedFrom', | |
'info', | |
]; |
Note that this constant is not actually used, it just exists for book-keeping purposes and to accompany the "immutable" constants. These are used in the merge algorithm and to restrict the valid keys of
ProviderPreferences
.Strictly speaking 'types' should also be part of
combinableReleaseProperties
since it is combined from multiple sources. On the other hand, the primary type is immutable, which makes 'types' partially immutable 🤔
I would say we leave 'types' as a combinable property for now and focus on improving the merging algorithm rather than allowing the user to select a preferred provider for them.
Co-authored-by: David Kellner <52860029+kellnerd@users.noreply.github.com>
@kellnerd I think I addressed all your latest comments. |
Avoids merging and sorting the types multiple times.
This adds support for release group types, see #15. I'm not fully sure this is ready to go, there are some loose ends, so I open this as a draft.
Some notes:
- Single
and- EP
suffixes used there and also removed them (this closes Remove- Single
&- EP
from release names automagically #9)For discussion:
convertRawRelease
?