-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support grouping of sounds #271
Comments
We'd need this ideally for Lexibank 2.0. |
One option would be to add the |
@LinguList Tagging you on this again to see how we proceed with this. |
The current workaround that would also guarantee backwards compat is to have a specific Lexeme class. from pylexibank import Lexeme
@attr.s
class CustomLexeme(Lexeme):
Grouped_Segments = attr.ib(default=None, metadata={"datatype": "string", "separator": " "}) And then you add a function def ungroup(segments):
return [segment for segment_group in [s.split(".") for s in segments] for segment in segment_group] Then you add in |
Yes, that"s what I am doing right now in my repositories. I just thought we could add this function to pylexibank and modify the Lexeme class to make this workaround unnecessary. |
I think this would be too much by now, since it is not part of the CLDF specification. |
I also wonder how we want to handle this in the future. If se say, Segments can be potentially Grouped, we have a situation where we may have clashes, so my idea would be to propose Grouped_Segments as another representation of segments to CLDF, but I suggest we wait for the reviews of the grouping sounds paper to see how we react here? With the paper, we have the reference to add this to lexibank. |
Yes, I think both are needed - a reference and more experience, including actively searching for cases with conflicting options to group. I think in the worst case, grouping of segments would introduce a degree of freedom which invites abuse where fine-tuned grouping together with fine-tuned analysis algorithms create intransparent results. I could also imagine that grouping and trimming used together could have funny effects. |
We already have examples of this kind. The freedom that this introduces at times may be so great that one can have two different outcomes of the same analysis due to grouping alone. The current solution leaves everything open but makes clear that this is a currently tested candidate for inclusion into CLDF in a later version. We can discuss already now -- also with respect to the modification of the paper -- if we want to propose a little plugin that could be used to make the conversion easier in lexibank (but would require to be installed on top of it and could be added there later). |
I agree with waiting until the reviews to see how we do this. With respect to the degrees of freedom: We would not touch the |
That is also an open question for now: do we add |
As referenced in an example case, we should modify pylexibank in a way that allows us to support grouped sounds (e.g.
a.ʔ
) instead of reporting a transcription error for those cases.The text was updated successfully, but these errors were encountered: