-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CollateX, bug, now with e-mail #85
Comments
Hi Karsten, Gregor Middell forwarded your example data files to me. I ran the collation and looked at the internal alignment result, represented as a variant graph, meaning the result independent of the chosen output format and it looks as follows: Which means that CollateX only finds two points of variation. One being a ":" (W1) replaced by a "," (W2). The other being "night" (W1) replaced by "knight" (W2). This seems to be correct to me. If you agree with this then the question becomes how that internal result should be represented in the requested output format. In the TEI output there is a Before we discuss possible solutions: am I thinking in the right direction so far or is there something else that you wanted to bring to our attention with the example in your report? Best, |
Dear Ronald Dekker Thank you, very much, for your rapid reply. You are indeed thinking in the right direction. It does confirm my suspicion that change of case somehow is a difference, somehow not. Your graph is correct in the sense that the different cases does not constitute a 'semantical difference'. Never the less you have saved the 'not semantically different' version (It) somewhere. It is not represented in the graph (nor in the --format graphml output), but it is in the TEI output by two separate elements. My problem is, that the not semantically difference (it vs. It) this way is mixed up with the truly invariant text surrounding it, which may be very comprehensive. I would have expected either (it and It are different)
or (it an It are not different, consistent with the graph)
I do catch your remark that the latter would prevent you to reconstruct the original witnesses. I also think the former was to prefer (the could be attributed type="notSemanticalDifference"), but you would not be able to construct it from your graph unless you make a recursive collation on the readings. Have I missed something in the documentation that changes in upper and lower casing are by default ignored during alignment (BTW the same counts for change in spacing)? And does 'by default' mean that I can change it, like suggested in the documentation, by the --script option? If so, how do I do this (back to the first question)? Yours, |
I have an issue on CollateX, a bug actually, in the program, which I would like to present and ask how to deal with. Is there an e-mail to contact - I would like to include a (short) example.
Karsten Kynde
karsten@kynde.dk
The text was updated successfully, but these errors were encountered: