Skip to content
This repository has been archived by the owner on Aug 27, 2019. It is now read-only.

properly handle tags in annotate transcription #16

Open
ggdhines-zz opened this issue Sep 15, 2015 · 0 comments
Open

properly handle tags in annotate transcription #16

ggdhines-zz opened this issue Sep 15, 2015 · 0 comments
Assignees

Comments

@ggdhines-zz
Copy link
Contributor

I need to make sure that tags in annotate (and in the future folger) are aggregated correctly. A big part of the problem is that MAFFT, which I use to do the alignment over multiple transcriptions, only allows basic ASCII characters and even some of those aren't allowed (spaces for example). So representing tags as special characters is a problem. The solution is to convert all normal characters to lower case and use upper case characters for special meaning. For example "A" would represent space, "B" could represent "" etc. (I have already implemented this) In theory this could result in misaligning of the text - since case can matter, but in practice I highly doubt it will ever matter. The benefit is that aggregating the tags becomes as simple as aggregating the text. The challenge is that for the final text we need to go back to the individual transcriptions and decide case on a character by character basis. (I still need to do this.) Also for Folger there are too many tags - we would run out of capital letters. So I think what I will do is use "B" (for example) to represent all tags and then at the end go back and vote on what each tag should actually be. (So basically just like what happens with deciding on case.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant