Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad UTF-8 content in GraphML output on plain text #50

Open
tla opened this issue Feb 1, 2018 · 0 comments
Open

Bad UTF-8 content in GraphML output on plain text #50

tla opened this issue Feb 1, 2018 · 0 comments

Comments

@tla
Copy link
Member

tla commented Feb 1, 2018

In constructing a simple test case for other purposes, I ran across a bug in the GraphML output of CollateX, which should be reproducible as follows:

Taras-iMac:MatthewEdessa tla$ cat first.txt 
The quick brown fox jumped over the lazy dogs.
Taras-iMac:MatthewEdessa tla$ cat second.txt 
the quick brown fox jumped over the lazy sleeping dog.
Taras-iMac:MatthewEdessa tla$ cat third.txt 
The quick brown fox jumped over the sleeping cat.
Taras-iMac:MatthewEdessa tla$ file *.txt
first.txt:  ASCII text
second.txt: ASCII text
third.txt:  ASCII text
Taras-iMac:MatthewEdessa tla$ ~/bin/collatex -t -f graphml first.txt second.txt third.txt > test.xml

Here is a zip file of the output - you can see an enormous blob of bad data in the middle of node 14.
test.xml.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants