ZnCharacterReadStream (and friends) should publish position in terms of Character position #80

dalehenrich · 2022-12-12T19:22:33Z

The current implementation wraps a Byte stream and the position of the ZnCharaterReadStream is returned using the byte position not character position, so the following test fails returning 4 and not 3 as expected for character position:

	| string bytes stream res |
	string := 'eißendeße'.
	bytes := ZnUTF8Encoder new encodeString: string.
	stream := (ZnCharacterReadStream on: bytes readStreamPortable).
	res := stream next; next; next.
	self assert: res equals: $ß.
	self assert: stream position equals: 3.

This is from the test ZnCharacterStreamTests >> testUtf8EncodingStreamPosition.

The text was updated successfully, but these errors were encountered:

…reamTests and ZnLegacyCharacterStreamTests: testUpToAll, testUpToAllTwice and testUtf8EncodingStreamPositionFor...) are all apparently due to Issue #80

kurtkilpela · 2023-01-17T18:09:41Z

I took a look at Pharo's behavior around ZnCharacterReadSteam>>#position. Using a similar test case, Pharo's implementation also returns the byte position rather than the character position. Is this a semantic change we want to make? Or do we want to maintain semantics w/ Pharo?

| string bytes stream res |
string := 'eißßßßßßßßßßßßendeße'.
bytes := ZnUTF8Encoder new encodeString: string.
stream := (ZnCharacterReadStream on: bytes readStream).
res := stream next; next; next; next; next.
stream position. "-> 8 rather than 5"

dalehenrich · 2023-01-17T19:24:56Z

I am in favor of having the code work correctly over Pharo compatibility and skipping 1/2 characters does not seem correct to me ...

kurtkilpela · 2023-02-06T21:24:57Z

The tests associated with this issue have been temporarily removed from the test set in ecf4900. They will be restored when this issue is addressed.

dalehenrich assigned kurtkilpela Dec 12, 2022

dalehenrich mentioned this issue Dec 12, 2022

ZnCharacterStreamTests #testUpToAll and #testUpToAllTwice are now failing #79

Closed

This was referenced Dec 14, 2022

For GemStone support we need to be able to specify a file encoding and StringClass ala topaz #75

Closed

Issue 75: add stringClass attributes to FileSystem #82

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZnCharacterReadStream (and friends) should publish position in terms of Character position #80

ZnCharacterReadStream (and friends) should publish position in terms of Character position #80

dalehenrich commented Dec 12, 2022 •

edited

Loading

kurtkilpela commented Jan 17, 2023

dalehenrich commented Jan 17, 2023

kurtkilpela commented Feb 6, 2023

ZnCharacterReadStream (and friends) should publish position in terms of Character position #80

ZnCharacterReadStream (and friends) should publish position in terms of Character position #80

Comments

dalehenrich commented Dec 12, 2022 • edited Loading

kurtkilpela commented Jan 17, 2023

dalehenrich commented Jan 17, 2023

kurtkilpela commented Feb 6, 2023

dalehenrich commented Dec 12, 2022 •

edited

Loading