Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix invalid WKT empty geometries #160

Merged
merged 7 commits into from
Aug 17, 2023
Merged

Conversation

Oreilles
Copy link
Contributor

Currently, empty geometries would be writter in WKT as:

LINESTRING()
POLYGON()
GEOMETRYCOLLECTION()

which isn't valid WKT.

This PR addresses this issue and output instead:

LINESTRING EMPTY
POLYGON EMPTY
GEOMETRYCOLLECTION EMPTY

Copy link
Member

@michaelkirk michaelkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

} else {
self.out.write_all(b"(")?;
}
Ok(())
}
fn geom_end(&mut self) -> Result<()> {
self.out.write_all(b")")?;
if let Some(geometry_size) = self.geometry_sizes.pop() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would we ever expect this to be None?

Copy link
Contributor Author

@Oreilles Oreilles Aug 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A geometry would have to end before it began, so in theory no ? I just tried to get around the fact that Vec::pop always returns an option!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding something like else { debug_assert!(false, "ended geometry that didn't start") } as a "this should never happen but we don't want to blow up in production if it does"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch... #160 (comment)

self.geom_begin(idx, b"POINT EMPTY")
// we intentionally omit calling geom_end(), because POINT EMPTY has no closing paren
self.geom_begin(b"POINT", true, 0, idx)?;
self.geom_end()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems less weird. 👍

@Oreilles
Copy link
Contributor Author

Oreilles commented Aug 13, 2023

We have a problem with the csv BufferingWktWriter... since it instanciate a new writer for every processor function call, the state isn't persisted and the implementation cannot work.

If we had access to the geometry size in processor functions polygon_end etc., then we wouldn't have to manage the geometry_sizes stack 🤔

@michaelkirk
Copy link
Member

We have a problem with the csv BufferingWktWriter... since it instanciate a new writer for every processor function call, the state isn't persisted and the implementation cannot work.

I've just opened Oreilles#3 against your own branch that I think addresses this (and simplifies my old crappy code from yesteryear!). Take a look and merge into your branch if you think it makes sense.

@michaelkirk
Copy link
Member

If we had access to the geometry size in processor functions polygon_end etc., then we wouldn't have to manage the geometry_sizes stack 🤔

I'm also open to this approach (or we could do both) but it seems like a bigger breaking change, so my preference at this point would be to avoid it.

impl<'a, W: Write> WktWriter<'a, W> {
pub fn new(out: &'a mut W) -> WktWriter<'a, W> {
impl<W: Write> WktWriter<W> {
pub fn new(out: W) -> Self {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, but wanted to highlight the change for other reviewers that we're changing WktWriter to take any impl Write rather than only &mut impl Write.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for simplifying writters so that they own the underlying impl Write - but should it possibly be a separate PR that changes them all? It feels like a counter-pattern to introduce it in just one writer as part of an unrelated empty geometry support. Then we can rebase this PR on top of it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with doing it for all writers. Who wants?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not unrelated, in that it's required to fix the bug. But yes, I agree it would be "cleaner" to switch all the API's over, rebase this PR and then merge it rather than follow up with changing the rest of the API's after this is merged. My judgement call was that it wasn't worth the extra paperwork, but if you feel differently thats fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok either way, just as long as we don't release until all writers are more-or-less lined up (and of course its a minor rather than a patch change

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and of course its a minor rather than a patch change

Just to be clear, changing from &mut Write -> Write is a non-breaking change, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WktWriter is part of the public API - if you change its signature, just like we changed writers in #155 and #150, I think all of these should be released as 0.11, or else any existing code using this in dependencies: geozero = "0.10.0" and using MvtWriter::new() or using let w = WktWriter::new(); w.dims.z = true; will break.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a breaking change to WktWriter in this PR if that's what you mean (see the changelog entry).

So is everyone OK to merge this?

Copy link
Member

@nyurik nyurik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@michaelkirk
Copy link
Member

bors r+

@bors
Copy link

bors bot commented Aug 16, 2023

Configuration problem:
bors.toml: not found

@nyurik
Copy link
Member

nyurik commented Aug 16, 2023

i thought there were some discussions in rust itself to stop using bors because github now covers most such usecases?

@michaelkirk
Copy link
Member

michaelkirk commented Aug 17, 2023 via email

@michaelkirk michaelkirk merged commit 47c02ea into georust:main Aug 17, 2023
1 check passed
@michaelkirk
Copy link
Member

I just realized that this project does not, and never did, use bors. 😅

@nyurik
Copy link
Member

nyurik commented Aug 18, 2023

@michaelkirk in Martin, we have branch protection rule for main:

  • Require a pull request before merging
  • Require approvals (1)
  • Require status checks to pass before merging (with most of the key CI jobs listed as required)

I think if you also enable Require branches to be up to date before merging, you will achieve what you want: a way to +1 something, plus a way to say "squash once CI passes" - this way it will not be squashed unless everything is good, and it will not pass if another PR merges ahead of it. Granted that it will fail if another PR merges first though.

@michaelkirk
Copy link
Member

I got it working! Ultimately I think the issue was something else, but thanks for the sanity check @nyurik.

michaelkirk added a commit to michaelkirk/geozero that referenced this pull request Aug 21, 2023
As of georust#160, this breaks with invalid output:

    GEOMETRYCOLLECTION EMPTYPOINT(-122.329051 47.6069),POINT(-122.266529 47.515984)
michaelkirk added a commit to michaelkirk/geozero that referenced this pull request Aug 21, 2023
As of georust#160, this breaks with invalid output:

    GEOMETRYCOLLECTION EMPTYPOINT(-122.329051 47.6069),POINT(-122.266529 47.515984)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants