Skip to content
This repository has been archived by the owner on Mar 8, 2020. It is now read-only.

XPath returns a node instead of an attribute #340

Open
dennwc opened this issue Dec 11, 2018 · 7 comments
Open

XPath returns a node instead of an attribute #340

dennwc opened this issue Dec 11, 2018 · 7 comments
Assignees
Labels

Comments

@dennwc
Copy link
Member

dennwc commented Dec 11, 2018

From the new Python client:

... There is a failing test that I've left commented out, testFiltertoken, PTAL, this xpath query:

it = ctx.filter("//*[@token='else']/@token")

AFAIK it should return the @token property value (else) but it returns the full object containing it.

See bblfsh/python-client#128 (comment)

Related either to XPath projection in SDK, or XPath library itself.

@juanjux
Copy link
Contributor

juanjux commented Dec 12, 2018

The problem is probably the @ in the @token (and @type, @pos, etc...) since XPath uses the @ symbol to retrieve attribute values.

@juanjux
Copy link
Contributor

juanjux commented Dec 12, 2018

More info: the xpath library seems to be Ok and it even have a test for this case so unless we're hitting a corner case with it with our specific usage parameters (I'll do more tests to discard it) the problem seems to be the one stated above.

Also, this website (https://www.freeformatter.com/xpath-tester.html) that the Go-xpath lib seems to use as independent testing of the validity of its library (so I guess it uses another lib internally) also fails when the properties start with a "@".

We could do this workaround:

  • Project @token, @type, etc as something like "_BBLFSH_token/type/etc".
  • For these, and only these, names, translate the @ to _BBLFSH_, before passing it to the library.

Not very pretty, but it should work and would avoid changing all the drivers, fixtures and testing code to replace the @ by something else.

@dennwc
Copy link
Member Author

dennwc commented Dec 12, 2018

_BBLFSH_ is too hard to use, so maybe we can pick something different?

The token specifically is mapped to the node's content. Since does might have both token and children, I also projected is as a separate child node (<token> or <Token>?).

@type is projected to a node element name, so we don't need to handle it that way.

@pos is projected to attributes as: <pos-name>-start-offset, etc.

And we don't have any other @ fields defined.

@juanjux
Copy link
Contributor

juanjux commented Dec 12, 2018

Yeah, _BBLFSH it was just an example, but the user would still use @ and it would be translated internally (would still be ugly on the fixtures). The length is to avoid possible collisions with drivers' existing properties. But since we translate whatever field is the token in the native AST, we should be right using another "token" property for this specific case as you proposed.

@creachadair
Copy link
Contributor

Is this an issue that arises from queries generated inside the library (rather than directly by users), and if so can we work around it by adding quotes explicitly? I'm pretty sure XPath allows "@" in values, as long as they're escaped.

@juanjux
Copy link
Contributor

juanjux commented Dec 12, 2018

Nope, no scaping of symbols in property names for XPath 1.0 as far as I investigated. Double quotes doesn't work either. This would also affect users, actually I found about this while upgrading the Python client's unittests.

The solution proposed by @dennwc of creating internally a 'token' property with the @token should be simple to implement.

@dennwc
Copy link
Member Author

dennwc commented Dec 12, 2018

P.S. I believe it was already implemented and merged ;) We just haven't updated the docs.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants