Skip to content
This repository has been archived by the owner on Mar 8, 2020. It is now read-only.

Python client v3 (UASTv2) #128

Merged
merged 53 commits into from
Mar 12, 2019
Merged

Python client v3 (UASTv2) #128

merged 53 commits into from
Mar 12, 2019

Conversation

dennwc
Copy link
Member

@dennwc dennwc commented Oct 4, 2018

Work-in-progress implementation of Python client v3 for the new libuast.

TODO:

  • Proper memory management
  • Fix install scripts
  • Update gRPC client to use v1 and v2 protocols
  • Test that the static linking works in Windows and macos and include static files in the libuast package.
  • Tests
  • Fix some TO-DOs in the code about potential memory problems.
  • Update README.md
  • Enable failing test testFilterToken once the SDK problem has been fixed.
  • Add Node compatibility layer.

Signed-off-by: Denys Smirnov denys@sourced.tech

Signed-off-by: Denys Smirnov <denys@sourced.tech>
@dennwc dennwc self-assigned this Oct 4, 2018
@dennwc dennwc requested a review from juanjux October 4, 2018 01:00
@dennwc dennwc changed the title Client v3 (UASTv2) Python client v3 (UASTv2) Oct 4, 2018
Denys Smirnov added 2 commits October 11, 2018 21:11
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
bzz and others added 9 commits October 17, 2018 13:02
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
Signed-off-by: Alexander Bezzubov <bzz@apache.org>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@juanjux
Copy link
Contributor

juanjux commented Oct 22, 2018

Install script should be fixed.

Juanjo Alvarez and others added 3 commits October 22, 2018 19:19
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@bzz
Copy link
Contributor

bzz commented Oct 28, 2018

Was trying to test bblfshd/client-python.v3 locally (have libuast.3-rc2 globally installed)

virtualenv -p python3 .venv3
source .venv3/bin/activate
pip3 install -r requirements.txt
 edit setup.py // GET_LIBUAST = False
python3 setup.py install --getdeps --log
python3 setup.py install //build seems to finish fine

but then

python3 -c "import bblfsh"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "~/src-d/bblfsh/client-python/bblfsh/__init__.py", line 1, in <module>
    from bblfsh.client import BblfshClient
  File "~/src-d/bblfsh/client-python/bblfsh/client.py", line 6, in <module>
    from bblfsh.aliases import ParseRequest, DriverStub
  File "~/src-d/bblfsh/client-python/bblfsh/aliases.py", line 15, in <module>
    Node = uast_module.Node
AttributeError: module 'bblfsh.gopkg.in.bblfsh.sdk.v2.uast.generated_pb2' has no attribute 'Node'

@juanjo I belive it has to do with 24fd7b6 as bblfsh.gopkg.in.bblfsh.sdk.v2.uast.generated_pb2 indeed does not have Node, only v1 has.

When I manually roll it back to aliases.py it seems to work.

Denys Smirnov and others added 4 commits October 30, 2018 18:57
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@juanjux
Copy link
Contributor

juanjux commented Oct 31, 2018

@dennwc PTAL at the last commit. Should work on Linux (works on my machine and my Docker image) but as we talked on Slack it'll require the libuast package to include static libraries and it's untested on Windows and Darwin. It also have GET_LIBUAST to False to ease testing this, should be changed before merging.

Denys Smirnov added 3 commits November 1, 2018 21:22
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
README.md Show resolved Hide resolved
@juanjux
Copy link
Contributor

juanjux commented Dec 13, 2018

@bzz everything in your review should be solved.

@bzz
Copy link
Contributor

bzz commented Dec 17, 2018

@vmarkovtsev would you be so kind to please review the API changes? Thanks!

Copy link
Contributor

@zurk zurk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My minor comments and questions. Main part looks good 👍 .

.travis.yml Show resolved Hide resolved
bblfsh/aliases.py Outdated Show resolved Hide resolved
bblfsh/client.py Show resolved Hide resolved
bblfsh/client.py Outdated Show resolved Hide resolved
bblfsh/client.py Outdated Show resolved Hide resolved
bblfsh/fixtures/test.py Show resolved Hide resolved
bblfsh/result_context.py Show resolved Hide resolved
bblfsh/tree_order.py Outdated Show resolved Hide resolved
setup.py Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
# are alterative typed versions:
x = next(ctx.filter("boolean(//*[@strtOffset or @endOffset])").get_bool()
y = next(ctx.filter("name(//*[1])")).get_str()
z = next(ctx.filter("count(//*)").get_int() # or get_float()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not very Pythonic. IDK if it is hard to guess the type, but the following is more Pythonic:

x = next(ctx.filter("boolean(//*[@strtOffset or @endOffset])").iterate())
for name in ctx.filter("name(//*[1])").iterate():
    print(name)

Where iterate() would be an unordered iterator over the resulting values. Or name it results(). Or remove it and iterate directly, anyway.

Copy link
Contributor

@juanjux juanjux Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are typed functions for queries returning boolean/string/integer/float values instead of nodes so the "typed get" is needed when the user doesn't have to do isinstance of the results to check that they're right every time (but if he wants, he can use the normal get()).

for i in newiter: ...

# You can also get the non semantic UAST or native AST:
ctx = client.parse("file.py", mode=bblfsh.ModeDict["NATIVE"])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use an enum class or define separate constants instead of raw strings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's mapping the protocol_v2_module.Mode.DESCRIPTOR.values_by_name field. Maybe we could generate variable names from these strings using eval or exec but that's almost worse.

bblfsh/client.py Outdated Show resolved Hide resolved
bblfsh/client.py Show resolved Hide resolved
bblfsh/client.py Outdated Show resolved Hide resolved
bblfsh/client.py Outdated Show resolved Hide resolved
@vmarkovtsev
Copy link
Contributor

So given that we no longer have a Node, we have to rewrite literally everything on our side, because Node is used everywhere in ML code, in nearly half of the files. Alternatively, we will write our own emulation wrapper because we heavily rely on that abstraction anyway and don't want to spend months on debugging very very tricky stuff bound to nodes :)

@vmarkovtsev
Copy link
Contributor

vmarkovtsev commented Dec 18, 2018

Daaaamn, we have to rewrite bloody months of work in the style-analyzer's feature extraction without this emulation. Very painful.

@juanjux
Copy link
Contributor

juanjux commented Dec 18, 2018

@vmarkovtsev That was my point of view, when I disliked the breaking changes. It doesn't make sense for you to write a Node compatibility layer because if there is any it should be here if it's needed (from bblfsh.past import Node or something like that).

Juanjo Alvarez added 2 commits December 18, 2018 12:15
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
bblfsh/aliases.py Outdated Show resolved Hide resolved
Juanjo Alvarez added 2 commits December 18, 2018 14:06
Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
…y Vadim

Signed-off-by: Juanjo Alvarez <juanjo@sourced.tech>
@dennwc
Copy link
Member Author

dennwc commented Jan 3, 2019

Re "typed get" issue mentioned above, I agree with @vmarkovtsev - I don't see the reason for typed get helpers. Python is a dynamic language, so it should be okay to iterate over nodes directly (that already are strings/ints/whatever). Old client had those helpers because the way libxml was wrapped. It's not necessary right now.

@juanjux
Copy link
Contributor

juanjux commented Jan 14, 2019

Python is dynamically typed but not weakly typed so it's not considered good practice on Python to have variant return types from the same function other than None. Also, having typed functions make easier to use mypy and avoid doing "if isinstance(...)" on every value, which would also be slower (or risking having buggy code using types without checking them). Plus is basically free to have them on the API.

@dennwc
Copy link
Member Author

dennwc commented Jan 14, 2019

@juanjux OK, thanks for clarifying it. I trust your decision here.

@bzz bzz mentioned this pull request Jan 24, 2019
@juanjux
Copy link
Contributor

juanjux commented Jan 24, 2019

Update: I started to work on a compatiblity layer with v2 before the holidays, but it's stopped to work on the very prioritary C# driver, I'll continue with it ASAP once the driver is finished so we can finally publish this version.

@juanjux
Copy link
Contributor

juanjux commented Mar 12, 2019

Since this passed review from the team, I propose to merge it (WITHOUT releasing v3 yet) so the compatiblity layer PR will be much easier to review as a separate one. Both PRs, once merged, would then be released as V3. What do you think?

@creachadair
Copy link
Contributor

Since this passed review from the team, I propose to merge it (WITHOUT releasing v3 yet) so the compatiblity layer PR will be much easier to review as a separate one. Both PRs, once merged, would then be released as V3. What do you think?

That sounds reasonable to me.

@dennwc
Copy link
Member Author

dennwc commented Mar 12, 2019

Sounds good. Let's merge it.

@juanjux juanjux merged commit 375fd7e into bblfsh:master Mar 12, 2019
@dennwc dennwc deleted the v3 branch March 12, 2019 16:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants