Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bytecode optimizer #429

Closed
wants to merge 17 commits into from
Closed

Bytecode optimizer #429

wants to merge 17 commits into from

Conversation

markw65
Copy link

@markw65 markw65 commented Aug 27, 2023

This stack implements a framework for performing optimizations to the bytecode (file interp-state.js), and then implements a number of fairly simple transformations via a new pass, optimize-bytecode.js.

After the initial commits of interp-state.js and its tests, I've committed each optimization separately, with an "artifacts" diff to show how it affects lib/parser.js, to make it easy to review/discuss. I plan to remove all the artifacts diffs, and add a single "Build Artifacts" commit once the pull request settles down.

Most of what I've implemented so far is pretty local, and depends on recognizing simple patterns (like PUSH_NULL, POP). If this framework is accepted I plan to add some more global analysis, so that eg an unused PUSH_CURR_POS can be removed, even when the POP is hidden behind a lot of more complex code; or immediately dropping unused elements from a plucked sequence, rather than generating them all, then ignoring the ones that aren't plucked (obviously their side effects still have to be executed); or not bothering to build an array for a sequence wrapped in a text node.

lib/peg.d.ts Outdated
Comment on lines 469 to 476
type OutputType =
| "ast"
| "parser"
| "source-and-map"
| "source-with-inline-map"
| "source"
;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I missed something here... I should probably be using SourceBuildOptions rather than Options. Although SourceOutputs doesn't include "ast". I'll take a look...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a change that reverts this, and uses SourceBuildOptions<SourceOutputs> instead

@Mingun
Copy link
Member

Mingun commented Aug 28, 2023

I only have one concern: the bytecode was introduced by David (creator of pegjs) to be an abstraction over piece of JS code, that can be tested easily and have simple semantic. This change turns the bytecode into a more complicated thing, which probably could be simplified by using a more low-level bytecode.

So probably we should create that low-level bytecode and generate it directly? And then use it to generate JS. Roughly speaking, that means we should just renumber our bytecode instructions. It seems to me that it will be more in the spirit of the library, didn't it?

@hildjj
Copy link
Contributor

hildjj commented Aug 28, 2023

I definitely want to rework code generation at some point, because the current mechanism of generating source maps isn't really sustainable, particularly as we want to add other output formats.

@markw65
Copy link
Author

markw65 commented Aug 28, 2023

So definitely having a lower level bytecode would make it possible to optimize more. Some of the bytecodes just do too much currently. But changing the bytecode presumably breaks all existing plugins - so I was trying to avoid doing that.

If you don't think this is something you want to take, I can redo it as a plugin, and maintain it myself until the big bytecode rewrite, when it will hopefully become redundant...

@markw65
Copy link
Author

markw65 commented Aug 30, 2023

So probably we should create that low-level bytecode and generate it directly

I think you'd still want to keep code generation simple, and then optimize the bytecode. I mean, without changing the bytecode at all, the code generator could emit optimized bytecode similar to what I'm producing here. But I think it would be harder to do...

@markw65
Copy link
Author

markw65 commented Sep 6, 2023

So I've taken this, and some more work I did on top, and converted it to a plugin.

Since this depends on some non-exported functionality, and since I also need a few updates to generate-js.js to get the full benefit of the bytecode optimizations, I ended up just using rollup to create the plugin from my fork of peggy. Its published as @markw65/peggy-optimizer, and includes #425 and #427.

I still think this (plus the additional work Ive done) is worth taking; the generated code with the plugin is much cleaner - and much closer to what a person might write (although loops are still awkward).

@hildjj
Copy link
Contributor

hildjj commented Jan 27, 2024

I'm going to close this in favor of the plugin for now. Might revisit after we look at code generation.

@hildjj hildjj closed this Jan 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants