Bytecode optimizer #429

markw65 · 2023-08-27T19:26:27Z

This stack implements a framework for performing optimizations to the bytecode (file interp-state.js), and then implements a number of fairly simple transformations via a new pass, optimize-bytecode.js.

After the initial commits of interp-state.js and its tests, I've committed each optimization separately, with an "artifacts" diff to show how it affects lib/parser.js, to make it easy to review/discuss. I plan to remove all the artifacts diffs, and add a single "Build Artifacts" commit once the pull request settles down.

Most of what I've implemented so far is pretty local, and depends on recognizing simple patterns (like PUSH_NULL, POP). If this framework is accepted I plan to add some more global analysis, so that eg an unused PUSH_CURR_POS can be removed, even when the POP is hidden behind a lot of more complex code; or immediately dropping unused elements from a plucked sequence, rather than generating them all, then ignoring the ones that aren't plucked (obviously their side effects still have to be executed); or not bothering to build an array for a sequence wrapped in a text node.

fixes

markw65 · 2023-08-28T01:35:54Z

lib/peg.d.ts

+  type OutputType =
+    | "ast"
+    | "parser"
+    | "source-and-map"
+    | "source-with-inline-map"
+    | "source"
+    ;
+


I think I missed something here... I should probably be using SourceBuildOptions rather than Options. Although SourceOutputs doesn't include "ast". I'll take a look...

I pushed a change that reverts this, and uses SourceBuildOptions<SourceOutputs> instead

Mingun · 2023-08-28T15:32:05Z

I only have one concern: the bytecode was introduced by David (creator of pegjs) to be an abstraction over piece of JS code, that can be tested easily and have simple semantic. This change turns the bytecode into a more complicated thing, which probably could be simplified by using a more low-level bytecode.

So probably we should create that low-level bytecode and generate it directly? And then use it to generate JS. Roughly speaking, that means we should just renumber our bytecode instructions. It seems to me that it will be more in the spirit of the library, didn't it?

hildjj · 2023-08-28T15:45:40Z

I definitely want to rework code generation at some point, because the current mechanism of generating source maps isn't really sustainable, particularly as we want to add other output formats.

markw65 · 2023-08-28T17:32:11Z

So definitely having a lower level bytecode would make it possible to optimize more. Some of the bytecodes just do too much currently. But changing the bytecode presumably breaks all existing plugins - so I was trying to avoid doing that.

If you don't think this is something you want to take, I can redo it as a plugin, and maintain it myself until the big bytecode rewrite, when it will hopefully become redundant...

markw65 · 2023-08-30T18:54:00Z

So probably we should create that low-level bytecode and generate it directly

I think you'd still want to keep code generation simple, and then optimize the bytecode. I mean, without changing the bytecode at all, the code generator could emit optimized bytecode similar to what I'm producing here. But I think it would be harder to do...

markw65 · 2023-09-06T21:52:14Z

So I've taken this, and some more work I did on top, and converted it to a plugin.

Since this depends on some non-exported functionality, and since I also need a few updates to generate-js.js to get the full benefit of the bytecode optimizations, I ended up just using rollup to create the plugin from my fork of peggy. Its published as @markw65/peggy-optimizer, and includes #425 and #427.

I still think this (plus the additional work Ive done) is worth taking; the generated code with the plugin is much cleaner - and much closer to what a person might write (although loops are still awkward).

hildjj · 2024-01-27T18:43:13Z

I'm going to close this in favor of the plugin for now. Might revisit after we look at code generation.

markw65 added 3 commits August 27, 2023 09:06

Implement a framework for creating a bytecode optimizer

e82e982

fixes

Add some tests for interp-state.js

a18cdca

Add an optimization step to combine consecutive if-blocks

6739ba3

markw65 mentioned this pull request Aug 28, 2023

Make generate-js.js ts clean #430

Merged

markw65 commented Aug 28, 2023

View reviewed changes

markw65 added 14 commits August 28, 2023 07:46

Revert changes to peg.d.ts

f55227f

artifacts

0a06c13

Drop various PUSH opcodes, if they're immediately discarded

fcbe9e2

artifacts

2b8e82a

Drop consecutive POP_CURR_POS

993fe9f

artifacts

943d972

Don't restore currPos if it hasn't changed

b5b7ff6

artifacts

7297da1

Include WHILE_NOT_FAILED in the if/else optimization

2a80538

artifacts

3a0f60d

Drop nested SILENT_FAILS*

36de099

Kill redundant updates to currPos

3e33288

artifacts

0f21542

Add some optimizer tests

821912a

markw65 force-pushed the bytecode-optimizer branch from 00eb942 to 821912a Compare August 28, 2023 15:17

hildjj closed this Jan 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bytecode optimizer #429

Bytecode optimizer #429

markw65 commented Aug 27, 2023

markw65 Aug 28, 2023

markw65 Aug 28, 2023

Mingun commented Aug 28, 2023

hildjj commented Aug 28, 2023

markw65 commented Aug 28, 2023

markw65 commented Aug 30, 2023

markw65 commented Sep 6, 2023

hildjj commented Jan 27, 2024

Bytecode optimizer #429

Bytecode optimizer #429

Conversation

markw65 commented Aug 27, 2023

markw65 Aug 28, 2023

Choose a reason for hiding this comment

markw65 Aug 28, 2023

Choose a reason for hiding this comment

Mingun commented Aug 28, 2023

hildjj commented Aug 28, 2023

markw65 commented Aug 28, 2023

markw65 commented Aug 30, 2023

markw65 commented Sep 6, 2023

hildjj commented Jan 27, 2024