Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi file #417

Merged
merged 4 commits into from
Dec 28, 2023
Merged

Multi file #417

merged 4 commits into from
Dec 28, 2023

Conversation

hildjj
Copy link
Contributor

@hildjj hildjj commented Apr 28, 2023

This is a PR draft exploring another part of #292.

What if you could just pass multiple files in on the command line, and have Peggy stitch them together into a single output?

This points out a few issues:

  • Code generation does not rely on source information in the AST adequately. Instead, it guesses based on the grammarSource passed in. This is another strong argument for changing how we do code generation.
  • topLevelInitializer and initializer in the AST should be arrays. This is so that when they get copied into the output, each initializer section would be marked up with the correct source-map info.
  • There might be imports in the top-level initializers of multiple files. They should probably all be moved to the top, but this is a JS-specific transform that should be in a compiler pass.

@hildjj
Copy link
Contributor Author

hildjj commented Apr 28, 2023

There's a timing bug in the watchfile tests as well. Will deal with it if we end up using this.

@hildjj
Copy link
Contributor Author

hildjj commented Dec 10, 2023

I think the sourcemap thing isn't an issue. This works surprisingly well already. Still needs a compiler pass that checks if output type is "es" and looks for import lines to move to the top.

@hildjj hildjj force-pushed the multi-file branch 3 times, most recently from 519d96f to 121273b Compare December 11, 2023 21:14
@hildjj hildjj marked this pull request as ready for review December 11, 2023 23:33
@hildjj hildjj requested a review from Mingun December 11, 2023 23:33
@hildjj
Copy link
Contributor Author

hildjj commented Dec 11, 2023

Still needs docs, a changelog entry, and some thought about how this affects the version number.

@hildjj
Copy link
Contributor Author

hildjj commented Dec 12, 2023

Being able to specify a URL in the CLI would be pretty cool. In order to do this, I'd look for filenames starting with https?://, use fetch to pull down the contents, and cache it, like deno. This would up the minimum version of node required to be v18 -- That's fine with me because:

a) we're taking a breaking change in this release for node version anyway.
b) Node 16 is no longer supported.

This would make importing rules from a central repository easier, I think.

bin/peggy-cli.js Outdated Show resolved Hide resolved
@frostburn
Copy link
Contributor

Tested combining grammars from two repositories successfully: https://github.com/frostburn/peggy-string-concat

bin/peggy-cli.js Outdated
}
this.outputFile = this.progOptions.output;
this.outputJS = this.progOptions.output;

if ((this.inputFile === "-") && this.argv.watch) {
if ((this.inputFiles.indexOf("-") !== -1) && this.argv.watch) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about the more readable inputFiles.includes("-") insteads?

bin/peggy-cli.js Outdated
const req = (
// In node 12+, createRequire is documented.
// In node 10, createRequireFromPath is the least-undocumented approach.
Module.createRequire || Module.createRequireFromPath
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using nullish coalescing ?? would be more idiomatic here. Are we on a node version that doesn't support it? Not seeing it anywhere in the source.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hildjj, you say that you would like to bump required node version to 18, so you can use only createRequire

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think createRequire is in node12, but I'm fine bumping to Node18 required, and using anything new we want in the CLI code. The web-facing code we should be a little more careful with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just removed node 10 compatibility.

Copy link
Member

@Mingun Mingun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good, but I don't like the fact that peggy grammar becomes tied to JS. That is the same problem that we have now for unbalanced {} which could be present in JS comments or strings and actually does not create unbalanced braces, but peggy grammar does not understand that because code blocks are language-agnostic.

I suggest to merge this without changes in the grammar and propose grammar changes in separate PR. Ideally we should find a way to make grammar modularised so language plugins can provide they own methods to parse code blocks.

bin/peggy-cli.js Outdated
if (this.inputFiles.indexOf("-") === -1) {
let inFile = this.inputFiles[0];
// You might just want to run a fragment grammar as-is,
// particularyly with a specified start rule.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misprint

Suggested change
// particularyly with a specified start rule.
// particularly with a specified start rule.

bin/peggy-cli.js Outdated
const req = (
// In node 12+, createRequire is documented.
// In node 10, createRequireFromPath is the least-undocumented approach.
Module.createRequire || Module.createRequireFromPath
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hildjj, you say that you would like to bump required node version to 18, so you can use only createRequire

bin/peggy-cli.js Outdated
// In node 10, createRequireFromPath is the least-undocumented approach.
Module.createRequire || Module.createRequireFromPath
)(prevSource);
prevSource = req.resolve(source.slice(4));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment, what parts are skipped here.

bin/watcher.js Outdated
}
const filename = path.join(dir, fn);
// Might be a different fil changing in one of the target dirs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Misprint

Suggested change
// Might be a different fil changing in one of the target dirs
// Might be a different file changing in one of the target dirs

Comment on lines 129 to 140
<p>If you specify multiple input files, they will be folded together in the
order specified before generating a parser. <code>import</code> statements in
the top-level initializers from each of the inputs will be moved to the top of
the generated code, and all other top-level initializers will be inserted
directly after those imports, in the order of the inputs. This approach can
be used to keep libraries of often-used grammar rules in separate files.</p>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe document possible caveats of automatic import movement. For example, if imports in the block comment (/*...*/), would them be moved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See if the comment I added was good enough? Importantly, the ECMAscript grammar treats all comments after an import as part of the import.

Comment on lines 18 to 21
// Put library code above code that uses it, so that variables
// will be in scope.
const res = bb.filter(x => x).concat(aa.filter(x => x)); // Remove nulls.
return res;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that comment should be at call site rather than here

Suggested change
// Put library code above code that uses it, so that variables
// will be in scope.
const res = bb.filter(x => x).concat(aa.filter(x => x)); // Remove nulls.
return res;
return bb.filter(x => x).concat(aa.filter(x => x)); // Remove nulls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. As well, if we aren't generating es-js, the inputs should not be reordered. That's a decision for the generator.

src/parser.pegjs Outdated
@@ -57,14 +57,64 @@ Grammar
}

TopLevelInitializer
= "{" code:CodeBlock "}" EOS {
return {
= "{" "{" imports:ImportDeclarations code:BareCodeBlock "}" "}" EOS {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can create problems for plugins that add support for other languages, because they would have other mechanisms for imports.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree this is an issue. I suppose we could have a sub-grammar that can be used in generate-js to perform this transform.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented the sub-grammar approach, which only gets called from generate-js, and only if we are in "es" format. This should be a lot safer.

@hildjj
Copy link
Contributor Author

hildjj commented Dec 26, 2023

Another patch coming with other issues fixed.

@hildjj
Copy link
Contributor Author

hildjj commented Dec 26, 2023

OK, I think I fixed all of the outstanding issues. I've left this as two patches since review to make it easier to see the changes, but I'll reorder and clean up the patch before merging.

/ StringLiteral

BindingIdentifier = id:IdentifierName &{
return reservedWords.indexOf(id[0]) === -1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.includes instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this one needs to run in the browser.

bin/peggy-cli.js Outdated
this.argv.watch = false; // Make error throw.
this.error("Can't watch stdin");
}

if (!this.outputFile) {
if (this.inputFile !== "-") {
this.outputJS = this.inputFile.slice(
if (this.inputFiles.indexOf("-") === -1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.includes instead

@hildjj
Copy link
Contributor Author

hildjj commented Dec 27, 2023

This is ready to squash and merge.

…nto one output. Not ready. Still needs change to source-map generation, for topLevelInitializer to be an array, and to make moving imports to the top of all topLevelInitializers (keeping track of source info) a JS-specific pass.
@hildjj
Copy link
Contributor Author

hildjj commented Dec 27, 2023

ready to go if anyone wants to take a last look. Next step is to merge the imports grammar back into the main grammar, and give it two entry points. Then, I'll use the imports grammar to add imports syntax before toplevel initializers.

@hildjj hildjj merged commit cbea181 into peggyjs:main Dec 28, 2023
9 checks passed
@hildjj hildjj deleted the multi-file branch December 28, 2023 16:35
hildjj added a commit to hildjj/peggy that referenced this pull request Jan 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants