Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for binding patterns in SemIR #4221

Open
wants to merge 13 commits into
base: trunk
Choose a base branch
from

Conversation

geoffromer
Copy link
Contributor

@geoffromer geoffromer commented Aug 16, 2024

Introduces the BindingPattern and SymbolicBindingPattern insts, and a separate stack of pattern blocks that they are emitted into. The intent is to generate the corresponding pattern-matching insts (like BindName) from them in a separate pass, but that is deferred to future PRs.

See here for the design this is based on, but note that during review we have chosen to deviate from that design by putting the patterns in separate blocks, and omitting the "forward references" from a BindingPattern to its corresponding BindName. This in turn necessitates having separate inst kinds for symbolic and non-symbolic binding patterns.

@github-actions github-actions bot requested a review from josh11b August 16, 2024 18:01
@geoffromer geoffromer marked this pull request as draft August 16, 2024 18:05
@geoffromer geoffromer marked this pull request as ready for review August 16, 2024 18:07
@github-actions github-actions bot requested a review from jonmeow August 16, 2024 18:07
@geoffromer geoffromer removed the request for review from josh11b August 16, 2024 18:08
Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be sure I understand correctly, is this change supposed to only be adding BindingPattern without really using it? Can you please use the PR description to elaborate a little on what's being implemented here?

toolchain/sem_ir/inst_namer.cpp Outdated Show resolved Hide resolved
toolchain/sem_ir/inst_kind.def Outdated Show resolved Hide resolved
toolchain/sem_ir/function.cpp Outdated Show resolved Hide resolved
toolchain/check/eval.cpp Outdated Show resolved Hide resolved
toolchain/check/handle_pattern_list.cpp Outdated Show resolved Hide resolved
toolchain/sem_ir/formatter.cpp Outdated Show resolved Hide resolved
toolchain/check/import_ref.cpp Outdated Show resolved Hide resolved
toolchain/check/handle_binding_pattern.cpp Outdated Show resolved Hide resolved
toolchain/lower/file_context.cpp Outdated Show resolved Hide resolved
Copy link
Contributor Author

@geoffromer geoffromer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be sure I understand correctly, is this change supposed to only be adding BindingPattern without really using it?

This PR "really uses it" in the way that it's designed to be used. In particular, we traverse the BindingPatterns in order to emit the corresponding BindNames.

Can you please use the PR description to elaborate a little on what's being implemented here?

How's this?

toolchain/check/eval.cpp Outdated Show resolved Hide resolved
toolchain/check/handle_binding_pattern.cpp Outdated Show resolved Hide resolved
toolchain/check/import_ref.cpp Outdated Show resolved Hide resolved
toolchain/lower/file_context.cpp Outdated Show resolved Hide resolved
toolchain/lower/handle.cpp Outdated Show resolved Hide resolved
toolchain/sem_ir/function.cpp Outdated Show resolved Hide resolved
toolchain/sem_ir/inst_kind.def Outdated Show resolved Hide resolved
toolchain/sem_ir/typed_insts.h Show resolved Hide resolved
toolchain/check/handle_pattern_list.cpp Outdated Show resolved Hide resolved
Copy link
Contributor Author

@geoffromer geoffromer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm particularly looking for feedback on the textual IR changes, because I'm not familiar enough with the conventions to tell if what I have is good, or how to improve them if not.

toolchain/check/handle_binding_pattern.cpp Outdated Show resolved Hide resolved
case SymbolicBindingPattern::Kind: {
auto inst = untyped_inst.As<AnyBindingPattern>();
add_inst_name_id(
sem_ir_.entity_names().Get(inst.entity_name_id).name_id, ".patt");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: the ".patt" never appears in the textual IR output. The only effect of omitting that parameter here is that the instruction numbering changes slightly. I can't tell whether any of that is working as intended.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@looking at an example, I think this is related to:

%.loc11_8: i32 = binding_pattern a

I would expect that should be named. My guess would be that the block isn't getting touched. This is probably worth some debugging about why the assignment isn't taking, e.g. by figuring out one of the InstIds for a BindingPattern and then looking at naming calls related to it. (I'd probably just insert logs in the get and set code for starters, maybe logging for every BindingPattern instruction in a test where there's only one)

Do you want help figuring this out, or is this something you'd like to figure out what's going on?

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, I'm still looking at this today, but wanted to publish comments in case you want to discuss something during the toolchain meeting. Overall I think the direction this PR is taking seems a good start, though there are still some details that may be good to discuss (particularly see DeclInfo).

@@ -147,6 +148,13 @@ auto Context::AddPlaceholderInst(SemIR::LocIdAndInst loc_id_and_inst)
return inst_id;
}

auto Context::AddPatternInst(SemIR::LocIdAndInst loc_id_and_inst)
-> SemIR::InstId {
auto inst_id = sem_ir().insts().AddInNoBlock(loc_id_and_inst);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had you considered calling either Context::AddInstInNoBlock or Context::AddPlaceholderInstInNoBlock here? Note both of those provide a little extra logic (for VLOG and constant), which is why they're called from AddInst/AddPlaceholderInst. Normally we'd want a constant associated, which would be AddInstInNoBlock.

-> bool {
// TODO: when/if parameterized aliases are supported, this must be
// attached to the `BindAlias` inst.
(void)context.pattern_block_stack().Pop();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
(void)context.pattern_block_stack().Pop();
context.pattern_block_stack().PopAndDiscard();

Generally I don't think we write (void) except where required by the compiler for an unused variable/parameter. If you're trying to indicate that there's a value being discarded, maybe the PopAndDiscard call would capture that?

(note, this same comment would apply elsewhere that you're doing this)

@@ -302,6 +302,36 @@ struct BindValue {
InstId value_id;
};

// Common representation for various `*binding_pattern` nodes.
struct AnyBindingPattern {
// TODO: Also handle BindTemplateName once it exists.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this TODO be updated?

case SymbolicBindingPattern::Kind: {
auto inst = untyped_inst.As<AnyBindingPattern>();
add_inst_name_id(
sem_ir_.entity_names().Get(inst.entity_name_id).name_id, ".patt");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@looking at an example, I think this is related to:

%.loc11_8: i32 = binding_pattern a

I would expect that should be named. My guess would be that the block isn't getting touched. This is probably worth some debugging about why the assignment isn't taking, e.g. by figuring out one of the InstIds for a BindingPattern and then looking at naming calls related to it. (I'd probably just insert logs in the get and set code for starters, maybe logging for every BindingPattern instruction in a test where there's only one)

Do you want help figuring this out, or is this something you'd like to figure out what's going on?

Comment on lines +448 to +450
DeclInfo decl = sem_ir_.decls().Get(inst.decl_id);
CollectNamesInBlock(scope_id, decl.pattern_block_id);
CollectNamesInBlock(scope_id, decl.decl_block_id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the copies, had you considered a helper function for this?

Comment on lines +12 to +19
// Information about a declaration.
struct DeclInfo {
// The pattern block, containing the pattern insts for the parameter pattern.
InstBlockId pattern_block_id;
// The declaration block, containing the declaration's parameters and their
// types.
InstBlockId decl_block_id;
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What had you considered as alternatives before the DeclInfo approach? A couple I'm wondering about are:

  • Adding an instruction to decl_block_id which just declares that there's a pattern
    • e.g. struct PatternBlock { InstBlockId block_id; }
    • i.e., as a generic approach that could apply to any block that adds a pattern.
  • Putting pattern_block_id on EntityWithParams
    • i.e., tracking it alongside parameters that it's providing a pattern for

To explain why I'm thinking about this, adding more ValueStores feels a little incremental complexity for understanding the toolchain. decl_block_id was added to instructions because it fit into the instruction and was mainly for formatting. However, as part of splitting out to a new structure, it's no longer taking advantage of free space in the declaration's instruction.

Maybe it'd help for me to understand, what's the access pattern you expect for pattern_block_id?

If only kept for the printed IR, the PatternBlock instruction approach might be good: it keeps an association that you could get formatted instructions for, and also provides an approach that would work for arbitrary constructs. The overhead is marginally higher (essentially 12 more bytes), but more flexible beyond the current set of declarations. I expect this approach would need the PatternBlock to be skipped for lowering, but it'd just be the one instruction.

If you need to go back to the pattern block from the decl instruction later during checking, then EntityWithParams might be good: we would probably need it to be associated with the params themselves. EntityWithParams is already shared where you're using DeclInfo, meaning overhead is marginally lower (essentially 4 fewer bytes).

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I guess I was almost done looking through, just a couple more comments.

@@ -41,6 +41,8 @@ auto HandleParseNode(Context& context, Parse::ClassIntroducerId node_id)
context.decl_name_stack().PushScopeAndStartName();
// This class is potentially generic.
StartGenericDecl(context);
// Start a new pattern block for the signature.
context.pattern_block_stack().Push();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note, but something else to consider with storage might be whether we want the parameters and implicit parameters to each have their own pattern block. A single block is fine for now, I'm not asking for a change in this PR, but it might be something that affects the implementation more later.

class Context;

// Stack of blocks containing pattern insts.
class PatternBlockStack {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had you reconsidered reusing InstBlockStack? There's a lot of logic there that might apply here too: avoiding allocation for empty blocks, debug logging, printing support for stack traces, and so on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants