-
Notifications
You must be signed in to change notification settings - Fork 27
Define Ambiguity node #337
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Denys Smirnov <denys@sourced.tech>
Signed-off-by: Denys Smirnov <denys@sourced.tech>
While I think this is a nice feature to have, it was unplanned and the impact of this and the Bool PR is not trivial since would require to update/debug all drivers again so we should probably discuss not so much when to merge this (since it seems to be backward compatible until you start upgrading nodes) but when to upgrade the drivers. I suspect having finished I also have couple questions:
|
If I understand the proposal correctly, this is effectively adding sum types to our abstract type system. That's not necessarily a bad idea—several languages have variations of sum-typing that we might want to incorporate anyway. However, I agree with @juanjux that it's a much larger design question, and we should consider it as a new feature rather than adding it in piecewise. I tend to agree that short term we should probably be aiming for "breadth" of coverage (e.g., more languages) before "depth" of coverage (e.g., more detailed types). Both are valuable, but I think the immediate impact for our customers is higher if we go broad first. |
The aim of the proposal is to start a discussion about this specific problem. Just in case - I'm not proposing to update all the drivers :) I also agree that extending language coverage should be the highest priority. This PR is more like a brain-dump of the problem I was thinking about in the background. If you think the proposal is reasonable - let's discuss it (same - in the background). @juanjux Those are really good questions. First, this node type won't break annotation code, because Semantic pass happens before the Annotation pass and the last one is already prepared to deal with arbitrary Semantic types. And annotations won't ever target Semantic types directly, because it doesn't make sense to do so (Semantic types are "roles" themselves). @creachadair I'm not sure that this is similar to sum types. Sum types assume that there is a set of defined types and the sum type may consist of few of them. And this is somehow expressed in AST/code/type system/whatever. Instead, the transformation code will search for specific patterns ( I think it's a useful change, in general. For most languages, we know 2-3 possible meanings of operators or some identifiers and we can easily be more specific than "unknown binop %" or "identifier |
This PR defines a new node type for SemUAST called
Ambiguity
.This node will represent multiple alternative meanings of the same node in case driver cannot determine its type with a 100% certainty (without running the compiler or symbol resolution), but can propose a small number of alternative meanings.
The first use case is that languages like Python, Go and some others have built-in names like
True
/False
that can be redefined. The driver might mark them asIdentifier
and be done with it, but this won't be that helpful for the end user. In fact, the driver knows that in this specific language theTrue
identifier with 99% certainty is a boolean literal. So instead of dumping a genericIdentifier
, the driver may emit anAmbiguity
node with bothBool
andIdentifier
nodes.The second use case is that sometimes operators might be overloaded in the language itself. I don't want to state that this will solve an issue of user-defined operator overloading, though. The current use case is a bit simpler for now. For example, in Python there is a
%
operator that can be a modulo operator or a string interpolation operator. The driver may emit both possible versions as anAmbiguity
node if it cannot assert the type of the first operand.Later, this may help to identify places in UAST that needs additional analysis. And it will immediately allow searching for nodes of more specific types in Babelfish instead of generic ones.
This change is