Disjunctive normal form attempt #558

erszcz · 2023-09-21T17:21:15Z

I'm creating this draft PR with an intention of closing it immediately afterwards. It's not a working solution, I just want to describe the problem and show an experiment aimed at solving the problem.

In #524 (comment) I said I'd experiment with disjunctive normal form to address the following problem from Gradualizer's own code. The problem is quite prevalent - it occurs everywhere where we have a structure with an inner union. For example, pretty much every time we handle type() instances:

 compat_ty({type, _, tuple, any}, {type, _, tuple, _Args}, Seen, _Env) -> 
     ret(Seen); 
 compat_ty({type, _, tuple, _Args}, {type, _, tuple, any}, Seen, _Env) -> 
     ret(Seen); 
 compat_ty({type, _, tuple, Args1}, {type, _, tuple, Args2}, Seen, Env) -> 
     %% We can assert because we match out `any' in previous clauses. 
     %% TODO: it would be perfect if Gradualizer could refine this type automatically in such a case 
     compat_tys(?assert_type(Args1, [type()]), 
                ?assert_type(Args2, [type()]), Seen, Env);

In the above example we see that even though there's a dedicated clause matching the case of Args1 = any (and symmetrically Args2 = any), we still have to assert that Args1 :: [type()] (respectively Args2 :: [type()]) in the clause which handles exactly that - Args1 and Args2 already refined to [type()]. We have to use a superfluous assertion, even though it's obvious to us and should be obvious to the typechecker.

I think it's caused by the following:

> typechecker:subtype(gradualizer:type("{t, a} | {t, b}"), gradualizer:type("{t, a | b}"), gradualizer:env()).
{true,{constraints,#{},#{},#{}}}
> typechecker:subtype(gradualizer:type("{t, a | b}"), gradualizer:type("{t, a} | {t, b}"), gradualizer:env()).
false

In other words, while Gradualizer understands that {t, a} | {t, b} :: {t, a | b}, it does not understand that {t, a | b} :: {t, a} | {t, b}. It seems to me that this property holding in both directions is crucial for our needs.

As you can see in notes.1.md in this PR, I contrasted it with the Elixir prototype based on the CDuce set-theoretic type checker. For Elixir/CDuce, the property holds in both directions.

To some extent it can be achieved by "expanding" unions as done in this PR - and it fixes some particular cases, especially the new tests in typechecker_tests - however, the general effect on tests is disastrous :(

…var_with_guard_should_pass.erl

Test a tuple with an inner call returned directly and through a type var. The inner call returns a union, which complicates the typing. Move passing tests to their own file.

erszcz · 2023-09-21T17:28:32Z

Hmm, now that I look at the results again, they don't look THAT bad:

Failed: 20. Skipped: 0. Passed: 708.

Maybe it's not such a bad idea after all... I remembered the result from June as much worse 🤔

zuiderkwast · 2023-09-22T12:41:00Z

Nice! I believe this is a good approach, especially since CDuce is the language where these theories by Castagna et.al. are tested. The two papers mentioned below are listed on the web site of CDuce. (I read them some years ago.)

In Gradual Typing with Union and Intersection Types (2017) they talk about DNF and Gradual-DNF and some other concepts and steps how to convert types into this form, compute subtyping, etc.

In Covariance and Contravariance: a fresh look at an old issue (a primer in advanced type systems for learning functional programmers) (2018), there's an efficient datastructure "binary decision diagram with lazy unions" to represent types in DNF and an algorithm how to transform types.

erszcz · 2023-09-22T13:36:00Z

Indeed, especially "Covariance and Contravariance" is very informative. It contains one more brilliant idea which I tested in Gradualizer, but with quite a dramatic impact on test results. The idea is defining subtyping by type difference.

Currently, in Gradualizer type_diff and subtype are defined separately. The paper proposes to define subtyping like this:

subtype(T1, T2) -> is_empty(type_diff(T1, T2)).

While this PR's approach to DNF doesn't seem to be as dramatic, I think it's not complete either. Together with subtyping redefinition, we'd be getting very close to the theory described there. The question, though, is whether it's worth to change fundamental parts of Gradualizer in the light of etylizer being developed from scratch according to that theory? As we speak it still has small coverage of the Erlang language features, but it seems to have more sound and better understood theoretical underpinning 🤔

zuiderkwast · 2023-09-22T14:07:56Z

Etilyzer is not gradual typing, is it?

Anyway, I don't have time and energy to do such major work in Gradualizer at this time.

erszcz · 2023-09-22T15:40:28Z

Etilyzer is not gradual typing, is it?

At the very least they accept the dynamic type on the syntax level, so I imagine they would like to expand in that direction.

time and energy

I'm afraid that's the main problem. There's a chance for some EEF (Erlang Ecosystem Foundation) funding (I spoke to one of the board members about it, nothing certain, but still), but apart from a budget, we would also need people with capacity. cc @xxdavid @josefs

xxdavid · 2023-10-04T13:10:59Z

Good idea! Thanks for experimenting with it. :)

20 failed tests don’t sound like that much. It seems to me that most of them are related to maps and glb, so maybe it would be sufficient to patch these places. But it might also require some deep refactoring, who knows…

Also, have you thought about the exponential explosion? Do you think it’s safe to use this expansion on type expressions of “practical size”?

Regarding the redefinition of subtype, what do you think would be the advantages (and possibly disadvantages) of it compared to the current implementation?

Funding for Gradualizer sounds really cool! I think it could help the project a lot. Not having anyone who would work on it regularly and as part of their job is one of the things that distinguishes Gradualizer from eqWAlizer and etylizer and probably one of the reasons why it doesn’t move very fast (compared to the other projects). Sadly, I don’t have much time for it either. I’m currently immersed in trying to improve the polymorphism support, and do not have much capacity left.

erszcz · 2023-10-04T19:25:06Z

20 failed tests don’t sound like that much.

Indeed, I might give it a go and try fixing them when I find a while.

Also, have you thought about the exponential explosion? Do you think it’s

safe to use this expansion on type expressions of “practical size”? I don't know :| I think it can be managed, given CDuce, etylizer, and upcoming Elixir systems all have to use this approach (because they follow the paper suggesting to use DNF).

Regarding the redefinition of subtype, what do you think would be the

advantages (and possibly disadvantages) of it compared to the current implementation? Currently we have two somewhat redundant and not completely compatible relations comparing types in Gradualizer: `subtype` and `type_diff`. `refinable` succinctly defines for which types `type_diff` is defined. This has led to some confusion a few times already, e.g. here - #346 (comment). To sum up, `type_diff` is not total on types (type pairs). So `type_diff` "compares" some types to tell if one can be refined with the other and to tell what the result is. `subtype` compares types to tell if one is a subtype of another. This is quite a similar notion. It seems to me that `disjoint` is another way of saying "not a subtype or supertype". If one type is a subtype or supertype of another, then some refinement in that pair must be possible, either to a narrower type or, ultimately, to `none()`) - expressing `subtype` with `type_diff` seems natural in this light. However, `subtype` has to be a total relation - otherwise we couldn't use it on arbitrary types, but our `type_diff` isn't. Practically, unifying them would mean we could have just one function implementing the not so trivial logic. I'm not sure if it would give any other significant benefits (like expressiveness), but it would help clarify the concept which currently seems to be a bit hazy in Gradualizer.

Funding for Gradualizer sounds really cool! [...]

Sadly, I don’t have much time for it either. I think if the funding happened, it would be a fixed budget / one time grant, not a long-term cooperation / employment opportunity. However, it seems none of us, the latest contributors, can invest the time or considers it compelling :|

erszcz added 8 commits June 7, 2023 23:50

Find places where dnf() might be needed

c15fd81

Introduce dnf/1 - use disjunctive normal form of types with inner unions

cec9dd8

Fix tuple with inner calls type checking

ca2b86b

Expect more failures in test/known_problems/should_pass/refine_bound_…

763b612

…var_with_guard_should_pass.erl

Add test/should_pass/inner_union_subtype_of_root_union_pass.erl

282d48d

Test a tuple with an inner call returned directly and through a type var. The inner call returns a union, which complicates the typing. Move passing tests to their own file.

Fix test/should_fail/tuple_union_fail.erl

4c56dcc

Add tuple union typechecker:subtype() tests

c0b43ae

Save notes

6f82eb0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disjunctive normal form attempt #558

Disjunctive normal form attempt #558

erszcz commented Sep 21, 2023

erszcz commented Sep 21, 2023 •

edited

Loading

zuiderkwast commented Sep 22, 2023

erszcz commented Sep 22, 2023

zuiderkwast commented Sep 22, 2023

erszcz commented Sep 22, 2023

xxdavid commented Oct 4, 2023

erszcz commented Oct 4, 2023 via email •

edited

Loading

Disjunctive normal form attempt #558

Are you sure you want to change the base?

Disjunctive normal form attempt #558

Conversation

erszcz commented Sep 21, 2023

erszcz commented Sep 21, 2023 • edited Loading

zuiderkwast commented Sep 22, 2023

erszcz commented Sep 22, 2023

zuiderkwast commented Sep 22, 2023

erszcz commented Sep 22, 2023

xxdavid commented Oct 4, 2023

erszcz commented Oct 4, 2023 via email • edited Loading

erszcz commented Sep 21, 2023 •

edited

Loading

erszcz commented Oct 4, 2023 via email •

edited

Loading