Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

O₃ examples don't work because we don't have event-based channels #19

Open
ashton314 opened this issue Nov 15, 2024 · 0 comments
Open

Comments

@ashton314
Copy link
Member

ashton314 commented Nov 15, 2024

We can't support examples from the Ozone paper because our communication channels can't have methods bound to them that fire on message reception.

It's not actually clear how much this would give us; Elixir manages just fine without having channel triggers; perhaps we can find a way to construct a non-channel-trigger version of a system for every channel-trigger setup?

We have some tests in test/ooo_test.exs, but originally we wanted to make the test failing and then fix it by introducing line number-based communication integrity tokens. As of writing, we can't trigger that bug.

Details:

Motivation: supporting O₃-like choreographies

We want to be able to specify actions to happen as messages arrive. From the Ozone paper, figure 15:

public class ConcurrentSend@(KS, CS, S, C) {
    public void start(
                      String@KS key, String@CS txt, Client@C client,
                      Token@(KS, CS, S, C) tok,
                      AsyncChannel@(KS, S) ch1, AsyncChannel@(CS, S) ch2, AsyncChannel@(S, C) ch
                      ){
        // Services send data to the server.
        CompletableFuture@S keyS = ch1.fcom(key, 1@(KS,S), tok);
        CompletableFuture@S txtS = ch2.fcom(txt, 2@(CS,S), tok);

        // Server forwards data to the client.
        ch3.fcom(keyS, 3@(S,C), tok)
            .thenAccept(client::decrypt);
        ch3.fcom(txtS, 4@(S,C), tok)
            .thenAccept(client::display);
    }
}

the last 4 lines of the start function set up functions to happen on the client when a message arrives on ch3.

Problem: Elixir does not have channels to do this

We don’t have a channel object that we can set callbacks on to fire when a message arrives. We can block a process until it receives a message matching one of several patterns, but full asynchrony like the Choral example doesn’t work.

Solution 1: Turn every process into a GenServer

This is hard.

Projecting every process into a GenServer would let us create choreographies that can run some function as soon as a message is received.

The problem is that we would break straight-line choreography code up across several message handling functions. Variables created as part of the choreography would no longer be in scope throughout what appears to be a block in the choreography.

We could fake this by storing all variables created by the choreography in the GenServer state. The issue then becomes making this state available to projected local expressions. Two options:

  1. Rewrite local expressions so that instances of a variable e.g. x becomes state.vars[:x].

    This is difficult because Elixir macros cannot reason about binding information. We figured this out as part of type tailoring. Just to be sure I wasn’t missing something, I asked a question on the Elixir forum (post here) and José confirmed that Elixir macros cannot safely reason about variable bindings.

  2. Splat relevant pieces of state into each callback function.

    E.g. if the choreography variables are x, y, and z, we can put

    x = state.vars[:x]
    y = state.vars[:y]
    z = state.vars[:z]

    at the beginning of each handler function. A tricky problem remains to stuff the right variables back into the state at the end of the function, but I think that’s more tractable than solving problem №1.

Either way, this would be a substantial rewrite of how Chorex works, and we would loose some nice things e.g. automatically getting some checks thanks to Chorex just being a simple (for sufficiently tricky values of “simple”) macro.

We would gain something by letting processes get stopped mid-choreography by a supervisor. We would reserve messages starting with e.g. {:chorex_control, …} for handlers that can recover from errors and whatnot. Again, more complexity, but more possibilities.

Solution 2: Introduce a limited form of thread joining

So far this is my favorite option.

Instead of making processes fully async, we could have a notion of “until all these messages are received, wait and fire off functions for each process”. While not as general, we could get the behavior of some of the Ozone examples, and possibly the most important/realistic ones. Javascript’s Promise.all function is kind of like this.

This might exclude some options of exotic error recovery. That said, as I’ve started thinking about how error recovery with supervision trees might work, I’m not sure how much granularity we will need.

My current thoughts with error recovery are that we should start with a rather coarse-grained approach and refine later as needed.

Solution 3: Use a different message backend that allows for registering actions

Something like Phoenix PubSub might work.

A separate library (either existing, or maybe we roll our own) might let us get some of the “do this action when you get this message” behavior of the .thenAccept function for low effort.

This feels like the least elegant solution and I’m a little hesitant to adopt it: we’d be pulling in a dependency and proxying all communication.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant