WARNING: this crate is experimental and even careful use is likely undefined behavior.
This crate exposes four C standard library functions to Rust:
pub fn setjmp(env: *mut jmp_buf) -> c_int;
pub fn sigsetjmp(env: *mut sigjmp_buf, savesigs: c_int) -> c_int;
pub fn longjmp(env: *mut jmp_buf, val: c_int) -> c_void;
pub fn siglongjmp(env: *mut sigjmp_buf, val: c_int) -> c_void;
as well as the jmp_buf
and sigjmp_buf
types needed to use them.
See
setjmp(3)
for details and caveats.
Also see RFC #2625.
To interact better with C code that may use
setjmp()
/longjmp()
:
- If C code calls rust code, and the rust code calls C code, and a
longjmp()
happens, you may want the rust code to catch, thelongjmp()
, transform it into a panic (to safely unwind), thencatch_unwind()
, then turn it back into alongjmp()
to return to someplace in the C code (the last place that calledsetjmp()
). - If rust code calls C code, the rust code might want to catch a
longjmp()
from the C code and handle it somehow. - Rust code might want to
longjmp()
to return control to C code.
It is possible to use setjmp()
/longjmp()
just for managing
control flow in rust (without interacting with C), but that would be
quite dangerous and has no clear use case.
Ordinarily, using a C function from rust is easy: you just declare it. Why go to the trouble of making a special crate?
- Document the numerous problems and caveats, as done in this document.
- Explore the problem space enough that the rust language team might feel comfortable defining the behavior (in at least some narrow circumstances).
- Provide tests to see if something breaks in an obvious way.
- Handle some platform issues:
- The
jmp_buf
andsigjmp_buf
types are not trivial and are best defined using bindgen on the system's<setjmp.h>
header. - libc implementations often use macros to change the symbols
actually referenced; and this is done differently on different
platforms. For instance, instead of
sigsetjmp
the actual libc symbol might be__sigsetjmp
, and there may be a macro to rewrite thesigsetjmp()
call into__sigsetjmp()
.
- The
The invocation of setjmp can appear only in the following contexts (see this comment):
- the entire controlling expression of
match
, e.g.match setjmp(env) { ... }
. if setjmp(env) $integer_relational_operator $integer_constant_expression { ... }
- the entire expression of an expression statement:
setjmp(env);
See tests for examples.
Beyond the many challenges using setjmp/longjmp
in C, there are
additional challenges using them from rust.
- The behavior of these functions is defined in terms of C, and therefore any application to rust is by analogy (until rust defines the behavior).
- Rust has destructors, and C does not. Any
longjmp()
must be careful to not jump over any stack frame that owns references to variables that have destructors. - Rust doesn't have a concept of functions that return multiple
times, like
fork()
orsetjmp()
, so it's easy to imagine that rust might generate incorrect code around such a function. - Rust uses LLVM during compilation, which needs to be made aware of
functions that return multiple times by using the
returns_twice
attribute; but rust has no way to propagate that attribute to LLVM. Without this attribute, it's possible that LLVM itself will generate incorrect code (See this comment). - Jumping can interrupt well-bracketed control flow, circumventing guarantees about what code has run.
- Jumping can return control to a point before a value was moved, thereby allowing use-after-drop bugs.
- Jumping deallocates variables without destructing them (it doesn't merely leak them).
Given these problems, you should seriously consider alternatives.
One alternative is to use C wrappers when entering a rust stack frame
from C or a C stack frame from rust. The wrappers could turn special
return values from rust into a C longjmp()
if necessary, or catch
a longjmp()
from C and turn it into a rust panic!()
,
respectively. This is not always practical, however, so sometimes
calling setjmp()
/longjmp()
from rust is still the best
solution.
- Mark any function calling
setjmp()
with#[inline(never)]
to reduce the chances for misoptimizations. - Code between a
setjmp()
returns0
and possiblelongjmp()
should be as minimal as possible. Typically, this might just be saving/setting global variables and calling a C FFI function (which mightlongjmp()
). This code should avoid allocating memory on the heap, using types that implement theDrop
trait, or code that is complex enough that it might trigger misoptimizations. - Code before a
longjmp()
or any parent stack frames should also be minimal. Typically, this would be just enough code to retrieve a return value from a callee, or catch a panic withcatch_unwind()
. This code should avoid allocating memory on the heap, using types that implement theDrop
trait, or code that is complex enough that it might trigger misoptimizations.