Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect infinite recursions in nickel doc #2055

Merged
merged 3 commits into from
Oct 2, 2024
Merged

Conversation

yannham
Copy link
Member

@yannham yannham commented Sep 27, 2024

Closes #1967.

This commit fixes an issue where legit recursive configurations cause infinite recursion when trying to extract their documentation. Indeed, nickel doc calls to eval_record_spine, which is unable to detect loops that spans multiple evaluation steps (an evaluation step is evaluating a field to a WHNF).

This commit adds a lock and unlock method to the interface of thunks, which amounts to blackholing, and use them in the implementation of eval_record_spine such that the thunks of the parents of the field that we are currently evaluating - the active parents - are always locked. If there is a cycle, this will raise an infinite recursion error, that we ignore - we can still print the unevaluated value for the user - but will prevent the evaluation from going on forever.

This technique should be used to fix other similar issues with force and deep_seq, but this is left for future work.

/// Generate an update frame from this thunk and set the state to `Blackholed`. Return an
/// error if the thunk was already black-holed.
pub fn mk_update_frame(&mut self) -> Result<ThunkUpdateFrame, BlackholedError> {
if self.data.borrow().state == ThunkState::Blackholed {
let mut data_ref = self.data.borrow_mut();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in this function are cosmetic (avoid two calls to borrow()/borrow_mut() in a row: just borrow mut from the beginning)

@github-actions github-actions bot temporarily deployed to pull request September 27, 2024 12:04 Inactive
}
} else {
Ok(None)
// If the thunk is already evaluated, we don't return an update frame.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes are a bit cosmetic, but not only: previously, if idx.should_update() was false, we would never look at the state of the thunk and always proceed with evaluation. This is actually not correct, because if the thunk is blackholed, we should raise an infinite recursion error. For eval record spine in particular, the parents thunk are in weak head normal form (they are records), so should_update() is false and without this additional change, the detection of the infinite cycle fails.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this code path is no longer triggered in eval_record_spine? But maybe it makes sense to do this anyway...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I suspect the previous code could miss some infinite recursion errors as well because of that.

Copy link
Contributor

github-actions bot commented Sep 27, 2024

🐰 Bencher Report

Branch2055/merge
Testbedubuntu-latest

⚠️ WARNING: The following Measure does not have a Threshold. Without a Threshold, no Alerts will ever be generated!

Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds CLI flag.

Click to view all benchmark results
BenchmarkLatencynanoseconds (ns)
fibonacci 10📈 view plot
⚠️ NO THRESHOLD
495,290.00
pidigits 100📈 view plot
⚠️ NO THRESHOLD
3,231,100.00
product 30📈 view plot
⚠️ NO THRESHOLD
847,830.00
scalar 10📈 view plot
⚠️ NO THRESHOLD
1,537,800.00
sum 30📈 view plot
⚠️ NO THRESHOLD
840,870.00
🐰 View full continuous benchmarking report in Bencher

@yannham yannham requested a review from jneem September 27, 2024 12:22
@jneem
Copy link
Member

jneem commented Sep 27, 2024

I think the recursion check has some false positives: with this PR, the input:

{
  outer = {
    mid = { inner | doc "this is inner" },
    z | doc "this is z" = outer.mid,
  }
}

doesn't generate documentation for outer.z.inner. But if I replace the value of z by mid instead of outer.mid then it does generate documentation for outer.z.inner.

@yannham
Copy link
Member Author

yannham commented Sep 27, 2024

I wonder if we can do much better than that, at least using the thunk machinery, because when z refers to the thunk outer, we can't really decide if this use is problematic or not. It depends if another field of outer is then accessed, and even then, just seeing mid doesn't tell us if there's an issue or not, as we could have something like:

{
  outer = {
    mid = { inner = outer.z | doc "this is inner" },
    z | doc "this is z" = outer.mid,
  }
}

Which is looping. So either we think it's ok to not print documentation in this case because the structure of the configuration is a bit intricate - do we expect configuration from which we extract doc to exhibit this sort of shape? I honestly don't know.

Or we can lower the detection power and accept to have more false negatives, but it becomes very ad-hoc. For example, we could only detect unguarded recursion, that is when a field directly hosts a naked variable as in z = outer.

Another possible solution is to try to count the number of times we unwrap the same thunk, and instead to halt at one, halt at some other higher empirical value (with the idea that if you recurse 10 times in the same thunk, there is a great change that this is an infinite loop). It has to be baked in the interpreter though, because eval_record_spine only has access to the thunks stored directly at the level of fields, but in z = outer.mid, only the evaluator has the knowledge that outer has been used at some point.

@yannham
Copy link
Member Author

yannham commented Sep 27, 2024

What makes me wonder if this example is really a good working example is that you would rather refer to mid directly as a sibling here, and this is handled fine by the implementation of this PR.

However, while it might be ok for documentation, the false positives preclude the usage of this technique for deep sequing and forcing, I believe.

@yannham
Copy link
Member Author

yannham commented Sep 27, 2024

Maybe one last possible approach, which is I believe what you hinted at in #1815 (comment), would be to detect regular trees, that is when a contract or a field evaluates to a parent thunk directly, rather than this PR that is more sensitive and detects mere usage. I would indeed require an additional stack/set of active thunks, but shouldn't be very hard to implement either. Le met give it a try

@github-actions github-actions bot temporarily deployed to pull request September 30, 2024 06:44 Inactive
@yannham
Copy link
Member Author

yannham commented Sep 30, 2024

@jneem I implemented the last proposal, which seems to handle both your example and the original repro. I'm not sure it's a good idea to do that for deep_seq, as it has a cost - albeit small. I don't know if we want to pay that price for each and every deep_seq. I think the trade-off is different for nickel doc, where we can afford a bit more machinery to avoid infinite recursion.

In any case, it's read for a review.

Copy link
Member

@jneem jneem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, except that I think there are some left-overs from the previous version.

I haven't thought too much about the semantics, but if performance is a concern during normal eval then I think the state could be moved into the thunks instead of having an external hash set: give each thunk a bit that says whether or not it's active.

data_ref.state = ThunkState::Evaluated;
}
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lock/unlock are no longer used, right?

}
} else {
Ok(None)
// If the thunk is already evaluated, we don't return an update frame.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this code path is no longer triggered in eval_record_spine? But maybe it makes sense to do this anyway...

@yannham
Copy link
Member Author

yannham commented Oct 2, 2024

I haven't thought too much about the semantics, but if performance is a concern during normal eval then I think the state could be moved into the thunks instead of having an external hash set: give each thunk a bit that says whether or not it's active.

This is a very good point; we could just "mix" the two solutions to get the best of both worlds (no false positive and fast check). For now I'm going to proceed with the current version, but it's good to keep in mind for the deep_seq issue.

@github-actions github-actions bot temporarily deployed to pull request October 2, 2024 09:18 Inactive
@yannham
Copy link
Member Author

yannham commented Oct 2, 2024

@jneem Sorry I changed my mind 😛 as the lock and unlock methods were left over, I thought it would not be much effort to implement your last suggestion. Given alignment and padding, adding a bit for locking in ThunkData doesn't change the size of any datastructure, and will let us implement the detection for deep seq easily.

The only drawback is that cleaning up after an eval error is a bit more involved (the external hashset was just dropped upon error, but here I reckon we have to make sure the thunk's state is properly reset, or it could interfere with future evaluation e.g. when deep_seqing in the REPL). It's still reasonable IMHO.

The PR could deserve a new review though.

This commit fixes an issue where legit recursive configurations cause
infinite recursion when trying to extract their documentation. Indeed,
`nickel doc` calls to `eval_record_spine`, which is unable to detect
loops that span multiple evaluation step (an evaluation step is
evaluating one field to a weak head normal form).

This commit changes `eval_record_spine` and related methods to lock (in
practice just flip a bit in the thunk's state) the active thunks, that
is the thunks of parents of the field that we are currently evaluating.

If we ever come across a thunk that is already locked, the field is left
unevaluated, avoiding infinite recursion.
@github-actions github-actions bot temporarily deployed to pull request October 2, 2024 09:46 Inactive
core/src/program.rs Outdated Show resolved Hide resolved
core/src/program.rs Show resolved Hide resolved
@yannham yannham enabled auto-merge October 2, 2024 13:07
@github-actions github-actions bot temporarily deployed to pull request October 2, 2024 13:09 Inactive
@yannham yannham added this pull request to the merge queue Oct 2, 2024
Merged via the queue into master with commit 0b251e8 Oct 2, 2024
7 checks passed
@yannham yannham deleted the fix/doc-infinite-recursion branch October 2, 2024 13:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nickel doc overflows the stack with recursive schemas
2 participants