-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add DebugText
for self-documenting f-strings
#6167
add DebugText
for self-documenting f-strings
#6167
Conversation
PR Check ResultsEcosystem✅ ecosystem check detected no changes. BenchmarkLinux
Windows
|
fn self_documenting_f_string() { | ||
assert_round_trip!(r#"f"{ chr(65) = }""#); | ||
assert_round_trip!(r#"f"{ chr(65) = !s}""#); | ||
assert_round_trip!(r#"f"{ chr(65) = !r}""#); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a test case with a format specification?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, an additional test case to mix both conversion and format specification would be nice as well: f"{a=!r:0.05f}"
.../src/snapshots/ruff_python_parser__string__tests__parse_fstring_self_doc_trailing_space.snap
Show resolved
Hide resolved
start_location, | ||
) | ||
})?; | ||
let leading = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels a little hacky but was my best idea. suggestions welcome
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which part do you find hacky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pulling slices out of expression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. I ultimately want to re-write this entire file and use Cursor
and use string slices from the source text directly everywhere. But @dhruvmanila may make this whole code redundant before when implementing the 3.12 F-string changes.
135c98b
to
757dc4f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
I would like to hear your opinion on keeping Conversion::Repr
for debug strings to ease downstream use and we may be able to speed this up a bit (the whole module requires a rework but we can start somewhere).
@@ -592,10 +592,26 @@ pub struct ExprCall<'a> { | |||
#[derive(Debug, PartialEq, Eq, Hash)] | |||
pub struct ExprFormattedValue<'a> { | |||
value: Box<ComparableExpr<'a>>, | |||
debug_text: Option<DebugText<'a>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to store a reference to a debug text instead of defining an extra DebugText
node?
debug_text: Option<DebugText<'a>>, | |
debug_text: Option<&'a DebugText>, |
#[derive(Debug, PartialEq, Eq, Hash)] | ||
pub struct DebugText<'a> { | ||
leading: &'a str, | ||
trailing: &'a str, | ||
} | ||
|
||
impl<'a> From<&'a ast::DebugText> for DebugText<'a> { | ||
fn from(debug_text: &'a ast::DebugText) -> Self { | ||
Self { | ||
leading: debug_text.leading.as_str(), | ||
trailing: debug_text.trailing.as_str(), | ||
} | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above
#[derive(Debug, PartialEq, Eq, Hash)] | |
pub struct DebugText<'a> { | |
leading: &'a str, | |
trailing: &'a str, | |
} | |
impl<'a> From<&'a ast::DebugText> for DebugText<'a> { | |
fn from(debug_text: &'a ast::DebugText) -> Self { | |
Self { | |
leading: debug_text.leading.as_str(), | |
trailing: debug_text.trailing.as_str(), | |
} | |
} | |
} |
crates/ruff_python_ast/src/nodes.rs
Outdated
@@ -925,6 +926,14 @@ impl ConversionFlag { | |||
} | |||
} | |||
|
|||
#[derive(Clone, Debug, PartialEq)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[derive(Clone, Debug, PartialEq)] | |
#[derive(Clone, Debug, PartialEq, Eq, Hash)] |
fn unparse_formatted( | ||
&mut self, | ||
val: &Expr, | ||
debug_text: &Option<DebugText>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I find Option<&ref>
more natural than passing a reference to an option (except if the option is mutable).
debug_text: &Option<DebugText>, | |
debug_text: Option<&DebugText>, |
start_location, | ||
) | ||
})?; | ||
let leading = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which part do you find hacky?
@@ -230,6 +230,7 @@ impl<'a> StringParser<'a> { | |||
// match a python 3.8 self documenting expression | |||
// format '{' PYTHON_EXPRESSION '=' FORMAT_SPECIFIER? '}' | |||
'=' if self.peek() != Some('=') && delimiters.is_empty() => { | |||
trailing_seq.push('='); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like it that we now use the text ranges to get leading
and before_eq
. Would it be possible also to use text ranges to extract trailing
? It would reduce 2 string allocations for each self-documenting string (one for trailing_seq
, and one for formatting trailing
in debug text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think so. we'd need to store the =
and any trailing spaces inside expression
which is a bit odd but doable if we just track where the actual expression ends. will push something with what i mean (and maybe you have ideas for how to improve that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good! Thanks for working on this :)
fn self_documenting_f_string() { | ||
assert_round_trip!(r#"f"{ chr(65) = }""#); | ||
assert_round_trip!(r#"f"{ chr(65) = !s}""#); | ||
assert_round_trip!(r#"f"{ chr(65) = !r}""#); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, an additional test case to mix both conversion and format specification would be nice as well: f"{a=!r:0.05f}"
instead of modelling self-documenting f-strings (`f"{ foo= }"`) as a (simplified) `Constant("foo=")` followed by a `FormattedValue(Expr("x"))`, instead model this case with a `DebugText(leading, trailing)` attribute on the `FormattedValue` so that we don't have to synthesize nodes (which results in siblings with overlapping ranges). We need to be able to preserve the whitespace for self-documenting f-strings, as well as reproduce the source (eg unparse, format).
vec![Expr::from(ast::ExprFormattedValue { | ||
value: Box::new(value), | ||
debug_text: Some(ast::DebugText { | ||
leading: leading.to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we store references into the source here rather than allocating new strings? i guess many more things would need to change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I played around with an ast that uses Cow<'source, str>
for all string fields. I spent about 2h fixing lifetime issues and run out of patience. The problem is that every node must be parametrized by the source lifetime.
We may still want to do this because using a bump allocator for the AST nodes would also require a new lifetime. Something to explore in the future.
757dc4f
to
ff200c6
Compare
instead of modelling self-documenting f-strings (
f"{ foo= }"
) as a (simplified)Constant("foo=")
followed by aFormattedValue(Expr("x"))
, instead model this case with aDebugText(leading, trailing)
attribute on theFormattedValue
so that we don't have to synthesize nodes (which results in siblings with overlapping ranges). We need to be able to preserve the whitespace for self-documenting f-strings, as well as reproduce the source (eg unparse, format).Fixes #5970
Test Plan
Existing snapshots, a few more tests. More needed?