[flake8-type-checking] Adds implementation for TCH007 and TCH008 #12927

Daverball · 2024-08-16T11:43:49Z

This is part of a series of pull requests fulfilling my promise in #9573 to port some of the enhancements and new rules from flake8-type-checking to ruff.

Summary

This adds the missing flake8-type-checking rules TC(H)007 and TC(H)008 including auto fixes.

There is some overlap between TC(H)007 and TC(H)004, currently the existing rule takes precedence, although ideally we would still emit both violations and share a fix for overlaps based on the selected strategy (if it is possible for violations of two different rules to share the same fix). We could probably move the analysis for TC(H007) into the same function as TC(H)004 in order to accomplish that. But we could defer that to a future pull request.

There is also some potential overlap between TC(H)008 and TC(H)010. Currently TC(H)010 is prioritized, but since it currently has no fix it might make more sense to either prioritize TC(H)008 or add a fix for TC(H)010, although that could be covered by a future pull request.

Test Plan

cargo nextest run

The implementation for TCH007 is incomplete and also requires some changes to TCH004 in order to avoid conflicts between TCH004 and TCH007

github-actions · 2024-08-16T11:57:39Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+6 -0 violations, +0 -0 fixes in 2 projects; 52 projects unchanged)

latchbio/latch (+1 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ src/latch_cli/snakemake/config/utils.py:13:33: TCH008 [*] Remove quotes from type alias

zulip/zulip (+5 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ analytics/views/stats.py:342:14: TCH008 [*] Remove quotes from type alias
+ analytics/views/stats.py:344:16: TCH008 [*] Remove quotes from type alias
+ confirmation/models.py:76:5: TCH008 [*] Remove quotes from type alias
+ confirmation/models.py:77:5: TCH008 [*] Remove quotes from type alias
+ zerver/lib/push_notifications.py:75:49: TCH008 [*] Remove quotes from type alias

Changes by rule (1 rules affected)

code	total	+ violation	- violation	+ fix	- fix
TCH008	6	6	0	0	0

codspeed-hq · 2024-08-16T13:25:35Z

CodSpeed Performance Report

Merging #12927 will not alter performance

_{Comparing Daverball:feat/tch007-tch008 (fd8c045) with main (0dbcecc)}

Summary

✅ 32 untouched benchmarks

Also fixes lexicographical lookup not taking shadowed bindings into account.

Daverball · 2024-08-18T09:00:22Z

@charliermarsh It might be worth to move the TCH010 logic to where the TCH008 logic currently sits, so we can actually autofix TCH010 in the same cases where we would otherwise emit a TCH008 by removing the quotes when we know that the quoted expression should be entirely available at runtime.

Concretely we would pass it at the stage where we parse deferred string annotations and check if the current parent expression is a | BinOp. Rather than walk the AST of type definitions to collect the string literals (which doesn't appear to currently properly handle deeply nested cases like eg. list["Foo" | None]).

The drawback would be that it may be more difficult to create an autofix in the opposite case that expands the quotes to the entire type expression, since we would need to walk the parent nodes in order to determine the root node of the type expression.

charliermarsh · 2024-09-03T00:49:11Z

Thanks @Daverball, sorry that I haven't had a chance to review this yet.

Daverball · 2024-10-09T08:32:35Z

@charliermarsh Just a friendly reminder in case this PR slipped through the cracks somehow

Daverball · 2024-11-05T14:00:02Z

@charliermarsh Another reminder. Please let me know if these reminders are a bother. The contribution guidelines don't really seem to contain any information about whether it's acceptable/appreciated/frowned upon to bump review requests, it tends to vary from project to project and maintainer to maintainer.

charliermarsh · 2024-11-05T14:26:58Z

No that's fine @Daverball -- sorry, it's just blocked on me finding time and clearly I haven't done a good job of doing so yet.

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs

sbrugman · 2024-11-09T13:44:02Z

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs

+///     from foo import Foo
+/// OptFoo: TypeAlias = "Foo | None"
+/// ```
+///


Why is the fix marked as unsafe? We can document this in a ## Fix safety section,

Good question, I went by the precedent set by runtime_import_in_type_checking_block which uses the same function to quote annotations in its fix and also doesn't explain its reasoning. So I don't know what the original reasoning was. It might just be the fact that the forward reference would not be able to be resolved at runtime. Which is a problem with flake8-type-checking in general.

Looks like it is at least in part because it can remove comments on multi-line expressions. Maybe this documentation gap should first be addressed in a separate PR though, since it also affects other existing rules.

It would be great if we can fill the gap for new rules. But I agree, filling the gap for existing rules can be done in a separate PR

sbrugman · 2024-11-09T13:50:10Z

crates/ruff_linter/resources/test/fixtures/flake8_type_checking/TCH007.py

+f: TypeAlias = Bar   # TCH007
+g: TypeAlias = Foo | Bar  # TCH007 x2
+h: TypeAlias = Foo[str]  # TCH007
+


What does the fix look like when comments are present?

example:

>>> i: TypeAlias = (int | # TCH007 x2 float) >>> i int | float

Until I know why this fix was marked unsafe in the other rules I will continue always to mark this fix as unsafe. If the only reason was the fix potentially eating comments I can apply the same comment detection logic to TCH007.

crates/ruff_linter/resources/test/fixtures/flake8_type_checking/TCH008.py

Co-authored-by: Simon Brugman <sbrugman@users.noreply.github.com>

Marks TCH007 fix as unsafe, since it may remove comments. Adds additional test cases.

MichaReiser

Nice. This overall looks good to me, but I know very little about typing or that part of the code base. That's why it would be great to get a second opinion from either @carljm or @AlexWaygood

crates/ruff_linter/src/rules/flake8_type_checking/mod.rs

MichaReiser · 2024-11-19T08:53:10Z

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs

+///     from foo import Foo
+/// OptFoo: TypeAlias = "Foo | None"
+/// ```
+///


It would be great if we can fill the gap for new rules. But I agree, filling the gap for existing rules can be done in a separate PR

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs

MichaReiser · 2024-11-19T08:59:49Z

crates/ruff_linter/src/checkers/ast/analyze/deferred_scopes.rs

+            // FIXME: This does not seem quite right, if only TC004 is enabled
+            //        then we don't need to collect the runtime imports


What's the reason we can't fix this today?

We can, I just didn't want to introduce unrelated changes to this PR, so I opted for a comment as reminder to myself to clean this up in a follow-up pull request.

crates/ruff_linter/src/checkers/ast/mod.rs

crates/ruff_python_semantic/src/binding.rs

MichaReiser · 2024-11-19T11:27:50Z

crates/ruff_python_semantic/src/model.rs

+            // FIXME: Shouldn't this happen above where `class_variables_visible` is set?
+            seen_function |= scope.kind.is_function();


Can you expand on why you think this should happen above?

My assumption is that it's here because it updates the state for the next iteration. The only possible issue I see is that it fails updating the seen_function if the self.bindings[binding_id].kind is an Annotation. I, unfortunately, don't have a good enough understanding of this code to assess whether this is a) possible and b) a problem

I'm fairly confident it's incorrect to skip updating seen_function if you see a BindingKind::Annotation.

That being said, situations where this would actually lead to the wrong result should be incredibly rare, since the corresponding Python code itself would be highly suspect and probably incorrect.

MichaReiser · 2024-11-19T11:40:55Z

crates/ruff_python_semantic/src/model.rs

+    /// Simulates a runtime load of a given [`ast::ExprName`].
+    ///
+    /// This should not be run until after all the bindings have been visited.
+    pub fn simulate_runtime_load(&self, name: &ast::ExprName) -> Option<BindingId> {


I don't think I'm the right person reviewing this. @AlexWaygood or @carljm could either of you take a look to see if that makes sense (overall)?

I took a look and left some comments, though I think familiarity with Ruff's semantic model would make @AlexWaygood a better reviewer here.

...es/ruff_linter/src/rules/flake8_type_checking/rules/runtime_import_in_type_checking_block.rs

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs

Ensures there's no circularity issues between `TCH007` and `TCH008` by always running test cases with both rules enabled.

crates/ruff_linter/resources/test/fixtures/flake8_type_checking/TCH008.py

carljm · 2024-11-20T00:31:56Z

crates/ruff_linter/src/checkers/ast/mod.rs

+    /// Visit an [`Expr`], and treat it as a [PEP 613] explicit type alias.
+    ///
+    /// [PEP 613]: https://peps.python.org/pep-0613/
+    fn visit_explicit_type_alias(&mut self, expr: &'a Expr) {
+        let snapshot = self.semantic.flags;
+        self.semantic.flags |= SemanticModelFlags::EXPLICIT_TYPE_ALIAS;
+        self.visit_type_definition(expr);
+        self.semantic.flags = snapshot;
+    }
+
+    /// Visit an [`Expr`], and treat it as a [PEP 695] generic type alias.
+    ///
+    /// [PEP 695]: https://peps.python.org/pep-0695/#generic-type-alias
+    fn visit_generic_type_alias(&mut self, expr: &'a Expr) {
+        let snapshot = self.semantic.flags;
+        // even though we don't visit these nodes immediately we need to
+        // modify the semantic flags before we push the expression and its
+        // corresponding semantic snapshot
+        self.semantic.flags |= SemanticModelFlags::GENERIC_TYPE_ALIAS;


Naming nit: I would append _value to the names of both these methods to clarify that they handle the RHS or "value", not the entire type alias.

More substantive naming issue: I think that using "generic type alias" as an all-encompassing description of PEP 695 type aliases is inaccurate and confusing. (It is strange that the PEP text uses this as a header over that entire section of the PEP, but it's clear in code examples in that section that only a type statement with type parameters is actually generic; one without type parameters is clearly described there as "non-generic".)

Using "explicit type alias" to describe PEP 613 type aliases is not unreasonable, given that's the title of PEP 613, but I still think it's ambiguous (the type statement is equally explicit!) and based on an indirect reference that requires looking up the PEP to understand.

If we don't mind referencing PEPs, then I think we should just go all-in on that, and opt for concise and totally unambiguous, with names like visit_pep613_type_alias_value, visit_pep695_type_alias_value, PEP613_TYPE_ALIAS, PEP695_TYPE_ALIAS.

If we want to avoid using PEP numbers like this (which is totally unambiguous, but does require knowing or looking up the PEP numbers), then I think we have to be more verbose to be unambiguous: visit_typealias_annotated_type_alias_value, visit_type_statement_type_alias_value, TYPEALIAS_ANNOTATED_TYPE_ALIAS, TYPE_STATEMENT_TYPE_ALIAS.

One other note: either way, a code example in the docstring of each method could go a long way towards making it immediately obvious to the reader what form each one is handling.

I'll have a little think about what I prefer. Initially I did use the PEP numbers in their names, but it seemed too easy to accidentally read one as the other when you're just skimming the code, so I decided to switch to the names used in the PEP title/section header respectively, even if they're far from ideal.

The descriptive names are a little too tongue-twistery for my liking, but they may be the correct choice after all.

carljm · 2024-11-20T01:00:03Z

crates/ruff_python_semantic/src/model.rs

+    /// Simulates a runtime load of a given [`ast::ExprName`].
+    ///


It was very much unclear to me, until I spent quite a while reading through the code, what it meant to "simulate a runtime load" and why or how this is different from what lookup_symbol and resolve_load do.

It seems the differences are a) this method ignores typing-only bindings (this explains the name choice), and b) this method is supposed to run after all bindings have been collected, but still do a flow-sensitive lookup (i.e. find the applicable binding for a name at a particular point in control flow, which is not necessarily the end of the scope.) Or semi-flow-sensitive, anyway; flow-sensitive as approximated by "source order within the scope, with no understanding of branches."

Difference (a) makes sense, though taken alone it seems like it could be a flag rather than an entirely separate method.

I have a few questions/thoughts about difference (b), some of which may be due to lack of familiarity with Ruff and its semantic model.

It seems like this can be an improvement over Ruff's current model in some cases. But why is this PR the one introducing this capability? What is it about TCH007 and TCH008 that require this capability, when Ruff's many other rules involving looking up name bindings have so far gotten away without this?

My understanding is that largely Ruff handles this by doing name lookups while constructing the semantic model, so inapplicable bindings just don't exist yet. Is that approach not workable in this case? Why do we need a method that must instead be run after all bindings are collected, but then ignores the "late" ones, in inner scopes?

If this capability isn't critical to the functioning of the new rules specifically, but is really a more general improvement to Ruff's semantic model, perhaps it should be looked at more holistically in view of all Ruff rules, and not introduced as part of this PR?

If we do definitely need this new method for these new rules to work at all, then I think this also should be more clearly described in a longer doc comment on the method.

My understanding is that largely Ruff handles this by doing name lookups while constructing the semantic model, so inapplicable bindings just don't exist yet. Is that approach not workable in this case? Why do we need a method that must instead be run after all bindings are collected, but then ignores the "late" ones, in inner scopes?

The main reason is that TCH007/TCH008 perform speculative lookups, rather than the ones that would actually happen at runtime, to ensure that we don't produce violations for the parts of the type alias value that need to available at runtime and vice-versa. (in fact they actually perform both the speculative and regular lookup to avoid violations/fixes going in circles)

The normal flow of the semantic model does not cover the speculative nature of questions like "what if this annotation wasn't a forward reference" or "what if this annotation was turned into a forward reference?".

There are certainly other ways to implement this check (i.e. performing the speculative lookup at the right time during semantic analysis), but this seemed like the least disruptive one (especially since at the time I wrote this there wasn't yet a cache for parsing string annotations) and matches my implementation in the flake8-plugin I wrote, so I can be more confident that it works and produces zero false positives in several large code-bases of mine.

I was also worried that speculatively walking the type definition earlier/later than it was supposed to could lead to other rules triggering or unwanted state leaking into the semantic model. It would be difficult to ensure that this excursion would have no unwanted side-effects.

Difference (a) makes sense, though taken alone it seems like it could be a flag rather than an entirely separate method.

And in fact in my own implementation in the flake8 plugin they are the same method with a flag. However the potentially incorrect placement of the seen_function update prompted me to keep this method separate for now, in order to avoid potentially subtle regressions in lookup_symbol.

We could either take the risk and merge these methods now or put it off until later, when we're more confident that the method is stable.

carljm · 2024-11-20T01:01:35Z

crates/ruff_python_semantic/src/model.rs

+    /// Simulates a runtime load of a given [`ast::ExprName`].
+    ///
+    /// This should not be run until after all the bindings have been visited.
+    pub fn simulate_runtime_load(&self, name: &ast::ExprName) -> Option<BindingId> {


I took a look and left some comments, though I think familiarity with Ruff's semantic model would make @AlexWaygood a better reviewer here.

crates/ruff_python_semantic/src/model.rs

Co-authored-by: Carl Meyer <carl@oddbird.net>

Adds initial implementation for TCH008

3de92a5

The implementation for TCH007 is incomplete and also requires some changes to TCH004 in order to avoid conflicts between TCH004 and TCH007

Fixes mkdocs and clippy complaints

e1ca904

This comment was marked as resolved.

Sign in to view

Fixes runtime forward reference binding lookup

3201588

This comment was marked as resolved.

Sign in to view

Daverball added 7 commits August 17, 2024 11:38

Simulates proper runtime name lookups for more accurate results

bc45146

Adds missing test case back and fixes range overlap check

0df3d12

Also fixes lexicographical lookup not taking shadowed bindings into account.

Fixes docstring

4cb32e1

Refactors class name book keeping code

d9cea8f

Removes unnecessary borrow

096eb9d

Fixes some logical errors and simplifies class name lookups

e420697

Guards against TC010/TCH008 overlap

fa9d4df

Daverball added 2 commits August 20, 2024 15:48

Adds implementation and tests for TCH007

dd78b52

Fixes TCH004/TCH007 overlap

08b7ddc

Daverball marked this pull request as ready for review August 20, 2024 16:01

MichaReiser requested a review from charliermarsh September 2, 2024 06:08

MichaReiser added the rule Implementing or modifying a lint rule label Sep 2, 2024

Daverball mentioned this pull request Sep 2, 2024

[flake8-type-checking] Missing rules, autofix strategies, overlapping rules and more #13208

Open

Merge branch 'main' into feat/tch007-tch008

bd31311

Daverball added 3 commits October 24, 2024 15:36

Merge branch 'main' into feat/tch007-tch008

b185e4b

Clean up unused import that was accidentally left in

b17c959

Simplifies quote_type_expression

643ddb6

Daverball mentioned this pull request Oct 30, 2024

RUF027 False positive for non-runtime context binding #14000

Open

sbrugman reviewed Nov 9, 2024

View reviewed changes

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs Outdated Show resolved Hide resolved

sbrugman reviewed Nov 9, 2024

View reviewed changes

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs Outdated Show resolved Hide resolved

sbrugman reviewed Nov 9, 2024

View reviewed changes

crates/ruff_linter/src/rules/flake8_type_checking/rules/type_alias_quotes.rs Outdated Show resolved Hide resolved

sbrugman reviewed Nov 9, 2024

View reviewed changes

crates/ruff_linter/resources/test/fixtures/flake8_type_checking/TCH008.py Outdated Show resolved Hide resolved

Daverball and others added 10 commits November 9, 2024 14:58

Apply suggestions from code review

09986c7

Co-authored-by: Simon Brugman <sbrugman@users.noreply.github.com>

Re-formats code slightly.

4044dfa

Marks TCH007 fix as unsafe, since it may remove comments. Adds additional test cases.

Merge branch 'main' into feat/tch007-tch008

f51ea65

Fixes errors and only marks fix as unsafe when there are comments

81bb172

Removes unnecessary return statement

2555f8a

Merge branch 'main' into feat/tch007-tch008

a36bab6

Merge branch 'main' into feat/tch007-tch008

f2f384e

Fixes new semantic flag values

6799411

Merge branch 'main' into feat/tch007-tch008

231804b

quote_type_annotation is once again no longer guaranteed to work

bfffb7b

dhruvmanila added the preview Related to preview mode features label Nov 18, 2024

MichaReiser approved these changes Nov 19, 2024

View reviewed changes

Daverball added 2 commits November 19, 2024 15:58

Addressed some of the feedback from the code review

fd051a8

Avoids changing the behavior of TCH004 for now.

0fc0d4b

Ensures there's no circularity issues between `TCH007` and `TCH008` by always running test cases with both rules enabled.

carljm reviewed Nov 20, 2024

View reviewed changes

Daverball and others added 2 commits November 20, 2024 08:20

Apply suggestions from code review

f1ec697

Co-authored-by: Carl Meyer <carl@oddbird.net>

Fixes snapshot and applies a couple of other suggestions

fd8c045

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flake8-type-checking] Adds implementation for TCH007 and TCH008 #12927

[flake8-type-checking] Adds implementation for TCH007 and TCH008 #12927

Daverball commented Aug 16, 2024 •

edited

Loading

github-actions bot commented Aug 16, 2024 •

edited

Loading

This comment was marked as resolved.

codspeed-hq bot commented Aug 16, 2024 •

edited

Loading

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Daverball commented Aug 18, 2024

charliermarsh commented Sep 3, 2024

Daverball commented Oct 9, 2024

Daverball commented Nov 5, 2024

charliermarsh commented Nov 5, 2024

sbrugman Nov 9, 2024

Daverball Nov 9, 2024

Daverball Nov 9, 2024

MichaReiser Nov 19, 2024

sbrugman Nov 9, 2024

Daverball Nov 9, 2024

MichaReiser left a comment

MichaReiser Nov 19, 2024

MichaReiser Nov 19, 2024

Daverball Nov 19, 2024

MichaReiser Nov 19, 2024

Daverball Nov 19, 2024 •

edited

Loading

MichaReiser Nov 19, 2024

carljm Nov 20, 2024

carljm Nov 20, 2024

Daverball Nov 20, 2024

carljm Nov 20, 2024

Daverball Nov 20, 2024

carljm Nov 20, 2024

		// FIXME: This does not seem quite right, if only TC004 is enabled
		// then we don't need to collect the runtime imports

		// FIXME: Shouldn't this happen above where `class_variables_visible` is set?
		seen_function \|= scope.kind.is_function();

		/// Simulates a runtime load of a given [`ast::ExprName`].
		///

[flake8-type-checking] Adds implementation for TCH007 and TCH008 #12927

Are you sure you want to change the base?

[flake8-type-checking] Adds implementation for TCH007 and TCH008 #12927

Conversation

Daverball commented Aug 16, 2024 • edited Loading

Summary

Test Plan

github-actions bot commented Aug 16, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

This comment was marked as resolved.

codspeed-hq bot commented Aug 16, 2024 • edited Loading

Merging #12927 will not alter performance

Summary

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

Daverball commented Aug 18, 2024

charliermarsh commented Sep 3, 2024

Daverball commented Oct 9, 2024

Daverball commented Nov 5, 2024

charliermarsh commented Nov 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MichaReiser left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Daverball Nov 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Daverball commented Aug 16, 2024 •

edited

Loading

github-actions bot commented Aug 16, 2024 •

edited

Loading

`ruff-ecosystem` results

codspeed-hq bot commented Aug 16, 2024 •

edited

Loading

Daverball Nov 19, 2024 •

edited

Loading