-
Notifications
You must be signed in to change notification settings - Fork 415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: make frontend normalize line endings to LF #3903
Conversation
@@ -125,7 +125,7 @@ def applyDocumentChange (oldText : FileMap) : (change : Lsp.TextDocumentContentC | |||
| TextDocumentContentChangeEvent.rangeChange (range : Range) (newText : String) => | |||
replaceLspRange oldText range newText | |||
| TextDocumentContentChangeEvent.fullChange (newText : String) => | |||
newText.toFileMap | |||
newText.crlfToLf.toFileMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to move crlfToLf
to toFileMap
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The replaceLspRange
case made me decide not to incorporate crlfToLf
into toFileMap
. We only want to normalize the line endings of the new text, and it's worth making it explicit where we're normalizing.
There's still something unsatisfactory however, which is that if somehow a \r\n
is split and re-merged by edit operations, then we can get into a situation where there's an unnormalized \r\n
. I'm not sure if this is an issue that could crop up in practice.
If we're not happy with leaving it in the current form, here are some solutions that came to mind:
- Have a second FileMap with the unnormalized text. Make sure LSP operations edit that, and then re-normalize it to create the actual FileMap used by elaboration each time.
- Do that, but accelerate it somehow. There's probably some fancy data structure supporting inserting-and-normalizing.
- Make the LSP throw an error when it detects a loose
\r
. It's not clear how to get this information back to the user in a nice way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current state seems fine. I'm more worried about accidentally forgetting crlfToLf
somewhere in the future, but I agree that this is not that important and explicitly seeing the normalization in the code is helpful, too.
To reduce differences between Windows and other platforms, the frontend now normalizes all CRLF line endings to LF. Effects: - Lake's hashes already use normalized line endings. This change makes the hashes be faithful to what Lean sees. - Docstrings are affected by line endings. In particular, this fixes `#guard_msgs` failing multilines tests for Windows users using CRLF. - Now strings don't have different lengths depending on the platform. The following theorem is true for LF and false for CRLF files. ```lean example : " ".length = 1 := rfl ```
Mathlib CI status (docs):
|
Now that we have position termination proofs in core (e.g., |
Now that the Lean frontend normalises all line endings to LF (since leanprover/lean4#3903), this check is not necessary any more. It is also one fewer Python linter to rewrite in Lean.
To eliminate parsing differences between Windows and other platforms, the frontend now normalizes all CRLF line endings to LF, like in Rust.
Effects:
#guard_msgs
failing multiline tests for Windows users using CRLF.Note: the normalization will take
\r\r\n
and turn it into\r\n
. In the elaborator, we reject loose\r
's that appear in whitespace. Rust instead takes the approach of making the normalization routine fail. They do this so that there's no downstream confusion about any\r\n
that appears.Implementation note: the LSP maintains its own copy of a source file that it updates when edit operations are applied. We are assuming that edit operations never split or join CRLFs. If this assumption is not correct, then the LSP copy of a source file can become slightly out of sync. If this is an issue, there is some discussion here.