Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dotnet format for very large codebases #43930

Open
KirillOsenkov opened this issue Oct 6, 2024 · 2 comments
Open

dotnet format for very large codebases #43930

KirillOsenkov opened this issue Oct 6, 2024 · 2 comments
Labels
Area-Format untriaged Request triage from a team member

Comments

@KirillOsenkov
Copy link
Member

dotnet format is using MSBuildWorkspace to read the .sln file and instantiate all projects in memory at once. For large solutions with thousands of projects this approach does not scale (I ran into this issue earlier with https://github.com/KirillOsenkov/SourceBrowser)

I prototyped a simple tool that instead uses an MSBuild binlog to read compiler invocations from the binlog and then processes each project in isolated (by creating a ProjectInfo from Csc command line arguments for that project).

Here's the prototype:
https://github.com/KirillOsenkov/CodeCleanupTools/tree/main/CodeFixer

For a small-ish 50 project solution it takes 51 seconds (vs. 1:07 for dotnet format). For larger solutions the difference will be more and more pronounced, to the point where dotnet format will choke altogether while my tool will continue to work fine.

Another problem with dotnet format is that it effectively does two passes over the solution to find diagnostics: first pass to find all diagnostics, and then when it does Fix All it effectively scans the solution for diagnostics again. Roslyn might cache things, but it's still double the work.

I'm not sure how actionable this issue is, I contemplated contributing a PR that adds support for a new type of workspace (BinlogWorkspace) but the way dotnet format is currently written it would need to be refactored in a non-trivial way because currently it assumes there's a single large solution and not many individual projects.

At the very least I'm filing this issue so that people for whom dotnet format chokes on their solution have a workaround.

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged Request triage from a team member label Oct 6, 2024
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

1 similar comment
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Format untriaged Request triage from a team member
Projects
None yet
Development

No branches or pull requests

1 participant