Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add task to verify indexing against Top 100 RubyGems #2330

Merged
merged 1 commit into from
Jul 24, 2024

Conversation

andyw8
Copy link
Contributor

@andyw8 andyw8 commented Jul 18, 2024

Motivation

To help catch issues related to indexing, we can expand our indexing testing to cover a set of RubyGems

As per @kddnewton's suggestion, let's begin with just the top 100. We can later consider a periodic job which runs against every gem on RubyGems.org.

Implementation

Shamelessly copied from https://github.com/ruby/prism/blob/main/rakelib/lex.rake

Questions:

  • Do we want to run this as part of CI? Locally: 17.04s user 3.65s system 60% cpu 34.393 total
  • If so, do we care about caching the downloads between runs? (~120MB).

Automated Tests

None, since this is a test itself.

Manual Tests

Run with bundle exec rake index:topgems.

sorbet/config Show resolved Hide resolved
@andyw8 andyw8 changed the title Add task to verify against against Top 100 RubyGems Add task to verify indexing against Top 100 RubyGems Jul 18, 2024
@andyw8 andyw8 added chore Chore task server This pull request should be included in the server gem's release notes labels Jul 18, 2024
@andyw8 andyw8 marked this pull request as ready for review July 18, 2024 16:09
@andyw8 andyw8 requested a review from a team as a code owner July 18, 2024 16:09
@andyw8 andyw8 requested review from st0012 and vinistock July 18, 2024 16:09
Copy link
Member

@st0012 st0012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I say let's run it on CI so it'll be maintained (or deleted when not used). I'm fine with downloading things on build if it's not too flaky.

lsp.code-workspace Outdated Show resolved Hide resolved
Copy link
Member

@vinistock vinistock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think enabling this in CI is a great idea. We can run this only on the latest Ruby version and only if the PR includes modifications to the ruby_indexer directory or lockfile.

Rakefile Outdated Show resolved Hide resolved
lib/tasks/index.rake Outdated Show resolved Hide resolved
lib/tasks/index.rake Outdated Show resolved Hide resolved
lib/tasks/top_100_gems.yml Outdated Show resolved Hide resolved
sorbet/config Show resolved Hide resolved
@andyw8 andyw8 force-pushed the andyw8/index-top-gems branch 3 times, most recently from fa2418e to 1e5f04d Compare July 19, 2024 18:20
.github/workflows/ci.yml Outdated Show resolved Hide resolved
Copy link
Member

@vinistock vinistock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If indexing succeeds, this is fine. However, if it fails, we will keep the copies of every gem still installed, even if only a single file out of the 100 gems have failed - which is why we need to ignore the tmp directory for Sorbet.

You also get prints for every single file path, even the ones that succeed, which is not useful information for someone running the script.

Finally, if there is more than one error, you need to re-run the script because it will stop at the first failure.

How about we do something like this instead?

errors = Dir[File.join(directory, "**", "*.rb")].filter_map do |filepath|
  print "."
  code = File.read(filepath)
  index.index_single(RubyIndexer::IndexablePath.new(nil, filepath), code)
  nil
rescue => e
  { message: e.message, file: filepath }
end

# => Copy only the files with errors to some directory 
# => group errors by filepath and then print them
# => include the error message as part of what is printed

lib/tasks/index.rake Outdated Show resolved Hide resolved
.github/workflows/ci.yml Outdated Show resolved Hide resolved
@andyw8 andyw8 force-pushed the andyw8/index-top-gems branch 2 times, most recently from 2ae548c to 3b8616e Compare July 19, 2024 18:43
@andyw8 andyw8 enabled auto-merge (squash) July 24, 2024 19:14
@andyw8 andyw8 merged commit 77de05f into main Jul 24, 2024
33 checks passed
@andyw8 andyw8 deleted the andyw8/index-top-gems branch July 24, 2024 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
chore Chore task server This pull request should be included in the server gem's release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants