Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce guessed receiver types #2210

Merged
merged 5 commits into from
Jul 23, 2024
Merged

Conversation

vinistock
Copy link
Member

@vinistock vinistock commented Jun 19, 2024

Motivation

This PR adds the experiment of guessed receiver types, where we try to guess the type of receivers based on their identifier.

Implementation

The relevant part of the implementation is all in TypeInferrer, everything else is just displaying to users why we picked a certain type.

The idea is to try to guess the types like this:

  1. Take the raw receiver slice
  2. Sanitize that name to be camel case and discard @ symbols
  3. First, try to resolve the name inside the current nesting. If we find something, return that
  4. Otherwise, search for the first type that matches the unqualified name of the identifier

More details in the Markdown documentation.

Validation

I used Spoom's access to the Sorbet LSP to compare the guessed types vs the actual types informed by Sorbet. I also compared 4 approaches:

In the Ruby LSP repo, these are the accuracy results for each approach

  1. First resolve then fallback to unqualified name: 15% of correct types
  2. Unqualified only: 11%
  3. Resolve with nesting only: 9%
  4. Fuzzy search: 2% (in addition to being the worse accuracy, fuzzy search was also unbearably slow)

In Core, the analysis script took way too long to finish, so I sampled a subset of the codebase. The results there were worse than in the Ruby LSP codebase, peaking at about 5% of correct types.

Surely, the level of accuracy will vary a lot between different codebases. That said, I still believe the experiment would be worth the try and would love to hear feedback from users about the usefulness of this.

Script:

# typed: strict
# frozen_string_literal: true

require "spoom"
require "ruby_lsp/internal"

class Visitor < Prism::Visitor
  extend T::Sig

  sig { returns(T.nilable(RubyLsp::Document)) }
  attr_accessor :document

  sig { returns(Integer) }
  attr_reader :total, :correct

  sig { returns(T::Hash[String, T.nilable(String)]) }
  attr_reader :comparison

  sig { params(inferrer: RubyLsp::TypeInferrer, lsp_client: Spoom::LSP::Client).void }
  def initialize(inferrer, lsp_client)
    @inferrer = inferrer
    @lsp_client = lsp_client
    @total = T.let(0, Integer)
    @correct = T.let(0, Integer)
    @document = T.let(nil, T.nilable(RubyLsp::Document))
    super()
  end

  sig { params(node: Prism::CallNode).void }
  def visit_call_node(node)
    receiver_loc = node.receiver&.location
    return super unless receiver_loc

    receiver = node.receiver
    unless receiver.is_a?(Prism::CallNode) || receiver.is_a?(Prism::LocalVariableReadNode) ||
        receiver.is_a?(Prism::InstanceVariableReadNode)
      return super
    end

    hover = @lsp_client.hover(T.must(@document).uri.to_s, receiver_loc.start_line - 1, receiver_loc.start_column)

    if hover
      hovered_type = if /returns\((.*)\)/ =~ hover.contents
        T.must(T.must(hover.contents.match(/returns\((.*)\)/))[1])
      else
        hover.contents
      end

      return super if hovered_type == "T.untyped" || hovered_type == "T::Private::Methods::DeclBuilder"

      loc = T.must(node.message_loc)

      node_context = T.must(@document).locate_node(
        {
          line: loc.start_line - 1,
          character: loc.start_column,
        },
        node_types: [Prism::CallNode],
      )

      type = @inferrer.infer_receiver_type(node_context)

      @total += 1

      if type
        parts = type.split("::")
        parts.reject! { |e| e.include?("<Class:") }
        corrected_type = parts.join("::")

        if hovered_type.include?(corrected_type)
          @correct += 1
        end
      end
    end

    super
  end
end

index = RubyIndexer::Index.new
index.index_all

inferrer = RubyLsp::TypeInferrer.new(index)
workspace_path = Dir.pwd

client = Spoom::LSP::Client.new(
  Spoom::Sorbet::BIN_PATH,
  "--lsp",
  "--enable-all-experimental-lsp-features",
  "--disable-watchman",
)
client.open(workspace_path)

begin
  visitor = Visitor.new(inferrer, client)
  files = Dir.glob("#{workspace_path}/**/*.rb")
  RubyVM::YJIT.enable

  Signal.trap("INT") do
    puts "Total: #{visitor.total}"
    puts "Correct: #{visitor.correct}"
    puts "Accuracy: #{100 * (visitor.correct.to_f / visitor.total)}"
    client.close
    exit
  end

  files.each_with_index do |file, index|
    document = RubyLsp::RubyDocument.new(
      source: File.read(file),
      version: 1,
      uri: URI::Generic.from_path(path: File.expand_path(file)),
    )
    visitor.document = document
    Prism.parse_file(file).value.accept(visitor)

    print("\033[M\033[0KCompleted #{index + 1}/#{files.length}")
  end

  puts "Total: #{visitor.total}"
  puts "Correct: #{visitor.correct}"
  puts "Accuracy: #{100 * (visitor.correct.to_f / visitor.total)}"
ensure
  client.close
end

Automated Tests

Added tests.

Manual Tests

Type any existing class name as a variable. After typing a dot, you should see completion options for that type (e.g.: pathname.).

@vinistock vinistock added enhancement New feature or request server This pull request should be included in the server gem's release notes labels Jun 19, 2024
@vinistock vinistock self-assigned this Jun 19, 2024
@vinistock vinistock requested a review from a team as a code owner June 19, 2024 20:27
@vinistock vinistock requested review from andyw8 and st0012 June 19, 2024 20:27
@vinistock vinistock changed the base branch from main to vs/move_exp_features_into_global_state June 19, 2024 20:27
@andyw8
Copy link
Contributor

andyw8 commented Jun 19, 2024

Another thing we could do, especially for the benefit of tests, is to match on a type name followed by a number, e.g. product_1.

Base automatically changed from vs/move_exp_features_into_global_state to main June 20, 2024 13:18
@vinistock vinistock force-pushed the vs/introduce_guessed_receiver_types branch from b0615a5 to 886a7aa Compare June 20, 2024 13:18
Copy link
Contributor

@Morriar Morriar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add the script you used to benchmark somewhere?

DESIGN_AND_ROADMAP.md Outdated Show resolved Hide resolved
lib/ruby_indexer/lib/ruby_indexer/index.rb Show resolved Hide resolved
lib/ruby_lsp/type_inferrer.rb Show resolved Hide resolved
test/type_inferrer_test.rb Show resolved Hide resolved
lib/ruby_lsp/listeners/hover.rb Outdated Show resolved Hide resolved
@vinistock vinistock force-pushed the vs/introduce_guessed_receiver_types branch from 886a7aa to 1df800c Compare June 20, 2024 18:23
@vinistock vinistock force-pushed the vs/introduce_guessed_receiver_types branch 2 times, most recently from 656e424 to 7cfdd9a Compare July 11, 2024 17:25
Copy link
Member

@st0012 st0012 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be very honest, I'm not convinced that this approach worth the extra complexity based on the accuracy metrics. But let's see how users react to it first 👍
Additionally, is it possible to identify the guessed type responses from our metrics? It'd be interesting to see how much completion requests are based on it vs not.

DESIGN_AND_ROADMAP.md Outdated Show resolved Hide resolved
@st0012
Copy link
Member

st0012 commented Jul 22, 2024

Do you think we're able to package the script into a flag, like ruby-lsp --report-guess-type-accuracy? (return early if spoom is not available)
It will help us continuously evaluating this feature in the future, and we can ask some community users who also use Sorbet to give us result too.

@vinistock vinistock force-pushed the vs/introduce_guessed_receiver_types branch from 7cfdd9a to e597de1 Compare July 23, 2024 15:05
@vinistock
Copy link
Member Author

Talked to Stan and we agreed to ship this and follow up with an executable to estimate the type accuracy of guessed types.

@vinistock vinistock enabled auto-merge (squash) July 23, 2024 15:08
@vinistock vinistock merged commit e66513d into main Jul 23, 2024
35 checks passed
@vinistock vinistock deleted the vs/introduce_guessed_receiver_types branch July 23, 2024 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request server This pull request should be included in the server gem's release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants