Introduce guessed receiver types #2210

vinistock · 2024-06-19T20:27:28Z

Motivation

This PR adds the experiment of guessed receiver types, where we try to guess the type of receivers based on their identifier.

Implementation

The relevant part of the implementation is all in TypeInferrer, everything else is just displaying to users why we picked a certain type.

The idea is to try to guess the types like this:

Take the raw receiver slice
Sanitize that name to be camel case and discard @ symbols
First, try to resolve the name inside the current nesting. If we find something, return that
Otherwise, search for the first type that matches the unqualified name of the identifier

More details in the Markdown documentation.

Validation

I used Spoom's access to the Sorbet LSP to compare the guessed types vs the actual types informed by Sorbet. I also compared 4 approaches:

In the Ruby LSP repo, these are the accuracy results for each approach

First resolve then fallback to unqualified name: 15% of correct types
Unqualified only: 11%
Resolve with nesting only: 9%
Fuzzy search: 2% (in addition to being the worse accuracy, fuzzy search was also unbearably slow)

In Core, the analysis script took way too long to finish, so I sampled a subset of the codebase. The results there were worse than in the Ruby LSP codebase, peaking at about 5% of correct types.

Surely, the level of accuracy will vary a lot between different codebases. That said, I still believe the experiment would be worth the try and would love to hear feedback from users about the usefulness of this.

Script:

# typed: strict
# frozen_string_literal: true

require "spoom"
require "ruby_lsp/internal"

class Visitor < Prism::Visitor
  extend T::Sig

  sig { returns(T.nilable(RubyLsp::Document)) }
  attr_accessor :document

  sig { returns(Integer) }
  attr_reader :total, :correct

  sig { returns(T::Hash[String, T.nilable(String)]) }
  attr_reader :comparison

  sig { params(inferrer: RubyLsp::TypeInferrer, lsp_client: Spoom::LSP::Client).void }
  def initialize(inferrer, lsp_client)
    @inferrer = inferrer
    @lsp_client = lsp_client
    @total = T.let(0, Integer)
    @correct = T.let(0, Integer)
    @document = T.let(nil, T.nilable(RubyLsp::Document))
    super()
  end

  sig { params(node: Prism::CallNode).void }
  def visit_call_node(node)
    receiver_loc = node.receiver&.location
    return super unless receiver_loc

    receiver = node.receiver
    unless receiver.is_a?(Prism::CallNode) || receiver.is_a?(Prism::LocalVariableReadNode) ||
        receiver.is_a?(Prism::InstanceVariableReadNode)
      return super
    end

    hover = @lsp_client.hover(T.must(@document).uri.to_s, receiver_loc.start_line - 1, receiver_loc.start_column)

    if hover
      hovered_type = if /returns\((.*)\)/ =~ hover.contents
        T.must(T.must(hover.contents.match(/returns\((.*)\)/))[1])
      else
        hover.contents
      end

      return super if hovered_type == "T.untyped" || hovered_type == "T::Private::Methods::DeclBuilder"

      loc = T.must(node.message_loc)

      node_context = T.must(@document).locate_node(
        {
          line: loc.start_line - 1,
          character: loc.start_column,
        },
        node_types: [Prism::CallNode],
      )

      type = @inferrer.infer_receiver_type(node_context)

      @total += 1

      if type
        parts = type.split("::")
        parts.reject! { |e| e.include?("<Class:") }
        corrected_type = parts.join("::")

        if hovered_type.include?(corrected_type)
          @correct += 1
        end
      end
    end

    super
  end
end

index = RubyIndexer::Index.new
index.index_all

inferrer = RubyLsp::TypeInferrer.new(index)
workspace_path = Dir.pwd

client = Spoom::LSP::Client.new(
  Spoom::Sorbet::BIN_PATH,
  "--lsp",
  "--enable-all-experimental-lsp-features",
  "--disable-watchman",
)
client.open(workspace_path)

begin
  visitor = Visitor.new(inferrer, client)
  files = Dir.glob("#{workspace_path}/**/*.rb")
  RubyVM::YJIT.enable

  Signal.trap("INT") do
    puts "Total: #{visitor.total}"
    puts "Correct: #{visitor.correct}"
    puts "Accuracy: #{100 * (visitor.correct.to_f / visitor.total)}"
    client.close
    exit
  end

  files.each_with_index do |file, index|
    document = RubyLsp::RubyDocument.new(
      source: File.read(file),
      version: 1,
      uri: URI::Generic.from_path(path: File.expand_path(file)),
    )
    visitor.document = document
    Prism.parse_file(file).value.accept(visitor)

    print("\033[M\033[0KCompleted #{index + 1}/#{files.length}")
  end

  puts "Total: #{visitor.total}"
  puts "Correct: #{visitor.correct}"
  puts "Accuracy: #{100 * (visitor.correct.to_f / visitor.total)}"
ensure
  client.close
end

Automated Tests

Added tests.

Manual Tests

Type any existing class name as a variable. After typing a dot, you should see completion options for that type (e.g.: pathname.).

andyw8 · 2024-06-19T20:56:55Z

Another thing we could do, especially for the benefit of tests, is to match on a type name followed by a number, e.g. product_1.

Morriar

Should we add the script you used to benchmark somewhere?

DESIGN_AND_ROADMAP.md

lib/ruby_indexer/lib/ruby_indexer/index.rb

lib/ruby_lsp/type_inferrer.rb

test/type_inferrer_test.rb

lib/ruby_lsp/listeners/hover.rb

lib/ruby_lsp/requests/completion_resolve.rb

st0012

To be very honest, I'm not convinced that this approach worth the extra complexity based on the accuracy metrics. But let's see how users react to it first 👍
Additionally, is it possible to identify the guessed type responses from our metrics? It'd be interesting to see how much completion requests are based on it vs not.

DESIGN_AND_ROADMAP.md

lib/ruby_lsp/listeners/definition.rb

st0012 · 2024-07-22T17:27:45Z

Do you think we're able to package the script into a flag, like ruby-lsp --report-guess-type-accuracy? (return early if spoom is not available)
It will help us continuously evaluating this feature in the future, and we can ask some community users who also use Sorbet to give us result too.

vinistock · 2024-07-23T15:06:58Z

Talked to Stan and we agreed to ship this and follow up with an executable to estimate the type accuracy of guessed types.

vinistock added enhancement New feature or request server This pull request should be included in the server gem's release notes labels Jun 19, 2024

vinistock self-assigned this Jun 19, 2024

vinistock requested a review from a team as a code owner June 19, 2024 20:27

vinistock requested review from andyw8 and st0012 June 19, 2024 20:27

vinistock changed the base branch from main to vs/move_exp_features_into_global_state June 19, 2024 20:27

Base automatically changed from vs/move_exp_features_into_global_state to main June 20, 2024 13:18

vinistock force-pushed the vs/introduce_guessed_receiver_types branch from b0615a5 to 886a7aa Compare June 20, 2024 13:18

Morriar reviewed Jun 20, 2024

View reviewed changes

vinistock force-pushed the vs/introduce_guessed_receiver_types branch from 886a7aa to 1df800c Compare June 20, 2024 18:23

st0012 reviewed Jun 21, 2024

View reviewed changes

lib/ruby_lsp/requests/completion_resolve.rb Outdated Show resolved Hide resolved

vinistock mentioned this pull request Jul 3, 2024

Properly handle methods for definition, completion and hover #899

Closed

3 tasks

vinistock force-pushed the vs/introduce_guessed_receiver_types branch 2 times, most recently from 656e424 to 7cfdd9a Compare July 11, 2024 17:25

st0012 approved these changes Jul 19, 2024

View reviewed changes

DESIGN_AND_ROADMAP.md Outdated Show resolved Hide resolved

st0012 reviewed Jul 22, 2024

View reviewed changes

lib/ruby_lsp/listeners/definition.rb Show resolved Hide resolved

vinistock added 5 commits July 23, 2024 10:59

Introduce guessed types

8fc3da3

Use guessed types in hover

d129b9e

Use guessed types in signature help

f6893e8

Use guessed types in completion

b961833

Address PR feedback

e597de1

vinistock force-pushed the vs/introduce_guessed_receiver_types branch from 7cfdd9a to e597de1 Compare July 23, 2024 15:05

vinistock enabled auto-merge (squash) July 23, 2024 15:08

vinistock merged commit e66513d into main Jul 23, 2024
35 checks passed

vinistock deleted the vs/introduce_guessed_receiver_types branch July 23, 2024 15:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce guessed receiver types #2210

Introduce guessed receiver types #2210

vinistock commented Jun 19, 2024 •

edited

Loading

andyw8 commented Jun 19, 2024

Morriar left a comment

st0012 left a comment

st0012 commented Jul 22, 2024

vinistock commented Jul 23, 2024

Introduce guessed receiver types #2210

Introduce guessed receiver types #2210

Conversation

vinistock commented Jun 19, 2024 • edited Loading

Motivation

Implementation

Validation

Automated Tests

Manual Tests

andyw8 commented Jun 19, 2024

Morriar left a comment

Choose a reason for hiding this comment

st0012 left a comment

Choose a reason for hiding this comment

st0012 commented Jul 22, 2024

vinistock commented Jul 23, 2024

vinistock commented Jun 19, 2024 •

edited

Loading