Skip to content

Commit

Permalink
Bump classifier version to 1.4.4 and improve LSI content node scaling
Browse files Browse the repository at this point in the history
- Update classifier version from 1.4.3 to 1.4.4 in Gemfile.lock and gemspec
- Enhance LSI content node scaling by considering unique words count
- Add test case for classifying repeated words in LSI
  • Loading branch information
cardmagic committed Jul 31, 2024
1 parent 40f3215 commit bb971a2
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 3 deletions.
2 changes: 1 addition & 1 deletion Gemfile.lock
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
PATH
remote: .
specs:
classifier (1.4.3)
classifier (1.4.4)
fast-stemmer (~> 1.0)
mutex_m (~> 0.2)
rake
Expand Down
2 changes: 1 addition & 1 deletion classifier.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Gem::Specification.new do |s|
s.name = 'classifier'
s.version = '1.4.3'
s.version = '1.4.4'
s.summary = 'A general classifier module to allow Bayesian and other types of classifications.'
s.description = 'A general classifier module to allow Bayesian and other types of classifications.'
s.author = 'Lucas Carlson'
Expand Down
3 changes: 2 additions & 1 deletion lib/classifier/lsi/content_node.rb
Original file line number Diff line number Diff line change
Expand Up @@ -45,10 +45,11 @@ def raw_vector_with(word_list)

# Perform the scaling transform
total_words = $GSL ? vec.sum : vec.sum_with_identity
total_unique_words = vec.count { |word| word != 0 }

# Perform first-order association transform if this vector has more
# than one word in it.
if total_words > 1.0
if total_words > 1.0 && total_unique_words > 1
weighted_total = 0.0

vec.each do |term|
Expand Down
1 change: 1 addition & 0 deletions test/lsi/lsi_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ def test_basic_categorizing
assert_equal 'Dog', lsi.classify(@str1)
assert_equal 'Cat', lsi.classify(@str3)
assert_equal 'Bird', lsi.classify(@str5)
assert_equal 'Bird', lsi.classify('Bird me to Bird')
end

def test_external_classifying
Expand Down

0 comments on commit bb971a2

Please sign in to comment.