Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognize Kind regards as a signature #71

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions email_reply_parser.gemspec
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ Gem::Specification.new do |s|
test/emails/email_sig_delimiter_in_middle_of_line.txt
test/emails/greedy_on.txt
test/emails/pathological.txt
test/emails/email_with_kind_regards.txt
]
# = MANIFEST =

Expand Down
67 changes: 64 additions & 3 deletions lib/email_reply_parser.rb
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,26 @@
class EmailReplyParser
VERSION = "0.5.9"

class << self
attr_writer :configuration

# Public: Configuration
#
# Returns a Configration instance .
#
def configuration
@configuration ||= Configuration.new
end

# Public: Configures EmailReplyParser
#
# block - a default configuration instance is exposed in the block
#
def configure
yield(configuration)
end
end

# Public: Splits an email body into a list of Fragments.
#
# text - A String email body.
Expand All @@ -50,6 +70,18 @@ def self.parse_reply(text)
self.read(text).visible_text
end

### Configuration

# A Configuration instance.
class Configuration
# Configuration has an Array of regards
attr_accessor :regards
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, at least generally in English, we'd refer to:

Warm regards,
Matt

as a signature as well, just a particular type of signature.

Rather than introduce the more-specific domain term regards, I think we should name this in terms of signature customization.


def initialize
@regards = []
end
end

### Emails

# An Email instance represents a parsed body String.
Expand Down Expand Up @@ -141,6 +173,24 @@ def read(text)
SIG_REGEX = Regexp.new(SIGNATURE)
end

# Regular expression for regards
#
# Returns a Regexp instance if regards are configured, otherwise it returns
# nil
def regards_regex
return nil if EmailReplyParser.configuration.regards.empty?
value = EmailReplyParser.configuration.regards.map do |regard|
"(#{regard.reverse}$)"
end.join('|')

begin
require 're2'
RE2::Regexp.new(value, case_sensitive: false)
rescue LoadError
Regexp.new(value, ignore_case: true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, we'll create a new Regexp object with each email message we parse, though the content of the regular expression won't be changing. I think we want to memoize it in one way or another.

end
end

### Line-by-Line Parsing

# Scans the given line of text and figures out which fragment it belongs
Expand All @@ -166,6 +216,16 @@ def scan_line(line)
end
end

# Mark the current Fragment as a regards if regards are configured and
# the current line is empty and the Fragment starts with a common regards
# indicator.
if regards_regex && @fragment && line == EMPTY
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to keep repeating myself, but as discussed above I think we should fold this into the signature check.

But this does raise a question: should the signature customization be purely additive, or as a user of the library should I be able to clear out the default signature matchers? I'm inclined toward the latter.

if regards_regex.match @fragment.lines.last
@fragment.regards = true
finish_fragment
end
end

# If the line matches the current fragment, add it. Note that a common
# reply header also counts as part of the quoted Fragment, even though
# it doesn't start with `>`.
Expand Down Expand Up @@ -217,7 +277,7 @@ def finish_fragment
if @fragment
@fragment.finish
if !@found_visible
if @fragment.quoted? || @fragment.signature? ||
if @fragment.quoted? || @fragment.signature? || @fragment.regards? ||
@fragment.to_s.strip == EMPTY
@fragment.hidden = true
else
Expand All @@ -235,7 +295,7 @@ def finish_fragment
# Represents a group of paragraphs in the email sharing common attributes.
# Paragraphs should get their own fragment if they are a quoted area or a
# signature.
class Fragment < Struct.new(:quoted, :signature, :hidden)
class Fragment < Struct.new(:quoted, :signature, :hidden, :regards)
# This is an Array of String lines of content. Since the content is
# reversed, this array is backwards, and contains reversed strings.
attr_reader :lines,
Expand All @@ -245,7 +305,7 @@ class Fragment < Struct.new(:quoted, :signature, :hidden)
:content

def initialize(quoted, first_line)
self.signature = self.hidden = false
self.signature = self.hidden = self.regards = false
self.quoted = quoted
@lines = [first_line]
@content = nil
Expand All @@ -255,6 +315,7 @@ def initialize(quoted, first_line)
alias quoted? quoted
alias signature? signature
alias hidden? hidden
alias regards? regards

# Builds the string content by joining the lines and reversing them.
#
Expand Down
17 changes: 17 additions & 0 deletions test/email_reply_parser_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,14 @@
EMAIL_FIXTURE_PATH = dir + 'emails'

class EmailReplyParserTest < Test::Unit::TestCase
def test_regards_configuration
EmailReplyParser.configure do |config|
config.regards = ['best regards']
end

assert_equal ['best regards'], EmailReplyParser.configuration.regards
end

def test_encoding_should_be_maintained
body = IO.read EMAIL_FIXTURE_PATH.join("email_1_1.txt").to_s
EmailReplyParser.read body
Expand Down Expand Up @@ -222,6 +230,15 @@ def test_doesnt_remove_signature_delimiter_in_mid_line
assert_equal 1, reply.fragments.size
end

def test_kind_regards_signature
EmailReplyParser.configure do |config|
config.regards = ['Kind regards']
end
reply = email('email_with_kind_regards')
assert_match(/Thats a great idea/, reply.fragments[0].to_s)
assert_equal [false, true], reply.fragments.map { |f| f.regards? }
end

def email(name)
body = IO.read EMAIL_FIXTURE_PATH.join("#{name}.txt").to_s
EmailReplyParser.read body
Expand Down
9 changes: 9 additions & 0 deletions test/emails/email_with_kind_regards.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Hey,

Thats a great idea!


Kind regards

Tim Tommy
CEO