Skip to content

Commit

Permalink
Allow single-quote style parsing as well as double-quote style (" vs …
Browse files Browse the repository at this point in the history
…') and record as an attribute on the parsed string. Re-use this style when outputting expressions as SXP.
  • Loading branch information
gkellogg committed Oct 11, 2023
1 parent a0bb2fa commit f00f8bc
Show file tree
Hide file tree
Showing 7 changed files with 45 additions and 11 deletions.
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ S-Expressions derive from LISP, and include some basic datatypes common to all v
<dt>Symbols</dt>
<dd>Of the form <code>with-hyphen ?@!$ a\ symbol\ with\ spaces</code></dd>
<dt>Strings</dt>
<dd>Of the form <code>"Hello, world!"</code><br/>
<dd>Of the form <code>"Hello, world!"</code> or <code>'Hello, world!'</code><br/>
Strings may include the following special characters:
<ul>
<li><code>\b</code> &mdash; Backspace</li>
Expand All @@ -36,6 +36,7 @@ S-Expressions derive from LISP, and include some basic datatypes common to all v
<li><code>\u<i>xxxx</i></code> &mdash; 2-byte Unicode character escape</li>
<li><code>\U<i>xxxxxxxx</i></code> &mdash; 4-byte Unicode character escape</li>
<li><code>\"</code> &mdash; Double-quote character</li>
<li><code>\'</code> &mdash; Single-quote character</li>
<li><code>\\</code> &mdash; Backspace</li>
</ul>
Additionally, any other character may follow <code>\</code>, representing the character itself.
Expand Down Expand Up @@ -124,6 +125,7 @@ In addition to the standard datatypes, the SPARQL dialect supports the following
<dd>Strings are interpreted as an RDF Literal with datatype <code>xsd:string</code>. It can be followed by <code>@<i>lang</i></code> to create a language-tagged string, or <code>^^<i>IRI</i></code> to create a datatyped-literal. Examples:
<ul>
<li><code>"a plain literal"</code></li>
<li><code>'another plain literal'</code></li>
<li><code>"a literal with a language"@en</code></li>
<li><code>"a typed literal"^^&lt;http://example/></code></li>
<li><code>"a typed literal with a PNAME"^^xsd:string</code></li>
Expand Down
20 changes: 18 additions & 2 deletions lib/sxp/extensions.rb
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ def to_sxp(**options)
class String
##
# Returns the SXP representation of this object. Uses SPARQL-like escaping.
# Uses any recorded quote style from an originally parsed string.
#
# @return [String]
def to_sxp(**options)
Expand All @@ -69,14 +70,29 @@ def to_sxp(**options)
when (0x0C) then '\f'
when (0x0D) then '\r'
when (0x0E..0x1F) then sprintf("\\u%04X", u.ord)
when (0x22) then '\"'
when (0x22) then as_dquote? ? '\"' : '"'
when (0x27) then as_squote? ? "\'" : "'"
when (0x5C) then '\\\\'
when (0x7F) then sprintf("\\u%04X", u.ord)
else u.chr
end
end
'"' + buffer + '"'
if as_dquote?
'"' + buffer + '"'
else
"'" + buffer + "'"
end
end

# Record quote style used when parsing
# @return [:dquote, :squote]
attr_accessor :quote_style

# Render string using double quotes
def as_squote?; quote_style == :squote; end

# Render string using single quotes
def as_dquote?; quote_style != :squote; end
end

##
Expand Down
12 changes: 7 additions & 5 deletions lib/sxp/reader/basic.rb
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ class Basic < Reader
def read_token
case peek_char
when ?(, ?) then [:list, read_char]
when ?" then [:atom, read_string] #"
when ?", ?' then [:atom, read_string] #" or '
else super
end
end
Expand All @@ -36,16 +36,18 @@ def read_atom
# @return [String]
def read_string
buffer = ""
skip_char # '"'
until peek_char == ?" #"
quote_char = read_char
until peek_char == quote_char # " or '
buffer <<
case char = read_char
when ?\\ then read_character
else char
end
end
skip_char # '"'
buffer
skip_char # " or '

# Return string, annotating it with the quotation style used
buffer.tap {|s| s.quote_style = (quote_char == '"' ? :dquote : :squote)}
end

##
Expand Down
2 changes: 1 addition & 1 deletion lib/sxp/reader/common_lisp.rb
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def read_vector
# @return [Array]
def read_quote
skip_char # "'"
[options[:quote] || :quote, read]
[options[:quote] || :quote, read.tap {|s| s.quote_style = :squote if s.is_a?(String)}]
end

##
Expand Down
2 changes: 2 additions & 0 deletions lib/sxp/reader/sparql.rb
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,7 @@ def initialize(input, **options, &block)
def read_token
case peek_char
when ?" then [:atom, read_rdf_literal] # "
when ?' then [:atom, read_rdf_literal] # '
when ?< then [:atom, read_rdf_uri]
else
tok = super
Expand Down Expand Up @@ -144,6 +145,7 @@ def read_token
#
# @example
# "a plain literal"
# 'another plain literal'
# "a literal with a language"@en
# "a typed literal"^^<http://example/>
# "a typed literal with a PNAME"^^xsd:string
Expand Down
4 changes: 3 additions & 1 deletion spec/common_lisp_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,9 @@

context "when reading strings" do
it "reads `\"foo\"` as a string" do
expect(read(%q("foo"))).to eq "foo"
res = read(%q("foo"))
expect(res).to eq "foo"
expect(res.quote_style == :squote)
end
end

Expand Down
12 changes: 11 additions & 1 deletion spec/reader_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@
%q{"\n"} => "\n",
%q{"\r"} => "\r",
%q{"\t"} => "\t",
%q{"\'"} => "\'",
%q{"\u0080"} => "\u0080",
%q("\u07FF") => "\u07FF",
%q("\u0800") => "\u0800",
Expand All @@ -75,9 +76,18 @@
%q("\U000FFFFD") => "\u{FFFFD}",
%q("\U00100000") => "\u{100000}",
%q("\U0010FFFD") => "\u{10FFFD}",

%q{'\b'} => "\b",
%q{'\f'} => "\f",
%q{'\n'} => "\n",
%q{'\r'} => "\r",
%q{'\t'} => "\t",
%q{'\''} => "\'",
}.each do |input, output|
it "reads #{input} as #{output.inspect}" do
expect(read(input)).to eq output
res = read(input)
expect(res).to eq output
expect(res.quote_style).to eql (input.start_with?('"') ? :dquote : :squote)
end
end
end
Expand Down

0 comments on commit f00f8bc

Please sign in to comment.