Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lexer does not conform to ECMA-262's definition of whitespace #84

Open
jtbraun opened this issue Mar 16, 2016 · 0 comments
Open

Lexer does not conform to ECMA-262's definition of whitespace #84

jtbraun opened this issue Mar 16, 2016 · 0 comments

Comments

@jtbraun
Copy link

jtbraun commented Mar 16, 2016

ECMA-262 specifies the allowed whitespace characters in Table 32. slimit complains that these are invalid characters. The spec says:

ECMAScript implementations must recognize as WhiteSpace code points listed in the “Separator, space” (Zs) category by Unicode 5.1. ECMAScript implementations may also recognize as WhiteSpace additional category Zs code points from subsequent editions of the Unicode Standard.

Here's a small test that exhibits some of the problems. There may be other characters in the Zs unicode category that must also be included, I haven't looked for those here.

import re
from slimit.parser import Parser as sParser
from slimit import ast as sAst
from itertools import product
import unicodedata

def replace_spaces(s, wschar):
    yield "WITHOUT REPLACEMENT", s
    offsets = [i for i, c in enumerate(s) if c == ' ']

    try:
        name = unicodedata.name(wschar[0])
    except ValueError:
        name = repr(wschar)

    for i in offsets:
        yield "WITH REPLACEMENT OF " + name, s[:i] + wschar + s[i+1:]

jsparser = sParser()
for src, wschar in product(
        [u" function_name( 'arg' ) "],
        [u"\x09", u"\x0b", u"\x0c",
         u"\x20", u"\xa0",
         u"\uFEFF"]):
    for prefix, js in replace_spaces(src, wschar):
        print prefix, "=>", js
        try:
            tree = jsparser.parse(js)
        except SyntaxError as e:
            print "Syntax error", e
    print
metatoaster added a commit to calmjs/calmjs.parse that referenced this issue Jun 8, 2017
- Conform to ECMA 262, section 7.2, table 2.
- Test case provided by rspivak/slimit#84 on github.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant