Skip to content

Latest commit

 

History

History
1002 lines (682 loc) · 37.8 KB

CHANGELOG.md

File metadata and controls

1002 lines (682 loc) · 37.8 KB

Changelog

This file contains the notable changes in the ISLa project since version 0.2a1 (February 2022). Changes prior to this date are not documented.

[unreleased]

[1.14.4] - 2024-01-12

Changed

  • TAR formalization: Changed order field of checksum predicate to ensure checksum is evaluated last.

[1.14.3] - 2024-01-09

Changed

  • Internal refactoring: Replaced custom Maybe/Exceptional classes by the returns library.
  • Switched to pyproject.toml configuration.

[1.14.2] - 2023-11-01

Changed

  • Improved the inference of intervals from grammars for numeric variables (relevant for the enable_optimized_z3_queries mode).
  • Fixed a bug in the evaluation of (concrete) arithmetic expressions: So far, we assumed that Z3 addition and multiplication operations can only have two children, while more are possible. Thus, some elements were ignored when computing sums or products.

[1.14.1] - 2023-07-10

Changed

  • Supporting literal negative numbers in ISLa constraints (concrete syntax), such as in str.to.int(<int>) < -1. This raised a SyntaxError before.

[1.14.0] - 2023-07-05

Changed

  • Improved optimized handling of str.to.int (when enable_optimized_z3_queries is enabled). For variables exclusively appearing in str.to.int, we support a wider range of possible solutions, such as negative numbers or languages with an optional + or padding of zeroes in front. Furthermore, ISLa tries to infer the domains of these variables automatically, e.g., to exclude "0" or negative values.

[1.13.9] - 2023-04-27

Changed

  • Bug fix to HUGE unsoundness: Solver returned a solution for "false" constraint.

[1.13.8] - 2023-04-27

Changed

  • Fixed a parser bug occurring when a conjunctive formula contains two independent sets of XPath expressions. Previously, the produced quantifiers were nested, which is incorrect. Now, they're "pushed in" as expected.
  • Fixed a bug in isla.solver.subtree_solutions surfacing for the reST example when a subtree gets substituted by a smaller subtree produced by a semantic predicate (in that case, extend_crop).

[1.13.7] - 2023-04-26

Changed

  • Bug fix: When solving SMT formulas with multiple independent clusters, an exponential number of solutions was pre-computed (many more than asked for), which resulted in a potentially huge performance problem.

[1.13.6] - 2023-04-26

Changed

  • Bug fix for parsing a formula where two XPath expressions are appended to the same bound variable, and one of the possible expansions that can be chosen cannot be merged.

[1.13.5] - 2023-04-26

Changed

  • Introducing a global configuration module global_config.py. Currently, this contains a flag for disabling assertions guarded by assertions_activated(), which are many of the more costly assertions. This flag is an opportunity to programmatically turn off at least some assertions if setting the global -O flag is not an option.
  • Suppressing potentially meaningless error messages issued during the procedure implemented in 1.13.4.

[1.13.4] - 2023-04-26

Changed

  • Fixed bug concerning numeric solutions to SMT formulas with "padded" variables: only "01," e.g., is a valid solution, not "1."

[1.13.3] - 2023-04-17

Changed

  • Fix to partial last bug fix.

[1.13.2] - 2023-04-17

Changed

  • Fixed unparsing of null byte unicode literal: Got "\u{}" before (which could not be parsed by Z3!), now obtain the correct "\u{0}".

[1.13.1] - 2023-04-06

Changed

  • BugFix in ISLaSolver.eliminate_all_ready_semantic_predicate_formulas: Subtrees in computed substitutions are now considered (as they were already for SMT formulas).
  • When defining a new semantic predicate, an order parameter can be set that specifies when a semantic predicate is evaluated relatively to the other semantic predicates.

[1.13.0] - 2023-04-06

Changed

  • Refactoring & optimization in implementation of count semantic predicate. We now select candidates with larger numbers of <needle> occurrences first.
  • Optimization in ISLaSolver.extract_regular_expression: Better re-use of already computed regexes for frequently re-used nonterminals (such like <byte>)

[1.12.0] - 2023-04-06

Changed

  • Change in the implementation of the count predicate: We now only ask for one tree insertion solution. This increases efficiency in our examples (before, we asked for 50 insertion solutions). Yet, it is not entirely certain if this step might cause prolems in more complex examples.

[1.11.2] - 2023-04-04

Changed

[1.11.1] - 2023-03-24

Changed

  • New optimization in SMT formula solving: Variables exclusively occurring in str.to.int contexts are passed to Z3 as integers and later converted to strings, which reduces the solver's load.
  • Cleaned up / improved code corresponding to SMT formula solving optimizations.

[1.11.0] - 2023-03-22

Changed

  • The default for the ISLaSolver parameter enforce_unique_trees_in_queue changed from True to False. The old setting could improve performance if the same derivation tree can be generated by expansion and tree insertion (and similar cases), but can have very unpleasant side effects, e.g., in the presence of disjunctions (see GitHub Issue 53). If you encounter negative performance effects in your existing use cases, consider setting this flag to True when creating the ISLaSolver.

[1.10.4] - 2023-03-15

Changed

  • Solved parser problem: Arithmetic differences (a = b - c) not parsed correctly before.
  • Eliminated bug in solver: If an explicit start symbol was given, the given grammar was mutated. Solver invocations (including the constructor) should, however, not have such side effects. In fact, that problem rendered the test suite flaky.
  • delete_unreachable (which deletes unreachable nonterminal keys in a grammar) no longer mutates grammars, but instead returns a new grammar.

[1.10.3] - 2023-03-09

Changed

  • Upgrade to ANTLR 4.12.0

[1.10.2] - 2023-03-09

Changed

  • The BNF parser now handles XML-like terminals (e.g., "<a>") that are no longer treated as nonterminal symbols. We introduce a fresh nonterminal "<langle>" ("<langle_i>" for smallest possible i if the other alternative(s) are not free) expanding to "<" and replace "<" in terminals with that nonterminal in the resulting Python grammar.

[1.10.1] - 2022-01-10

Changed

  • Deterministic hashing of structural predicates.
  • Fixed too aggressive simplification of existential quantifiers (see test test_subsitute_in_existential_formula in test_language.py).
  • Fixed too aggressive simplification of universal quantifiers for recursive grammars: even if a quantifier already matched a tree node, we cannot simplify it away in the absence of other current matches if the quantified nonterminal can still be reached from the already mached tree.
  • Deactivated custom solver for str.len in presence of "too concrete" tree substitutions, e.g., a str.len(<chars>) < 10 constraint with a <char><char> tree, where 2 is the only possible instantiation of str.len.

[1.10.0] - 2022-01-05

Changed

  • Resolves GitHub Issue #36, which in essence refers to solving multiple "conflicting" SMT formulas. These are formulas whose solutions change subtrees of substitutions in other SMT formulas. Before, we did not check that the chosen "priority" formulas are not themselves influenced by other formulas. See test case test_issue_36 in test_solver.py.
  • Resolves GitHub Issue #39. The count predicate now better handles grammars with epsilon productions: If the number of "needles" is already reached, it can "finish off" the remaining nonterminals that might be instantiated with needle nonterminals if alternative instantiations are available.
  • Adds the feature requested in GitHub Issue #38: one can now pass a start symbol (parameter start_symbol) to the ISLaSolver constructor as alternative to the (also optional) initial derivation tree.
  • Resolves GitHub Issue #40. The default value for formula in ISLaSolver is now None and no longer "true", with the same semantics. This means that you can still pass Formula objects and strings, but additionally None for the trivially valid constraint.

[1.9.9] - 2022-12-22

Changed

  • Improved DerivationTree.to_dot(): Now, nodes at same levels appear at the same levels, and the horizontal order of child nodes is retained.

[1.9.8] - 2022-12-20

Changed

  • Fixed "build" shield in README.md
  • Changed computation of symbol costs in solver; now computed statically from grammar graph, not using fuzzer.

[1.9.7] - 2022-12-16

Changed

  • Fix in performance evaluator; improves output, where an error (w/o consequences) was shown in case of an intentional timeout. Now, the solvers terminate silently after the set timeout.
  • Small performance improvements in establishing normal form, computing potential matches, and performing substitutions in SMT formulas.

[1.9.6] - 2022-12-15

Changed

  • Fixed hashing in call to solver.is_valid_combination
  • Performance improvements, mainly by avoiding replace_formula for top-level formulas (performing shallow replace instead) and re-using the same objects for atomic SMT formulas.

[1.9.5] - 2022-11-21

Changed

  • Bug fix in optimized solution of length constraints: (1) Preventing infinite loop for unsolvable constraints (e.g., zero length for non-nullable nonterminal), (2) preventing solver from trying to produce strings with negative lengths.

[1.9.4] - 2022-11-13

Changed

  • Can use Unicode symbols like "\x01" in the nonprintable range in BNF grammars (parsing and unparsing now works for these symbols).

[1.9.3] - 2022-11-10

Changed

  • Further improvements in input repair: Better ordering of abstracted trees. Now, all abstractions are generated at once (not lazily in a generator), which might impact performance; however, the results are ordered by tree size, i.e., less abstract trees come first, which is the expected behavior. Code is simplified.

[1.9.2] - 2022-11-10

Changed

  • Improved input repair:
    • Results are more local now, performance improvement for more complicated repairs
    • Fixed tree abstraction for the top (empty) path

[1.9.1] - 2022-11-08

Changed

  • Bug fix: Python extension files could not use local definitions before (e.g., helper functions); this is fixed now.
  • Buf fix: Epsilon productions are correctly processed in unparse_grammar.

[1.9.0] - 2022-11-08

Added

  • It is now possible to extend ISLa with custom structural or semantic predicates when using the CLI (this concerns isla solve, isla parse, isla check, isla fuzz, isla mutate, isla find, and isla mutate) by passing a Python file as input that contains a function

    predicates() -> typing.Set[isla.language.StructuralPredicate | isla.language.SemanticPredicate]

    Additionally or alternatively, that file may define a variable grammar or a function grammar() specifying the grammar of the language under test. This behavior, already implemented in previous ISLa versions (for the variable declaration, see "Changed"), is unaffected by the predicate extension feature.

Changed

  • In Python extension files, grammars may alternatively be defined as the return value of a function def grammar() -> Dict[str, List[str]] instead of an assignment to a variable grammar. This is the preferred style; the variable assignment may get deprecated in the future.

[1.8.0] - 2022-10-21

Added

  • Added isla find command to filter files passing syntactic & semantic constraints

Changed

  • isla create now prints out the created files.

[1.7.3] - 2022-10-13

Changed

  • Performance fix with most impact in instantiating large structures with the grammar fuzzer: More caching & propagation of the "is open" status of derivation trees.

[1.7.2] - 2022-10-13

Changed

  • Creation of zero-length trees for str.len(var) expressions, if var only occurs inside str.len applications and optimized Z3 queries are enabled, now works; before, such a tree was not found.

[1.7.1] - 2022-10-13

Changed

  • Asserting that creation of fixed-length trees (see comment to version 1.5.0) did work as expected; error message suggests disabling optimized queries or refining constraints.
  • The CLI captures all exceptions for isla solve and reports them to the command line rather than crashing ungracefully.

[1.7.0] - 2022-10-13

Added

  • CLI argument --tree for isla solve to produce JSON output (derivation trees) rather than "unparsed" strings.
  • The CLI commands check, repair, and mutate now also accept derivation trees in JSON format as inputs, circumventing the need for parsing if, e.g., piping an input produced by a grammar fuzzer to the checker.

[1.6.0] - 2022-10-12

Added

  • Command line argument -O which runs ISLa in optimized mode with deactivated assertions. Recommended if speed is an issue.

[1.5.0] - 2022-10-12

Added

  • Added special handling for str.len applications: For variables exclusively occurring inside str.len, we only ask Z3 for a solution to the numeric length, and then randomly create a string of that length. The ISLaSolver has a new option enable_optimized_z3_queries to disable this behavior. Future releases will also have a CLI option for that purpose.

Changed

  • Bug fix: Wrong precedence of multiplication/division and addition/subtraction in ISLa parser; x * y + z was parsed x * (y + z). This is now corrected.
  • Bug fix: Null bytes at the level of SMT expressions (\u{}) were not handled correctly in z3_helpers.evalute_z3_expression; instead of \x00, they were treated as the 4-character string \u{}.

[1.4.1] - 2022-10-06

Changed

  • Bug fix: Added forgotten requirement. The "toml" package needs to be installed to use ISLa 1.4.0.

[1.4.0] - 2022-10-06

Added

  • Configuration of default ISLa CLI options via .islarc files. The ISLa CLI searches for an .islarc files first in the current working directory, then in the user's home directory, and uses a bundled resource file as a fallback. All found configurations are merged; in case of conflicts, earlier found ones take precedence (i.e., the options specified in ./.islarc override all others). The file format for .islarc files is TOML.
  • Added isla config command to write the default configuration either to a specified output file or to the standard output. This comes handy when one is unsure about the available options and to start a complete custom configuration.

[1.3.3] - 2022-10-04

Changed

  • Solved packaging bug: Resource files for the isla create command were not packaged in installation packages; this is now fixed.

[1.3.2] - 2022-09-30

Changed

  • Bugfix in unparse_grammar related to escaping.
  • Changed standard cost weight vector in CLI (now defaults to isla.solver.STD_COST_SETTINGS.weight_vector).
  • Describing individual cost vector components in CLI help text.

[1.3.1] - 2022-09-30

Changed

  • Fixed bug in unparse_grammar: Backslashes are correctly escaped now.
  • Fixed bug in parsing of match expressions: Escaped symbols are not handled correctly.
  • Made order of terminals in REST_GRAMMAR deterministic.

[1.3.0] - 2022-09-27

Added

  • Added the CLI command isla mutate and solver method ISLaSolver.mutate for mutating inputs in a semantics-preserving way.

[1.2.0] - 2022-09-27

Changed

  • Added the option to pass multiple constraints via multiple --constraint command line parameters in the CLI.

[1.1.0] - 2022-09-27

Added

  • Added the CLI command isla repair and solver method ISLaSolver.repair for repairing inputs violating semantic constraints.

[1.0.1] - 2022-09-23

Changed

  • The evaluation function evaluate_legacy now better accounts for open derivation trees, issuing "unknown" in more cases. For example, if a universal quantifier matches possible expansions of an open derivation tree, we return "unknown;" similarly if an existential quantifier is not satisfied by any concrete match, but there are potential matches (of instantiations of open subtrees) left
  • Fix in can_extend_leaf_to_make_quantifier_match_parent, which sometimes did not satisfy its intended semantics of only returning True for potential matches, and not for current ones
  • Bug fix: Epsilon expansions in match expressions caused problems before (see test case "test_solve_config_grammar_leaddigit_equality" in "test_solver.py")

[1.0.0] - 2022-09-22

This first major ISLa release comes with a command line interface and a new solver API. Please note that the new API is incompatible with the 0.* versions, such that things can break if you upgrade to 1.0.0 from a previous version. Read the change list below to learn how you can adapt your code to work with 1.0.0.

Changed

  • The previous solver function ISLaSolver.solve() was removed
  • The function ISLaSolver.fuzz() was renamed to ISLaSolver.solve()
  • The signatures of the old ISLaSolver.fuzz() and the new ISLaSolver.solve() functions are different: The new function
    • only ever returns DerivationTree objects
    • raises a StopIteration if no (more) solutions could be found
    • raises a SolverTimeout exception if the solver timed out
  • The previous solver function ISLaSolver.evaluate() was renamed to ISLaSolver.check()
  • The new ISLaSolver.check() function returns a "normal" Boolean value (instead of a ThreeValuedTruth) and returns an UnknownResultError if truth cannot be determined (which should only occur in rare edge cases).
  • The function ISLaSolver.parse() now also checks for semantic validity. It
    • raises a SyntaxError if the input cannot be parsed with the grammar,
    • raises a SemanticError if the input can be parsed, but does not satisfy the constraint, and
    • returns a DerivationTree object otherwise.

[0.11.0] - 2022-09-22

Added

  • Added command line interface: Run isla -h after installing ISLa to obtain usage assistance
  • Integrated the black code style
  • Set up GitHub workflows checking black and the flake8 linter

Changed

  • Improved code quality, now satisfying flake8 checks
  • Fixes in octal_to_dec semantic predicate (relevant for TAR)
  • Bug fix in solver: Avoiding error if no k-paths are present due to too large k for (nonrecursive) grammar
  • Bug fix in ensure_unique_bound_variables: This function sometimes produced results that were not semantically equivalent
  • Can parse constraints such as forall <var> var: var = "a" or forall <var> var: var = "b", where different quantifiers bind the same variable. Raised a SyntaxError before.

[0.10.10] - 2022-09-01

Changed

  • Upgraded grammar_graph library, which now integrates better caching. Significant performance improvement.
  • Upgraded grammar_to_regex library, which incorporates a bug fix in expansing nonregular subgrammars.

[0.10.9] - 2022-09-01

Changed

  • Upgraded grammar_graph library, which now caches (k)-paths of subtrees. This gives a performance boost in particular for long trees with redundant subtrees like from the TAR case study.
  • Fixed the TAR parser from the TAR case study.

[0.10.8] - 2022-08-31

Changed

  • Passing bytes rather than str to the ijson library to address deprecation warning.
  • Increased version of required grammar_to_regex library; works better when expressing constraints on nonterminals with certain nonregular sub-languages.

[0.10.7] - 2022-08-31

Changed

  • Non-recursive length computation for derivation trees; prevents recursion depth errors for deep trees.

[0.10.6] - 2022-08-30

Changed

  • Bug fix in performance evaluator.

[0.10.5] - 2022-08-30

Changed

  • Upgraded grammar_graph library; includes performance fix.
  • Including ISLa 0.10.15 in Dockerfile.

[0.10.4] - 2022-08-30

Changed

  • Added solver parameter grammar_unwinding_threshold to customize the grammar unwinding depth for nonregular nonterminal grammars involved in SMT-LIB formulas.

[0.10.3] - 2022-08-29

Bugfix release for 0.10.2.

[0.10.2] - 2022-08-29

Changed

  • If multiple solutions to an SMT formula are required (see solver parameter max_number_smt_instantiations), we add a negated conjunction of the values for all variables in a previous solution; before, we only added a negation of the individual values, which sometimes resulted in fewer solutions than possible (see this GitHub issue).
  • Using iterative JSON deserialization in performance evaluators and derivation tree deserialization, works better for large inputs (e.g., TAR files).

[0.10.1] - 2022-08-26

Changed

  • Fixed a bug in computing more than one solution to an SMT-LIB formula. Before, it was pure coincidence if another solution dropped out.
  • Updated dependency to grammar_graph library, which had a major bug before which could render k-path coverage guidance useless.
  • Recalibrated weight vectors (including default weight vector) to account for the bug fix in the grammar graph library.

[0.10.0] - 2022-08-22

Changed

  • Major overhaul of cost computation. The solver finds much more diverse inputs now.

[0.9.6] - 2022-08-20

Changed

  • Bug fix in performance_evaluator, regression due to API change in ISLaSolver (return of TIMEOUT values)
  • Increased maximum recursion depth for evaluation; this is needed in the TAR case study.

[0.9.5] - 2022-08-19

Changed

  • Bug fix deserialization of isla.language.SMTFormula in conjunction with escaped quotation marks.

[0.9.4] - 2022-08-19

Changed

  • Bug fix in ISLaUnparser: Quotation marks in SMT-LIB string literals are escaped according to the ISLa language specification, using \".

[0.9.3] - 2022-08-19

Added

  • Added functions isla.solver.implies and isla.solver.equivalent for implication and equivalence checking.

Changed

  • Bug fix in unsatisfiability testing for existential quantifiers. Before, SAT was reported instead of UNSAT in certain cases.

[0.9.2] - 2022-08-19

Changed

  • Using Trie implementation that is more efficient in retrieving sub-tries. All trie functionality is now bundled in the isla.trie.SubtreesTrie class.

[0.9.1] - 2022-08-19

Changed

  • Re-introduced PEGParser to isla.parser
  • Using PEGParser first in bind expression parsing; make a huge performance difference for complex (PEG) grammars.

[0.9.0] - 2022-08-18

Added

  • With the solver flag activate_unsat_support set, an additional unsatisfiability check is executed for existential formulas arising during solving, which, in many cases, can make the solver terminate with an adequate ISLaSolver.UNSAT response. This flag should only be set if an unsatisfiable answer is expected or deemed possible, since the additional checks impact the solver performance. With this release, ISLa is approaching the state of being a solver/checker for both satisfiable and unsatisfiable problems.

[0.8.18] - 2022-08-18

Changed

  • The ISLaSolver.fuzz() and ISLaSolver.solver() methods now return either (a generator of) derivation trees or the special constants ISLaSolver.TIMEOUT or ISLaSolver.UNSAT for occurred timeouts or unsatisfiable problems.
  • Changes to a crucial solver function drastically improved performance.
  • Small bug fix in DerivationTree.

[0.8.17] - 2022-08-17

Added

  • Added a method ISLaSolver.fuzz() that produces one solution at each call, i.e., not a generator of solutions as returned by ISLaSolver.solver().

Changed

  • Better support for unsatisfiable formulas: States with unsatisfiable SMT-LIB atoms in their formulas are discarded early. Before, it could happen that the solver diverged. When checking unsatisfiable formulas with existential quantifiers, you have to set a timeout; the solver usually diverges (without producing an output) in these cases.
    To use this, pass the parameter activate_unsat_support=True to the solver.

[0.8.16] - 2022-08-16

Added

  • The descendant axis .. is now supported in XPath expressions, as in <xml-open-tag>.<id>..<id-char> = "a". Consult the ISLa Language Specification for more information.

Changed

  • Escaping in unparsing of grammars.

[0.8.15] - 2022-08-15

Added

  • We added an unparse_grammar function that produces a string in concrete BNF syntax from a Python grammar.

Changed

  • Consolidated match expression matching and the conversion of match expression to tree prefixes; now conforms to the state described in the ISLa Language Specification.
  • We removed the mandatory semicolons as line terminators from the concrete grammar syntax; they are not necessary for parsing in our syntax, even with multi-line expansions. Now, the syntax is identical to classical BNF. Adding semicolon line terminators remains possible, but is optional (and superfluous).

[0.8.14] - 2022-08-10

Added

  • Added parser for concrete BNF syntax. One can now call the ISLaSolver and the evaluate function with a string representation of the grammar, e.g.,

     <start> ::= <stmt> ;
     <stmt> ::= <assgn> | <assgn> " ; " <stmt> ;
     <assgn> ::= <var> " := " <rhs> ;
     <rhs> ::= <var> | <digit> ;
     <var> ::= "a" | "b" | "c" ;
     <digit> ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9" ; 
    

[0.8.13] - 2022-08-09

Changed

  • Bug fix in DerivationTree serialization.

[0.8.12] - 2022-08-09

Changed

  • More space-efficient serialization of DerivationTree objects.

[0.8.11] - 2022-08-09

Changed

  • Better serialization of DerivationTree objects.

[0.8.10] - 2022-08-09

Changed

  • Linking new Fuzzing Book tutorial on ISLa in README.md.
  • Added evaluation scripts.

[0.8.9] - 2022-08-08

Changed

  • Propositional combinators (and, or, ...) can not be used also on SMT level, i.e., in s-expression syntax.

[0.8.8] - 2022-08-08

Changed

  • Formulas like forall <a> in <start>: ... get now parsed like forall <a> in start: .... Before, another quantifier was added to close over <start>.
  • The count predicate now accepts a literal value as third argument, expected a variable before.
  • Bug fix in handling of XPath expressions where at least two share a prefix (test case test_length_prefixed_strings in test_concrete_syntax.py).
  • Bug fix in extracting strings from Z3 models: Now also handling null characters ("\u{}") correctly.

[0.8.7] - 2022-08-07

Added

  • isla.language.unparse_isla shortcut to ISLaUnparser for getting the textual representation of an ISLa formula.
  • Added ISLaSolver.evaluate(tree) method to quickly evaluate inputs based on an existing ISLaSolver object.

Changed

  • Bug fix: Indexes in XPath expressions (like <a>.<b>[2].<c>) work now. Remember: Counting starts from 1, which also is the default if you omit an index.
  • Bug fix: ISLa's unparser returns correct result for nullary SMT-LIB functions (e.g., re.all instead of (re.all)).
  • Bug fix: SMT-LIBs div and mod operators work now (also in infix syntax).
  • Added mod to evaluate_z3_expression shortcut (which does not require Z3 solver calls).

[0.8.6] - 2022-08-05

Added

  • Implemented direct_child predicate upon request.

[0.8.5] - 2022-08-04

Small bugfix release.

[0.8.4] - 2022-08-04

Changed

  • Updated build/test/install information in README.

[0.8.1], [0.8.2], [0.8.3] - 2022-08-03 to 2022-08-04

Small bugfix releases.

[0.8] - 2022-08-03

Added

  • Brief descriptions of the built-in predicates added to the README.md file.

Changed

  • Project package information was extended to prepare for publishing at PyPi.
  • Version jump to signal PyPi maturity :) We're planning for an 1.0 release soon!

[0.3] - 2022-08-03

Added

  • Cost computation is not outsourced to exchangeable CostComputer classes. The existing cost computation strategy is encapsulated in the class GrammarBasedBlackboxCostComputer.
  • ISLa now has a simplified syntax layer, which is translated to the core syntax (it's "syntactic sugar"):
    • Names in quantifiers may be omitted; quantified elements can be referred to by their nonterminal types (if unambiguous) in quantifier core.
    • "Free" nonterminals are permitted, and are universally quantified (and in <start>) as default.
    • The in ... part of quantifiers may be omitted and defaults to in <start>.
    • For SMT expressions, we support an infix (for binary operators) and prefix (for unary ones) syntax instead of the usual SMT-LIB S-Expressions.
    • Additionally to match expressions, we permit the descendant axis . (.. is not yet implemented, but planned) borrowed from the XPath abbreviated syntax (there, one writes / and //). For ., the specification of a position predicate [n], for $n=1,2,\dots$, is required, as in <id>.<char>[2] (the second <char> occurrence in an <id> nonterminal). If it is omitted, it defaults to [1].

Changed

  • Removed deprecated/unused "cost phase" feature; CostSettings now expect a single CostWeightVector and a value for k, exclusively.
  • Removed deprecated "vacuous penalty" cost feature from cost vector.
  • Performance fix in tree expansion: If there is a limit on the number of desired expansions, we don't compute all of them and choose a subset afterward, but only compute the desired number of (random) expansions.
  • Bug fix in tree expansion, more precisely in determining whether a nonterminal can be freely instantiated: If the nonterminal can be associated with a dummy variable in the match expression, we not correctly report that it is not bound by the quantifier and can be freely instantiated. This mitigates state explosion in certain cases.
  • Performance fix: Lazy computation of complex string arguments passed to the logger.
  • Removed all fuzzingbook dependencies. This significantly reduces the number of indirect dependencies that have to be installed.
  • Split dependencies into required, dev, and test dependencies. If you want to develop or test ISLa, use pip install -e .[dev,test] in your virtual environment.
  • Removed an assertion that made the solver fail for SMT constraints on non-regular nonterminals. For nonterminals with a context-free language, the regular expression passed to the SMT solver will always be an approximation, and it cannot be asserted that the languages of the subgrammar and the regex are identical.

[0.2b3] - 2022-06-03

Changed

  • Simplification in assignment language example in README.
  • Added README section about the new ISLa Docker image.

[0.2b2] - 2022-06-03

Added

  • More precise translation of ISLa formulas to SMT (evaluator.isla_to_smt_formula), implication (evaluator.implies) and equivalence (evaluator.equivalent) checks.

Changed

  • Bugfixes in translation of Z3 regexes to Python.
  • Z3 translation method evaluate_z3_expression can handle open variables and returns closures. Better caching opportunities.
  • Factored out Z3 helpers from isla.helpers to isla.z3_helpers.
  • z3.ExprRef.__eq__ is pointed to z3.AstRef.__eq__ when loading isla.language. The reason is that __eq__ is called, e.g., when requesting elements from hash maps, but the z3.ExprRef implementation creates a new z3.BoolRef instead of returning a bool. We perform a structural comparison instead. To construct z3.BoolRef equations, use isla.z3_helpers.z3_eq instead.
  • Performance optimization for semantic predicates.
  • Performance optimizations in match expression matching.
  • Cleaned up and relaxed requirements.

[0.2b1] - 2022-02-25

Added

  • Evaluation routine for concrete Z3 expressions without solver calls. Speedups of 50 % to 100 % in calls to helpers.is_valid(Formula).
  • Added a structural predicate nth(idx, tree, in_tree) which holds if tree is the idx-th occurrence of a tree with its nonterminal within in_tree. This is useful, e.g., in CSV files.

Changed

  • More efficient DerivationTree.replaceTree method.
  • The evaluator for concrete SMT-LIB expressions handles (str.to.int float_str) for strings representing floating point numbers by rounding them to Integers. This differs from the standard SMT-LIB/Z3 semantics, where the result is -1 for all strings that are not positive Integers. It is planned to also integrate this into the more general Z3-based evaluation and into the solver.

[0.2a2] - 2022-02-03

Added

  • Both the evaluator and the solver are passed default sets of predicates. Thus, unless some user-defined special predicates should be used, one does not need to pass the predicates to the solver/evaluator when generating from/checking a constraint in ISLa's concrete syntax (a string).

  • Added universal integer quantification method for a special case:

    forall int i:
      exists <?> elem in container: 
        not phi(elem, i) 
    

    is transformed to

    exists int i: (
      exists <?> elem in container: 
        phi(elem, i) and 
      exists <?> elem in container: 
        not phi(elem, i))
    

    if the predicate / formula phi exactly holds true for a single instantiation of i once elem (and potentially other parameters) have fixed values. In the code, we briefly sketch a correctness proof that this transformation is equivalence-preserving.

  • Constraint checking can consider preconditions. This is highly useful when eliminating existential quantifiers (esp. after re-insertion): They may not generally hold, but taking the already existing constraints into account.

  • Added forgotten mention of the renaming of isla.isla to isla.language to the changelog (this file), section 0.2a1.

  • It is now possible to create infinite streams of solutions from existential formulas. Before, the number of solutions was usually exactly one, unless one increased the number of free instantiations.

Changed

  • Factored out functions related to ISLa evaluations from isla.language to isla.evaluator. The previous isla.evaluator file was named isla.performance_evaluator. This change is also reflected in the organization of the tests in tests/.
  • Checking whether existential (integer / tree) quantifiers are already satisfied, taking into account existing constraints (assumptions).
  • Corrections in README.txt + clarifies in README that ISLa is a grammar-aware String constraint solver.

Removed

  • Removed "eval_results" folder. New evaluation scripts output SQLITE files.
  • The deprecated solver parameter "expand_after_existential_elimination" was removed. The solver now always expands in parallel to existential elimination / tree insertion, thus infinite solution streams for existential formulas become possible. In recent ISLa versions, this parameter was anyway ignored.

[0.2a1] - 2022-02-01

Added

  • Support for universal and existential quantifiers over integers: forall int x: ... and exists int x: .... Existential quantification replaces the previous num x: ... syntax, universal quantification is new. ISLa is now also closed under negation for numeric quantifiers. The ISLa solver uses an incomplete approach to instantiate universal quantifiers over integers, depending on SMT formula (and possibly semantic predicates) on top level inside the inner formula.
  • ISLa now has its own, performance-optimized grammar fuzzer (forked from the Fuzzing Book).

Changed

  • The ISLa concrete syntax was changed. In the new syntax, there is not variable definition block (vars) and the constant declaration is now optional (only needed if the top-level constant is not named "start" and has the nonterminal type <start>). Instead, all types are directly declared inside the constraint: E.g., forall <expr> expr="{<sum> sum} + {<factor> factor}": .... The constraint keyword was also removed; an ISLa specification either consists of a constant declaration followed by an ISLa formula, or only of an ISLa formula.
  • The ISLa source is now organized with a so-called "src layout".
  • The package isla.isla (i.e., the isla.py file in the isla package) was renamed to isla.language. Thus, you have to replace all from isla.isla import ... import statements (or similar) by from isla.language import ....
  • The existing language formalizations (e.g., Scriptsize-C and TAR) were moved into a package isla_formalizations under the src directory. Tests are now in the top-level directory tests, the evaluation scripts for the formalizations in the top-level evaluations directory. The demo file xml_demo.py moved to inside the tests directory. You can still run tox in the top-level directory to run tests. For running tests without tox, you might have to do a pip install -e . for an editable installation of ISLa (preferrably within a virtual environment).
  • Switched to a PEP517-based build system. You can now run python3 -m build from inside the virtual environment to run a build. All settings are now in setup.cfg, the setup.py file is only there for backward compatibility and to make the "setuptools-antlr" package work (python3 setup.py antlr). Only ISLa developers need this.
  • Various bug fixes, enhancements and new safety assertions related to existential quantifier solving (both instantiation and tree insertion).

Removed

  • The outdated formalizations for tiny-C and minipy were removed.