Skip to content

2015 006 Additional string conversion functionality

John Reppy edited this page Aug 16, 2015 · 1 revision

Proposal 2015-006

Additional string conversion functionality

Author: Andreas Rossberg
Last revised: August 16, 2015
Status: proposed
Discussion: issue #7


Synopsis

signature CHAR =
  sig
    ...
    val isBinDigit : char -> bool
    val isOctDigit : char -> bool

    val scanC : (char,'a) StringCvt.reader -> (char,'a) StringCvt.reader
  end
signature STRING =
  sig
    ...
    val scan : (char,'a) StringCvt.reader -> (string,'a) StringCvt.reader
    val scanC : (char,'a) StringCvt.reader -> (string,'a) StringCvt.reader
  end
signature INTEGER =
  sig
    ...
    val scan : StringCvt.radix -> (char,'a) StringCvt.reader -> (int,'a) StringCvt.reader
    val fromString : string -> int option
  end
signature WORD =
  sig
    ...
    val scan : StringCvt.radix -> (char,'a) StringCvt.reader -> (int,'a) StringCvt.reader
    val fromString : string -> int option
  end

Description

CHAR

  • isBinDigit c
    returns true iff c is a binary digit (i.e., 0 or 1).

  • isOctDigit c
    returns true iff c is an octal digit (i.e., 0 to 7).

  • scanC getc strm
    scans a character (including space) or a C escape sequence representing a character from the prefix of a string. Similar to scan, except that it uses C escape conventions, like the function fromCString.

STRING

  • scan getc strm
  • scanC getc strm
    scans a string as an SML / C source program string, converting escape sequences into the appropriate characters. These functions are similar to fromString and fromCString, but can convert from arbitrary streams.

INTEGER

  • fromString s
  • scan getc strm
    like before, except that underscores are allowed to separate digits. The scan function thus accepts the following formats:
StringCvt.BIN   [+~-]?[0-1_]*[0-1][0-1_]*
StringCvt.OCT   [+~-]?[0-7_]*[0-7][0-7_]*
StringCvt.DEC   [+~-]?[0-9_]*[0-9][0-9_]*
StringCvt.HEX   [+~-]?(0x|0X)?[0-9a-fA-F_]*[0-9a-fA-F][0-9a-fA-F_]*

WORD

  • fromString s
  • scan getc strm
    like before, except that underscores are allowed to separate digits. The scan function thus accepts the following formats:
StringCvt.BIN   (0w)?[0-1_]*[0-1][0-1_]*
StringCvt.OCT   (0w)?[0-7_]*[0-7][0-7_]*
StringCvt.DEC   (0w)?[0-9_]*[0-9][0-9_]*
StringCvt.HEX   (0wx|0wX|0x|0X)?[0-9a-fA-F_]*[0-9a-fA-F][0-9a-fA-F_]*

Rationale

The functions in CHAR and STRING fill a few holes and asymmetries in the current library.

The modifications to scanners for integers and words are proposed to adapt to the more liberal literal syntax proposed for Successor ML.

Impact

The change of the integer/word scanning functions might affect existing programs, and make them consider more input valid. However, it seems unlikely that this breaks programs.


History

  • [2015-08-16] Proposed

Clone this wiki locally