STRING signature
signature STRING
structure String :> STRING
where type string = string
where type string = CharVector.vector
where type char = Char.char
structure WideString :> STRING (* OPTIONAL *)
where type string = WideCharVector.vector
where type char = WideChar.char
The STRING signature specifies the basic operations on a string type, which is a vector of the underlying character type char as defined in the structure.
The STRING signature is matched by two structures, the required String and the optional WideString. The former implements strings based on the extended ASCII 8-bit characters, and is a companion structure to the Char structure. The latter provides strings of characters of some size greater than or equal to 8 bits, and is related to the structure WideChar. In particular, the type String.char is identical to the type Char.char and, when WideString is defined, the type WideString.char is identical to the type WideChar.char. These connections are made explicit in the Text and WideText structures, which match the TEXT signature.
eqtype string
eqtype char
val maxSize : int
val size : string -> int
val sub : string * int -> char
val extract : string * int * int option -> string
val substring : string * int * int -> string
val ^ : string * string -> string
val concat : string list -> string
val concatWith : string -> string list -> string
val str : char -> string
val implode : char list -> string
val explode : string -> char list
val map : (char -> char) -> string -> string
val translate : (char -> string) -> string -> string
val tokens : (char -> bool) -> string -> string list
val fields : (char -> bool) -> string -> string list
val isPrefix : string -> string -> bool
val isSubstring : string -> string -> bool
val isSuffix : string -> string -> bool
val compare : string * string -> order
val collate : (char * char -> order)
-> string * string -> order
val < : string * string -> bool
val <= : string * string -> bool
val > : string * string -> bool
val >= : string * string -> bool
val toString : string -> String.string
val scan : (char, 'a) StringCvt.reader
-> (string, 'a) StringCvt.reader
val fromString : String.string -> string option
val toCString : string -> String.string
val fromCString : String.string -> string option
val maxSize : int
size s
sub (s, i)
Subscript if i < 0 or |s| <= i.
extract (s, i, NONE)
extract (s, i, SOME j)
substring (s, i, j)
Subscript if i < 0 or |s| < i. The second form returns the substring of size j starting at index i, i.e., the string s[i..i+j-1]. It raises Subscript if i < 0 or j < 0 or |s| < i + j. Note that, if defined, extract returns the empty string when i = |s|.
The third form returns the substring s[i..i+j-1], i.e., the substring of size j starting at index i. This is equivalent to .
extract(s, i, SOME j)
Implementation note:
Implementations of these functions must perform bounds checking in such a way that the
Overflowexception is not raised.
s ^ t
Size if |s| + |t| > maxSize.
concat l
Size if the sum of all the sizes is greater than maxSize.
concatWith s l
Size if the size of the resulting string would be greater than maxSize.
str c
implode l
concat (List.map str l). This raises Size if the resulting string would have size greater than maxSize.
explode s
map f s
implode(List.map f (explode s)).
translate f s
concat(List.map f (explode s)).
tokens f s
fields f s
Two tokens may be separated by more than one delimiter, whereas two fields are separated by exactly one delimiter. For example, if the only delimiter is the character #"|", then the string "|abc||def" contains two tokens "abc" and "def", whereas it contains the four fields "", "abc", "" and "def".
isPrefix s1 s2
isSubstring s1 s2
isSuffix s1 s2
true if the string s1 is a prefix, substring, or suffix (respectively) of the string s2. Note that the empty string is a prefix, substring, and suffix of any string, and that a string is a prefix, substring, and suffix of itself.
compare (s, t)
Char.compare on the characters. It returns LESS, EQUAL, or GREATER, if s is less than, equal to, or greater than t, respectively.
collate f (s, t)
val < : string * string -> bool
val <= : string * string -> bool
val > : string * string -> bool
val >= : string * string -> bool
char type.
toString s
scan getc strm
fromString s
isPrint), or if they encounter an improper escape sequence. fromString ignores the remaining characters, while scan returns the remaining characters as the rest of the stream.
The function fromString is equivalent to the .
StringCvt.scanString scan
If no conversion is possible, e.g., if the first character is non-printable or begins an illegal escape sequence, NONE is returned. Note, however, that returns fromString "".
SOME("")
For more information on the allowed escape sequences, see the entry for CHAR.fromString. SML source also allows escaped formatting sequences, which are ignored during conversion. The rule is that if any prefix of the input is successfully scanned, including an escaped formatting sequence, the functions returns some string. They only return NONE in the case where the prefix of the input cannot be scanned at all. Here are some sample conversions:
| Input string s |
fromString s
|
|---|---|
"\\q"
|
NONE
|
"a\^D"
|
SOME "a"
|
"a\\ \\\\q"
|
SOME "a"
|
"\\ \\"
|
SOME ""
|
""
|
SOME ""
|
"\\ \\\^D"
|
SOME ""
|
"\\ a"
|
NONE
|
Implementation note:
Because of the special cases, such as
fromString "" = SOME "",fromString "\\ \\\^D" = SOME "", andfromString "\^D" = NONE, the functions cannot be implemented as a simple iterative application ofCHAR.scan.
toCString s
fromCString s
fromString above, except that C escape sequences are used (see ISO C standard ISO/IEC 9899:1990[CITE]).
For more information on the allowed escape sequences, see the entry for CHAR.fromCString. Note that fromCString accepts an unescaped single quote character, but does not accept an unescaped double quote character.
CHAR,CharArray,CharVector,StringCvt,SUBSTRING,TEXT,WideCharVector
Generated April 12, 2004
Last Modified October 17, 2000
Comments to John Reppy.
This document may be distributed freely over the internet as long as the copyright notice and license terms below are prominently displayed within every machine-readable copy.
|
Copyright © 2004 AT&T and Lucent Technologies. All rights reserved.
Permission is granted for internet users to make one paper copy for their
own personal use. Further hardcopy reproduction is strictly prohibited.
Permission to distribute the HTML document electronically on any medium
other than the internet must be requested from the copyright holders by
contacting the editors.
Printed versions of the SML Basis Manual are available from Cambridge
University Press.
To order, please visit
www.cup.org (North America) or
www.cup.cam.ac.uk (outside North America). |