[Top]
|
Method sscanf()
- Method
sscanf
-
int sscanf(string data, string format, mixed ... lvalues)
- Description
-
The purpose of sscanf is to match a string data against a format
string and place the matching results into a list of variables. The list
of lvalues are destructively modified (which is only possible because
sscanf really is an opcode, rather than a pike function) with the values
extracted from the data according to the format specification. Only
the variables up to the last matching directive of the format string are
touched.
The format string can contain strings separated by special matching
directives like %d, %s%c and %f. Every such
directive corresponds to one of the lvalues , in order they are listed.
An lvalue is the name of a variable, a name of a local variable, an index
in an array, mapping or object. It is because of these lvalues that sscanf
can not be implemented as a normal function.
Whenever a percent character is found in the format string, a match is
performed, according to which operator and modifiers follow it:
"%b" | Reads a binary integer ("0101" makes 5 )
|
"%d" | Reads a decimal integer ("0101" makes 101 ).
|
"%o" | Reads an octal integer ("0101" makes 65 ).
|
"%x" | Reads a hexadecimal integer ("0101" makes 257 ).
|
"%D" | Reads an integer that is either octal (leading zero),
hexadecimal (leading 0x) or decimal. ("0101" makes
65 ).
|
"%c" | Reads one character and returns it as an integer
("0101" makes 48 , or '0' , leaving
"101" for later directives). Using the field width and
endianness modifiers, you can decode integers of any size and
endianness. For example "%-2c" decodes "0101"
into 12592 , leaving "01" fot later directives.
The sign modifiers can be used to modify the signature of the
data, making "%+1c" decode "ä" into
-28 .
|
"%f" | Reads a float ("0101" makes 101.0).
|
"%F" | Reads a float encoded according to the IEEE single precision
binary format ("0101" makes 6.45e-10 ,
approximately). Given a field width modifier of 8 (4 is the
default), the data will be decoded according to the IEEE
double precision binary format instead. (You will however
still get a float, unless your pike was compiled with the
configure argument --with-double-precision.)
|
"%s" | Reads a string. If followed by %d, %s will only read non-numerical
characters. If followed by a %[], %s will only read characters not
present in the set. If followed by normal text, %s will match all
characters up to but not including the first occurrence of that text.
|
"%[set]" | Matches a string containing a given set of characters (those given
inside the brackets). %[^set] means any character except those inside
brackets. Ranges of characters can be defined by using a minus
character between the first and the last character to be included in
the range. Example: %[0-9H] means any number or 'H'. Note that sets
that includes the character - must have it first in the brackets to
avoid having a range defined. Sets including the character ']' must
list this first (even before -) too, for natural reasons.
|
"%{format%}" | Repeatedly matches 'format' as many times as possible and assigns an
array of arrays with the results to the lvalue.
|
"%O" | Match a Pike constant, such as string or integer (currently only
integer, string and character constants are functional).
|
"%%" | Match a single percent character (hence this is how you quote the %
character to just match, and not start an lvalue matcher directive).
|
|
Similar to sprintf , you may supply modifiers between the % character
and the operator, to slightly change its behaviour from the default:
"*" | The operator will only match its argument, without assigning any
variable.
|
number | You may define a field width by supplying a numeric modifier.
This means that the format should match that number of
characters in the input data; be it a number characters
long string, integer or otherwise ("0101" using the
format %2c would read an unsigned short 12337 , leaving
the final "01" for later operators, for instance).
|
"-" | Supplying a minus sign toggles the decoding to read the data encoded
in little-endian byte order, rather than the default network
(big-endian) byte order.
|
"+" | Interpret the data as a signed entity. In other words,
"%+1c" will read "\xFF" as -1 instead
of 255 , as "%1c" would have.
|
|
- Note
-
Sscanf does not use backtracking. Sscanf simply looks at the format string
up to the next % and tries to match that with the string. It then proceeds
to look at the next part. If a part does not match, sscanf immediately
returns how many % were matched. If this happens, the lvalues for % that
were not matched will not be changed.
- Example
-
// a will be assigned "oo" and 1 will be returned
sscanf("foo", "f%s", a);
// a will be 4711 and b will be "bar", 2 will be returned
sscanf("4711bar", "%d%s", a, b);
// a will be 4711, 2 will be returned
sscanf("bar4711foo", "%*s%d", a);
// a will become "test", 2 will be returned
sscanf(" \t test", "%*[ \t]%s", a);
// Remove "the " from the beginning of a string
// If 'str' does not begin with "the " it will not be changed
sscanf(str, "the %s", str);
// It is also possible to declare a variable directly in the sscanf call;
// another reason for sscanf not to be an ordinary function:
sscanf("abc def", "%s %s", string a, string b);
- Returns
-
The number of directives matched in the format string. Note that a string
directive (%s or %[]) counts as a match even when matching just the empty
string (which either may do).
- See also
-
sprintf , array_sscanf
|