Reference: TT (token trees)
General
TTs are a form of generic tree structure for data storage and manipulation.
It is a fundamental building block in Flux.
Properties overview
- Can have one parent, or none, in which case it's a root node.
- Can have up to (2^32)-1 direct children.
- Ordered relative to its siblings. Knows about the two siblings
appearing directly prior and next to itself, if any.
- Can have any form of data, up to (2^32)-1 bytes, associated with it.
- Homogenous; any node in a tree represents a tree in itself, and can be
referenced as such.
- Memory requirements: Without data, 32 bytes per node on 32-bit
architectures. 56 bytes per node on 64-bit architectures.
Casts
- TT(x)
Casts any kind of tree node to a TT.
Iterators
- TT_FOR_EACH(TT *root, TT *child) statements
Iterates over the direct children of given root
node, using child to store the per-iteration pointer.
Used like a for statement.
- TT_FOR_ALL(TT *root, TT *child) statements
Iterates over entire subtree of root, infix,
storing the per-iteration pointer in child. Used like a
for statement.
Allocation
- TT *tt_new();
Allocates an empty, unconnected node from memory and returns its pointer.
- TT *tt_new_with_data(void *data, int len);
Allocates an unconnected node, copies given data to it
and returns pointer to the node.
- TT *tt_new_with_parent_and_data(TT
*parent, void *data, int len);
Allocates a node, connects it as the last child of
parent, copies given data to it
and returns pointer to the node.
- void tt_del(TT *tt);
Detaches and frees node pointed to by tt, and all of its
children, recursively.
- TT *tt_dup(TT *tt);
Makes an unconnected duplicate of given node and its data.
- TT *tt_dup_all(TT *tt);
Makes an internally connected duplicate of the tree defined by given node.
That is, its children are also duplicated and connected to form a tree with
duplicate of given node as root.
- TT *tt_split(TT *tt, u32 pos);
Splits data of tt before the byte indicated by
pos, putting the last half in a new node. The new node
is connected as a sibling immediately following tt, and
can be retrieved with tt_get_next(tt). If
pos is zero, then tt_size(tt)
will be zero. If pos equals
tt_size(tt) before the operation,
tt_size(tt_get_next(tt)) will be zero afterwards.
Connectivity
These functions must be used to link trees (nodes) to form larger trees.
- void tt_add_as_first_child(TT *parent_tt, TT *tt);
Adds tt at the beginning of
parent_tt's child list.
- void tt_add_as_last_child(TT *parent_tt, TT *tt);
Adds tt to the end of
parent_tt's child list.
- void tt_add_as_first_sibling(TT *sibling_tt, TT *tt);
Adds tt at the beginning of
sibling_tt's parent's child list. Note: If
sibling_tt is a root, and thus has no parent to hold
child lists, a new root will be created implicitly, holding
sibling_tt and tt.
See tt_is_fake_root on dealing with implicitly
created root nodes.
- void tt_add_as_last_sibling(TT *sibling_tt, TT *tt);
Adds tt to the end of
sibling_tt's parent's child list. Note: If
sibling_tt is a root, and thus has no parent to hold
child lists, a new root will be created implicitly, holding
sibling_tt and tt.
See tt_is_fake_root on dealing with implicitly
created root nodes.
- void tt_add_before(TT *next_tt, TT *tt);
Adds tt before sibling_tt in
sibling_tt's parent's child list. Note: If
sibling_tt is a root, and thus has no parent to hold
child lists, a new root will be created implicitly, holding
sibling_tt and tt.
See tt_is_fake_root on dealing with implicitly
created root nodes.
- void tt_add_after(TT *prev_tt, TT *tt);
Adds tt after sibling_tt in
sibling_tt's parent's child list. Note: If
sibling_tt is a root, and thus has no parent to hold
child lists, a new root will be created implicitly, holding
sibling_tt and tt.
See tt_is_fake_root on dealing with implicitly
created root nodes.
- int tt_add(TT *parent_tt, TT *tt);
Shortcut to tt_add_as_last_child.
- void tt_swap(TT *tt0, TT *tt1);
Swaps tt0 and tt1's connectivity
contexts, meaning their positions relative to other nodes are exchanged.
Nothing else is touched, and the nodes take their respective data with them.
- void tt_detach(TT *tt);
Disconnects the subtree denoted by tt from its parent
and siblings. After the operation, tt will be a root
node.
- int tt_is_in_path(TT *tt0, TT *tt1);
Returns TRUE if tt0 is in
the path of (or rather, is a direct or indirect parent of)
tt1.
Navigation
- int tt_is_root(TT *tt);
Returns TRUE if tt
is a root node (has no parent).
- int tt_is_leaf(TT *tt);
Returns TRUE if tt
is a leaf node (has no children).
- int tt_is_first(TT *tt);
Returns TRUE if tt
comes before all of its siblings (has no previous node).
- int tt_is_last(TT *tt);
Returns TRUE if tt
comes after all of its siblings (has no next node).
- int tt_is_sibling(TT *tt0, TT *tt1);
Returns TRUE if tt0
and tt1 are siblings (have same parent).
- int tt_has_child(TT *parent, TT *child);
Returns TRUE if child
is a direct child of parent.
- TT *tt_get_root(TT *tt);
Returns the root node in tt's tree.
- TT *tt_get_first_sibling(TT *tt);
Returns the first sibling of tt.
- TT *tt_get_last_sibling(TT *tt);
Returns the last sibling of tt.
- int tt_get_prev(TT *tt);
Returns the previous sibling of tt, or
NULL if tt is the first
node.
- int tt_get_next(TT *tt);
Returns the next sibling of tt, or
NULL if tt is the last
node.
- int tt_get_parent(TT *tt);
Returns the parent of tt, or
NULL if tt is the root
node.
- int tt_get_first_child(TT *tt);
Returns the first child of tt, or
NULL if tt is a leaf node.
- int tt_get_last_child(TT *tt);
Returns the last child of tt, or
NULL if tt is a leaf node.
- TT *tt_get_next_infix(TT *tt, TT *top);
Returns the next node in a depth-first, infix traversal of the tree defined
by top, where tt is the last
node traversed. Returns NULL if all nodes have been
visited. To traverse all nodes under top, you should
visit it first, then, on the first call, pass tt
equal to top, and pass the returned node back as
tt on each successive iteration. See also the
iterator TT_FOR_ALL which implements the same
method, but does more work for you.
- TT *tt_get_common_parent(TT *tt0, TT *tt1);
Returns the first common parent in the path going upwards (towards the root)
from tt0 and tt1, or
NULL if they belong to disparate trees.
- TT *tt_get_next_in_breadth_with_level(TT *tt, int depth, int level);
Made for breadth-first traversal, when you know all the parameters. There are
higher-level, more intuitive (though marginally slower) calls you can use.
To determine the next breadth node, three parameters are required:
tt is the node to start looking from.
depth is the depth of this node (see
tt_get_depth). Passing a wrong value for this parameter
is an error - results are undefined. level is the
tree depth you want to breadth-first traverse. To determine which node
gets returned, the function considers 1) the children of
tt, 2) any siblings following tt
and 3) parents of tt. Step 1) and 2) is repeated
for each node. Note that the passed node is not considered, so this can
be used iteratively to get all nodes on a given level. Returns
NULL when no more nodes are found.
- int tt_get_next_in_breadth(TT *tt, TT *depth);
Breadth-first traversal. Returns the next node following
tt, at the same level. depth
is the common depth of these two. Returns NULL if
there are no more nodes at this level. See
tt_get_next_in_breadth_with_level for details.
- TT *tt_get_next_in_same_depth(TT *tt);
Breadth-first traversal. This is the highest-level function to this end.
tt is the node to start looking from. Returns the
first node found which is at the same level as tt, or
NULL if there are no more.
Data
- int tt_get_external_handle(TT *tt);
Returns a numeric filehandle to the data in tt. Makes
sense only if tt_data_is_internal is
FALSE for this node.
- void tt_data_swap(TT *tt0, TT *tt1);
Swaps the data content of tt0 and
tt1, not otherwise touching the nodes.
- void tt_data_del(TT *tt);
Frees all data associated with tt.
- void tt_data_set_internal(TT *tt, void *src, u32 len, unsigned int copy);
Lowest-level data assignment function. Removes any previously set data for
tt, and sets it to src, with
length len. If copy is
TRUE, a local duplicate is made and assigned.
Otherwise, the node will just reference the data, and will not free it
when the node is deleted (see tt_data_is_local).
- int tt_data_set_file(TT *tt, char *path, int local);
Assigns data found in file defined by path to node
tt. Returns TRUE only
if the file exists and can be opened. After a successful assignment,
the file acts as the node's data holder, transparently. If the
local argument is TRUE,
the file will be deleted whenever the node is.
- void tt_data_set_bytes(TT *tt, void *src, u32 start, u32 len);
Changes len bytes of data in tt,
starting at start, to src.
Expands data size if it crosses the boundary, so this can be used to
set data in previously empty nodes too.
- void tt_data_append_bytes(TT *tt, void *src, u32 len);
Adds len bytes of data from src
at the end of tt.
- void tt_data_prepend_bytes(TT *tt, void *src, u32 len);
Adds len bytes of data from src
to the beginning of tt.
- void tt_data_set_int(TT *tt, int val);
Shortcut for setting the data of tt to an integer
value, val.
- void tt_data_set_ptr(TT *tt, void *ptr);
Shortcut for setting the data of tt to a pointer
value, ptr.
- void tt_data_set_str(TT *tt, char *str);
Shortcut for setting the data of tt to a string,
str.
- void *tt_data_get(TT *tt);
Returns a pointer to tt's data (if it is internal), or
a pointer to its (null-terminated) path (if it is external).
- u32 tt_data_get_bytes(TT *tt, void *dest, u32 start, u32 len);
Copies len bytes of data from tt,
starting at start, to dest.
- int tt_data_get_int(TT *tt);
Shortcut. Takes the initial bytes of tt, returning them
as an int.
- void *tt_data_get_ptr(TT *tt);
Shortcut. Takes the initial bytes of tt, returning them
as a generic pointer.
- char *tt_data_get_str(TT *tt);
Returns a null-terminated copy of the data in tt. Yes,
you have to free this yourself, when you're done with it. If
tt is empty, it still returns a valid string -
zero-length but null-terminated.
Matching
- int tt_cmp(TT *tt0, TT *tt1);
Returns TRUE if the data of
tt0 and tt1 are an exact match.
- int tt_casecmp(TT *tt0, TT *tt1);
Returns TRUE if the data of
tt0 and tt1 are an exact match,
disregarding case.
- int tt_memcmp(TT *tt, void *p, u32 len);
Returns TRUE if the data of
tt and the data defined by p
and len are an exact match.
- int tt_memcasecmp(TT *tt, void *p, u32 len);
Returns TRUE if the data of
tt and the data defined by p
and len are an exact match, disregarding case.
- int tt_strcmp(TT *tt, TT *s);
Returns TRUE if the data of
tt and the string passed as s
are an exact match.
- int tt_strcasecmp(TT *tt, TT *s);
Returns TRUE if the data of
tt and the string passed as s
are an exact match, disregarding case.
- int tt_chr(TT *tt, int c);
Returns the position of the first byte matching c in
tt's data.
- int tt_rchr(TT *tt, int c);
Returns the position of the last byte matching c in
tt's data.
- int tt_regcmp_precomp(TT *tt, regex_t *preg);
Returns TRUE if the data of
tt matches the compiled regexp
preg.
- int tt_regcmp(TT *tt, char *regex);
Returns TRUE if the data of
tt matches the regexp regex.
This function does the regexp compilation for you, so watch out if
you're concerned with performance.
- int tt_regcasecmp(TT *tt, char *regex);
Returns TRUE if the data of
tt matches the regexp regex,
disregarding case. This function does the regexp compilation for you, so
watch out if you're concerned with performance.
- size_t tt_spn(TT *tt, const char *accept);
Calculates the length of the initial segment of tt's
data which consists entirely of characters in accept.
- size_t tt_cspn(TT *tt, const char *reject);
Calculates the length of the initial segment of tt's
data which consists entirely of characters not in reject.
Searching
- TT *tt_find_first_sibling(TT *tt, void *data, u32 len);
Works like tt_get_first_sibling, except it returns
the first sibling matching the exact data defined by
data and len, or
NULL if there are no matches.
- TT *tt_find_last_sibling(TT *tt, void *data, u32 len);
Works like tt_get_last_sibling, except it returns
the last sibling matching the exact data defined by
data and len, or
NULL if there are no matches.
- TT *tt_find_next(TT *tt, void *data, u32 len);
Works like tt_get_next, except it returns
the next sibling matching the exact data defined by
data and len, or
NULL if there are no matches.
- TT *tt_find_prev(TT *tt, void *data, u32 len);
Works like tt_get_prev, except it returns
the previous sibling matching the exact data defined by
data and len, or
NULL if there are no matches.
- TT *tt_find_first_child(TT *tt, void *data, u32 len);
Works like tt_get_first_child, except it returns
the first child matching the exact data defined by
data and len, or
NULL if there are no matches.
- TT *tt_find_last_child(TT *tt, void *data, u32 len);
Works like tt_get_last_child, except it returns
the last child matching the exact data defined by
data and len, or
NULL if there are no matches.
- TT *tt_match_first_sibling(TT *tt, char *regexp);
Works like tt_get_first_sibling, except it returns
the first sibling matching the regexp regexp, or
NULL if there are no matches.
- TT *tt_match_last_sibling(TT *tt, char *regexp);
Works like tt_get_last_sibling, except it returns
the last sibling matching the regexp regexp, or
NULL if there are no matches.
- TT *tt_match_next(TT *tt, char *regexp);
Works like tt_get_next, except it returns
the next sibling matching the regexp regexp, or
NULL if there are no matches.
- TT *tt_match_prev(TT *tt, char *regexp);
Works like tt_get_prev, except it returns
the previous sibling matching the regexp regexp, or
NULL if there are no matches.
- TT *tt_match_first_child(TT *tt, char *regexp);
Works like tt_get_first_child, except it returns
the first child matching the regexp regexp, or
NULL if there are no matches.
- TT *tt_match_last_child(TT *tt, char *regexp);
Works like tt_get_last_child, except it returns
the last child matching the regexp regexp, or
NULL if there are no matches.
Statistics
- u32 tt_depth(TT *tt);
Reports the depth at which tt is connected (distance
to root node). For a root node itself, this value is zero.
- void tt_stat_children_all(TT *root, u32 *count, u32 *size);
Recursively gets number of children and total size of their associated data
for root, and stores the values in
count and size. Note that
the values for root itself is not included.
- int tt_size(TT *tt);
Returns the size of tt's associated data. A size
of zero means there is none.
- u32 tt_size_children(TT *root);
Returns the total size of data associated with root's
direct (i.e. non-recursive) children. The size of root
itself is not included.
- u32 tt_size_children_all(TT *root);
Returns the total size of data associated with root's
direct and indirect (i.e. recursive) children. The size of root itself is not included.
- u32 tt_count_children(TT *tt);
Returns the number of direct children under root.
root itself is not included.
- u32 tt_count_children_all(TT *tt);
Returns the number of direct and indirect children under
root. root itself is not
included.
- u32 tt_count_siblings(TT *tt);
Returns the number of nodes having the same direct parent as
root. root itself is
included.
Status indicators; getting
- int tt_has_data(TT *tt);
Returns TRUE if tt has any
data associated with it.
- int tt_is_ready(TT *tt);
Used internally.
Used in some constructors to determine if the node has been fully
constructed. Outside these constructors, the flag has no meaning except for
whatever you might assign to it. You can safely use this flag for your own
purposes, as long as you don't depend on its initial value. It is not touched
by any elementary operations, and its value is preserved in duplicates.
- int tt_data_is_internal(TT *tt);
Returns TRUE if the node's data resides in main memory (fast access).
- int tt_data_is_local(TT *tt);
Returns TRUE if tt owns
the data it is assocated with, meaning the data will be deleted/freed along
with the node. This is true unless you explicitly assigned non-local data
to the node - normal operations just make local copies.
- int tt_is_fake_root(TT *tt);
Returns TRUE if the node was implicitly created
because the tree's upper branch expanded (e.g. you added a sibling to the old
root node), and this new root was required to maintain a valid tree structure.
You'd only have to use this if your code added siblings to potentional root
nodes - in practice, it should be avoided.
Status indicators; setting
- int tt_set_ready(TT *tt, int ready);
Used internally.
Sets the node's state of readiness to TRUE or
FALSE, depending on the
ready argument. Used in some constructors to determine
if the node has been fully constructed. You can safely use this flag for your
own purposes, as long as you don't depend on its initial value. It is not
touched by any elementary operations, and its value is preserved in
duplicates.
- int tt_set_internal(TT *tt, int internal);
Used internally.
Used to indicate if the node's data resides in main memory (fast access) or
on an external medium. You shouldn't have to use this.
- int tt_set_fake_root(TT *tt, int fake_root);
Used internally.
Used to indicate if the node was implicitly created because the tree's upper
branch expanded (e.g. you added a sibling to the old root node), and this new
root was required to maintain a valid tree structure. You shouldn't have to
use this.
Hashing
- u32 tt_hash(TT *tt);
Returns a RIPE-MD160 hash of tt's data, XOR-collapsed
to 32 bits.
- u32 tt_hash_all(TT *tt);
Returns a running RIPE-MD160 hash of the tree defined by
tt, XOR-collapsed to 32 bits.
Printable I/O
- TT *tt_scan_from_file(FILE *in);
Reads a printable token tree from in, and if it's
well-formed, returns the root of the resulting token tree. Otherwise,
it returns NULL.
- void tt_print_to_file(TT *tt, FILE *out, TT_PRINT_MODE mode, int honour_meta);
Prints a token tree defined by tt in human-readable,
7-bit ASCII to out, using the indenting style specified
by mode. honour_meta is not
used for anything yet. Set it to 0.
Supported values for mode:
TT_PRINT_COMPACT: Uses as little spacing as
possible. Output will look like a solid block of text, almost unreadable.
TT_PRINT_KNR: Uses K&R-style indenting.
TT_PRINT_ALLMAN: Uses Allman-style (also called
BSD-style) indenting.