Container Concepts¶
Generics¶
The containers in this library are template based. This library uses a rudimentary #define mechanism to generate generic versions from a common source supporting all the primitive types as well as arbitrary user defined types. A C++ compiler is not required to build the templated code.
Not only does the C-Craft Collections Library implement many common collections, it also provides a mechanism by which a container of any type may be generated. The following is how a list of user structs may be declared in ‘list-user.h’.
#ifndef __LIST_USER_H__
#define __LIST_USER_H__
#define TYPE struct user *
#define NAME user
#define UNAME USER
#include <c-craft/list-decl.h>
#undef TYPE
#undef NAME
#undef UNAME
#endif /* __LIST_USER_H__ */
and how it might be implemented in ‘list-user.c’.
#include "list-user.h"
#define TYPE struct user *
#define NAME user
#define UNAME USER
#include <c-craft/list-impl-c.h>
#undef TYPE
#undef NAME
#undef UNAME
By including ‘list-user.h’ and linking with ‘list-user.c’ you may now write code like:
#include "list-user.h"
struct user *freddy = { .name = "freddy", .password = "secret" };
LIST_USER user_list = list_user_create(mem);
list_user_insert(user_list, freddy);
Template Interface¶
In the above examples the template arguments are implemented as #defines prior to including the template code. You can generate code for any type by #defining the type and names you want. Each container type is documented in this guide with explanations of the required #defines. For simple containers these are standardised as above, but be aware that maps require more #define parameters, because of the complexity.
Note the pattern of #defining parameters, including the template and then #undefing the parameters.
The generated code follows the general pattern of functions and data types of the form:
CONTAINER_UNAME cntr = container_name_create(...);
type container_name_operation(cntr, ...);
This is required because C does not support function overloading by type. Different names are needed to avoid ambiguity. For example, here is the list-double.h file provided as part of the C-Craft collections library.
#define TYPE double
#define NAME double
#define UNAME DOUBLE
#include "list-decl.h"
#undef TYPE
#undef NAME
#undef UNAME
And here are example functions generated.
LIST_DOUBLE list = list_double_create(mem);
list_double_append(list, 0.2345);
double x = list_double_get(list, 0);
Note that often the NAME and UNAME defines match the types, as in list doubles. However, this is not always the case. The corresponding C-Craft list of char * is defined as:
#include "list-str.h"
#define TYPE const char *
#define NAME str
#define UNAME STR
#include <c-craft/list-impl-c.h>
#undef TYPE
#undef NAME
#undef UNAME
In this case the example usage would be:
LIST_STR list = list_str_create(mem);
list_str_append(list, "Hello, collections library !!");
char *msg = list_str_get(list, 0);
Maps¶
Maps are more complex. Here is a map of char * keyed by char * in a file ‘map-str-str.h’.
#define VNAME str
#define KNAME str
#define UVNAME STR
#define UKNAME STR
#define KTYPE char *
#define VTYPE char *
#define VECTOR
#define STRING_HASH
#include <c-craft/map-decl.h>
#undef KNAME
#undef VNAME
#undef KTYPE
#undef VTYPE
#undef UVNAME
#undef UKNAME
#undef VECTOR
#undef STRING_HASH
Hashing¶
Notice that the above definition includes the line:
#define STRING_HASH
This is required for maps keyed by a char *. When a key is used to find an item in a map the first operation is to hash the key to select the slot for lookup. A hash of a char * directly would only work for the exact same char * value. This is not usually what is required for a string lookup. The string key for lookup may be the same characters but a different char * pointer. For this common use case a string hash may be used instead, by specifying STRING_HASH.
The effect of not using STRING_HASH is that lookups expected to succeed may fail, and this can be confusing initially.
Note that if you are using atoms for keys you do not need to use STRING_HASH since an exact string match will also be the same char * pointer, so will hash to the same slot.
Floating Point Keys¶
If you create a map based on a key with floating point numbers you must provide a custom comparison function. This is because floating point numbers cannot be reliably compared for equality, only approximate equality, and key equality is critical for map operation.
Hash Slots¶
Note that when creating maps the create call must specify the number of slots for the hash table. This value is quite important for optimum lookup performance and depends on the speed/space trade offs you wish to make. It is also important to choose a value that reduces hash collisions. You should research before choosing an arbitrary value.
Implementation Choices¶
Two implementations of maps are supported. Both versions maintain a table indexed by the hash of the key, but the slot implementation may be a LIST or a VECTOR. Vectors are generally more efficient for map usage so you should prefer that. However, lists may scale better, so remain an option.
Naming¶
There is scope for confusion between the contained type and the lookup key type, although not for the above case since the key and value are the same types.
The convention is that key comes after the contained type. So for a VECTOR_MAP_STR_INT think of it as a ‘map of strings keyed by int’. EG:-
/* Map of names keyed by age in years for fast lookup */
VECTOR_MAP_STR_INT map = vector_map_str_int_create(mem, 32, cmp_str);
Comparison¶
Some of the container functions require a comparison function. For example, searching functions, such as list_type_where(), and, of course, for map lookup. The C-Craft library provides a set of comparison functions for all the fundamental types and char *. You may create you own:-
#define TYPE USER
#define NAME user
#include <c-craft/cmp-decl.h>
#undef TYPE
#undef NAME
#define TYPE USER
#define NAME user
#include <c-craft/cmp-impl-c.h>
#undef TYPE
#undef NAME
However, be aware that this will generate comparison functions that simply compare using ‘<’, ‘>’, and ‘==’. This may not be what you want and may not even compile, depending on the type.
For complex user types you should write custom comparison functions that are meaningful for the type.
Comparison functions are often used as parameters to a container library call. They may also be used directly as first class functions. Using comparison functions simply requires:
#include <c-craft/cmp.h>
cmp_str("Hello", "Goodbye"); /* Returns non-zero */
cmp_int(23, 23); /* Returns zero */
Maps use comparison functions to test element equality. The function used should be the generic comparison for the key name in the map declaration (NAME #define). A map keyed by char* (str) will ultimately use the standard C library function strcmp(). Primitive types, including void* (ptr) and floating point will use a direct comparison.
Floating point comparison may not behave as you might expect. By their nature, floating point values are an approximation, to some degree, and two seemingly identical values, may not precisely match. It is recommended to use a custom comparison function that includes a delta value. For example:
/* Compare doubles with delta */
int cmp_double(double v1, double v2)
{
const double delta = 0.0001;
double v = v1 - v2;
if(abs(v) < delta)
return 0;
return v > 0.0 ? 1 : -1;
}
Operator Functions¶
Operator functions are useful for consistent conditional tests. They are used in this library for filtering collections. All operator functions compare one value to another and return TRUE or FALSE.
#include <c-craft/cmp.h>
less_than_int(4, 3); /* Returns TRUE */
less_than_int(23, 23); /* Returns FALSE */
There are similar functions for greater_than and equals.
Memory Considerations¶
Transient generic collections are extremely useful and memory safe. However, if you require a collection to be long-lived, such as in a service application you must consider memory usage. Every added element will consume memory, and memory is not released until the memory scope is released. Eventually the service will run out of memory.
You can counteract this by recycling collections periodically. You do this with the collection copy function.
QUEUE_TASK tasks; /* Long lived collection. */
MEM_SCOPE mem; /* Scope used by long-lived collection. */
MEM_SCOPE old_memory = mem;
mem = sm_create(0);
tasks = queue_task_copy(mem, tasks, NULL);
sm_free(old_memory);
Now ‘tasks’ has been moved to a new collection constructed of only the active elements and stale memory is released. However, it will now point to a different in memory location so you will need to be careful to manage access to it. Specifically, do not allow copies of ‘tasks’. If you do they will need to be updated.
