Hash tables

Submitted by raevnos on Mon, 2007-06-04 22:26

Hash tables are a (usually) fast lookup function for (string key, value) tuples. They are used for things like looking up softcode functions. They support lookup of exact key names, and iteration through all elements in a random order.


To use hash tables, #include "htab.h" if it isn't already. Hash tables can then be declared with the HASHTAB type.

For example:

HASHTAB htab_sample;

Before using a hash table, it must be initialized with hash_init. This takes three arguments: A pointer to the hash table, an initial size of the table, and a pointer to a function that is passed the data stored in an entry when it's deleted (Or NULL). hashinit() is a macro that passes a NULL cleanup function and only takes the first two arguments.

hash_init(&htab_sample, 256, NULL);

When done with a hashtable, call hash_free() on it.

Once you've initialized a table, you can add elements to it with hashadd(). It takes three arguments: The string to use as a key, a pointer to the data to store, and a pointer to the table. The key is copied; the data pointer is stored untouched. It is this that's passed to a cleanup function if given in hash_init().

hashadd("foo", &some_value, &htab_sample);.

You can look up elements with hashfind(). Its arguments are the key to look up and a pointer to the table. It returns the data pointer for the associated key, or NULL if the key was not found. Hence, it's not a good idea to use a NULL data pointer.

if (hashfind("foo", &htab_sample)) { item_present(); }

Delete elements from the table with hashdelete(), which takes the key and pointer to table arguments just like hashadd().

Iterate through all data pointers in the hash table with hash_firstentry()/hash_nextentry(), which returns the first data pointer, followed by every other data pointer, returning NULL after the last. hash_firstentry_key()/hash_nextentry_key() does the same thing for the keys; you cannot currently (1.8.3p3) get both key and data in one iteration; this might change in the future.


Hash tables use cuckoo hashing, which produces very dense memory-saving tables. They strongly favor usages where there are few insertions (Which can potentially be very costly), and many lookups (Which are very fast and cheap).