Giter Club home page Giter Club logo

klib's Introduction

Klib: a Generic Library in C

Overview

Klib is a standalone and lightweight C library distributed under MIT/X11 license. Most components are independent of external libraries, except the standard C library, and independent of each other. To use a component of this library, you only need to copy a couple of files to your source code tree without worrying about library dependencies.

Klib strives for efficiency and a small memory footprint. Some components, such as khash.h, kbtree.h, ksort.h and kvec.h, are among the most efficient implementations of similar algorithms or data structures in all programming languages, in terms of both speed and memory use.

A new documentation is available here which includes most information in this README file.

Common components

Components for more specific use cases

Methodology

For the implementation of generic containers, klib extensively uses C macros. To use these data structures, we usually need to instantiate methods by expanding a long macro. This makes the source code look unusual or even ugly and adds difficulty to debugging. Unfortunately, for efficient generic programming in C that lacks template, using macros is the only solution. Only with macros, we can write a generic container which, once instantiated, compete with a type-specific container in efficiency. Some generic libraries in C, such as Glib, use the void* type to implement containers. These implementations are usually slower and use more memory than klib (see this benchmark).

To effectively use klib, it is important to understand how it achieves generic programming. We will use the hash table library as an example:

#include "khash.h"
KHASH_MAP_INIT_INT(m32, char)        // instantiate structs and methods
int main() {
    int ret, is_missing;
    khint_t k;
    khash_t(m32) *h = kh_init(m32);  // allocate a hash table
    k = kh_put(m32, h, 5, &ret);     // insert a key to the hash table
    if (!ret) kh_del(m32, h, k);
    kh_value(h, k) = 10;             // set the value
    k = kh_get(m32, h, 10);          // query the hash table
    is_missing = (k == kh_end(h));   // test if the key is present
    k = kh_get(m32, h, 5);
    kh_del(m32, h, k);               // remove a key-value pair
    for (k = kh_begin(h); k != kh_end(h); ++k)  // traverse
        if (kh_exist(h, k))          // test if a bucket contains data
			kh_value(h, k) = 1;
    kh_destroy(m32, h);              // deallocate the hash table
    return 0;
}

In this example, the second line instantiates a hash table with unsigned as the key type and char as the value type. m32 names such a type of hash table. All types and functions associated with this name are macros, which will be explained later. Macro kh_init() initiates a hash table and kh_destroy() frees it. kh_put() inserts a key and returns the iterator (or the position) in the hash table. kh_get() and kh_del() get a key and delete an element, respectively. Macro kh_exist() tests if an iterator (or a position) is filled with data.

An immediate question is this piece of code does not look like a valid C program (e.g. lacking semicolon, assignment to an apparent function call and apparent undefined m32 'variable'). To understand why the code is correct, let's go a bit further into the source code of khash.h, whose skeleton looks like:

#define KHASH_INIT(name, SCOPE, key_t, val_t, is_map, _hashf, _hasheq) \
  typedef struct { \
    int n_buckets, size, n_occupied, upper_bound; \
    unsigned *flags; \
    key_t *keys; \
    val_t *vals; \
  } kh_##name##_t; \
  SCOPE inline kh_##name##_t *init_##name() { \
    return (kh_##name##_t*)calloc(1, sizeof(kh_##name##_t)); \
  } \
  SCOPE inline int get_##name(kh_##name##_t *h, key_t k) \
  ... \
  SCOPE inline void destroy_##name(kh_##name##_t *h) { \
    if (h) { \
      free(h->keys); free(h->flags); free(h->vals); free(h); \
    } \
  }

#define _int_hf(key) (unsigned)(key)
#define _int_heq(a, b) (a == b)
#define khash_t(name) kh_##name##_t
#define kh_value(h, k) ((h)->vals[k])
#define kh_begin(h, k) 0
#define kh_end(h) ((h)->n_buckets)
#define kh_init(name) init_##name()
#define kh_get(name, h, k) get_##name(h, k)
#define kh_destroy(name, h) destroy_##name(h)
...
#define KHASH_MAP_INIT_INT(name, val_t) \
	KHASH_INIT(name, static, unsigned, val_t, is_map, _int_hf, _int_heq)

KHASH_INIT() is a huge macro defining all the structs and methods. When this macro is called, all the code inside it will be inserted by the C preprocess to the place where it is called. If the macro is called multiple times, multiple copies of the code will be inserted. To avoid naming conflict of hash tables with different key-value types, the library uses token concatenation, which is a preprocessor feature whereby we can substitute part of a symbol based on the parameter of the macro. In the end, the C preprocessor will generate the following code and feed it to the compiler (macro kh_exist(h,k) is a little complex and not expanded for simplicity):

typedef struct {
  int n_buckets, size, n_occupied, upper_bound;
  unsigned *flags;
  unsigned *keys;
  char *vals;
} kh_m32_t;
static inline kh_m32_t *init_m32() {
  return (kh_m32_t*)calloc(1, sizeof(kh_m32_t));
}
static inline int get_m32(kh_m32_t *h, unsigned k)
...
static inline void destroy_m32(kh_m32_t *h) {
  if (h) {
    free(h->keys); free(h->flags); free(h->vals); free(h);
  }
}

int main() {
	int ret, is_missing;
	khint_t k;
	kh_m32_t *h = init_m32();
	k = put_m32(h, 5, &ret);
	if (!ret) del_m32(h, k);
	h->vals[k] = 10;
	k = get_m32(h, 10);
	is_missing = (k == h->n_buckets);
	k = get_m32(h, 5);
	del_m32(h, k);
	for (k = 0; k != h->n_buckets; ++k)
		if (kh_exist(h, k)) h->vals[k] = 1;
	destroy_m32(h);
	return 0;
}

This is the C program we know.

From this example, we can see that macros and the C preprocessor plays a key role in klib. Klib is fast partly because the compiler knows the key-value type at the compile time and is able to optimize the code to the same level as type-specific code. A generic library written with void* will not get such performance boost.

Massively inserting code upon instantiation may remind us of C++'s slow compiling speed and huge binary size when STL/boost is in use. Klib is much better in this respect due to its small code size and component independency. Inserting several hundreds lines of code won't make compiling obviously slower.

Resources

  • Library documentation, if present, is available in the header files. Examples can be found in the test/ directory.
  • Obsolete documentation of the hash table library can be found at SourceForge. This README is partly adapted from the old documentation.
  • Blog post describing the hash table library.
  • Blog post on why using void* for generic programming may be inefficient.
  • Blog post on the generic stream buffer.
  • Blog post evaluating the performance of kvec.h.
  • Blog post arguing B-tree may be a better data structure than a binary search tree.
  • Blog post evaluating the performance of khash.h and kbtree.h among many other implementations. An older version of the benchmark is also available.
  • Blog post benchmarking internal sorting algorithms and implementations.
  • Blog post on the k-small algorithm.
  • Blog post on the Hooke-Jeeve's algorithm for nonlinear programming.

klib's People

Contributors

arrbee avatar attractivechaos avatar daviesrob avatar dnbaker avatar droe avatar immerrr avatar jkbonfield avatar jmarshall avatar johnm avatar kloetzl avatar leecbaker avatar lh3 avatar pmelsted avatar sciascid avatar valeriuo avatar zhanxw avatar zjf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

klib's Issues

Request for function: store and load for khash.h

I have been using khash a lot in the last couple of years, it is very useful code, but one limitation keeps showing up - there is no way to maintain the hash information between runs of a program. It would be fantastic if store and load operations were provided. I suggested this about 3 years ago here:

https://attractivechaos.wordpress.com/2009/09/29/khash-h/#comment-2193

but as far as I can tell those capabilities have not been added.

In its current state khash is great for indexing a lot of information, then doing some enormous calculation that requires accessing large parts of that information. Once. What it isn't good for is retrieving a little bit of that pile of information, from time to time. To do that the entire input set must be parsed and reloaded each time, and doing that can take many orders of magnitude longer than the
single (or a few) look up operations that are the actual function needed. In the program I am currently working on this manifests as slow debugging cycles. The data load operations with parsing take about 5 minutes, all that before the code being worked on is entered. The C data structures outside of khash could easily be stored to disk and then reloaded or memory mapped. But I don't have any way of doing that with the khash data structures.

I'm guessing that the reason this has not yet been implemented is that khash uses char pointers to strings, so a simple binary save will not work. One way around that might be to have an option where khash is commanded to enter a "static" state, where it gathers all of those strings into one big array and releases the little bits and pieces. After that those string references would be from from base+offset rather than by simple pointer. (Store offset in the char * values, just cast it before use as an offset.) That single big array could then be stored/loaded rapidly, and the offset values in the "h" structure would still be valid when reloaded too. That is, if a current bit of code is:

char *sptr = KHASH_OPERATION_RETURNING_POINTER_BY_INT(x);
it becomes

char *sptr = h.base + (unsigned long long) KHASH_OPERATION_RETURNING_POINTER_BY_INT(x);
Note that if h.base is 0 this is equivalent to the current situation. Once "static" it would be hard to add more data, which would be a down side.

Helper functions for khash (patch available)

I've been using klib's khash for my own project, and I've written some convenience functions. I was wondering if there might be interest in incorporating them into the core implementation:

  • kh_getval(name, h, key, defval): Return the value corresponding to key key, or defval if the key is not found.
  • kh_set(name, h, key, val): Set the value corresponding to key key to val, inserting into table if key is absent and overwriting if present.
  • kh_unset(name, h, key): A more user-friendly kh_del(): checks if key is in the table, and deletes if it is.

I would be happy to submit a pull request incorporating these.

Array keys

Klib looks really interesting! I would love to use it in our Snabb Switch project. Snabb Switch is a simple and fast IP-networking toolkit based on LuaJIT.

Questions!

  • Is there a LuaJIT binding anywhere for khash and/or ktree?
  • Could these data structures support fixed-size array keys (with embedded zeros)?

I expect that we would want to have hundreds of instances of tables/trees, each with up to around a million items, and key types like int64, char[16], char[40] containing information that we have extracted from network packets (e.g. IPv6 addresses).

I have previously investigated Judy Arrays in 1250 LOC but I did not manage to match that to our needs.

Help to get started would be much appreciated!

kvec.h: kv_a compilation error with c-compiler (not cpp)

I can not compile kvec.h with kv_a used with a pure c-compiler.
I think, it has something to with with the comma operator on the left hand site.
The error message is:

lvalue required as left operand of assignment

I tried:

~/cpp-test $ cat comma.c
int main(void) {
int i , j;

j = 0;
(j++, i) = 1;

return 0;

}
~/cpp-test $ CC=gcc make comma
gcc comma.c -o comma
comma.c: In function ‘main’:
comma.c:5:14: error: lvalue required as left operand of assignment
make: *** [comma] Error 1
~/cpp-test $ CC=g++ make comma
g++ comma.c -o comma
~/cpp-test $

Need to check how much memory my hash table has taken after inserting 1 million key values

Here is the code.

https://github.com/attractivechaos/klib/blob/master/khash.h

What I want: I am entering n unique entry in this table. and after inserting key and value i want to check how much space this hastable is consuming.

Issue: Example after entering 1000 entry. When i try to get the size of hastable using kh_size(h) function given in "khash.h" I am getting 387 entry only. But when i am trying get the value by giving the key I am able to get.

Here is my code: PFA.
`#include <stdio.h>
#include <stdlib.h>
#include "khash.h"
#include "string.h"

#define kh_get_val(kname, hash, key, defVal) ({k=kh_get(kname, hash, key);(k!=kh_end(hash)?kh_val(hash,k):defVal);})
#define kh_set(kname, hash, key, val) ({int ret; k = kh_put(kname, hash,key,&ret); kh_value(hash,k) = val; ret;})

const int khIntInt = 33;

KHASH_MAP_INIT_INT(khIntInt, long long) // setup khash to handle string key with int payload
//KHASH_MAP_INIT_STR(khStrInt, long long) // setup khash to handle string key with int payload
int int_key_int_body_example(char * filename)
{
printf("\n\nBegin khash test string key with int payload\n");
int ret, is_missing;
khiter_t k;
char *tval;
khash_t(khIntInt) *h = kh_init(khIntInt); // create a hashtable
printf("\n --------------------------------- FILE READ AND WRITE OPERTAION -------------------\n");

    FILE *fp;
	int * ch;
	char line[200];
	fp=fopen(filename,"r");
	if(!fp)
	{
            perror("could not find the file");
            exit(0);
	}

printf("\n ------------------------------------------------- PUTTING ALL KEYS ---------------------\n");

int count=0;
long long val_line=0;
    while (fgets ( line, 200, fp ) != NULL ) {

    printf(" Trying to Add key = %s and value = %s count =%lu \n", line,line,count++);
    val_line = strtoll(line, NULL, 0);
    printf("\n int_key_int_body: Here is the value in integer After Atoi = %ld \n",val_line);
    kh_set(khIntInt, h, val_line, val_line);
    printf(" ### DEBUG: h->n_buckets=%d,h->size=%d,h->n_occupied=%d,h->upper_bound=%d \n ",h->n_buckets,h->size,h->n_occupied,h->upper_bound);
    }
           fseek(fp, 0, SEEK_SET);
           printf("\n ------------------------------------------------- GETTING ALL VALUES ---------------------\n");
           count=0;
          while (fgets ( line, 200, fp ) != NULL ) {
                 val_line = strtoll(line, NULL, 0);
                 printf("\n int_key_int_body: Here is the value in integer After Atoi = %ld \n",val_line);
                 k=kh_get(khIntInt, h, val_line);
                if (k == kh_end(h)) {
                     printf("no key found for line=%ld",val_line);
                } else {
                    printf("GET value for  key = %s and value = %d count =%lu and k_iterator=%d \n", 
                    line,kh_val(h,k),count++,k);
              }
           }
          long long 

h,k),count++,k);
}
}
long long no_of_elements = kh_size(h);
printf("\n No_of_elements = %lld", no_of_elements);
printf("\n ------------------------------------------------- ---------------------\n");
kh_destroy(khIntInt, h);
return 0;
}
int main( int argc, char *argv[] )
{
char *filename;
filename=argv[1];
printf("\n filename=%s",filename);
int_key_int_body_example(filename);
}`

Example for string/int, String/string khash usage

It took me a while to figure out how to use khash from the built in example. I saw similar complaints on other blog pages so I don't think I am unique. I suggest adding the following as file to the repository as an example and modifying the home page of the repository to link to it. It compiled and ran fine with gcc with the std=c11 flag set.

// khash example for string key int payload.  And string key with string payload.
//
// javadoc style docs
//   http://samtools.sourceforge.net/samtools/khash/index.html?PDefines/PDefines.html
// https://attractivechaos.wordpress.com/2008/09/02/implementing-generic-hash-library-in-c/
// https://attractivechaos.wordpress.com/2009/09/29/khash-h/
// https://attractivechaos.wordpress.com/tag/hash/
// Wrapper code for khash which is a good start for wrapping khash for ffi access from luajit.
//   https://github.com/attractivechaos/klib/issues/44

#include <stdio.h>
#include <stdlib.h>
#include "khash.h"


// shorthand way to get the key from hashtable or defVal if not found
#define kh_get_val(kname, hash, key, defVal) ({k=kh_get(kname, hash, key);(k!=kh_end(hash)?kh_val(hash,k):defVal);})

// shorthand way to set value in hash with single line command.  Returns value
// returns 0=replaced existing item, 1=bucket empty (new key), 2-adding element previously deleted
#define kh_set(kname, hash, key, val) ({int ret; k = kh_put(kname, hash,key,&ret); kh_value(hash,k) = val; ret;})

// name part of init must be unique for the key, value types.
// in this instance 33 is arbitrary symbolic name for a hashtable
// that contains string keys and int values.
const int khStrInt = 33;
KHASH_MAP_INIT_STR(khStrInt, int) // setup khash to handle string key with int payload
int string_key_int_body_example()
{
   printf("\n\nBegin khash test string key with int payload\n");
   int ret, is_missing;
   khiter_t k;
   khash_t(khStrInt) *h = kh_init(khStrInt); // create a hashtable

   // Add a value Key "apple" to the hashtable
   k = kh_put(khStrInt, h, "apple", &ret); // add the key
   kh_value(h, k) = 10; // set the value of the key

   // shorthand way to set a value in the
   // with a macro 
   kh_set(khStrInt, h, "jimbo", 99);
   kh_set(khStrInt, h, "john haze", 98);
   kh_set(khStrInt, h, "jaki joiner", 97);

   // Retrieve the value for key "apple"
   k = kh_get(khStrInt, h, "apple");  // first have to get ieter
   if (k == kh_end(h)) {  // k will be equal to kh_end if key not present
      printf("no key found for apple");
   } else {
      printf ("key at apple=%d\n", kh_val(h,k)); // next have to fetch  the actual value
   }

   // Retrieve the value for key "apple"
   k = kh_get(khStrInt, h, "john haze");  // first have to get ieter
   if (k == kh_end(h)) {  // k will be equal to kh_end if key not present
      printf("no key found for john haze\n");
   } else {
      printf ("key at john haze=%d\n", kh_val(h,k)); // next have to fetch  the actual value
   }

   // retreive key for the key "not_present"
   // which should print the error message
   k = kh_get(khStrInt, h, "not_present");  // first have to get ieter
   if (k == kh_end(h)) {  // k will be equal to kh_end if key not present
      printf("no key found for not_present\n");
   } else {
      printf ("key at not_present=%d\n", kh_val(h,k)); // next have to fetch  the actual value
   }

   // shorthand method to get value that returns value found
   // or specified default value if not present.
   int tval = kh_get_val(khStrInt, h, "apple", -1);
   printf ("shortcut tval for apple = %d\n", tval);

   tval = kh_get_val(khStrInt, h, "john haze", -1);
   printf ("shortcut tval for john haze = %d\n", tval);

   // missing key should return default value of -1
   tval = kh_get_val(khStrInt, h, "not_present", -1);
   printf ("shortcut tval for not_present = %d\n", tval);


   // Try to delete a key and then retrieve it
   // to see if it is really gone.
   k = kh_get(khStrInt, h, "john haze"); // get the ieterator
   if (k != kh_end(h)) {  // if it is found
      printf("deleting key john_haze\n");
      kh_del(khStrInt, h, k);  // then delete it.
   }
   // now see if it is really gone
   tval = tval = kh_get_val(khStrInt, h, "john haze", -1);
   printf ("after delete tval for john haze = %d\n", tval);


   // Ieterate the hash table by key, value and print out
   // the values found.
   printf("\nIeterate all keys\n");
   for (k = kh_begin(h); k != kh_end(h); ++k) {
      if (kh_exist(h, k)) {
         const char *key = kh_key(h,k);
         tval = kh_value(h, k);
         printf("key=%s  val=%d\n", key, tval);
      }
   }

   // cleanup and remove our hashtable
   kh_destroy(khStrInt, h);
   return 0;
}

char *missing = "MISSING";
const int khStrStr = 32;
KHASH_MAP_INIT_STR(khStrStr, char*); // setup khash to handle string key with string body
int str_key_str_body_example()
{
   printf("\n\nBegin khash test string key with string payload\n");
   int ret, is_missing;
   khiter_t k;
   khash_t(khStrStr) *h = kh_init(khStrStr); // create a hashtable

   // Add a value Key "apple" to the hashtable
   k = kh_put(khStrStr, h, "apple", &ret); // add the key
   kh_value(h, k) = "fruit"; // set the value of the key

   // shorthand way to set a value in hash with single line. 
   kh_set(khStrStr, h, "jimbo", "artist");
   kh_set(khStrStr, h, "john haze", "engineer");
   kh_set(khStrStr, h, "jaki joiner", "architect");

   // Retrieve the value for key "apple"
   k = kh_get(khStrStr, h, "apple");  // first have to get ieter
   if (k == kh_end(h)) {  // k will be equal to kh_end if key not present
      printf("no key found for apple");
   } else {
      printf ("key at apple=%s\n", kh_val(h,k)); // next have to fetch  the actual value
   }

   // Retrieve the value for key "apple"
   k = kh_get(khStrStr, h, "john haze");  // first have to get ieter
   if (k == kh_end(h)) {  // k will be equal to kh_end if key not present
      printf("no key found for john haze\n");
   } else {
      printf ("key at john haze=%s\n", kh_val(h,k)); // next have to fetch  the actual value
   }

   // retreive key for the key "not_present"
   // which should print the error message
   k = kh_get(khStrStr, h, "not_present");  // first have to get ieter
   if (k == kh_end(h)) {  // k will be equal to kh_end if key not present
      printf("no key found for not_present\n");
   } else {
      printf ("key at not_present=%s\n", kh_val(h,k)); // next have to fetch  the actual value
   }

   // shorthand method to get value that returns value found
   // or specified default value if not present.   Would normally
   // use NULL default if key is not present but easier to printf
   // a real value.
   char *tval = kh_get_val(khStrStr, h, "apple", missing);
   printf ("shortcut tval for apple = %s\n", tval);

   tval = kh_get_val(khStrStr, h, "john haze", missing);
   printf ("shortcut tval for john haze = %s\n", tval);

   // missing key should return default value of -1
   tval = kh_get_val(khStrStr, h, "not_present", missing);
   printf ("shortcut tval for not_present = %s\n", tval);


   // Try to delete a key and then retrieve it
   // to see if it is really gone.
   k = kh_get(khStrStr, h, "john haze"); // get the ieterator
   if (k != kh_end(h)) {  // if it is found
      printf("deleting key john_haze\n");
      kh_del(khStrStr, h, k);  // then delete it.
   }
   // now see if it is really gone
   tval = tval = kh_get_val(khStrStr, h, "john haze", missing);
   printf ("after delete tval for john haze = %s\n", tval);


   // Ieterate the hash table by key, value and print out
   // the values found.
   printf("\nIeterate all keys\n");
   for (k = kh_begin(h); k != kh_end(h); ++k) {
      if (kh_exist(h, k)) {
         const char *key = kh_key(h,k);
         tval = kh_value(h, k);
         printf("key=%s  val=%s\n", key, tval);
      }
   }

   // cleanup and remove our hashtable
   kh_destroy(khStrStr, h);
   return 0;
}


int main()
{
   string_key_int_body_example();
   str_key_str_body_example();
}

resize issue, erase function added, clear function added

kv_resize does not take into account .n member. This is an issue if we are resizing to a smaller number of items.

Also:

I wrote an erase function, which yanks out an element that one wants to erase, and memory shifts remainder of the array over the value erased. n is reduced by one, max is held the same.

define kv_erase(type, v, i) ((--(v).n), memcpy(&kv_A(v, i), &kv_A(v, i + 1), ((v).n - i) * sizeof(type)))

I wrote a kv_clear function which is just a wrapper over kv_destroy (I renamed it kv_free) and kv_init

define kv_clear(v) kv_free(v); kv_init(v)

#9 "fix" broke kv_a

When I run the kvec example in the header:

kvec_t(int) array;
kv_init(array);
kv_push(int, array, 10); // append
kv_a(int, array, 20) = 5; // dynamic
kv_destroy(array);

The line on kv_a gives me error: lvalue required as left operand of assignment. However, when I revert the change done in issue #9, removing one level of parens, it works perfectly.

strange kb_generic_cmp function

Hello,
In kbtree.h why

define kb_generic_cmp(a, b) (((b) < (a)) - ((a) < (b)))

is not simply

define kb_generic_cmp(a, b) ((a) - (b))

Compiler warnings for kh_clear & kh_destroy with Keil uVision ARM CC

When I cross-compile on ARM using Keil ARM CC, the hash table functions work but I do get compiler warnings:
warning: #177-D: function "kh_clear" was declareed but never referenced
warning: #177-D: function "kh_destroy" was declareed but never referenced

Any reason why these warnings are popping up?

lock question

I plan to use khash as a cache in a long lived server process (web server).

At times I'll need to synchronize add / updates to khash.
I'd like to avoid locking the entire table. I wanted to confirm this strategy.

cache_entry;
kh_put(..)
lock()
if (put_ret == 0)
get & free entry
add / replace entry
unlock()

Can you describe the difference between the various ksw_*

@attractivechaos @lh3

I am trying to figure out the types of alignment based on the function names, guessing the following:

  • ksw_align - local alignment: a sub-sequence of the query aligned to a sub-sequence of the target
  • ksw_extend - a prefix of the query aligned to a prefix of the target**
  • ksw_global - alignment of the full query to the full target

** doesn't behave this way, which is why I am asking

Also, I have a ksw_glocal implementation if you are looking for a contribution.

Bug in kvec.h

I think there is a bug in kv_a function in kvec.h. Line 87 is currently:

: (v).n <= (size_t)(i)? (v).n = (i) \

But it should be:

: (v).n <= (size_t)(i)? (v).n = (i+1) \

Description is required.

Hello. I found that the lib is a little hard for me to learn how to use. Most of source code has no comment. I cann't understand the use of most of the library.

For example,
how to use double kmin_hj(kmin_f func, int n, double *x, void *data, double r, double eps, int max_calls) ?

If you can finish some minimal document of each function, I will be very appreciate:)

Thanks for your great library!

khash: kh_exists triggers a SIGSEGV when hash is empty

I encountered a crash in khash.h. kh_exists triggers a segmentation fault when the hash is empty.

When any key has been inserted, it works as expected:

// Working example
// Create first hash
khash_t(ptr)* hash1 = kh_init(ptr);

// Insert a value
int ret;
khiter_t k = kh_put(ptr, hash1, "key", &ret);
kh_value(hash1, k) = "asd";

// Check, if another key exists
k = kh_get(ptr, hash1, "another_key");
printf("Iterator: %u\n", k);

int val = kh_exist(hash1, k);
printf("Exists: %d\n", val);

But if no key is inserted (hash is empty), it crashes:

// Crashing example
// Create another hash
khash_t(ptr)* hash2 = kh_init(ptr);

// Check, if some key exists
k = kh_get(ptr, hash2, "key");
printf("Iterator: %u\n", k);

val = kh_exist(hash2, k); // <--- SIGSEGV
printf("Exists: %d\n", val);

memory frag question

Thanks for creating and maintain this lib.
My plan is to use khash to act as a cache in a long lived server process.
Over time my program will add / replace cache entries to a pretty well know MAX number of entries.

Should I be using a strategy to pre-alloc memory? Or is this something I need to be concerned with?

kvec : kv_a causes a compilation error

The example provided in the header file itself:

kv_a(int, array, 20) = 5;

doesn't compile under a C compiler, leading to the following message:

lvalue required as left operand of assignment

I think there is a bug in kv_a function in kvec.h. Lines 84-88 are currently:

#define kv_a(type, v, i) (((v).m <= (size_t)(i)? \
						  ((v).m = (v).n = (i) + 1, kv_roundup32((v).m), \
						   (v).a = (type*)realloc((v).a, sizeof(type) * (v).m), 0) \
						  : (v).n <= (size_t)(i)? (v).n = (i) + 1 \
						  : 0), (v).a[(i)])

#endif

hence kv_a(type, v, i) is a comma expression so cannot stand as a left value. I guess you have to remove the outer parenthesis.

EDIT
In standard C, a comma expression is not a lvalue contrary to C++. So the test file kvec_test.cc provided in the test folder compiles correctly. So the problem is a C-bug only.

[Question] Using code from Klib (Kson) into another project

In a project of mine, distributed under the LGPL license, I based a JSON reader/writer implementation upon the Kson parser code. Is the following header (placed on top of the modified files) enough to give credit to the original author ?

/* The MIT License

   Original work Copyright (c) 2014-2016 Heng Li <[email protected]>

   Permission is hereby granted, free of charge, to any person obtaining
   a copy of this software and associated documentation files (the
   "Software"), to deal in the Software without restriction, including
   without limitation the rights to use, copy, modify, merge, publish,
   distribute, sublicense, and/or sell copies of the Software, and to
   permit persons to whom the Software is furnished to do so, subject to
   the following conditions:

   The above copyright notice and this permission notice shall be
   included in all copies or substantial portions of the Software.

   THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
   EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
   MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
   NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
   BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
   ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
   CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
   SOFTWARE.
*/

//////////////////////////////////////////////////////////////////////////////////////////
//                                                                                      //
//  Modified work Copyright (c) 2016-2017 Leonardo Consoni <[email protected]>   //
//                                                                                      //
//  This file is part of Platform Utils.                                                //
//                                                                                      //
//  Platform Utils is free software: you can redistribute it and/or modify              //
//  it under the terms of the GNU Lesser General Public License as published            //
//  by the Free Software Foundation, either version 3 of the License, or                //
//  (at your option) any later version.                                                 //
//                                                                                      //
//  Platform Utils is distributed in the hope that it will be useful,                   //
//  but WITHOUT ANY WARRANTY; without even the implied warranty of                      //
//  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the                        //
//  GNU Lesser General Public License for more details.                                 //
//                                                                                      //
//  You should have received a copy of the GNU Lesser General Public License            //
//  along with Platform Utils. If not, see <http://www.gnu.org/licenses/>.              //
//                                                                                      //
//////////////////////////////////////////////////////////////////////////////////////////

kurl detect http to https redirecting

Hi,

Say I need to visit an url: http://myurl.com, while the website was automatically redirected to https://myurl.com
Is it possible that if I use kurl to visit http://myurl.com, could automatically visit https://myurl.com instead?

$ curl -I http://xxxx/xxx.bed.gz
HTTP/1.1 301 Moved Permanently
Date: Fri, 25 Apr 2014 15:35:39 GMT
Server: Apache/2.2.22 (Ubuntu)
Location: https://xxx/xxx.bed.gz
Vary: Accept-Encoding
Content-Type: text/html; charset=iso-8859-1

Thanks.

kbtree documentation

This has no documentation at all. I can decipher most of the functionality(getp,putp,delp)
but how can i iterate on the tree?

Got wrong hash for the same string for hash map

Reproduction code:

KHASH_MAP_INIT_STR(var, double);
static khash_t(var) *h = NULL;
static double get_var(const char *name) {
    khiter_t k = kh_get(var, h, name);
    double val = k != kh_end(h) ? kh_val(h, k) : 0;
    printf("value of %s is %g\n", name, val);
    return val;
}

static void set_var(const char *name, double val) {
    printf("set %s to %g\n", name, val);

    int ret;
    khiter_t k = kh_put(var, h, name, &ret);
    kh_value(h, k) = val;
}

void test() {
    if (h == NULL) {
        h = kh_init(var);
    }

    const char *foo = strdup("foo");

    set_var(foo, 20);
    free(foo);

    get_var("foo");
    get_var(strdup("foo"));
}

Result:

set foo to 20
value of foo is 0
value of foo is 20

If I turned on optimization:

set foo to 20
value of foo is 0
value of foo is 0

wtf

Consistent code style

I suggest running clang-format over all the code (with your preferred style, ofcourse).

kb_itr_end() ?

Hello,

it's not an issue. Just a kbtree question.

There's kb_itr_first() which works perfectly. Is there a way to implement function returning end() iterator of btree?

Thanks

Packaging?

Any plans on doing releases of klib? Versioning, tarballing and so on. I'd really like to see changelog notes outside the header files as well.

kh_put does't work?

init two khash map
khash_t(32) *hash_map_offset = kh_init(32);
khash_t(32) *hash_map_pktlen = kh_init(32);

after some kh_put and kh_del operating, kh_put will work incorrectly.

kvec.h kv_a need an additional level of parentheseis

Should be

define kv_a(type, v, i) (((v).m <= (size_t)(i)? \

                      ((v).m = (v).n = (i) + 1, kv_roundup32((v).m), \
                       (v).a = (type*)realloc((v).a, sizeof(type) * (v).m), 0) \
                      : (v).n <= (size_t)(i)? (v).n = (i)           \
                      : 0), (v).a[(i)] )

Otherwise, , in function call variable number of arguments, like printf,the last comma in the kv_a define will cause wrong argument mapping.

klib overview states khash is based on double hashing, but it looks like this is no longer true

It looks like it isn't true anymore because khash.h contains this comment:

2013-05-02 (0.2.8):

    * Use quadratic probing. When the capacity is power of 2, stepping function
      i*(i+1)/2 guarantees to traverse each bucket. It is better than double
      hashing on cache performance and is more robust than linear probing.

      In theory, double hashing should be more robust than quadratic probing.
      However, my implementation is probably not for large hash tables, because
      the second hash function is closely tied to the first hash function,
      which reduce the effectiveness of double hashing.

    Reference: http://research.cs.vt.edu/AVresearch/hashing/quadratic.php

multiple definition of `kb_init_xxx`

When a.h file is used to initialize a Btree structure with KBTREE_INIT, a.c includes a.h, then include a.h in main.c and a multiple definition error will occur with the a.o link.

I think the reason is that kbtree.h:62 has less static keywords.

I am working normally after adding.

`kh_int_hash_func` does nothing

Hi, sorry if I'm missing something obvious: Why is kh_int_hash_func defined to only return the key?

#define kh_int_hash_func(key) (khint32_t)(key)

Thanks.

kh_exist when using __ac_Wang_hash

Not sure if it happens with other hash functions, but when I use __ac_Wang_hash
khit = kh_get(h,key)
If key does not exist: khit = h->n_buckets.
If i then do kh_exist(h,khit) it sometimes returns TRUE, when it should be FALSE.
We can change:

define kh_exist(h, x) (!__ac_iseither((h)->flags, (x)))

TO this:

define kh_exist(h, x) (!__ac_iseither((h)->flags, (x)) && ((x) < (h)->n_buckets))

  1. Should I have used (khit < kh_end(h)) and not kh_exist(h,khit)?
  2. Would the proposed change have any unwanted side-effect?

is khash thread-safe when used read-only?

Hi,

I want to populate a hash set in the beginning of program and want to do lookups from several threads. Hash set will be read-only. Is khash thread-safe for this type of situation (i.e concurrent readers)? I looked documentation but couldn't find anything about it.

infinite loop in kseq_read for gz files that fail crc check

When a gz file fails the crc check kseq_read enters an infinite loop.

This happens in https://github.com/attractivechaos/klib/blob/master/kseq.h#L101

The __read method when using gzread returns -1 when the crc check fails and the for loop never exits.

I have reproduced this bug using the test program from your website and the two supplied files. They are identical except I messed with a CRC byte in t.fastq.gz

reads_2.fastq.gz
t.fastq.gz

Interestingly seqtk does not fail, although it doesn't read all the lines

> seqtk seq t.fastq.gz | wc -l
   39842
> seqtk seq reads_2.fastq.gz| wc -l
   40000

and gzcat (mac version of zcat) notices the error, but it returns the original data.

> gzcat reads_2.fastq.gz > /dev/null
> gzcat t.fastq.gz>  /dev/null
gzcat: invalid compressed data--crc error
gzcat: t.fastq.gz: uncompress failed

kh_get question

Hello,

I have a question about how kh_get is implemented in khash. https://github.com/attractivechaos/klib/blob/master/khash.h#L237 (Line 237)
Before the loop runs, the last index is set to the starting value of i.
In the worst case scenario, the loop would have to go through all the elements.
But as the code is currently written, I believe the loop will make 2*h->n_buckets -1 iterations.
If the assignment of last is moved inside the loop, the iterations should become h->n_buckets.
So my question is whether I'm wrong in this assumption?

Also, if my math is correct, using if(step == h->n_buckets) should also work (which would also remove the need for the last variable. Or maybe it should be if(step > h->n_buckets).

Storing and returning pointer values?

I'm replacing my own map implementation with khash in a given project. My old map could return pointers or values. This pattern still makes sense for my project. However, khash is built to return values for performance reasons, I believe, so returning pointers reliably isn't really supported, FWICS (?).

I'd like to use khash for both performance-critical and less demanding work where I just need string-key/pointer-value pairs, so am here to ask, what would you do to store/return a pointer in a khash map?

I can see 3 ways:

  • Use khash to store array indices rather than pointers... but here we'd have to also ref into that array
  • Hack it using int64_t... this should theoretically work on systems with smaller word sizes as well, but it seems a bit dodgy... my project needs to work for Arm (32-bit) & Intel (64-bit)
  • The better solution seems to be to use intptr_t as the value type, but am worried this will have some problems.

Any advice greatly appreciated.

Error in Make Test!

$ g++ -v

Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.1-2ubuntu2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.1 20160929 (Ubuntu 5.4.1-2ubuntu2) 

klib/test$ make

gcc -g -Wall -O2 -I.. -o kbtree_test kbtree_test.c
In file included from kbtree_test.c:10:0:
../kbtree.h:333:13: warning: ‘kb_itr_get_int’ defined but not used [-Wunused-function]
  static int kb_itr_get_##name(kbtree_##name##_t *b, const key_t * __restrict k,
             ^
../kbtree.h:372:2: note: in expansion of macro ‘__KB_ITR’
  __KB_ITR(name, key_t)
  ^
kbtree_test.c:11:1: note: in expansion of macro ‘KBTREE_INIT’
 KBTREE_INIT(int, uint32_t, kb_generic_cmp)
 ^
../kbtree.h:333:13: warning: ‘kb_itr_get_str’ defined but not used [-Wunused-function]
  static int kb_itr_get_##name(kbtree_##name##_t *b, const key_t * __restrict k,
             ^
../kbtree.h:372:2: note: in expansion of macro ‘__KB_ITR’
  __KB_ITR(name, key_t)
  ^
kbtree_test.c:12:1: note: in expansion of macro ‘KBTREE_INIT’
 KBTREE_INIT(str, str_t, kb_str_cmp)
 ^
gcc -g -Wall -O2 -I.. -o khash_keith khash_keith.c
gcc -g -Wall -O2 -I.. -o khash_keith2 khash_keith2.c
khash_keith2.c: In function ‘main’:
khash_keith2.c:32:9: warning: unused variable ‘l’ [-Wunused-variable]
  int i, l, n = 1000, ret;
         ^
gcc -g -Wall -O2 -I.. -o khash_test khash_test.c
gcc -g -Wall -O2 -I.. -o klist_test klist_test.c
gcc -g -Wall -O2 -I.. -o kseq_test kseq_test.c -lz
gcc -g -Wall -O2 -I.. -o kseq_bench kseq_bench.c -lz
gcc -g -Wall -O2 -I.. -o kseq_bench2 kseq_bench2.c -lz
In file included from kseq_bench2.c:6:0:
kseq_bench2.c: In function ‘ks_getc’:
kseq_bench2.c:7:19: warning: implicit declaration of function ‘read’ [-Wimplicit-function-declaration]
 KSTREAM_INIT(int, read, 4096)
                   ^
../kseq.h:74:14: note: in definition of macro ‘__KS_GETC’
    ks->end = __read(ks->f, ks->buf, __bufsize); \
              ^
kseq_bench2.c:7:1: note: in expansion of macro ‘KSTREAM_INIT’
 KSTREAM_INIT(int, read, 4096)
 ^
kseq_bench2.c: In function ‘main’:
kseq_bench2.c:39:3: warning: implicit declaration of function ‘close’ [-Wimplicit-function-declaration]
   close(fd);
   ^
gcc -g -Wall -O2 -I.. -o ksort_test ksort_test.c
g++ -g -Wall -O2 -I.. -o ksort_test-stl ksort_test.cc
In file included from ksort_test.cc:7:0:
ksort_test.cc: In function ‘void ks_sample_int(size_t, size_t, int*)’:
../ksort.h:276:10: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
    if (k != n - pop - 1) tmp = a[k], a[k] = a[n-pop-1], a[n-pop-1] = tmp; \
          ^
../ksort.h:295:36: note: in expansion of macro ‘KSORT_INIT’
 #define KSORT_INIT_GENERIC(type_t) KSORT_INIT(type_t, type_t, ks_lt_generic)
                                    ^
ksort_test.cc:8:1: note: in expansion of macro ‘KSORT_INIT_GENERIC’
 KSORT_INIT_GENERIC(int)
 ^
ksort_test.cc: In instantiation of ‘void _RadixSort_Unsigned_PowerOf2Radix_1(_Type*, long int, _Type, long unsigned int) [with _Type = unsigned int; long unsigned int PowerOfTwoRadix = 256ul; long unsigned int Log2ofPowerOfTwoRadix = 8ul; long int Threshold = 32l]’:
ksort_test.cc:776:102:   required from here
ksort_test.cc:748:55: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
   while( _current >= startOfBin[ nextBin ] && nextBin < numberOfBins )
                                                       ^
ksort_test.cc:750:70: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
  while( endOfBin[ nextBin - 1 ] == startOfBin[ nextBin ] && nextBin < numberOfB
                                                                     ^
g++ -g -Wall -O2 -I.. -o kvec_test kvec_test.cc
gcc -g -Wall -O2 -I.. -o kmin_test kmin_test.c ../kmath.c
/tmp/cc13nLlA.o: In function `kr_normal':
/home/guest/klib/test/../kmath.c:81: undefined reference to `log'
/home/guest/klib/test/../kmath.c:81: undefined reference to `sqrt'
/tmp/cc13nLlA.o: In function `kf_lgamma':
/home/guest/klib/test/../kmath.c:309: undefined reference to `log'
/home/guest/klib/test/../kmath.c:309: undefined reference to `log'
/tmp/cc13nLlA.o: In function `_kf_gammap':
/home/guest/klib/test/../kmath.c:369: undefined reference to `log'
/home/guest/klib/test/../kmath.c:369: undefined reference to `log'
/tmp/cc13nLlA.o: In function `_kf_gammaq':
/home/guest/klib/test/../kmath.c:390: undefined reference to `log'
/tmp/cc13nLlA.o:/home/guest/klib/test/../kmath.c:390: more undefined references to `log' follow
/tmp/cc13nLlA.o: In function `kf_betai_aux':
/home/guest/klib/test/../kmath.c:432: undefined reference to `exp'
/tmp/cc13nLlA.o: In function `kf_erfc':
/home/guest/klib/test/../kmath.c:336: undefined reference to `exp'
/tmp/cc13nLlA.o: In function `_kf_gammap':
/home/guest/klib/test/../kmath.c:369: undefined reference to `exp'
/tmp/cc13nLlA.o: In function `_kf_gammaq':
/home/guest/klib/test/../kmath.c:390: undefined reference to `exp'
collect2: error: ld returned 1 exit status
Makefile:48: recipe for target 'kmin_test' failed
make: *** [kmin_test] Error 1

how fix this?

How to use khash in a new thread

I'm not looking for locking or a lock-free implementation of khash (although that would be nice!)

I hate it declared as this at the top of the .c file:

KHASH_MAP_INIT_STR(string_hashmap, hw_route_entry*)

In the .h file I have this:

extern void* routes;

Later on I spawn a new thread, and inside the new thread I do the following

void set_route(void* hashmap, char* name, hw_route_entry* route_entry)
{
    routes = kh_init(string_hashmap);
    int ret;
    khiter_t k;
    khash_t(string_hashmap) *h = hashmap;
    k = kh_put(string_hashmap, h, strdup(name), &ret);
    kh_value(h, k) = route_entry;
}

The kh_put() call segfaults with the following:

Program received signal SIGSEGV, Segmentation fault.
0x000000000041a390 in __ac_X31_hash_string (s=0xffffffffb0000950 <Address 0xffffffffb0000950 out of bounds>)
    at lib/haywire/src/haywire/khash.h:387
387     khint_t h = (khint_t)*s;
(gdb) bt
#0  0x000000000041a390 in __ac_X31_hash_string (s=0xffffffffb0000950 <Address 0xffffffffb0000950 out of bounds>)
    at lib/haywire/src/haywire/khash.h:387
#1  0x000000000041aa48 in kh_put_string_hashmap (h=0x7fffb0000920, 
    key=0xffffffffb0000950 <Address 0xffffffffb0000950 out of bounds>, ret=0x7fffb83e1dd8)
    at lib/haywire/src/haywire/http_server.c:34

I'm pretty new to C, is there something I need to do to make this work in a separate thread? It works fine if I don't use a thread. I'm not sure why there's a problem because I'm calling all the khash methods mentioned above in the same thread.

Any help would be greatly appreciated :)

About Declarations and Definitions of khash

KHASH_DECLARE declares with extern, but KHASH_INIT is replaced with KHASH_INIT2(SCOPE = static kh_inline), SCOPE is always static. I want to know how to use KHASH_DECLARE. It looks like a bug.

Thanks to you.

kbtree iterator interface

I've stumbled upon this thread while looking for an adequate way to iterate over kbtree keys. Then i've ran across this kbtree variation which exposes an iterator interface instead of __kb_traverse. The following example illustrates their usage:

        int cnt = 0;
        kbitr_t itr;
        kb_itr_first_int(h, &itr);
        do {
            ++cnt;
            printf("%u\n", kb_itr_key(int, &itr));
        } while (kb_itr_next_int(h, &itr) != 0); 
        printf("[ht_khash_int] traverse size: %d\n", cnt);

Assuming that this repo is the "official" for klib, any chance that iterator version can be landed here?

Please enable the wiki on github

I've been writing some documentation for klib for personal reference, and if you enabled the wiki feature on github, i'd be able to upload it for all to see and benefit. Its a great library, and with some good docs it has great potential. Thanks again for writing it.

kgraph

Thank you for klib. Awesome!
Any plans for a basic graph library?

Thanks.

A question about how to use khash.h

I wrote the following code, kh_put -> kh_del -> kh_put

KHASH_MAP_INIT_INT(1, int)
KHASH_MAP_INIT_INT(2, int)
int main()
{
int ret;
khiter_t k;
khiter_t j;
khash_t(1) *h = kh_init(1);
khash_t(2) *i = kh_init(2);

int x = 0, y = 0;
for (; x < 100; x++)
{
    k = kh_put(1, h, x, &ret);
    kh_value(h, k) = x;

    j = kh_put(2, i, x, &ret);
    kh_value(i, j) = x + 1;
}

printf("h size:%d, i size: %d\n", kh_size(h), kh_size(i));

for (x = kh_begin(h); x < kh_end(h); x++)
{
    kh_del(1, h, x);
}

for (x = kh_begin(i); x < kh_end(i); x++)
{
    kh_del(2, i, x);
}

printf("h size:%d, i size: %d\n", kh_size(h), kh_size(i));

for (; x < 100; x++)
{
    k = kh_put(1, h, x, &ret);
    kh_value(h, k) = x;

    j = kh_put(2, i, x, &ret);
    kh_value(i, j) = x + 1;
}
printf("h size:%d, i size: %d\n", kh_size(h), kh_size(i));
return 0;

}

expect the result should be
h size:100, i size: 100
h size:0, i size: 0
h size:100, i size: 100

but got:
h size:100, i size: 100
h size:0, i size: 0
h size:0, i size: 0

can anyone tell me what's the problem? thanks.

calculating size for flags array

In the version "0.2.8" of khash you have changed the method used to calculate the size of the flags array (__ac_fsize). This change results in Keys being overwritten or not stored correctly. In my project, I changed the '__ac_fsize' function as following:

#define __ac_fsize(m) (((m)>>4) + 1)

The following implementation:

#define __ac_fsize(m) ((m) < 16? 1 : (m)>>4)

does not ensure that the size of flags > n_buckets, right?

Thanks in advance,
TB

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.