Share: Brisingr

Good hunting, O Wild Ones. May the wind rise under your wings, may the sun always be at your backs, and may you catch your prey napping. And, Wolf-Eyes, I hope that when you find the one who left your paws in his traps, you do not kill him too quickly.

DRY, YAGNI and other principles.

Better version of those randomly thrown about terms. Taken from here
That the Temple is in need; that the need is best met by code; that the code does not yet exist; that the existence may be achieved with reasonable effort; that the effort is best expended now and by myself.

Share: Brisingr

“I have not the slightest idea. Tenga always had a question he was trying to answer. If he succeeded, he immediately chose another one, and so on. He may have answered a hundred questions since I last saw him, or he may still be gnashing his teeth over the same conundrum as when I left him.” (Moon+ Reader Pro v2.6.5, Brisingr)

Share: Brisingr

To while away the day contemplating evils that might have been is to poison the happiness we already have (Moon+ Reader Pro v2.6.5, Brisingr)

Bloom Filter — How the code looks and reads

And here comes bloom filter. This one's not from any OSS project, but found here.
<<bloom.h>>=
#ifndef __BLOOM_H__
#define __BLOOM_H__

#include<stdlib.h>

typedef unsigned int (*hashfunc_t)(const char *);
typedef struct {
size_t asize;
unsigned char *a;
size_t nfuncs;
hashfunc_t *funcs;
} BLOOM;

BLOOM *bloom_create(size_t size, size_t nfuncs, ...);
int bloom_destroy(BLOOM *bloom);
int bloom_add(BLOOM *bloom, const char *s);
int bloom_check(BLOOM *bloom, const char *s);

#endif
Again, the data structure doesn't tell the story, but hints some things. Infact, I think this is one of those "more algorithms than data structure" category math object, even though it's more often called a data structure. Here's the code that actually does the clever bit.
#define SETBIT(a, n) (a[n/CHAR_BIT] |= (1<<(n%CHAR_BIT)))
#define GETBIT(a, n) (a[n/CHAR_BIT] & (1<<(n%CHAR_BIT)))
int bloom_add(BLOOM *bloom, const char *s)
{
size_t n;

for(n=0; n<bloom->nfuncs; ++n) {
SETBIT(bloom->a, bloom->funcs[n](s)%bloom->asize);
}

return 0;
}

int bloom_check(BLOOM *bloom, const char *s)
{
size_t n;

for(n=0; n<bloom->nfuncs; ++n) {
if(!(GETBIT(bloom->a, bloom->funcs[n](s)%bloom->asize))) return 0;
}

return 1;
}
Let's take closer look at those two functions above. 1. bloom_add, Let's ignore the for loop, as it seems to apply a function to the given value before setting the BF's bit. For the moment assume it's one function. The macro definition, finds the character corresponding(i.e: by taking modulo of the value/systems' default CHAR_BIT value, as defined by limits.h.) What does SETBIT do?(It takes in a pointer(array of chars, a value(after applying one of the hash functions defined to original value and taking a modulo of asize(filter size ) So far it has all been stuff, that's unrelated to the interesting math part, but necessary for the <side effects galore> real world computation. At the core of the logic, it's straightforward: a, Left shift the Char value( after finding modulo, for splitting the characters inside the char array) b, Or it with the already existing char value. 2. For the bloom_check function,Ignoring the for loop, we find the GETBIT macro as the core functionality. The GETBIT, simply returns the result of 'AND' operation on the existing value, and the new value. So this is all fine and dandy, but why is this special? or for that matter, how's it different from some other scheme. We the modulo,(i.e splitting the input value into <sizeof(char)> sized parts ) does nothing to limit the size of the input value. Here's where that for loop we ignored comes in. It applies one hash function(or many) to the input. These hash functions do the job of transforming/mapping the actual unlimited sized input into a limited sized char array. Here's the hash function code that comes along with the example program at the site.
unsigned int sax_hash(const char *key)
{
unsigned int h=0;

while(*key) h^=(h<<5)+(h>>2)+(unsigned char)*key++;

return h;
}

unsigned int sdbm_hash(const char *key)
{
unsigned int h=0;
while(*key) h=(unsigned char)*key++ + (h<<6) + (h<<16) - h;
return h;
}

So how exactly does this hash function, take in a random set of characters and always produces a finite size/limited number of characters? And what are the trade offs? How does it maintain unique mapping under that condition? when does it fail? Stay Tuned.

Share: Dune: House Corrino

All proofs inevitably lead to propositions that have no proof. All things are known because we want to believe in them. (Moon+ Reader Pro v2.6.5, Dune: House Corrino)

Share: Dune: House Corrino

Shaddam needed someone he could trust to act without question, because there would be many questions (Moon+ Reader Pro v2.6.5, Dune: House Corrino)

Quotes:House Harkonnen

The capacity to learn is a gift; The ability to learn is a skill; The willingness to learn is a choice. (Moon+ Reader Pro v2.6.5, Dune: House Harkonnen)

Share: Dune: House Atreides

Blindness can take many forms other than the inability to see. Fanatics are often blinded in their thoughts. Leaders are often blinded in their hearts. —The Orange Catholic Bible (Moon+ Reader Pro v2.6.5, Dune: House Atreides)