Skip to content

Implementation of Cuckoo Filter, a probabilistic data structure used for fast set membership testing.

License

Notifications You must be signed in to change notification settings

tradman1/cuckoo-filter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cuckoo-filter

Implementation of Cuckoo Filter, a probabilistic data structure used for fast set membership testing.

This is a project for Bionformatics course at FER, University of Zagreb. You can visit the course website here.

Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. Unlike Bloom filter, Cuckoo filters provide the flexibility to add and remove items dynamically. A cuckoo filter is based on cuckoo hashing and it is essentially a cuckoo hash table storing each key's fingerprint. Cuckoo hash tables can be highly compact and can use less space than Bloom filters, which is useful for applications that require low false positive rates.

Build

To build the example test:

$ make test

To build the benchmarks:

$ make benchmark

To build the main program for genome testing:

$ make main

Run

test and benchmark are run with following parameter:

  • Number of inserted elements: int

Cuckoo filter can be tested with by running main in the terminal with the following parametrs:

  • Apsolute path to genome txt:string
  • Length of k-meres:int
  • Algorithm:int (it represents wheater lookup will be done with random generated strings or substrings from uploaded genome):
    • 1 – substrings from genome
    • else – random strings

The program will generate an output .txt file that will have:

  • Number of inserted substrings
  • Percentage of inserted substrings
  • Percentage of false positive rates
  • Execution speed

Authors

About

Implementation of Cuckoo Filter, a probabilistic data structure used for fast set membership testing.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published