Prev Contents Next


Bloom Filters

Given a hash string (SHA, MD5), you can perform a
      quick filter before doing an intensive lookup

+ Very fast to identify if file is unknown
+ Known distribution size
+ Easy updates
+ No false negatives

- False positives possible

Fitting 10,000,000 MD5s on a CD distro
      2**29 bits, 30 hashes, 10**7 inputs = 0.000000% FP

Fitting 100,000,000 MD5s on a DVD distro
      2**32 bits, 30 hashes, 10**8 inputs = 0.000000109% FP