What are the chances of sampling the same thing twice?
How many unique bugs are in there?
Creation of test communities
Similarity/Distance coefficient on binary data