Both in fundamental search structures, such as binary search trees augmented with access frequencies, and in retrieval structures, such as inverted files, vast numbers of counts are recorded. Although they improve effectiveness, the counts take up space. In this project, we explore the trade-offs between effectiveness, query processing time, and space consumption in bringing sketching approaches, with mathematical proofs of performance, to search and retrieval.


Required knowledge: C/C++/Java programming experience; strong competence with algorithms and data structures

