Authors
Toan Nguyen Mau and Yasushi Inoguchi, JAIST Inoguchi Laboratory, Japan
Abstract
Locality-sensitive hashing(LSH) is a significant algorithm for big-data hashing. The original LSH uses a static hash-table as a reduce mapping for the data. Which make LSH challenging to apply on real-time information retrieval system. The database of a realtime system needs to be scalably updated over time. In this research, we concentrate on increasing the accuracy, searching speed and throughput of the nearest neighbor searching problem on big dynamic database. The dynamic Locality-sensitive hashing(DLSH) is proposed for facing the static problem of original LSH. DLSH is targeted for deploying on main memory or GPGPU's global memory, which can increase the throughput searching by parallel processing on multiple cores. We analyzed the efficiency of DLSH by building the big dataset of structured audio fingerprint and comparing the performance with original LSH. To achieve the dynamics, DLSH requires more memory space and takes slightly slower than the LSH. With DLSH's advantages, it can be improved and fully applied in practice in a real-life information retrieval system.
Keywords
Locality-sensitive hashing, Structured dataset, GPGPU Memory, Similarity Searching, Parallel Processing