In the thesis proposal we intended to analyze a new framework for nearest neighbor search in high-dimensional Euclidean spaces and its applications, but the initial work did not give the results we had hoped for.
The work on NNS has been temporarily put on hold for an internship at Microsoft Bing in London which focused on algorithm engineering of succinct data structures. The work resulted in an implementation of a structure for prefix search with precomputed results which is competitive in time with classical data structures but with significantly smaller space occupancy. A paper with an experimental analysis is currently being written.
This work also spun off a new technique, which we called “semi-indexing”, to speed up the access data encoded in semi-structured form (such as JSON or XML) by storing information about the parse tree along with the data, while keeping the overhead practically negligible thanks to the use of succinct data structures. A paper has been submitted to a conference and we are waiting for response.
Work on NNS has been recently resumed in the context of an internship at Microsoft Research Cambridge which focuses on the application of the techniques developed for nearest neighbor search in Project Emporia, real-time recommendation engine for news articles.
For the future, we propose to continue working on nearest neighbor search problems, putting the emphasis on the practicality of the approaches considered and their real-world applications.