Genetic genealogy search has emerged as a powerful technique for identifying individuals by leveraging their genetic information and a genealogical network. The current practice relies on searching within a pre-constructed database containing genetic data from many individuals, and as such exposes those in the database to substantial privacy risks.
Motivated by these privacy concerns, we propose a framework of genealogy search that takes into account the amount of privacy exposure. In contrast to the existing static approach of collecting a large amount of genetic data beforehand, we advocate for a new search paradigm whereby genetic samples are accessed in a sequential manner.
Our results show that carefully designed sequential search procedures can significantly outperform existing static approaches in terms of the trade-off between cost and privacy exposure. We further characterize the optimal trade-off, and propose a family of search strategies that provably achieve the it over path- and grid-like networks.
Finally, we validate our findings via numerical experiments on both real and synthetic genealogical networks and discuss the policy implications of our results.