Seeing the Unseen Network: Inferring Hidden Social Ties from Respondent-Driven Sampling
This work addresses a key challenge in epidemiological research for public health by enabling better understanding of social ties in hidden populations, though it is an incremental improvement in network inference methods.
The paper tackled the problem of inferring hidden social network structures from respondent-driven sampling data, which is used for hard-to-reach populations like drug users, by developing an algorithm called RENDER that reconstructs the underlying network with demonstrated effectiveness on synthetic and real data.
Learning about the social structure of hidden and hard-to-reach populations --- such as drug users and sex workers --- is a major goal of epidemiological and public health research on risk behaviors and disease prevention. Respondent-driven sampling (RDS) is a peer-referral process widely used by many health organizations, where research subjects recruit other subjects from their social network. In such surveys, researchers observe who recruited whom, along with the time of recruitment and the total number of acquaintances (network degree) of respondents. However, due to privacy concerns, the identities of acquaintances are not disclosed. In this work, we show how to reconstruct the underlying network structure through which the subjects are recruited. We formulate the dynamics of RDS as a continuous-time diffusion process over the underlying graph and derive the likelihood for the recruitment time series under an arbitrary recruitment time distribution. We develop an efficient stochastic optimization algorithm called RENDER (REspoNdent-Driven nEtwork Reconstruction) that finds the network that best explains the collected data. We support our analytical results through an exhaustive set of experiments on both synthetic and real data.