Joshua Vogelstein

ML
6papers
76citations
Novelty47%
AI Score23

6 Papers

MLApr 1, 2021
Dynamic Silos: Increased Modularity in Intra-organizational Communication Networks during the Covid-19 Pandemic

Tiona Zuzul, Emily Cox Pahnke, Jonathan Larson et al.

Workplace communications around the world were drastically altered by Covid-19, related work-from-home orders, and the rise of remote work. To understand these shifts, we analyzed aggregated, anonymized metadata from over 360 billion emails within 4,361 organizations worldwide. By comparing month-to-month and year-over-year metrics, we examined changes in network community structures over 24 months before and after Covid-19. We also examined shifts across multiple communication media (email, instant messages, video calls, and calendaring software) within a single global organization, and compared them to communications shifts that were driven by changes in formal organizational structure. We found that, in 2020, organizations around the world became more siloed than in 2019, evidenced by increased modularity. This shift was concurrent with decreased stability within silos. Collectively, our analyses indicate that following the onset of Covid-19, employees began to shift more dynamically between subcommunities (teams, workgroups or functional areas). At the same time, once in a subcommunity, they limited their communication to other members of that community. We term these network changes dynamic silos. We provide initial insights into the meaning and implications of dynamic silos for the future of work.

DCNov 10, 2020
PACSET (Packed Serialized Trees): Reducing Inference Latency for Tree Ensemble Deployment

Meghana Madhyastha, Kunal Lillaney, James Browne et al.

We present methods to serialize and deserialize tree ensembles that optimize inference latency when models are not already loaded into memory. This arises whenever models are larger than memory, but also systematically when models are deployed on low-resource devices, such as in the Internet of Things, or run as Web micro-services where resources are allocated on demand. Our packed serialized trees (PACSET) encode reference locality in the layout of a tree ensemble using principles from external memory algorithms. The layout interleaves correlated nodes across multiple trees, uses leaf cardinality to collocate the nodes on the most popular paths and is optimized for the I/O blocksize. The result is that each I/O yields a higher fraction of useful data, leading to a 2-6 times reduction in classification latency for interactive workloads.

MLJun 7, 2019
Vertex Classification on Weighted Networks

Hayden Helm, Joshua Vogelstein, Carey Priebe

This paper proposes a discrimination technique for vertices in a weighted network. We assume that the edge weights and adjacencies in the network are conditionally independent and that both sources of information encode class membership information. In particular, we introduce a edge weight distribution matrix to the standard K-Block Stochastic Block Model to model weighted networks. This allows us to develop simple yet powerful extensions of classification techniques using the spectral embedding of the unweighted adjacency matrix. We consider two assumptions on the edge weight distributions and propose classification procedures in both settings. We show the effectiveness of the proposed classifiers by comparing them to quadratic discriminant analysis following the spectral embedding of a transformed weighted network. Moreover, we discuss and show how the methods perform when the edge weights do not encode class membership information.

MLDec 4, 2013
Multiscale Dictionary Learning for Estimating Conditional Distributions

Francesca Petralia, Joshua Vogelstein, David B. Dunson

Nonparametric estimation of the conditional distribution of a response given high-dimensional features is a challenging problem. It is important to allow not only the mean but also the variance and shape of the response density to change flexibly with features, which are massive-dimensional. We propose a multiscale dictionary learning model, which expresses the conditional response density as a convex combination of dictionary densities, with the densities used and their weights dependent on the path through a tree decomposition of the feature space. A fast graph partitioning algorithm is applied to obtain the tree decomposition, with Bayesian methods then used to adaptively prune and average over different sub-trees in a soft probabilistic manner. The algorithm scales efficiently to approximately one million features. State of the art predictive performance is demonstrated for toy examples and two neuroscience applications including up to a million features.

OCNov 25, 2013
Robust Multimodal Graph Matching: Sparse Coding Meets Graph Matching

Marcelo Fiori, Pablo Sprechmann, Joshua Vogelstein et al.

Graph matching is a challenging problem with very important applications in a wide range of fields, from image and video analysis to biological and biomedical problems. We propose a robust graph matching algorithm inspired in sparsity-related techniques. We cast the problem, resembling group or collaborative sparsity formulations, as a non-smooth convex optimization problem that can be efficiently solved using augmented Lagrangian techniques. The method can deal with weighted or unweighted graphs, as well as multimodal data, where different graphs represent different types of data. The proposed approach is also naturally integrated with collaborative graph inference techniques, solving general network inference problems where the observed variables, possibly coming from different modalities, are not in correspondence. The algorithm is tested and compared with state-of-the-art graph matching techniques in both synthetic and real graphs. We also present results on multimodal graphs and applications to collaborative inference of brain connectivity from alignment-free functional magnetic resonance imaging (fMRI) data. The code is publicly available.

MLNov 23, 2013
Robust Vertex Classification

Li Chen, Cencheng Shen, Joshua Vogelstein et al.

For random graphs distributed according to stochastic blockmodels, a special case of latent position graphs, adjacency spectral embedding followed by appropriate vertex classification is asymptotically Bayes optimal; but this approach requires knowledge of and critically depends on the model dimension. In this paper, we propose a sparse representation vertex classifier which does not require information about the model dimension. This classifier represents a test vertex as a sparse combination of the vertices in the training set and uses the recovered coefficients to classify the test vertex. We prove consistency of our proposed classifier for stochastic blockmodels, and demonstrate that the sparse representation classifier can predict vertex labels with higher accuracy than adjacency spectral embedding approaches via both simulation studies and real data experiments. Our results demonstrate the robustness and effectiveness of our proposed vertex classifier when the model dimension is unknown.