Crime Event Embedding with Unsupervised Feature Selection
This work addresses crime analysis for law enforcement by improving event embedding, though it is incremental as it builds on existing RBMs with a new regularization approach.
The authors tackled the problem of embedding crime events by jointly capturing time, location, and text components using regularized Restricted Boltzmann Machines, achieving better embeddings and interpretable feature selection on a large-scale dataset.
We present a novel event embedding algorithm for crime data that can jointly capture time, location, and the complex free-text component of each event. The embedding is achieved by regularized Restricted Boltzmann Machines (RBMs), and we introduce a new way to regularize by imposing a $\ell_1$ penalty on the conditional distributions of the observed variables of RBMs. This choice of regularization performs feature selection and it also leads to efficient computation since the gradient can be computed in a closed form. The feature selection forces embedding to be based on the most important keywords, which captures the common modus operandi (M. O.) in crime series. Using numerical experiments on a large-scale crime dataset, we show that our regularized RBMs can achieve better event embedding and the selected features are highly interpretable from human understanding.