Factor Analysis on Citation, Using a Combined Latent and Logistic Regression Model
This is an incremental improvement for researchers analyzing citation networks, as it addresses limitations of existing models by integrating two approaches.
The authors tackled the problem of modeling citation networks by proposing a combined latent factor and logistic regression model to capture both main technological trends and ad-hoc dependencies, with simulation results showing it works well in practice and application to a real dataset revealing interesting findings.
We propose a combined model, which integrates the latent factor model and the logistic regression model, for the citation network. It is noticed that neither a latent factor model nor a logistic regression model alone is sufficient to capture the structure of the data. The proposed model has a latent (i.e., factor analysis) model to represents the main technological trends (a.k.a., factors), and adds a sparse component that captures the remaining ad-hoc dependence. Parameter estimation is carried out through the construction of a joint-likelihood function of edges and properly chosen penalty terms. The convexity of the objective function allows us to develop an efficient algorithm, while the penalty terms push towards a low-dimensional latent component and a sparse graphical structure. Simulation results show that the proposed method works well in practical situations. The proposed method has been applied to a real application, which contains a citation network of statisticians (Ji and Jin, 2016). Some interesting findings are reported.