LGAIFeb 23

Detecting High-Potential SMEs with Heterogeneous Graph Neural Networks

arXiv:2602.19591v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses the challenge for policymakers and early-stage investors in efficiently assessing SME potential, though it is incremental as it applies a novel graph-based method to an existing problem.

The paper tackles the problem of systematically identifying high-potential Small and Medium Enterprises (SMEs) by predicting which SBIR Phase I awardees will advance to Phase II funding using a Heterogeneous Graph Transformer framework, achieving an AUPRC of 0.621 and 89.6% precision at a screening depth of 100 companies.

Small and Medium Enterprises (SMEs) constitute 99.9% of U.S. businesses and generate 44% of economic activity, yet systematically identifying high-potential SMEs remains an open challenge. We introduce SME-HGT, a Heterogeneous Graph Transformer framework that predicts which SBIR Phase I awardees will advance to Phase II funding using exclusively public data. We construct a heterogeneous graph with 32,268 company nodes, 124 research topic nodes, and 13 government agency nodes connected by approximately 99,000 edges across three semantic relation types. SME-HGT achieves an AUPRC of 0.621 0.003 on a temporally-split test set, outperforming an MLP baseline (0.590 0.002) and R-GCN (0.608 0.013) across five random seeds. At a screening depth of 100 companies, SME-HGT attains 89.6% precision with a 2.14 lift over random selection. Our temporal evaluation protocol prevents information leakage, and our reliance on public data ensures reproducibility. These results demonstrate that relational structure among firms, research topics, and funding agencies provides meaningful signal for SME potential assessment, with implications for policymakers and early-stage investors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes