GNCLJul 3, 2025

Seeing Through Green: Text-Based Classification and the Firm's Returns from Green Patents

arXiv:2507.02287v2h-index: 11
Originality Synthesis-oriented
AI Analysis

This provides a more accurate classification of green patents for policymakers and firms, though it is incremental in applying existing NLP techniques to a specific domain.

The paper introduces an NLP method to identify 'true' green patents from official documents, finding that only 20% of previously classified green patents qualify as truly green, and that holding at least one such patent increases firm sales, market share, and productivity in the EU.

This paper introduces Natural Language Processing for identifying ``true'' green patents from official supporting documents. We start our training on about 12.4 million patents that had been classified as green from previous literature. Thus, we train a simple neural network to enlarge a baseline dictionary through vector representations of expressions related to environmental technologies. After testing, we find that ``true'' green patents represent about 20\% of the total of patents classified as green from previous literature. We show heterogeneity by technological classes, and then check that `true' green patents are about 1\% less cited by following inventions. In the second part of the paper, we test the relationship between patenting and a dashboard of firm-level financial accounts in the European Union. After controlling for reverse causality, we show that holding at least one ``true'' green patent raises sales, market shares, and productivity. If we restrict the analysis to high-novelty ``true'' green patents, we find that they also yield higher profits. Our findings underscore the importance of using text analyses to gauge finer-grained patent classifications that are useful for policymaking in different domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes