Recreation of the Periodic Table with an Unsupervised Machine Learning Algorithm
This work addresses a foundational problem in chemistry and data science by demonstrating machine learning's ability to replicate human cognitive feats like the periodic table, though it is incremental as it applies an existing method to a new domain.
The study tackled the problem of recreating the periodic table using machine learning by developing an unsupervised algorithm called PTG based on generative topographic mapping, which successfully produced various two-dimensional and three-dimensional arrangements of chemical elements from physicochemical data.
In 1869, the first draft of the periodic table was published by Russian chemist Dmitri Mendeleev. In terms of data science, his achievement can be viewed as a successful example of feature embedding based on human cognition: chemical properties of all known elements at that time were compressed onto the two-dimensional grid system for tabular display. In this study, we seek to answer the question of whether machine learning can reproduce or recreate the periodic table by using observed physicochemical properties of the elements. To achieve this goal, we developed a periodic table generator (PTG). The PTG is an unsupervised machine learning algorithm based on the generative topographic mapping (GTM), which can automate the translation of high-dimensional data into a tabular form with varying layouts on-demand. The PTG autonomously produced various arrangements of chemical symbols, which organized a two-dimensional array such as Mendeleev's periodic table or three-dimensional spiral table according to the underlying periodicity in the given data. We further showed what the PTG learned from the element data and how the element features, such as melting point and electronegativity, are compressed to the lower-dimensional latent spaces.