Benchmarking data encoding methods in Quantum Machine Learning
This work addresses a domain-specific challenge for researchers in Quantum Machine Learning, but it is incremental as it focuses on benchmarking existing methods.
The paper tackles the problem of selecting data encoding methods in Quantum Machine Learning by benchmarking commonly used types on different datasets, as there is no universal rule for this choice, but it does not report specific numerical results.
Data encoding plays a fundamental and distinctive role in Quantum Machine Learning (QML). While classical approaches process data directly as vectors, QML may require transforming classical data into quantum states through encoding circuits, known as quantum feature maps or quantum embeddings. This step leverages the inherently high-dimensional and non-linear nature of Hilbert space, enabling more efficient data separation in complex feature spaces that may be inaccessible to classical methods. This encoding part significantly affects the performance of the QML model, so it is important to choose the right encoding method for the dataset to be encoded. However, this choice is generally arbitrary, since there is no "universal" rule for knowing which encoding to choose based on a specific set of data. There are currently a variety of encoding methods using different quantum logic gates. We studied the most commonly used types of encoding methods and benchmarked them using different datasets.