Quantum Convolutional Neural Networks for High-Energy Physics Analysis at the LHC | GSoC 2022 @ ML4SCI
This blog post describes my Google Summer of Code (GSoC) 2022 project in brief under ML4SCI. The goal of this study is to show the capabilities of QML especially QCNN for classifying the HEP image datasets.
Synopsis
Determining whether an image of a jet particle corresponds to signals or background signals is one of the many challenges faced in High Energy Physics. CNNs have been effective against jet particle images as well for classification purposes. Quantum computing is promising in this regard and as the QML field is evolving, this project aims to understand and implement QCNN and gain some enhancement.
The goal of this study is to show the capabilities of QML especially QCNN for classifying the HEP image datasets. QCNN can be completely quantum or can be a hybrid with classical. The aim is to implement both. We will use quantum variational classification instead of the final FC classical layers in the quantum setting. This will give more depth about the quantum power that can be used in the near-term future.
Code
The code is hosted in a GitHub repository. Training quantum circuits takes time as the simulation is slow. I have tried and tested various frameworks like Tensorflow-quantum, Pennylane and JAX. JAX with Pennylane proved to be the fastest one for a huge amount of data (~ 500k — 700k images). I have also added some classical models for benchmarking. Also, I tried Classical-Quantum transfer learning and Layer-wise learning using TensorFlow-quantum.
About me
I am Gopal Ramesh Dahale. I completed my Bachelor of Technology in Electrical Engineering with honours in Computer Science from the Indian Institute of Technology, Bhilai. I am also a Qiskit Advocate. I am highly inclined toward Quantum Machine learning and Quantum circuit Optimization.
Why Quantum CNNs?
Quantum CNNs or Quantum machine learning in general aims to use quantum computers for machine learning applications. quantum computers are able to naturally solve certain problems with complex correlations between inputs that can be incredibly hard for traditional, or “classical”, computers [1]. In this study, the aim was to gain a higher AUC (area under the curve) value than the current state-of-the-art classical methods. Another advantage of Quantum machine learning is that the number of parameters is drastically reduced. ResNets have billions of parameters whereas a simple QCNN has ~20 parameters and obtains results at par.
Another advantage that is shown by a recent publication “Generalization in quantum machine learning from few training data” [2] is generalization over test data with few training data points. In contrast, classical models require a huge amount of training data to obtain reasonable AUC.
Hybrid quantum-classical Convolutional Neural Networks (HQCCNN)
I spent my summer mostly on this and will briefly describe it.
The HQCCNN architecture is exactly the same as the Classical CNN except for the fact the classical kernel matrix is replaced by a trainable quantum circuit. The figure above shows a kernel of 2 x 2 that is fed into a quantum circuit with 4 qubits. Here we measure all the qubits and feed the output to a classical fully connected layer. The loss computation and weight updates are performed classically.
Datasets
Electron Photon Electromagnetic Calorimeter (ECAL)
This is the dataset that I spent most of my time with. The dataset contains images of electrons and photons captured by the ECAL detector. These are 32 x 32 x 2 images where the intensity of each pixel corresponds to the amount of energy measured in that cell. The two channels correspond to energy and time. For this study, we used the energy channel only.
A total of 498k samples equally distributed among the two classes were present in the dataset. More details can be found in this paper [3].
Quarks and Gluons
The dataset contains images of size 125 x 125 x 3 of simulated quark and gluon jets. The first channel is the reconstructed tracks of the jet, the second channel is the images captured by the electromagnetic calorimeter (ECAL) detector, and the third channel is the images captured by the hadronic calorimeter (HCAL) detector.
Since the original size of 125 x 125 pixels is too large for quantum computing simulation, we cropped the images to a certain size. For now, we limit the current size to 40 x 40 pixels. In this study, we focus on the ECAL channel only. More details can be found in this paper [4].
Data Encoding
I implemented various data embedding techniques including Angle [5], Double Angle [6], and Amplitude [7] maps. For a quick revision, I suggest having a look at the Data Encoding section in the Qiskit Quantum Machine Learning course.
Ansatzes a.k.a PQC
Ansatzes also known as Parameterized quantum circuits are circuits with gates having tunable parameters. These parameters are learnt by the model during training. PQCs are building blocks for the near-term algorithms in quantum machine learning.
During this study, I implemented various ansatzes. Pauli Z basis is used for measurement which has an expected value between -1 and +1.
Chen [6]
I call this Chen ansatz (One of the authors of the paper “Quantum Convolutional Neural Networks for High Energy Physics Data Analysis” [6]). The ansatz uses a circular CNOT entangling layer followed by one-qubit-unitary gates. We measure the first qubit here.
Cong [8]
This ansatz is inspired by this paper [8] and the TensorFlow-quantum tutorial [9]. It uses alternate quantum convolution and pooling layers (separated using a barrier in the figure). The output is obtained by measuring the last qubit.
Farhi [10]
The paper [10] and the tutorial [11] describe the ansatz using two-qubit gates. Of the two qubits, one is always the readout qubit which will be measured at the end. The figure shows a quantum circuit made using RXX and RZZ as two-qubit gates but one can also use RYY and other two-qubit gates.
Tree-Tensor Network [12]
Tree-Tensor network quantum circuits are based on classical tree-tensor networks and emulate the connectivity and shape of the same. I was inspired by the Pennylane tutorial [13] and implemented it in TensorFlow-quantum. The figure shows TTN on 8 qubits (2³).
Data-Reuploading Circuits [14]
In order to increase the expressibility[15] on a quantum circuit, the data encoded on a single qubit is re-uploaded between parameterized gates. To enhance expressibility, we can scale the input parameters by some trainable parameters. The next two sections show two different ways in which we can create the data-reuploading circuits.
NQubitPQC-Sparse
This ansatz is based on [14]. The idea is to use the linear combination of inputs, weights and biases as input to the rotation gate (Rx, Ry or Rz).
Every gate has different weights and biases but the same input vector. The above figure shows a quantum circuit for 4 qubits and 2 layers (the 2nd parameterized layer is after the CZ entangling layer). In total every gate uses n+1 trainable parameters.
NQubitPQC
This is similar to NQubitPQC-Sparse but uses more gates. Assume an (n,1) input vector, and then every element of this vector gets a rotation gate. Every element is multiplied with weight and added by bias and therefore uses 3 trainable parameters per gate. This gives more flexibility to gates to rotate the quantum state.
Now the advantage of NQubitPQC and NQubitPQC-Sparse is that they don’t need any kind of data encoding technique like Angle or Amplitude map and thus can be extended to any number of qubits or layers.
The quantum circuit in the above figure serves as a fully quantum single-qubit classifier. We can also add a classical fully connected layer after measurement.
Implementation details
Creating custom models and using predefined models is fairly simple.
To create a Quantum Convolutional layer import and define the layer
To use the Data reuploading circuits (NQubitPQC and NQubit-Sparse), one needs to specify the number of qubits and sparse argument. We can then omit the feature map and ansatz.
For more details check out the tutorials.
Results
I sampled both datasets at random and used 90k, 10k and 20k for train, validation and test sets respectively. I used the Data reuploading architectures as the number of qubits and layers are tunable. I tried to implement a ResNet residual block architecture.
Results of the Electron Photon dataset
Using NQubitPQC-Sparse ansatz.
The best Test AUC is 0.7518.
Using NQubitPQC ansatz.
The best Test AUC is 0.7458.
Results of Quark Gluon dataset
Using NQubitPQC-Sparse ansatz.
The best Test AUC is 0.6887.
Results with Full dataset
With the full dataset, on the electron-photon dataset, the best Test AUC was 0.7684 and for Quark-gluon it was 0.699.
Future work
Although HQCCNNs have some extremely encouraging findings, there is still space for improvement in terms of how well they perform in comparison to the classical state-of-the-art models. The next step is to explore quantum self-attention networks and quantum contrastive learning. Also exploring gradient-free optimizers will be helpful as the training time will be reduced by quite a margin.
I want to thank my mentor Sergei V. Gleyzer for guiding me throughout the summer and resolving my queries patiently. I also want to thank the contributors that were working in QMLHEP along with me: Abhay Kamble, Amey Bhatuse (Quantum GANs) and Tom Magorsch (Quantum Autoencoders). Also thanks to the entire ML4SCI community for their support.
References
- Google Research Blog: Quantum Machine Learning and the Power of Data.
- Caro, M.C., Huang, HY., Cerezo, M. et al. Generalization in quantum machine learning from few training data. Nat Commun 13, 4919 (2022). https://doi.org/10.1038/s41467-022-32550-3.
- Andrews, M & Paulini, M & Gleyzer, Sergei & Poczos, B. (2018). End-to-End Event Classification of High-Energy Physics Data. Journal of Physics: Conference Series. 1085. 042022. 10.1088/1742–6596/1085/4/042022.
- Andrews, Michael & Alison, John & An, Sitong & Bryant, Patrick & Burkle, Bjorn & Gleyzer, Sergei & Narain, Meenakshi & Paulini, Manfred & Poczos, Barnabas & Usai, Emanuele. (2019). End-to-End Jet Classification of Quarks and Gluons with the CMS Open Data.
- LaRose, R. and Coyle, B., “Robust data encodings for quantum classifiers”, <i>Physical Review A</i>, vol. 102, no. 3, 2020. doi:10.1103/PhysRevA.102.032420.
- Chen, Samuel & Wei, Tzu-Chieh & Zhang, Chao & Yu, Haiwang & Yoo, Shinjae. (2020). Quantum Convolutional Neural Networks for High Energy Physics Data Analysis.
- Schuld, M., Petruccione, F. (2018). Information Encoding. In: Supervised Learning with Quantum Computers. Quantum Science and Technology. Springer, Cham. https://doi.org/10.1007/978-3-319-96424-9_5
- Cong, I., Choi, S. & Lukin, M.D. Quantum convolutional neural networks. Nat. Phys. 15, 1273–1278 (2019). https://doi.org/10.1038/s41567-019-0648-8
- TensorFlow Quantum: Quantum Convolutional Neural Network.
- Farhi, Edward & Neven, Hartmut. (2018). Classification with Quantum Neural Networks on Near Term Processors.
- TensorFlow Quantum: MNIST classification.
- Huggins, William J., Piyush S. Patil, Bradley K. Mitchell, K. Birgitta Whaley and Edwin Miles Stoudenmire. “Towards quantum machine learning with tensor networks.” Quantum Science and Technology 4 (2018): n. pag.
- PennyLane: Tensor-network quantum circuits.
- Pérez-Salinas, Adrián & Cervera-Lierta, Alba & Gil-Fuster, Elies & Latorre, José. (2020). Data re-uploading for a universal quantum classifier. Quantum. 4. 226. 10.22331/q-2020–02–06–226.
- Schuld, Maria & Sweke, Ryan & Meyer, Johannes. (2020). The effect of data encoding on the expressive power of variational quantum machine learning models.
- Liu, J., Lim, K.H., Wood, K.L. et al. Hybrid quantum-classical convolutional neural networks. Sci. China Phys. Mech. Astron. 64, 290311 (2021). https://doi.org/10.1007/s11433-021-1734-3
- TensorFlow Quantum Research: Layerwise Learning.
- Transfer learning in hybrid classical-quantum neural networks, XanaduAI.