How well can QML classify classical datasets? In this work (https://arxiv.org/abs/2108.00661), we use various Quantum Convolutional Neural Networks (QCNN) for image data classification tasks. We benchmarked various QCNN models differentiated by structures of parameterized quantum circuits, quantum data encoding methods, classical data pre-processing methods, cost functions and optimizers on MNIST and Fashion MNIST datasets. This is a introductory tutorial for the paper and the full code can be found in https://github.com/takh04/QCNN.
QCNN is a local and translationally invariant variational quantum model suggested by Cong et al (https://arxiv.org/abs/1810.03787). One of the key feature of QCNN is it reduces the number of qubit each layer. Due to its similarity with classical convolutional neural networks, people often use the term convolutional and pooling layers. The original work uses QCNN to classify quantum data in quantum phase recognition problem. Here, we will focus on classifying classical data.
First we load MNIST datasets for classification.
import pennylane as qml
from pennylane import numpy as np
import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train[..., np.newaxis] / 255.0, x_test[..., np.newaxis] / 255.0
Filter class 0 and 1 of MNIST dataset for binary classification tasks
x_train_filter_01 = np.where((y_train == 0) | (y_train == 1))
x_test_filter_01 = np.where((y_test == 0) | (y_test == 1))
X_train, X_test = x_train[x_train_filter_01], x_test[x_test_filter_01]
Y_train, Y_test = y_train[x_train_filter_01], y_test[x_test_filter_01]
Resize 28 * 28 image into 256 vectors so we can fit into 8 qubit QCNN model
X_train = tf.image.resize(X_train[:], (256, 1)).numpy()
X_test = tf.image.resize(X_test[:], (256, 1)).numpy()
X_train, X_test = tf.squeeze(X_train).numpy(), tf.squeeze(X_test).numpy()
In order to process classical data with quantum hardware, we must first map classical data in a quantum state. Here, we will use Amplitude Embedding scheme.
def data_embedding(X):
qml.AmplitudeEmbedding(X, wires=range(8), normalize=True)
QCNN model utilizes local trainable unitaries. Here, we are interested in 2-local architecture. Below is parameterized circuit that can express general SU4 unitaries with 15 parameters.
def U_SU4(params, wires): # 15 params
qml.U3(params[0], params[1], params[2], wires=wires[0])
qml.U3(params[3], params[4], params[5], wires=wires[1])
qml.CNOT(wires=[wires[0], wires[1]])
qml.RY(params[6], wires=wires[0])
qml.RZ(params[7], wires=wires[1])
qml.CNOT(wires=[wires[1], wires[0]])
qml.RY(params[8], wires=wires[0])
qml.CNOT(wires=[wires[0], wires[1]])
qml.U3(params[9], params[10], params[11], wires=wires[0])
qml.U3(params[12], params[13], params[14], wires=wires[1])
Here, we will define a QCNN structure. For convolutional layer, we will consider all two nearest neighbour two qubit unitaries. We will assume periodic boundary condition where the first qubit and the last qubits are connected as well. For the pooling layer, we will trace out the qubits without applying any additional gates.
def conv_layer1(U, params):
U(params, wires=[0, 7])
for i in range(0, 8, 2):
U(params, wires=[i, i + 1])
for i in range(1, 7, 2):
U(params, wires=[i, i + 1])
def conv_layer2(U, params):
U(params, wires=[0, 6])
U(params, wires=[0, 2])
U(params, wires=[4, 6])
U(params, wires=[2, 4])
def conv_layer3(U, params):
U(params, wires=[0,4])
def QCNN_structure_without_pooling(U, params):
param1 = params[0:15]
param2 = params[15:30]
param3 = params[30:45]
conv_layer1(U, param1)
conv_layer2(U, param2)
conv_layer3(U, param3)
Full QCNN model can be constructed by concatenating data embedding, QCNN ansatz, and measurement.
dev = qml.device('default.qubit', wires = 8)
@qml.qnode(dev)
def QCNN(X, params):
data_embedding(X)
QCNN_structure_without_pooling(U_SU4, params)
result = qml.expval(qml.PauliZ(4))
return result
We train QCNN model with MSE loss fucntion. Here, we will train for 100 steps, with batch size 25 and learning rate 0.01.
def square_loss(labels, predictions):
loss = 0
for l, p in zip(labels, predictions):
loss = loss + (l - p) ** 2
loss = loss / len(labels)
return loss
def cost(params, X, Y):
predictions = [QCNN(x, params) for x in X]
loss = square_loss(Y, predictions)
return loss
steps = 100
learning_rate = 0.01
batch_size = 25
params = np.random.randn(45, requires_grad=True)
opt = qml.NesterovMomentumOptimizer(stepsize=learning_rate)
loss_history = []
for it in range(steps):
batch_index = np.random.randint(0, len(X_train), (batch_size,))
X_batch = [X_train[i] for i in batch_index]
Y_batch = [Y_train[i] for i in batch_index]
params, cost_new = opt.step_and_cost(lambda v: cost(v, X_batch, Y_batch), params)
loss_history.append(cost_new)
if it % 10 == 0:
print("iteration: ", it, " cost: ", cost_new)
/Users/tak/anaconda3/envs/QC/lib/python3.11/site-packages/autograd/numpy/numpy_vjps.py:698: ComplexWarning: Casting complex values to real discards the imaginary part onp.add.at(A, idx, x)
iteration: 0 cost: 0.49934547842218835 iteration: 10 cost: 0.2503536678322024 iteration: 20 cost: 0.11762641399484976 iteration: 30 cost: 0.07426772535056775 iteration: 40 cost: 0.07729063180222724 iteration: 50 cost: 0.09388518272820143 iteration: 60 cost: 0.0797389077599842 iteration: 70 cost: 0.08666048601336815 iteration: 80 cost: 0.0826604040027213 iteration: 90 cost: 0.0870977108502367
As we trained QCNN models, let's see how well it can classify the images.
def accuracy_test(predictions, labels):
acc = 0
for l, p in zip(labels, predictions):
if np.abs(l - p) < 0.5:
acc = acc + 1
return acc / len(labels)
predictions = [QCNN(x, params) for x in X_test]
accuracy = accuracy_test(predictions, Y_test)
accuracy = accuracy * 100
print(f"Test data accuracy with QCNN model: {accuracy:.3}%")
Test data accuracy with QCNN model: 95.8%
Our simple QCNN model classifies the MNIST data with 95.8% accuracy!
In this introductory tutorial, we only looked at QCNN with resizing (interpolation) classical preprocessing, amplitude embedding, SU4 ansatz without pooling layer, MSE loss function applied on MNIST datasets. If you want to see more of the results, look at the original paper and code!