163
Views
0
CrossRef citations to date
0
Altmetric
Note

Toward cost-effective quantum circuit simulation with performance tuning techniques

ORCID Icon, ORCID Icon, ORCID Icon, ORCID Icon & ORCID Icon
Article: 2349541 | Received 18 Jul 2023, Accepted 25 Apr 2024, Published online: 09 May 2024
 

Abstract

Quantum circuit simulation is a popular approach to evaluating novel quantum algorithms before a physical quantum computer is available. Unfortunately, the simulation is often done with the full-state quantum circuit simulation scheme, and a huge memory space required by a full-state simulator is a limiting factor for the simulation of a larger qubit system. In this work, in order to support the simulation of a broadened qubit system, storage devices are introduced into the full-state quantum circuit simulation. A vertical qubit simulation design is proposed for the storage-based, full-state quantum circuit simulation, including qubit representation, threading model, and parallel state manipulations over storage devices. An empirical method of simulator parameter tuning is developed to achieve higher simulation performance. Our experimental results show that compared with the state-of-the-art memory-only simulator (QuEST), the storage-based simulation can achieve a 61x higher cost-delay ratio and can simulate a 39-qubit system on a commodity computer. The encouraging results indicate that our proposed simulator can help scale the full-state simulation for larger quantum circuits and achieve higher performance via the performance tuning method.

Disclosure statement

No potential conflict of interest was reported by the author(s).

Notes

1 SnuQS is available at https://github.com/mcrl/SnuQS.

2 Based on the experimental results provided by SnuQS the simulation time grows log-linearly with the increase of the simulated qubit size.

3 The following quantum gates are tested: H gate, S gate, T gate, X gate, Y gate, Z gate, P gate, Unitary 1-qubit gate, CX gate, CY gate, CZ gate, CP gate, Unitary 2-qubit gate, SWAP gate, and Unitary U3 gate.

4 It presents the requirement that performing matrix multiplication between a gate and its conjugate transpose must be an identity matrix, expressed as UU=UU=I.

5 When C is eight, it means the chunk size of 4,096 bytes (28+4 = 4,096). Please refer to Section 3.2 for details.

6 In this experiment, the setting of the parameters is as follows, T = 6, F = 6, C = 12, and M = N−18, where the direct I/O is turned on when N>30.

7 With the two mechanisms, quantum states can be kept in the memory without incurring file I/O operations, provided that there is sufficient available memory.

8 QuEST is optimised for the diagonal matrix computations, representing the CPhase gates frequently used in QFT.

9 A PCIe adaptor is used to mount eight SSDs.

10 The experimental results assume that the simulation of each quantum gate can incur one external data access in the gate-by-gate simulation scheme.

11 The cost of the computer cluster with sixteen machine nodes and 256 TB of SSDs is around $128,000 USD.

Additional information

Funding

This work was supported by National Science and Technology Council.