Experimental Quantum Advantage with Quantum Coupon Collector

An increasing number of communication and computational schemes with quantum advantages have recently been proposed, which implies that quantum technology has fertile application prospects. However, demonstrating these schemes experimentally continues to be a central challenge because of the difficulty in preparing high-dimensional states or highly entangled states. In this study, we introduce and analyze a quantum coupon collector protocol by employing coherent states and simple linear optical elements, which was successfully demonstrated using realistic experimental equipment. We showed that our protocol can significantly reduce the number of samples needed to learn a specific set compared with the classical limit of the coupon collector problem. We also discuss the potential values and expansions of the quantum coupon collector by constructing a quantum blind box game. The information transmitted by the proposed game also broke the classical limit. These results strongly prove the advantages of quantum mechanics in machine learning and communication complexity.


Introduction
The "second quantum revolution" is aimed at exploring the superiority of quantum resources over classical resources in terms of communication, computation, and artificial intelligence. To demonstrate that this goal is feasible in practice, a series of schemes with quantum advantages were experimentally implemented. These schemes included improving on the security of communication [1][2][3][4][5][6][7][8][9][10][11][12], enhancing computational power for specific tasks [13][14][15][16][17][18][19][20][21][22][23][24][25], and reducing the necessary resources used to complete specific communication tasks [26][27][28][29][30][31]. In addition, machine learning can extract useful knowledge from data, which can then have a significant impact on productivity, technology, and the economy [32]. This has led to an increasing interest in the question of quantum machine learning [33]: Can we improve machine learning by using quantum resources? Owing to the unique entanglement properties of quantum states, quantum models may be able to produce atypical patterns that cannot be effectively produced by classical models or effectively reduce training time. Therefore, some studies have made bold attempts. For example, in Ref. [34], a quantum autoencoder that can successfully denoise specific quantum states subjected to specific noises was developed. Reference [35] described a quantum neural network that can accurately recognize quantum states associated with a one-dimensional symmetry-protected topological phase. Moreover, there are several other excellent studies [36][37][38][39][40][41].
On the other hand, most of these attempts are heuristic and have not theoretically proven that quantum machine learning exhibits a better performance or shorter training time than classical machine learning. The training time of a model includes the time complexity of the learning algorithm. If the algorithm takes a constant amount of time for processing each sample, the concern for the time complexity translates into a concern of the sample complexity. Probably approximately correct (PAC) learning theory [42,43] provides the minimum number of samples necessary for a learning algorithm to complete a learning task. Researching quantum machine learning using this theory can therefore lay a positive theoretical foundation for exploring quantum advantages in machine learning.
For the first time, Ref. [44] described the use of PAC learning theory to rigorously prove that quantum technology can provide a learning algorithm with quantum advantage for machine learning. The method applied clever quantum measurements to the learning task known as the "coupon collector problem" [45]. Specifically, Ref. [44] gives a surprising result: for the coupon collector problem, the sample complexity of the quantum learning algorithm does not change with changes in the search space of the algorithm. This result is impossible in classical machine learning [42]. However, the experimental demonstration of this algorithm [44] requires high-dimensional states that are difficult to prepare. Even if these states are decomposed into a tensor product of qubits, these qubits must be highly entangled [26,46,47]. These requirements are far beyond the scope of current technology. More seriously, the algorithm requires a specific projective measurements, which is difficult to implement in experiment.
In this work, we experimentally demonstrate the quantum coupon collector algorithm by proposing a coherentstate quantum coupon collector protocol. Our protocol avoids the abovementioned difficulties. To do this, our protocol not only maintains the important properties of the original one but also introduces new conceptual tools that can be implemented using only linear optics operations and single-photon detectors. Even without a quantum computer, these tools enable us to demonstrate the quantum advantages in machine learning at the current technological level. Moreover, the coupon collector problem can be considered as a communication task. Similar to quantum fingerprinting [27,28], our protocol also experimentally demonstrates the advantages of quantum mechanics in the context of communication complexity. These results, in addition to their fundamental interest [48][49][50], will further inspire new designs of communication systems, large-scale integration circuit designs, and data structures [51], thus paving the way for other communication or computational tasks that rely on similar principles.

Coherent-State Quantum Coupon Collector Protocol.
To clearly describe our protocol, we briefly introduce the coupon collector problem [45]. This problem can be abstracted as learning exactly an unknown set S. Specifically, this set S is limited to a subset of the set ½n ≔ f1, ⋯, ng, where the size kðk < nÞ of set S is known. To learn the set S, several copies of S are given, and only one element is allowed to be extracted from each copy. The task is to determine the minimum number of copies required to learn S exactly.
Because the elements in S are independent of each other, the best strategy for learning S is to randomly extract one of these elements in each copy. Under this strategy, if iði < kÞ distinct elements have been obtained, the expected number of copies needed to learn the ði + 1Þth element is k/ðk − iÞ. Therefore, the expected number of copies needed to learn S is ∑ k−1 i=0 k/ðk − iÞ~k ln k. Continuing based on this, Ref. [45] shows that Θðk log 2 kÞ copies are necessary and sufficient for learning S with a high probability.
However, if these copies are quantum copies in the form of jSi ≔ ð1/ ffiffi ffi k p Þ∑ i∈S jii, the number of copies required can be further reduced. This is because we can perform more quantum operations on jSi before measuring them on the computational basis. In other words, by using quantum copies jSi, the strategy for learning S is no longer limited to random sampling. Reference [44] shows that when the number of missing elements m = n − k is very small, the number of copies of jSi used to learn S is reduced to Θðk log 2 ðm + 1ÞÞ. The method first performs a 2-outcome projective measurement with operators j½nih½nj and I − j½nih½nj on copies of jSi, where j½ni ≔ ð1/ ffiffiffi n p Þ∑ i∈½n jii. After the measurement, the second outcome jψi = ffiffiffiffiffiffiffiffi m/n p jSi − ffiffiffiffiffiffi ffi k/n p j Si is obtained with probability m/n, where S is the set of missing elements, and j Si ≔ ð1/ ffiffiffiffi m p Þ∑ i∈ S jii. Then, jψi is measured on the computational basis to obtain the missing elements. This method transforms jSi into jψi and then infers the elements in jSi by learning the elements in j Si from jψi. Because j Si has fewer elements, this method reduces the number of copies required to learn S. However, difficulties arise when we attempt to demonstrate this method experimentally. This is because it is difficult to experimentally construct a single k-dimensional quantum state [52][53][54]. Even if this state is decomposed into a tensor product of qubits, these qubits must be highly entangled [26,46,47]. This is also not feasible using current technology. More seriously, the operators j½nih½nj and I − j½nih½nj in this method are difficult to implement in experiments.
Therefore, we introduce an alternative scheme, which is defined as the "coherent-state quantum coupon collector protocol." This scheme maintains the main idea of the original one, which is to learn S by measuring the missing elements. In addition, this scheme uses a sequence of coherent states to implement copies of jSi. Coherent states are easy to prepare and can be transformed using simple linear optical elements. Therefore, this scheme is particularly attractive from a practical point of view.
In our scheme, copies of jSi are implemented using a time sequence of n weak coherent optical pulses where α is a complex number and jð−1Þ j αi i is a coherent state with amplitude α at the ith time mode, where j = 0 for i ∈ S and j = 1 otherwise. The phases of these coherent pulses depend on S, but their intensities are the same. Thus, the state jα, Si has the mean photon number μ = njαj 2 . Note that jSi is given by projecting the state j α, Si ≔ ⊗ i∈S jαi i into the single-photon subspace. The total intensity kjαj 2 of j α, Si represents the number of copies of jSi. However, our scheme uses jα, Si instead of j α, Si. In other words, when i ∉ S, our scheme sends state j−αi i instead of a vacuum state. This method can improve the efficiency of detecting missing elements, but it also causes the number of copies by our scheme to be slightly different from those of a scheme using j α, Si. Specifically, the 2 Research number of copies by our scheme are at most Oðnjαj 2 Þ. For comparison, the number of copies by a scheme using j α, Si is at most Oðkjαj 2 Þ. However, as we will see later, this subtle difference can be ignored, and using jα, Si greatly improves the efficiency of detecting missing elements. As we discussed previously, our scheme maintains the important characteristics of the original one; that is, the time bin i of state j−αi i in jα, Si is found through complementary measurements, and the elements in S are derived from these time bins. To do this, a local state jα, ½ni is prepared and sent to a 50 : 50 beam splitter (BS) to interfere with jα, Si (Figure 1(a)), where The interference result is recorded by a single-photon detector D R . If the detector D R clicks at the ith time bin, then we consider the pulse of the ith time bin in jα, Si to be j−αi i . Thus, all time bins containing j−αi in jα, Si can be learned exactly from the outcomes of the detector D R . Without loss of generality, let Alice be a coupon maker and Bob be a coupon collector. Bob's task is to learn all elements in S from the state jα, Si prepared by Alice. The detailed steps are described as follows: (1) Alice selects k elements from set ½n as set S and selects an appropriate value jαj 2 as the intensity of each pulse (2) Alice encodes the pulses jα, Si according to S and the intensity jαj 2 ; if i ∈ S, Alice prepares a coherent pulse jαi i at the ith time bin and sends it to Bob; otherwise, Alice sends j−αi i to Bob (3) Alice announces the value of jαj 2 , the elements of ½n, and the size k of S (4) Bob encodes the pulses jα, ½ni according to ½n and intensity jαj 2 : Bob prepares a coherent pulse jαi for all time bins The mean photon number of per pulse 3 Research (5) Bob uses a 50 : 50 beam splitter and two singlephoton detectors to perform interference measurements on the pulses jα, Si and jα, ½ni (6) Bob records whether the detector D R clicks or not at each time bin, and then learns S exactly based on the outcomes of the detector D R and the size k of the set S At each time bin i, the output state after the BS is Because D L is unnecessary in this work, it is not drawn in Figure 1(a). It is easy to verify that in the ideal case, the value of j determines whether the pulses output by the BS go to D L or D R , thus helping Bob to learn the state jα, Si.
However, even under ideal conditions, the proposed scheme has an intrinsic failure probability. This is because a coherent pulse may collapse to a vacuum state after being measured and thus cannot be detected by detector D R . Specifically, when Alice sends the coherent pulse jαi i , D R will never click. Therefore, Bob can always obtain the correct results. However, when Alice sends a coherent pulse j−αi i , the detector D R has a nonclick probability of P not = e −2jαj 2 (the detection efficiency is assumed to be 100% in the ideal case). Note that if Alice sends a vacuum state when i ∉ S, then the nonclick probability of D R is increased to e −ðjαj 2 /2Þ . This is why when i ∉ S, our scheme sends j−αi i instead of a vacuum state. Only when detector D R detects all j−αi i sent by Alice can we infer the elements in S. Therefore, the success probability of our scheme without experimental imperfections is PðmÞ = ð1 − P not Þ m , where m is the number of missing elements of S. Note that although the use of the coherent-state sequence jα, Si for implementing copies of jSi is easier to demonstrate experimentally, it also introduces an intrinsic failure probability. Fortunately, as we will see later, this failure probability is negligible compared to the failure probability introduced by experimental imperfections.
To demonstrate the quantum advantage of our scheme experimentally, we need to eliminate the influence of the failure probability as much as possible. On the one hand, we can reduce the failure probability by increasing the mean photon number per coherent pulse. On the other hand, we can calculate the required number of copies based on the expectation of 100% success. However, these methods increase the number of copies required. Therefore, the selection of an appropriate mean photon number is particularly important.

Protocol in the Presence of Experimental Imperfections.
So far, we have only discussed the success probability of our protocol under ideal conditions. However, owing to experimental imperfections, the success probability formula of our protocol must be modified. We consider imperfect experimental models characterized by three parameters: the combined effects of limited detector efficiency and channel loss η, dark count rate p d of the single-photon detector, and the limited visibility ν of the interferometer.
By replacing jα, Si with j ffiffi ffi η p α, Si, we can eliminate the effect of η without changing the form of the success proba-bility formula. However, p d and ν will cause D R to click at incorrect time bins with a nonzero probability; thus, the success probability formula must be modified. In this case, when Alice sends a pulse jαi i , the probability that detector D R clicks is given by and when Alice sends a pulse j−αi i , the probability that the detector D R clicks is Because D R may click when Alice sends jαi i and may not click when Alice sends j−αi i , the number of clicks of the detector D R may be different from the size k of S. In this case, Bob can determine that the experimental results are not available and must be discarded. Therefore, Alice and Bob need to repeat multiple experiments to obtain usable experimental data. Here, we define a new efficiency E = M/N to measure the ratio of the number M of available experimental results of the total number N of experiments, which can be calculated by the following formula: where In addition, even if Bob obtains usable experimental results, D R has a probability of clicking at the wrong time bin, thus making him misjudge the elements in S. This requires us to define the correct probability Pðm, kÞ = P m −α ð1 − P α Þ k /E when the experimental results are available. Thus, the success probability is modified to P suc = E × Pðm, kÞ. Based on the expectation of 100% success, the number of quantum samples required to complete the task is Without loss of generality, we choose a special set S j to verify the quantum advantage of our protocol, where S j = f1, 2, ⋯, j − 1, j + 1, ⋯, ng and size k j ≔ jS j j = n − 1. Therefore, the correct probability can be simplified as Based on the above formulae, we present the numerical simulation in Figures 1(b) and 1(c). The simulation results 4 Research also guide our experimental demonstration. Figure 1(b) shows that a higher intensity results in a higher correct probability Pð1, n − 1Þ. This is because D R can more easily detect a higher intensity j−αi i sent by Alice. In contrast, as the visibility ν of the interferometer is not perfect, higher intensity also increases the probability of D R clicking when Alice sends jαi i . These two factors lead to the experimental efficiency E and success probability P suc increasing first and then decreasing with an increase in light intensity. P suc directly affects the number of quantum samples required to complete the task. Therefore, we must choose an appropriate light intensity to achieve a balanced trade-off between improving the correct probability and reducing quantum resources. Figure 1(c) shows a comparison between the resources required by our protocol and the size of the set ½n with the correct probability Pð1, n − 1Þ = 90%. To highlight our quantum advantage, we also draw the classical limit of the required samples for comparison. The cost of our protocol is lower than the classical limit when n < 29,000. When n < 20,000, the samples required by our protocol are less than half of those required by the classical protocols.
Note that the objective of this article was to construct a specific computational task for experimentally demonstrating quantum advantages in the context of machine learning and communication complexity. Therefore, to simplify the implementation of our protocol, we do not consider the case in which Alice sends incorrect coherent pulses to Bob. In fact, this case belongs to a more complex communication task and is beyond the scope of this article.

Experimental Setup and Results
We used linear optics components and single-photon detectors to present a proof-of-principle experimental demonstration of the coherent-state quantum coupon collector protocol (Figure 2(a)). First, the continuous-wave light with a wavelength of 1550.12 nm emitted by a laser source was carefully modulated into optical pulses at a repetition rate of 312.5 MHz by using an intensity modulator (IM). To make the modulated waveform as perfect as possible, the modulated light was monitored by a singlephoton detector instead of an oscilloscope during the modulation phase. Then, the modulated optical pulses were attenuated to the desired level by a variable optical attenuator (VOA) and separated by a 50 : 50 BS into two identical pulse sequences jα, ½ni A and jα, ½ni B . These two pulse sequences travel clockwise and counterclockwise in the Sagnac loop. After passing through a phase modulator (PM), the clockwise pulse sequence jα, ½ni A is modulated to jα, Si according to the value of the coupon S. To convert the counterclockwise sequence jα, ½ni B into the local state jα, ½ni, the PM is turned off when jα, ½ni B passes through it. We used the Sagnac loop to automatically stabilize the phase fluctuation of the channel to improve interference visibility. In addition, to make the two pulse sequences pass through the PM at different times, one arm of the Sagnac loop was designed to be 1 m longer than the other arm. The optical circulator was used to prevent the previous optical pulses transmit-ted back by the BS from affecting the subsequent optical pulses. Finally, the interference results were detected using a superconducting nanowire single-photon detector (D R ).
Our scheme combines all copies of the jSi required to learn S into a sequence of n coherent states. Compared with a state that consists of a single photon in n modes, the coherent-state sequence is easier to prepare. Moreover, combining all copies of jSi enhances the mean photon number of each time mode of the coherent-state sequence, thus increasing the probability of measuring each mode. These features make our scheme easier to implement in experiments and more effective.
Note that we can only learn S correctly if each element i ∈ ½n is correctly classified as S or S. Therefore, even if p d and ν have a small effect on a single pulse, it is difficult to correctly classify all elements i ∈ ½n when n is very large. This means that p d and ν limit the maximum size of ½n that our protocol can achieve quantum advantage with a given correct probability. However, in our experiment, because p d can reach the order of 10 −8 under the current experimental conditions, the effects of the random detection events caused by p d can be ignored. In contrast, the maximum visibility of an interferometer ν that can be achieved in a laboratory is on the order of 1 − 10 −5 . As a result, ν seriously limits the success probability of learning S, even if the size of ½n is relatively small. In addition, η directly affects the number of quantum samples required to complete the task. Therefore, ν and η are the experimental parameters that need to be improved. When the accuracy of ν is improved in the future, our protocol can also achieve quantum advantage over ½n with a larger size.
To improve the success probability, we selected a BS whose visibility in the Sagnac loop was nearly 99.9993%. We also reduced the magnitude of the dark count probability p d to 10 −8 . At this magnitude, the dark count probability hardly affects the system performance. In addition, we tried to adjust the amplitude of the radio-frequency signal driving the PM to achieve an accurate π-phase shift. However, it is almost impossible for a PM to apply a perfect π-phase shift on a specific pulse without influencing other pulses in a period, especially as the duty cycle in this experimental demonstration is on the order of 10 −4~1 0 −5 . Consequently, the number of D R clicks caused by the pulses jαi i in a coupon period is much higher than theoretically expected, which means that the experimental interference visibility is lower than that of the BS. Fortunately, during data processing, we found that selecting certain time windows can significantly improve the visibility for interference. However, this approach inevitably filters out certain detection events, thereby affecting the final success probability. By repeatedly selecting different time windows, we achieved a better trade-off between improving visibility and reducing detection events, thereby reducing the number of quantum samples of different coupon lengths.
The experiment was performed with different sizes of the set ½n ranging from 2000 to 18000. For each size L, we ran the experiment for 5 s and analyzed the detection results. The relevant experimental parameters are listed 5 Research in Figure 2(c). The detailed results are listed in Table 1. Our protocol consumes fewer samples than the classic protocol for L < 16000 (Figure 2(b)). The gray area in Figure 2(b) indicates the region in which our protocol consumes more samples than the classical one. By choosing different time windows, we set ν = 0:99996 for n ≤ 6000 and ν = 0:99999 for n ≥ 8000. Note that these visibility levels are approximate values based on experimental data. To achieve better experimental results, we did not choose the time window with the most detection events, but the window with the highest visibility. The result is that only a fraction of the photons were detected, which is equivalent to reducing the detection efficiency of detector D R . Therefore, the intensities of the coherent-state pulses were modulated much higher than theoretically expected in this experiment. Furthermore, because the quantum resources consumed in our protocol are proportional to the intensity jαj 2 of each pulse in jα, Si, the   Figure 2: (a) Experimental setup for the coherent-state quantum coupon collector protocol. The optical pulses were generated by a 1550.12 nm continuous-wave (CW) laser source with an intensity modulator (IM) driven by an external arbitrary waveform generator (AWG). The frequency of the pulse sequences and the duration of a single pulse were 312.5 MHz and 900 ps, respectively. These pulses were attenuated to the single-photon level by a variable optical attenuator (VOA) and then separated into two pulse sequences with different propagation directions by a 50 : 50 beam splitter (BS). The clockwise pulse sequence was set to jα, Si after passing a phase modulator (PM) controlled by the AWG. The counterclockwise pulse sequence was used as the local state jα, ½ni. Finally, these two pulses interfered at the BS and were detected by a superconducting nanowire single-photon detector D R . A polarisation controller (PC) was used to modify the polarisation of the incident pulses to achieve the maximum detection efficiency. The detection events were recorded using a time-to-digital converter (TDC). (b) Relationship between the required samples and the size of the set ½n. We compare the classical lower bound, the samples consumed by our protocol in the practical setup with the experimental parameters listed in Figure 2(c), and its theoretically expected values. The experimental results were in line with the theoretically expected values. For an input size of under 14,000, our results outperformed the classical lower bound, thus demonstrating quantum advantage. (c) Experimental parameters corresponding to our numerical simulation and experimental demonstration. Note that the channel loss η is split into channel efficiency η cha and detector efficiency η det . Here, p d is dark count rate of the single-photon detector and ν is the limited visibility of the interferometer.
6 Research degree of photon dissipation directly affects the results of the experiment. When the photon dissipation is larger, a higher intensity is required to compensate for the photon dissipation, thus consuming more quantum resources. Therefore, in our experiments, we strive to improve the channel efficiency η cha between Alice and D R and the detection efficiency η det to reduce the consumed quantum resources. Reducing the voltage fluctuations of the phase modulator and selecting a better time window can further improve the experimental results.

Quantum Blind Box.
Our protocol can not only be used to verify quantum advantage in machine learning from the PAC theory but can also be regarded as a communication task to demonstrate quantum advantage in communication complexity. To this end, we designed a specific application scenario for our protocol, which is called a quantum blind box game. In this game (Figure 3(a)), Alice acts as a merchant, and Bob acts as a customer. Alice prepares n small balls with different patterns and packs them in boxes to form blind boxes. Alice then takes n − m of these boxes as a blind box system, where m ≥ 1, and encodes the coherent state jα, Si according to the patterns of the balls in that system. Alice then tells Bob all patterns of the balls and the number of blind boxes in the blind box system. To determine the patterns contained in the blind box system, Bob uses the same encoding method as Alice to create a local state jα, ½ni. The merchant Alice provides the entire measurement system, which is the same as the experimental device shown in Figure 2(a). Bob can decide the total intensity μ of the coherent state jα, Si sent by Alice, but the money he needs to pay to Alice is equal to the quantum resources consumed by jα, Si. The required quantum resources are Oðμ log 2 nÞ bits [27,54]. Therefore, the higher the intensity μ of jα, Si, the more money Bob needs to pay. Finally, Bob judges the patterns in the blind box system based on the results of the single-photon detector D R . If Bob gives the correct answer, he will be rewarded with ðn − mÞ log 2 ðn − mÞ log 2 n dollars, which is the minimum expected value of the information that needs to be transmitted to obtain the correct answer using classic resources. Therefore, when Bob consumes fewer quantum resources than the classical limit, he can obtain a positive return in this transaction.
Let us consider n = 100 and m ∈ f2, 3, 4g as an example for demonstrating this game. The experimental results show that for a given value of m, the expected quantum resources are significantly affected by the light intensity (Figure 3(b)). The detailed experimental results are presented in Materials and Methods. Note that the comparison used herein is the expected value of the quantum resources spent to obtain the correct answer. When the light intensity is small, Bob will not spend a large amount of resources in each game, but it is also difficult to obtain the correct answer. To obtain the bonus, Bob may need to play multiple games. As a result, the expected value of the quantum resources that he spends has increased. This is the reason why quantum resources at a 0.5 light intensity are relatively high for m = 3 and m = 4. Figure 3(b) also shows that Bob can break the classical limit by choosing the proper intensity for achieving a positive return. In other words, Bob successfully uses quantum resources to design a better strategy than random extraction in this game. This means that in the quantum blind box game, Alice can also allow customers to design various coding methods and measurement strategies for guessing the blind box. The smaller the expected value of the information required for a strategy designed by a customer, the more he can get in return. Overall, quantum advantage in communication complexity has been successfully demonstrated in experiments through the quantum blind box game. We reasonably expect that the ideas contained in this game can be used to design communication protocols with a lower amount of information needed to complete specified communication tasks. Table 1: Summary of the experimental data. The input size L ranges from 2,000 to 18,000 with a step size of 2,000. For each size L, we collected data for 5 seconds. The table shows the number of coupons sent, the number of detection events within the corresponding time windows, the number of events in which the detector clicked only once in each coupon period, the number of events in which the detector clicked only once and clicks at the correct time bin, the correct probability, the efficiency, the success probability, the minimum number of classical samples required for classical protocols, and the number of quantum samples needed to obtain the correct result utilized in our experiments.  Bob's answer The Z-axis represents the expected value of the quantum resources that Bob spends to obtain the correct answer. The plane represents the classical limit. Note that for different m, the classical limit is different. Therefore, the plane is not strictly a plane, but it looks like a plane because the gap is too small. The quantum resources at certain light intensities do not exceed the plane, which means that Bob can use these light intensities to break the classical limit.

Discussion
Prior works have mainly demonstrated the quantum advantages of improving the security of communication and enhancing computational power for specific tasks. Although many studies have attempted to find the superiority of quantum machine learning, these studies have not theoretically proven the quantum advantages of machine learning. In this work, we propose a coherent-state quantum coupon collector protocol and demonstrate it experimentally by using simple linear optical elements and coherent states. Experimental results show that our protocol can effectively reduce the number of samples required to learn coupons exactly with up to 14000 elements on the basis of a 90% correct probability. Combined with the arguments in Ref. [44], our result strongly demonstrates the quantum advantages of machine learning under current technology. To compare with quantum advantages achieved by other studies, we summarize them in Table 2. Note that our scheme does not resort to immature technologies, such as complicated entangled states or ideal single-photon sources. This makes our scheme particularly practical, especially for exemplifying the ability of linear optics. In addition to the demonstration of the quantum advantage of machine learning based on the PAC learning theory for the first time, we also specifically designed a quantum blind box game based on our protocol and experimentally demonstrated quantum advantage in communication complexity through this game. Our protocol does not save resources exponentially like other communication tasks with quantum advantages. Nevertheless, our protocol can still effectively learn the missing elements in the set ½n. We hope that the ideas contained in this game can inspire other useful applications, such as quantum voting.
Overall, despite potential limitations, our study provides new opportunities for the development of quantum machine learning and quantum communication complexity. We expect that the ability of linear optics can help us achieve more quantum-advantaged communication and computational schemes.

Materials and Methods
5.1. Selection of Time Windows. As described above, the visibility ν of the interferometer plays a crucial role in determining the cost in terms of quantum resources. Without phase modulation, ν in our scheme can easily reach over 99.999%. However, imperfections in PM often cause extra counts in unexpected places, which leads to a decrease in visibility.
Fortunately, we find that the visibility varies when different time windows are chosen, which is why the simulation shown in Figure 2(b) uses two visibilities. The choice of time window also affects the number of detection events. In our experiment, as visibility increased, the number of detected events tended to decrease, which is equivalent to a decrease in the detection efficiency. Therefore, the equivalent detection efficiencies and visibilities for different time windows are different.
To ensure the correct probability Pðm, kÞ > 90%, we traverse different time windows to find the minimum value of the expected value of the required quantum resources. For the expected values displayed shown in Figure 2(b), we adapted the corresponding experimental parameters according to the time window.

Experimental Details of Quantum Blind Box.
In the experiment, we correlate the patterns of the balls to the positions of the pulses. The corresponding positions of the balls that are not in the blind box system are loaded with a π -phase. Therefore, this game can be realized using the experimental setup shown in Figure 2(a).
The system was run at a repetition rate of 10 MHz, and each round was 5 seconds long. The duty cycle of a pulse was approximately 5%. The dark count rate per 5 ns detection gate was approximately 6 × 10 −7 . Considering the channel loss, the detection efficiency was approximately 68%. The detailed experimental results are presented in Table 3. The experimental apparatus was the same as that used before.

Data Availability
All data that support the findings of this study are available from the corresponding authors upon reasonable request.