Robust phase retrieval with complexity-guidance for coherent X-ray imaging

Reconstruction of a stable and reliable solution from noisy incomplete Fourier intensity data recorded in a coherent X-ray imaging (CXI) experiment is a challenging problem. The Relaxed Averaged Alternating Reflections (RAAR) algorithm that is concluded with a number of Error Reduction (ER) iterations is a popular choice. The RAAR-ER algorithm is usually employed for several hundreds of times starting with independent random guesses to obtain trial solutions that are then averaged to obtain the phase retrieval transfer function (PRTF). In this paper, we examine the phase retrieval solution obtained using the RAAR-ER methodology from perspective of the complexity parameter that was introduced by us in recent works. We observe that a single run of the RAAR-ER algorithm produces a solution with higher complexity compared to what is expected based on the complexity parameter as manifested by spurious high frequency grainy artifacts in the solution that do not seem to go away completely even after a number of trial solutions are averaged. We then describe a CG-RAAR (Complexity Guided RAAR) phase retrieval method that can effectively address this inconsistency problem and provides artifact-free solutions. The CG-RAAR methodology is first illustrated with simulated unblocked noisy Fourier intensity data and later applied to centrally-blocked noisy cyanobacterium data which is available from the CXIDB database. Our simulation and experimental results using CG-RAAR suggest two important improvements over the popular RAAR-ER algorithm. The CG-RAAR solutions after the averaging procedure is more reliable in the sense that it contains smallest features consistent with the resolution estimated by the PRTF curve. Secondly, since the single run of the CG-RAAR solution does not have grainy artifacts, the number of trial solutions needed for the averaging process is reduced.

1 Introduction 28 Imaging of micro and nano-sized objects remains one of the prime interests in structural biology, 29 material science, chemistry, medical science etc. With the rapid advancements in coherent diffraction 30 imaging hardware, particularly in X-ray free electron lasers (XFELs) [1], it is possible to perform 31 improve the performance of existing phasing algorithms in coherent X-ray imaging. 83 2 Methods: Fourier phase retrieval 84 In a typical coherent X-ray imaging experiment, where X-ray beam illuminates the object or sample, the Fourier intensity I(u) measured at the detector can be mathematically expressed as: where, r = (x, y) and u = (f x , f y ) are the spatial and Fourier space co-ordinates. ρ(r) is the electron density of the object andρ(u) is the Fourier transform of ρ(r). Through out the paper the hat notation will be the representative of Fourier domain. The aim of iterative phase retrieval algorithms is to find ρ(r) from the measured intensity I(u). Starting with a random guess ρ 0 (r), the Fourier magnitude constraint can be defined [18] as: =ρ(u), u ∈ Z.
Here,P M is the Fourier modulus projection operator in the Fourier domain and Z is the set of pixels in I(u) that have zero photon counts (either due to Poisson noise or missing pixels). The above equation can be translated in the real space as: where, P M is the Fourier projection operator in the object domain. The second projection operator in object domain is the support projection P S which signifies the extent of the object contained within the support S and is defined as: P S ρ(r) = ρ(r), r ∈ S (4) = 0, r / ∈ S Commonly used phase retrieval algorithms aim to reach a solution that satisfies the Fourier magnitude and a priori object domain constraints in an iterative fashion. Inspired by the work of Gerchberg and Saxton [19] from electron microscopy for two measurement, the error reduction (ER) algorithm [8] tries to solve the phase retrieval problem for single Fourier measurement via following solution update: ρ n+1 (r) = P S P M ρ n (r).
The Fourier constraint being non-convex in nature makes the problem harder and the ER algorithm is known to suffer from stagnation in the sense that the quality of the solution does not improve even after a large number of iterations. The hybrid input output (HIO) algorithm [7] addresses the stagnation issue by introducing a negative feedback outside the object support. The HIO feedback step can be stated as: ρ n+1 (r) = P M ρ n (r), r ∈ S = ρ n (r) − βP M ρ n (r), r / ∈ S where, β is the feedback parameter in the range (0, 1). For noiseless data case, in a wide variety of images the HIO algorithm shows a significantly improved reconstruction that is close to an ideal solution [20]. However, for the noisy data case, HIO solution also shows stagnation artifacts [8] that corrupt the quality of the recovered solution. Phase retrieval with noisy data is an ill-posed problem, and therefore a unique solution to the problem cannot be estimated by satisfying both the Fourier magnitude (P M ) and object support (P S ) constraints. The difference map (DM) proposed by Elser [9] is an interesting approach which avoids local minima by using relaxed projections. The (n + 1)-th iteration of the DM is given as: Here are the relaxed generalized projections of P M and P S respectively [21] and1 is the identity operator. β is the relaxation parameter which needs to be optimized depending on the problem at hand while the other two parameters γ M and γ S are generally chosen to be 1/β and −1/β [22]. Finally, in DM the final solution is obtained as P M ρ(r). Another approach that is also known to surpass the stagnation problems is the relaxed averaged alternating reflections (RAAR)[11] algorithm, which is a descendant of the hybrid projection reflection (HPR) algorithm [23]. The RAAR algorithm shows good convergence to a stable solution even when an exact solution satisfying both the Fourier magnitude and the support constraints does not exist. The RAAR update is given as: where, R S and R M are the reflections of P S and P M that are obtained by setting γ S = γ M = 1.

85
Apart from this, some other variants of HIO  Fourier domain [16,27]. We use the combination of RAAR algorithm followed by 100 iterations of 107 ER throughout this paper which will be referred by the name 'RAAR-ER' algorithm.
Here, ∇ x and ∇ y are the partial derivatives with respect to the x and y co-ordinates respectively.

113
Using Parseval's theorem and employing the derivative property of Fourier transform along with the 114 notion of modified wave-numbers, the above equation can be equivalently written as: where ∆x, ∆y are the grid spacing in spatial domain and (f xm , f ym ) denote the spatial frequency 116 samples as per the FFT convention. Note that the usual continuous case of this result is evident by 117 setting ∆x, ∆y → 0. Interestingly, the complexity parameter is a real space constraint but derived 118 directly from the raw Fourier intensity data I(f x , f y ) = |ρ(f x , f y )| 2 . Moreover, the complexity 119 information in the form of ζ 0 as defined in Eq. (10) is readily available a priori to all the phase 120 retrieval algorithms. We observe empirically that the complexity parameter is not too sensitive to 121 the typical Poisson noise in the Fourier magnitude data. As discussed in Section 3, we will explain 122 how the complexity information may be utilized when applying object domain constraints to obtain 123 a noise robust phase retrieval solution. Fourier magnitude information at all pixels. As we will describe later in Section 6, we handle the 133 missing pixels in the real data with the standard methodology prevalent in the literature.

134
In order to illustrate the effect of noise on the phase retrieval solution, we simulate two Fourier 135 intensity data corresponding to a red blood cell (RBC) object in a computational window of size 136 512 × 512 pixels. The amplitude of this test object (included within the central 160 × 160 pixels) 137 is shown in Fig. 1(a). The complex-valued RBC object for simulating noisy diffraction patterns 138 was generated using a digital holographic microscope system (make: Holmarc) and corresponds to

143
These noise levels are comparable to those in the experimental CXI data to be discussed in Section 144 6. The Fourier intensities are over-sampled by a factor of approximately 3 in both directions. 145 We ran 1100 iterations of RAAR-ER algorithm comprising of 1000 RAAR iterations, as per Eq.

146
(8), that were concluded by 100 ER iterations starting with the data in Fig. 1  Eq. (9) and is denoted by ζ. In a similar vein, complexity inside ζ in and outside ζ out the support 156 can be evaluated by summing over corresponding pixels. Clearly we have ζ = ζ in + ζ out . Figure 2  light levels of 1000 photons/pixel and 500 photons/pixel respectively. The complexities ζ and ζ out for both the cases stabilize to values higher and lower than the ground-truth complexity ζ 0 . In 161 terms of complexity behaviour of solution, we observe that the solution's complexity within the 162 support ζ in stabilizes at a higher level for data with higher noise. The outside support complexity 163 ζ out is also seen to stabilize to a slightly higher value for the higher noise data. The manifestation based on the PRTF curve as we will see later in Section 5.

179
The prior literature [12,24,25] has stated the significance of outside support region in relation opinion in the context of the RAAR algorithm. Therefore, most complexity reduction in the present 199 case needs to be done within the support region. It is interesting to note that an improvement over 200 the native RAAR algorithm has been suggested previously by introducing a generalized feedback 201 term inside the support region [28]. Based on our observations we control the total complexity of 202 the RAAR solution primarily by suppressing the complexity within the support which is essentially 203 like providing a complexity feedback within the support. 204 We remark that complexity-guidance is essentially a regularization scheme where complexity (or degree of fluctuations) in the desired solution is pre-determined from the raw data. In order to implement complexity-guidance the recursive iterations for complexity reduction are applied to the RAAR solution ρ n+1 (r) in Eq. (8) in the outside and inside the support region in each outer RAAR iteration. For brevity of notation we denote the outside and inside part of the solution as ρ out and ρ in respectively. Firstly, recursive sub-iterations are applied outside the support as represented by: Next, further recursive complexity reduction iterations similar to above are also applied to the ρ in : For both the above equations τ is the step size which was chosen to be 5 × 10 −3 so that the solution changes by a small amount in each sub-iteration andû is the normalized functional gradient of complexity defined as:û where, ∇ ρ * ζ(ρ, ρ * ) = −∇ 2 ρ is the complex derivative [29] of complexity parameter ζ. As explained in [15], the number of sub-iterations for ζ out reduction are adjusted by actively tracking the trend of ζ out . Particularly, the value ζ out is nudged towards the mean minus the standard deviation of ζ out of the previous 20 iterates in order to control any growth in ζ out . After that, the complexity reduction sub-iterations within the support are applied till the numerical value of the total complexity ζ in + ζ out is close to (within 0.5%) the ground-truth complexity ζ 0 . The number of sub-iterations in this step typically depends on the noise level (higher noise data requires more complexity reduction sub-iterations inside the support). The (n+1)-th iterate of CG-RAAR is updated as: Here ρ in and ρ out refer to the inside and outside support solutions after applying the complexity- In particular the high frequency artifacts within the support are now suppressed by the complexity-213 reduction steps. Therefore, the CG-RAAR solution is smoother in nature and is much closer to the 214 ground-truth shown in Fig. 1(a). We want to emphasize here that all the results shown in Fig. 1 215 correspond to a single run of the corresponding algorithms (RAAR or RAAR-ER or CG-RAAR).

216
Use of complexity constraint with noisy data is observed to provide a better estimate for single run 217 of the phase retrieval algorithm. In turn this is beneficial for the averaging process as well as we will 218 see in the next section. The trial solutions selected for the averaging process can be improved further  The reconstruction of solution from the experimental Fourier transform intensity data consisting of noise and missing pixels due to detector blocking poses challenges. First of all, generally just one realization of the data is often available as the sample may get destroyed after exposure to high energy X-ray pulse. Secondly, a single run of the algorithm starting with a random guess cannot be fully relied upon when the data is corrupted by noise and missing pixels. It is therefore a standard practice to arrive at a better estimate of the solution by averaging a large number of trial solutions [13] starting with the same data. These independent trial solutions are obtained by running the phase retrieval algorithm over hundreds of runs that are initialized with different random guesses. Among a number of trial reconstructions only some following a certain criterion are selected for the averaging process. The averaging criterion can be decided on the basis of least errors in real space or Fourier space. Here we have selected correlation coefficient between trial solution and a reference solution with least error as an averaging criterion. Since the final reconstruction is based on the averaging process, the goodness of reconstruction can be assessed using phase retrieval transfer function (PRTF) which compares the magnitude of Fourier transform of averaged complex reconstruction with the measured Fourier modulus and defined as [27,32]: Here, the average of complex Fourier estimatesρ avg (u) can be simplified as : where t = 1, 2, ....., T denotes the trial number index and ρ t (r) represents the phase adjusted trial and CG-RAAR algorithms respectively for noisy data in Fig. 1(b). The correlation coefficient with value 0.96 is chosen as the criterion for selecting a trial solution. For the simulated data, we have not assumed any missing pixels due to detector blocking and assumed that the data is corrupted by Poisson noise alone. In this case we observe that an average over a small number of trial solutions is sufficient to stabilize the PRTF curve. One can clearly see that the averaging process has reduced the grainy appearance of the RAAR-ER solution. However, some of the faint artifacts are still persistent which may possibly get misinterpreted as the real features, particularly in the case of experimental Fourier intensity data where the ground-truth object is unknown. Since the individual CG-RAAR solution was already free from grainy artifacts, the averaged CG-RAAR solution closely resembles the ground-truth object. The performance of the CG-RAAR and RAAR-ER solutions are tabulated in Table 1 in terms of error in Fourier and real domain. The real space error metric E R is defined as [33]: where, Here, ρ est represents the solution estimate (individual trial or average solution), ρ is the ground-truth object, and corr(ρ est , ρ) is the correlation between ρ est and ρ. Next the relative Fourier domain error E F for the Fourier modulus estimateρ est (u) with respect to the measured Fourier modulus I(u) can be written as: Both RAAR-ER and CG-RAAR solutions seem to have errors of similar orders in terms of the above 232 metrics despite the fact that CG-RAAR solution has much reduced grainy artifacts.  Table 2.

273
We observe that the PRTF curve is stabilized for both RAAR-ER and CG-RAAR cases and 274 additional trial runs will not have any perceptible effect on the PRTF curve. In order to assess the 275 quality of averaged solutions, we plot the PRTF as a function of spatial frequency which is shown 276 in Fig. 4(k). The PRTF curve is vanishing near zero frequency since there are blocked pixels in to be smoother and consistent with the spatial resolution reflected in PRTF curve. 284 We remark that the spatial resolution given here is in terms of pixels however one can always 285 express it in physical units by the relation ∆x = λz ∆x det N , where λ: wavelength of the X-rays, z: 286 distance between specimen and detector, ∆x det : pixel size of the detector, and N: half pixel number.

287
For the Fourier intensity data shown in Fig. 4(b), one can make similar observations from Figs. 4(g)-

288
(j) and (n)-(p In conclusion, we have examined the nature of phase retrieval solutions for noisy Fourier intensity were judged using the PRTF plot.

310
The traditional RAAR-ER solution seems to retain some grainy artifacts (even after averaging 311 over a number of trial solutions), whose feature size is smaller than the resolution estimated by 312 Table 2: Performance of single run and averaged RAAR-ER and CG-RAAR solutions for the CXI Fourier intensity data in Fig. 4(a) referred here as data A and Fig. 4 data from Coherent X-ray Imaging Data Bank (CXIDB). It is worth highlighting that the single 317 run of CG-RAAR produces solutions with much reduced artifacts and as a result the number of 318 trial solutions required for the averaging process with this methodology is less than the half number 319 that are needed for the traditional RAAR-ER method. We believe that the complexity-guidance 320 methodology can benefit the field of X-ray coherent diffraction imaging by providing reliable and 321 self-consistent phase retrieval solution estimates.