Symmetry, Integrability and Geometry: Methods and Applications Lattice Field Theory with the Sign Problem and the Maximum Entropy Method ⋆

Although numerical simulation in lattice field theory is one of the most effective tools to study non-perturbative properties of field theories, it faces serious obstacles coming from the sign problem in some theories such as finite density QCD and lattice field theory with the $\theta$ term. We reconsider this problem from the point of view of the maximum entropy method.


Introduction
Lattice field theory is a powerful method to study non-perturbative aspects of quantum field theory. Although numerical simulation is one of the most effective tools to study non-perturbative properties of field theories, it faces serious obstacles in theories such as finite density QCD and lattice field theory with the θ term. This is because the Boltzmann weights are complex and this makes it difficult to perform Monte Carlo (MC) simulations on a Euclidean lattice. This is the complex action problem, or the sign problem. In the present talk, we review the analysis of the sign problem based on the maximum entropy method (MEM) [1,2,3]. For details, refer to [4,5,6]. The MEM is well known as a powerful tool for so-called ill-posed problems, where the number of parameters to be determined is much larger than the number of data points. It has been applied to a wide range of fields, such as radio astrophysics and condensed matter physics.
In this talk we deal only with lattice field theory with the θ term. It is believed that the θ term could affect the dynamics at low energy and the vacuum structure of QCD, but it is known from experimental evidence that the value of θ is strongly suppressed in Nature. From the theoretical point of view, the reason for this is not clear yet. Hence, it is important to study the properties of QCD with the θ term to clarify the structure of the QCD vacuum [7,8,9]. For theories with the θ term, it has been pointed out that rich phase structures could be realized in θ space. For example, the phase structure of the Z(N ) gauge model was investigated using free energy arguments, and it was found that oblique confinement phases could occur [8,9]. In CP N −1 models, which have several dynamical properties in common with QCD, it has been shown that a first-order phase transition exists at θ = π [10,11,12].
In order to circumvent the sign problem, the following method is conventionally employed [11,12]. The partition function Z(θ) can be obtained by Fourier-transforming the topological charge distribution P (Q), which is calculated with a real positive Boltzmann weight: The measure [dzdz] Q in equation (1) is such that the integral is restricted to configurations of the field z with topological charge Q. Also, S represents the action.
In the study of CP N −1 models, it is known that this algorithm works well for a small lattice volume V and in the strong coupling region [10,11,13,14]. As the volume increases or in the weak coupling region, however, this strategy too suffers from the sign problem for θ ≃ π. The error in P (Q) masks the true values of Z(θ) in the vicinity of θ = π, and this results in a fictitious signal of a phase transition [13,14]. This is called 'flattening', because the free energy becomes almost flat for θ larger than a certain value. This problem could be remedied by reducing the error in P (Q). This, however, is hopeless, because the amount of data needed to reduce the error to a given level increases exponentially with V .
Here, we are interested in whether the MEM can be applied effectively to the study of the θ term and reconsider the flattening phenomenon of the free energy in terms of the MEM. The MEM is based upon Bayes' theorem. It derives the most probable parameters by utilizing data sets and our knowledge about these parameters in terms of the probability. The probability distribution, which is called the posterior probability, is given by the product of the likelihood function and the prior probability. The latter is represented by the Shannon-Jaynes entropy, which plays an important role to guarantee the uniqueness of the solution, and the former is given by χ 2 . It should be noted that artificial assumptions are not needed in the calculations, because the determination of a unique solution is carried out according to probability theory. Our task is to determine the image for which the posterior probability is maximized.
We present the results for the analysis by i) using mock data and ii) using the MC data. For the former, we use the Gaussian form of P (Q). The Gaussian form is realized in many cases, such as the strong coupling region of the CP N −1 model and the 2-d U (1) gauge model. For using MC data, we simulate the CP N −1 model and apply the MEM to the obtained data. This paper is organized as follows. In the following section, we give an overview of the origin of flattening and summarize the procedure for the analysis of the MEM. The results obtained by use of the MEM are presented in Section 3.

Flattening
The free energy density f (θ) is calculated by Fourier-transforming P (Q) obtained by MC simulation. Let us call this method the FTM. The quantity f (θ) is defined as where V = L 2 , the square of the lattice size. The MC data for P (Q) consist of the true value,P (Q), and its error, ∆P (Q). When the error at Q = 0 dominates because of the exponential damping of P (Q), f (θ) is closely approximated by wheref (θ) is the true value of f (θ). Becausef (θ) is an increasing function of θ, ∆P (0) dominates for large values of θ. If |∆P (0)| ≃ e −Vf (θ) at θ = θ f , then f (θ) becomes almost flat for θ > ∼ θ f . This is called "flattening of the free energy density", and it has been misleadingly identified as a first order phase transition, because the first derivative of f (θ) appears to jump at θ = θ f . To avoid this problem, we must carry out the order of e V measurements in the FTM.

MEM
In this subsection, we briefly explain the MEM in terms of the θ term. In a parameter inference, such as the χ 2 fitting, the inverse Fourier transform is used. In the numerical calculations, we use the discretized version of equation (2); P (Q) = n K Q,n Z n , where K Q,n is the Fourier integral kernel and Z n ≡ Z(θ n ). In order for the continuous function Z(θ) to be reconstructed, a sufficient number of values of θ, which we denote by N θ , is required so that the relation N θ > N Q holds, where N Q represents the number of data points in P (Q) (Q = 0, 1, . . . , N Q − 1). A straightforward application of the χ 2 fitting to the case N θ > N Q leads to degenerate solutions. This is an ill-posed problem. where X L is a normalization constant and χ 2 is a standard χ 2 function. The probability prob(Z(θ)|I) is given in terms of an entropy S as where α and X S (α) are a positive parameter and an α-dependent normalization constant, respectively. As S, the Shannon-Jaynes entropy is conventionally employed: Here m n ≡ m(θ n ) represents a default model. The posterior probability prob(Z n |P (Q), I), thus, is given by For the prior information I, we impose the criterion Z n > 0, so that prob(Z n ≤ 0 | I, α, m) = 0.
The most probable image of Z n , denoted asẐ n , is calculated according to the following procedures [3,4].
1. Maximizing W [Z] to obtain the most probable image Z (α) n for a given α: n to obtain the α-independent most probable image Z n : 3. Error estimation: The error of the most probable output imageẐ n is calculated as the uncertainty of the image, which takes into account the correlations of the imagesẐ n among various values of θ n : Here δẐ n and δZ (α) n represent the error inẐ n and that in Z (α) n , respectively.

Mock data: Gaussian
Firstly, we present the results of the analysis by using mock data. For this, we use the Gaussian P (Q) as where, in the case of the 2-d U (1) gauge model, c is a constant depending on the inverse coupling constant β, and V is the lattice volume. The constant A is fixed so that Q P (Q) = 1.
The distribution P (Q) is analytically transformed by use of the Poisson sum formula into the partition function To prepare the mock data, we add noise with variance δ × P (Q) to the Gaussian P (Q). In the analysis, we consider sets of data with various values of δ and study the effects of δ. In Fig. 1 the Gaussian topological charge distribution and corresponding f (θ) obtained by using the FTM are shown for various lattice volumes. For small volumes, the behavior of f (θ) is smooth. For large volume (V = 50), however, clear flattening is observed. For V = 30, some data are missing. This is due to the fact that Z(θ) could take negative values because of large errors. This is also called flattening, because its origin is the same as that stated above.
For the P (Q) data corresponding to small volumes without flattening, the MEM successfully reproduces f (θ).

MC data
In this subsection, we apply the MEM to real Monte Carlo data by simulating the CP 3 model. (For details, see [6].) For this we used a fixed point action [15,16] and various lattice volumes L × L. Among these, we concentrate on the data for L = 38 as the non-flattening case and L = 50 as the flattening case. We systematically studied the flattening phenomenon by adopting a variety of default model m(θ) and prior probability of the parameter α. For the latter, g(α) dependence appearing in prob(α|P (Q), I, m) ≡ P (α) ∝ g(α)e W (α)+Λ(α) , is investigated. The function g(α) represents the prior probability of α and is chosen according to prior information. In general, two types of g(α) are employed, one according to Laplace's rule, g Lap (α) = const, and one according to Jeffrey's rule, g Jef (α) = 1/α. The latter rule is determined by requiring that P (α) be invariant with respect to a change in scale, because α is a scale factor. The former rule means that we have no knowledge about the prior information of α. In general, the most probable imageẐ(θ) depends on g(α). We investigate the sensitivity   ofẐ(θ) to the choice of g(α) by studying a relative difference whereẐ Lap (θ) andẐ Jef (θ) represent the most probable images according to Laplace's rule and Jeffrey's rule, respectively. In the case without flattening (L = 38), the MEM yielded imagesẐ(θ) that are almost independent of m(θ) and g(α). The most probable imagesẐ(θ) are in agreement with the result of the FTM within the errors.
In the case with flattening (L = 50), we found that the statistical fluctuations ofẐ(θ) become smaller as the number of measurements increases except near θ = π. We also found thatẐ(θ) with large errors depends strongly on g(α) in the region of large θ, where the g(α) dependence ofẐ(θ) was estimated using the quantity ∆(θ). For θ < ∼ 2.3,Ẑ(θ) agrees with the result of the FTM. For θ > ∼ 2.3,Ẑ(θ) behaves smoothly, while the FTM develops flattening. In Fig. 3 , we compareẐ(θ) in larger values of θ for g Lap (α) and g Jef (α) when two different Gaussian default models are used (γ = 5.0 and 13.0). It is noted that γ = 5.0 is the case in which the smallest values of ∆(θ) in equation (5) are observed in this θ region among various default models.
Our results are summarized in Fig. 4. All the results obtained using the MEM behave smoothly over the entire range of θ. Errors are estimated from uncertainties of the images according to equation (3). For larger values of θ,Ẑ(θ) depends strongly on m(θ). Each of these images could be a candidate for the true image. This m(θ) dependence ofẐ(θ) may reflect the flattening phenomenon. If we had proper knowledge about m(θ) as prior information, we could identify the true image in a probabilistic sense. Such knowledge may also allow us to clarify the relationship between the default model dependence and the systematic error, which is not included in the figure. This will be a task to be pursued at the next stage.
The MEM provides a probabilistic point of view in the study of theories with the sign problem. It may then be worthwhile to study lattice QCD with a finite density in terms of the MEM.