COMPUTING THE TRACY-WIDOM DISTRIBUTION FOR ARBITRARY β > 0

. We compute the Tracy-Widom distribution describing the asymptotic distribution of the largest eigenvalue of a large random matrix by solving a boundary-value problem posed by Bloemendal [Blo11]. The distribution is computed in two ways. The first method is a second-order finite-difference method and the second is a highly accurate Fourier spectral method. Since β is simply a parameter in the boundary-value problem, any β > 0 can be used, in principle. The limiting distribution of the n th largest eigenvalue can also be computed. Our methods are available in the Julia package TracyWidomBeta.jl .


Introduction
Tracy and Widom [23,24,25] introduced the Tracy-Widom distribution that gives the limiting distribution of the rescaled largest eigenvalue of a random matrix taken from an appropriate symmetry class.More precisely, the largest eigenvalue λ max satisfies the following fundamental limit where F β is the Tracy-Widom distribution and β = 1, 2, 4 if A n ∼ Gaussian orthogonal ensemble, Gaussian unitary ensemble, Gaussian symplectic ensemble, respectively.For an n × n Gaussian ensemble A n with ordered eigenvalues {λ j } n j=1 , the joint probability density function (jpdf) for its eigenvalues is given by where Z n,β is the partition function.For β = 1, 2, 4, (1.1) is solvable: all correlation functions in finite dimensions can be explicitly calculated using Hermite polynomials.This provides definitive local limit theorems and establishes clear limits for the random points [15].Importantly, (1.1) also describes a one-dimensional Coulomb gas at inverse temperature β for any β > 0.
Dumitriu and Edelman [7] established that a family of symmetric tridiagonal (Jacobi) matrix models have (1.1) as their eigenvalue density for any β > 0.More specifically, the matrix given by has (1.1) as the jpdf of its eigenvalues.Here the entries are independent random variables, up to symmetry, g ∼ N (0, 2) and χ k ∼ Chi(k).We call H β n the β-Hermite ensemble.Sutton and Edelman [8,21] then presented an argument 1 describing how the spectrum of the rescaled operator , where I is the n × n identity matrix, should be described by the spectrum of the stochastic Airy operator as n → ∞, where b ′ x is standard Gaussian white noise.As a result, the following eigenvalue problem was considered in [19] with a Dirichlet boundary condition f (0) = 0. Ramírez, Rider, and Virág [19] proved the following theorem.
Ramírez, Rider, and Virág showed that the distribution of −Λ 0 is a consistent definition of Tracy-Widom(•) for general β > 0. Based on [19,26], Bloemendal and Viraǵ [1,2] considered a generalized eigenvalue problem with boundary condition f ′ (0) = ωf (0), where ω ∈ R represents a scaling parameter.Let H β,ω denote H β together with this boundary condition.According to [19], the distribution F β,ω of −Λ 0 in the case ω = +∞ thus coincides with Tracy-Widom for general β > 0. Bloemendal and Virág [2] further showed that F = F β,ω can be characterized using the solution of a boundary value problem, as we now discuss in the following section.We also point out that Bloemendal [1] provided Mathematica code to approximate F .The goal of the current work is to expand and improve upon this scheme.
This paper is laid out as follows.In Section 2, we outline our two algorithms to compute the Tracy-Widom distribution.In Section 3, we validate and compare our methods.In Section 4, we present a number of additional numerical results.We also include two appendices to discuss some nuances in the numerical algorithms (Appendix A) and to discuss the large β limit (Appendix B).Code to produce all figures in this paper can be found here [27].

Finite-difference discretization
One way to solve this boundary-value problem is by discretizing it using finite differences, see, for example, [12].For 0 ≤ n ≤ N , 0 ≤ m ≤ M , define Given that β > 0, one integrates equation (2.3) backward in "time" with respect to the timelike variable x to guarantee its well-posedness: ∆x < 0. Denote the approximation of H(x, θ m ) by H m (x).We then replace partial derivatives of H with respect to θ by centered differences, with grid spacing h.The method of lines formulation reads where H H H M excludes H 0 since H 0 (x) = 0.The coefficients in the last row of Ǔ come from the parameters of the two-step backward difference formula (BDF2).Note that there is no need to replace the last row of Ť using a backward difference formula since θ M = π and, mathematically, there is no need to set an additional boundary condition since (2.5) has vanishing diffusivity for θ = π.Let and where (2.9) Noting that x is the time-like variable, we apply the trapezoidal rule with time step ∆x < 0 to (2.9) yielding where H H H n M ≈ H H H M (x n ).Upon rearranging, this gives (2.10) Finally, we obtain F β (x) ≈ Fβ (x) := H M (x) ≈ H n M , n = 0, 1, . . ., N .Below, we also explore other time integration methods, in addition to the trapezoidal method.We use the trapezoidal method as our default due to its A-stability, but it can be outperformed by BDF methods in this context.

Spectral discretization
To obtain better accuracy, we apply a Fourier spectral method.As before, for 0 ≤ n ≤ N , define where and Here S ±k represent the (Fourier modes) shift matrices, S ±k = S k ±1 , where Also D 1 and D 2 represent the first and second-order differentiation matrices in the Fourier space, i.e., D 2 = D 2 1 , where .
The Fourier coefficients of the initial condition are obtained via Instead of applying the second-order accurate trapezoidal rule to integrate the system of ODEs for ρ(x, θ), we suggest the use of a five-step backward differentiation formula method (BDF5), see [12,Section 8.4].To use BDF5, we need four more starting conditions, i.e., a a a M (x 0 − i∆x), i = 1, 2, 3, 4, which can be obtained by (2.4) and (2.16).We then proceed with BDF5 to solve for a a a n M ≈ a a a (2.17) Other time integration methods can be used, and this will be discussed further below.
Finally, the value of H(x, θ) can be recovered from The Tracy-Widom distribution can then be approximated by setting θ = π, from which we find

Algorithm validation and comparison
At this point, we have approximated the values of F β (x) on equally-spaced grid points x 0 , x 1 , . . ., x N .To obtain a continuous function, we perform Fourier interpolation.Since Fourier interpolation has high accuracy when the function being interpolated is periodic, we define where erf denotes the error function [18].Then consider which is nearly a periodic function.Note that other functions can also be used instead of the error function erf(x) in constructing a periodic function.We perform Fourier interpolation to G β (x) on the grid x 1 , . . ., x N .The resulting interpolant is then evaluated on a shifted and scaled Chebyshev grid on [x N , x 0 ].By adding back in φ(x) evaluated on the same Chebyshev grid and then interpolating with Chebyshev polynomials, we obtain a useful high-accuracy approximation of F β (x).The number of Chebyshev coefficients is user-decided (we use 10 3 by default).We point out that using either (2.5) or (2.13), we can also obtain the approximation of F ′ β (x) on the grid x 0 , x 1 , . . ., x N at nearly no extra cost.We then perform the same procedure of interpolation to get an approximation of F ′ β (x), without using φ(x) since the function being interpolated is already nearly periodic.
Algorithm 1 shows the pseudocode for the Fourier interpolation, where T is the grid points x N , x N −1 , . . ., x 0 with grid spacing ∆x, K is an integer that gives the number of Chebyshev coefficients, F is the value of Fβ (x j ), j = 1, . . ., N , and f is the value of F ′ β (x j ).We think of F, f as functions on T and need to extend them to all of [x N , x 0 ].
Perform Fourier interpolation to G(T) and f(T) to obtain the interpolant G(x) and f(x), x ∈ D 4: Set F=G(x)+φ(x) and f(x)=f(x), where x is a Chebyshev grid on D with K points 5: Perform Chebyshev interpolation to F and f and return the interpolants For the finite-difference discretization using trapezoidal method to compute either the cumulative distribution function (cdf) or the probability density function (pdf) of F β using TW(β), as provided in TracyWidomBeta.jl, the following default values for the parameters are used: where ⌊•⌋ denotes the floor function.
Notation.Throughout we use TW(β; params) to refer to our implementation with different choices of parameters.For example, TW(β; method="finite", step="trapz", pdf=true), refers to using the finite-difference discretization, time-stepped with the trapezoidal method, outputting an approximation of F ′ β with the default parameters (3.1).The default values of x 0 and x N are chosen to be ⌊13/ √ β⌋ and −10 respectively so that F β (x 0 ) ≈ 1 and F β (x N ) ≈ 0 for β ≥ 1.Though not optimal, larger values of x 0 and smaller values of x N can also be used.See Section 3.3.3for a discussion on the selection of the default value for x 0 .For 0 < β < 1, a larger domain for x should be used.The values of ∆x and M are chosen so that M = ⌊−1/∆x⌋.In this way, the local truncation error of the trapezoidal method is of the optimal order.Algorithm 2 shows the pseudocode for TW(β; pdf=true).Step 6 may be replaced by solving an alternate discretization of (2.9).
Similarly, for the spectral discretization using BDF5 method to compute either the cdf or the pdf using TW(β; method="spectral", step="bdf5") or TW(β; method="spectral", step="bdf5", pdf=true), Algorithm 2 TW(β; pdf=true) Use (2.5) and denote the result by h 8: end for 9: Take the values of H and h for θ=π and denote the results by F and f 10: Perform Algorithm 1 with F, f, and T=x 11: Return the interpolant for f the following values for the parameters are used: One thing to note here is that θ M is set to be 20π, which is much larger than π as used for finitedifference discretization.This setting is necessary since a periodic boundary condition (2.12) is imposed on ρ(x, θ), or equivalently, on H(x, θ).If the length of the domain, θ M , is not large enough, due to periodicity, the approximation to H(x, θ) will propagate to the end of the domain and reappear at the lower boundary θ = 0.In other words, θ M depends on the speed of propagation of the approximation to H(x, θ), and it turns out that setting θ M = 20π is sufficient.M is set to be 8 × 10 3 regardless of the value of ∆x to make sure we have large enough number of Fourier modes to represent the initial condition.See Section 3.4 for details on which value of M to use for each method in terms of both accuracy and computation time.Algorithm 3 shows the pseudocode for TW(β; method="spectral", step="bdf5", pdf=true).As with the finite-difference method, step 11 can be replaced with solving an alternate discretization of (2.13).
Note that when β is very large, numerical instabilities can occur.This instability arises primarily because the initial condition closely resembles a step function.To get an accurate result in this case, first of all, one needs to use finer grid for x and larger value for M , i.e., refinement in both time and space.Also, one needs to use more Chebyshev coefficients.With current values for the parameters, TW(β) and TW(β; method="spectral", step="bdf5") exhibits stability roughly for 1 ≤ β ≤ 30.

Eigenvalues of ∆x[T (β, θ
for finite-difference discretization The default time-stepping routine for the finite-difference discretization is the trapezoidal method but BDF3, BDF4, BDF5, and BDF6 can also be used on (2.5) to solve for H H H M .It turns out that with values for the parameters as in (3.1), convergence occurs for all these methods.Since with the finite-difference discretization, the trapezoidal method and BDF3 are usually used, Figures 1 and 2 show the absolute stability regions of trapezoidal method and BDF3 along with eigenvalues of ∆x(T + U ) for β = 2, x = x 0 , x 0 − 1, . . ., x N , ∆x = −10 −3 , and M = 10 3 .The eigenvalues in the left-half plane are firmly within the region of absolute stability in each case.2), the eigenvalues in the left-half plane are firmly within the region of absolute stability for these methods.We choose BDF5 over BDF6 because with the same values of these parameters, the performance is roughly the same yet BDF5 has a larger absolute stability region.Selecting ∆x is more nuanced than one might think, see Appendix A for more details.

Error analysis
We now compare the accuracy of these two algorithms when computing the cdf for β = 1, 2, 4.
For reference solutions, we implement the Fredholm determinant representations for β = 1, 2, 4 [3,4,9] by porting the code in the Julia package RandomMatrices to Mathematica and implementing it in high-precision arithmetic.This ensures that our reference solutions are accurate to beyond the ≈ 10 −16 machine precision for standard double precision.Observe that for F 4 (x), x should be divided by a factor of 2 1/6 .This difference in variance convention has also been underscored in [16, p. 47], [14,17], and [1, Remark 5.1.4].With values of the parameters given as in (3.1) and (3.2), Figure 5 shows the absolute errors of the two algorithms across the domain.Tables 1 and 2 show the absolute errors of some selected x-values with same values for the parameters.

Error across the domain
For finite-difference discretization, from either Figure 5 (a) or Table 1, the error is roughly on the order of 10 −7 around the peak of the distribution, and it improves when x approaches either end of the domain.This is due to the fact that exact values, 0 and 1, for the initial condition are imposed on the endpoints of the domain, and the Dirichlet boundary condition ensures that the solution tends to zero.
For spectral discretization, from either Figure 5 (b) or Table 2, the error is roughly on the order of 10 −13 throughout the entire domain.Regardless of the order of error, the reason for this discrepancy from finite-difference discretization near the edges of the domain is that an error is introduced for the initial condition when we represent ρ(x, θ) in terms of a truncated Fourier series.

Order of error
For β = 2 evaluated at x = −2, Figure 6 shows the order of error plots of finite-difference discretization and spectral discretization.We choose x = −2 since it is near the peak of the distribution.With ∆x = −0.004,−0.002, −0.001 for finite-difference discretization and ∆x = −0.2,−0.1, −0.05 for spectral discretization, Figure 7 shows the change of error as the value of |∆x| decreases.For both the finite-difference discretization and the spectral discretization, from Figure 6, the error has the expected order with respect to |∆x|.However, if we decrease the value of |∆x| further, each BDF method has a corresponding range for ∆x that causes instability.Moreover, once the value of ∆x exits this range, the corresponding BDF method becomes accurate again.See Appendix A for more details.Remark 3.2.In addition to the trapezoidal and BDF methods, one could consider A-stable and L-stable diagonally implicit Runge-Kutta methods as alternative options for time-stepping.Furthermore, one could even consider adaptive time-stepping.While these alternative methods can effectively address the stability concerns arising from the spectrum of ∆x(A + xB) and ∆x(T + U ), they do tend to increase the computation time.These methods likely require more than one linear solve at each time step (possibly one matrix factorization and multiple uses of it).And our target time step is |∆x| = 10 −3 and, with this time step, BDF5 is stable with the default parameters, requiring only one (sparse) linear solve per time step.Then the comparison of methods becomes a complicated competition between more expensive, higher-order methods that allow a larger time step and less expensive, but still fairly high-order, methods requiring a smaller time step.It is possible that savings can be achieved here, but we did not see it in our experiments.

Error with respect to x 0
Figure 8 shows how the maximum error over x = −8, −7, . . ., x 0 changes with respect to the value of x 0 .For β = 4 with spectral discretization using BDF5, the error for x 0 = 12, 13 is roughly O 10 −11 , which can be improved to O 10 −12 using more Fourier modes.Except for  these two cases, for both methods, as the value of β increases, the value of x0 , the minimum value of x 0 that can be used without affecting the accuracy, decreases.This is exactly what we expect since the larger value of β is, the more concentrated the distribution becomes.Based on [5,6], one expects x0 ≈ C/β

204.295s
Table 3.For each discretization in the first column and time-stepping method in the second column, computation time to get the interpolated cdf for the Tracy-Widom distribution and the corresponding error at x = −2 for β = 2 are recorded in the next two columns.The computation times in the third column are generated using the default parameters as in (3.1) and (3.2), with the corresponding errors in the fourth column.If we aim for an error of O 10 −6 , the last column shows the corresponding minimum observed computation times for each discretization and time-stepping method to achieve this error.
conservative and use β

Computation time
The values in Table 3 are obtained by running on a computer with processor: 11th Gen Intel(R) Core(TM) i7-11800H 2.30GHz with 16.0GB of RAM.
Table 3 shows the computation time with default values of the parameters as in (3.1) and (3.2) for β = 2 along with the corresponding error at x = −2.The last column provides the computation time if we aim for an error of O 10 −6 at x = −2.For finite-difference discretization, to have the error of O 10 −6 at x = −2, ∆x = −0.01 can be used instead of ∆x = −0.001.For spectral discretization, to have the error of O 10 −6 at x = −2, M = 4×10 3 can be used instead of M = 8 × 10 3 with ∆x = −0.01 for trapezoidal method, ∆x = −0.05for BDF3, ∆x = −0.07 for BDF4, and ∆x = −0.1 for BDF5.It turns out that the error using BDF6 will jump from O 10 −5 to O 10 −12 .From Table 3, it takes about 204.295s using BDF6 to have the error of O(10 −12 ) with parameters as in (3.2).To have the error of O 10 −5 , it takes about 0.639s with M = 5×10 3 and ∆x = −0.2.Work to improve the speed of the spectral discretization is ongoing.
Using the same machine, Bloemendal's code (see [1,Section 6]), which uses Mathematica's NDSolve, takes approximately 3 seconds to compute the approximation to the function H.We point out that Mathematica 13.2, generates warnings from NDSolve regarding the errors in the approximate solution.The accuracy of this approach, while simple, appears to be limited to a maximum of 7 digits.Yet, the fact that this is indeed so simple, and works, is an important reminder of the conceptual simplicity of this representation of the Tracy-Widom distribution function.
Our methods expand upon this, providing both the cdfs and pdfs as output and producing high-accuracy interpolants while leaving the trade-off between accuracy and speed to the user's discretion.And since we consider the general-β Tracy-Widom distribution functions as important nonlinear special functions, developing methods, with the highest possible accuracy, is critical.See Section 5 for some thoughts on further improvements.

Additional numerical results
In this section, we present additional plots to demonstrate the power and flexibility of the code.

Comparison with large random matrices
We verify numerically that the pdf generated by our algorithm agrees with the model presented by Dumitriu and Edelman [7].Recall that H β n in (1.2) has (1.1) as the jpdf for its eigenvalues, and the distribution of its largest eigenvalue, after rescaling, converges to F β for any β > 0 as n → ∞.
Histograms in Figure 9 are the normalized histograms for n 4.2 Evolution of the density ρ(x, θ) := ∂H(x, θ)/∂θ Figure 11 shows the contour plots of the approximation of H(x, θ) using finite-difference discretization with trapezoidal method with values of the parameters given as in (3.1).For each contour plot, the initial condition H(x 0 , θ) is found along the top of the plot and the approximate Tracy-Widom distribution, F β (x) ≈ H(x, θ M ), is obtained on the right-hand side of the plot.

Density of the Tracy-Widom distribution
x

Limiting densities of other eigenvalues
By [1,Theorem 2.4.3], one finds the limiting density of the kth largest eigenvalue, after rescaling, of the β-Hermite ensemble H β n at θ = kπ.Using finite-difference discretization with trapezoidal method with values for the parameters as in (3.1), Figure 13 shows the limiting densities of the largest three eigenvalues of H β n , namely, F ′ (•, π), F ′ (•, 2π), and F ′ (•, 3π) from right to left respectively.They can also be interpreted as the densities of −Λ 0 , −Λ 1 , and −Λ 2 of the stochastic Airy operator H β .As the value of k increases, the limiting distribution becomes more concentrated.

Outlook and open questions
The error analysis in this paper was empirical, as a proof of convergence is elusive.It is likely that eigenvalue perturbation theory could be used to help show that the spectrum of the xdependent families of matrices we consider remains close to, or inside of, the stability regions for the time-steppers we have chosen -at least for sufficiently small time steps.A challenge here is to obtain quantitative bounds, making "sufficiently small" precise, and giving useful conclusions.Furthermore, even if the spectrum is understood, it is currently not known how to estimate the eigenvector condition numbers and pseudospectra of these families of matrices.
We also believe it to be possible to achieve better accuracy by adapting the global spectral method of Olver and Townsend [22] and future work will be in this direction.A high-precision implementation of this idea could help corroborate conjectures about the tail behavior of the Tracy-Widom distribution.

A The range of ∆x that causes instability
When applying BDF methods to the spectral discretization, we find that each BDF method has a corresponding range for ∆x that causes instability.Moreover, once the value of ∆x exits this range, convergence appears to occur at the expected rate.As a result, when an error plot is generated with respect to the value of |∆x|, we will observe a non-monotonic pattern of

M 12 :
Use (2.13) and denote the result by b n M 13: end for 14: Compute F= a n M , integ and f= b n M , integ 15: Perform Algorithm 1 with F, f, and T=x 16: Return the interpolant for f

Figures 3 and 4
Figures3 and 4show the absolute stability regions of BDF5 and BDF6 along with eigenvalues of ∆x(A + xB) for β = 2, x = x 0 , x 0 − 1, . . ., x N , ∆x = −10 −3 , and M = 8 × 10 3 .It is clear that, with the values of the parameters given as in (3.2), the eigenvalues in the left-half plane are firmly within the region of absolute stability for these methods.We choose BDF5 over BDF6 because with the same values of these parameters, the performance is roughly the same yet BDF5 has a larger absolute stability region.Selecting ∆x is more nuanced than one might think, see Appendix A for more details.

Figure 7 .
Figure 7. Change of error of finite-difference discretization using trapezoidal method with ∆x = −0.004,−0.002, −0.001 and spectral discretization using BDF5 with ∆x = −0.2,−0.05.Values of the other parameters are used as in (3.1) and (3.2).When applying BDF5 to the spectral discretization, errors are more pronounced for smaller time steps at the left edge due to the onset of instability discussed in the appendix.

Figure 8 .
Figure 8. Change of error of finite-difference discretization using trapezoidal method and spectral discretization using BDF5 with respect to the value of x 0 .β = 1, 2, 4 and x 0 = 13, 12, . . ., 2 are used.Values of the other parameters are the same as in (3.1) and (3.2).For each value of β and x 0 , the error is taken to be the maximum over x = −8, −7, . . ., x 0 .Discretization Method Time (default) Error (default) Time 10 −6

Figure 10
Figure10shows the waterfall plots of the approximation of ρ(x, θ) := ∂H(x, θ)/∂θ for θ ∈ [0, 10π] and x = 0, −2, . . ., −10 using finite-difference discretization with trapezoidal method on (2.11) with θ M = 10π and M = 10 4 .Values of the other parameters are the same as in (3.1).As the value of x decreases, the density of the solution propagates to the right.Figure11shows the contour plots of the approximation of H(x, θ) using finite-difference discretization with trapezoidal method with values of the parameters given as in (3.1).For each contour plot, the initial condition H(x 0 , θ) is found along the top of the plot and the approximate Tracy-Widom distribution, F β (x) ≈ H(x, θ M ), is obtained on the right-hand side of the plot.

Figure 12 (
Figure 12 (a) shows the plot of the approximation of F ′ β for β = 1 to 10 and x ∈ [−6, 4] using finite-difference discretization with trapezoidal method with values of the other parameters as in (3.1).As we can see from the plot, as the value of β increases, F ′ β becomes more concentrated, and its peak moves leftwards (See Appendix B for a discussion of the exact limiting behavior as β → ∞).Figure 12 (b) is a two-dimensional version of Figure 12 (a), which is also generated using the finite-difference discretization with trapezoidal method.It provides a closer view of F ′ β for β = 1 to 4 with step size = 0.2.The red curves from right to left correspond to β = 1, 2, 4 respectively.The black curves show F ′ β for the other values of β.

Figure 12 (
Figure 12 (a) shows the plot of the approximation of F ′ β for β = 1 to 10 and x ∈ [−6, 4] using finite-difference discretization with trapezoidal method with values of the other parameters as in (3.1).As we can see from the plot, as the value of β increases, F ′ β becomes more concentrated, and its peak moves leftwards (See Appendix B for a discussion of the exact limiting behavior as β → ∞).Figure 12 (b) is a two-dimensional version of Figure 12 (a), which is also generated using the finite-difference discretization with trapezoidal method.It provides a closer view of F ′ β for β = 1 to 4 with step size = 0.2.The red curves from right to left correspond to β = 1, 2, 4 respectively.The black curves show F ′ β for the other values of β.

( a )Figure 20 .
Figure 20.The non-monotonic convergence for BDF6 along with the mean fraction ⟨δ (x, ∆x)⟩ of the unstable eigenvalues and the mean value ⟨µ(x, ∆x)⟩.The detailed computing process for each plot is the same as for BDF3, see the caption of Figure14.

Figure 22 shows 9 )
Figure22shows the contour plot of (B.8) with respect to x and θ for x ∈ [−10, 10] and θ ∈ [0, π].The red curves correspond to the level zero, which occurs when c 1 ̸ = 0 and c 2 = 0. We can see that there is one and only one red curve along which θ → 0 as x → ∞.Denote the first real zero (closest to x = 0) of Ai(x) by x 1 ≈ −2.33811.It can be verified from (B.8) by setting c 2 = 0 and c 1 ̸ = 0 that x → x + 1 as θ → π − .Set F (x, ω) in the following way:

Table 1 .
Absolute errors of finite-difference discretization using trapezoidal method with values of the parameters as in (3.1).

Table 2 .
Absolute errors of spectral discretization using BDF5 with values of the parameters as in (3.2).