As a reminder, what we measure in lattice QCD are correlation functions

*C(t)=<O(t)O(0)>*of composite fields

*O(t)*. From Feynman's functional integral formula, these are equal to the vacuum expectation value of the corresponding products of operators. Changing from the Heisenberg to the Schrödinger picture, it is straightforward to show that (for infinite temporal extent of the lattice) these have a spectral representation

*C(t)=Σ*, which in principle contains all information about the energies

_{n}|ψ_{n}|^{2}e^{-Ent}*E*and matrix elements

_{n}*ψ*of all states in the theory.

_{n}=<0|O|n>The problem with getting that information from the theory is twofold: Firstly, we only measure the correlator on a finite number of timeslices; the task of inferring an infinite number of

*E*and

_{n}*ψ*from a finite number of

_{n}*C(t*is therefore infinitely ill-conditioned. Secondly, and more importantly, the measured correlation functions have associated statistical errors, and the number of timeslices on which the excited states' (

_{k})*n>1*) contributions are larger than the error is often rather small. We are therefore faced with a difficult data analysis task.

The simplest idea of how to extract information beyond the ground state would be to just perform a multi-exponential fit with a given number of exponentials on the measured correlator. This approach fails spectacularly, because multi-exponential fits are rather ill-conditioned. One finds that changing the number of fitted exponentials will affect the best fit values found rather strongly, leading to a large and unknown systematic error; moreover, the fits will often tend to wander off into unphysical regions (negative energies, unreasonablely large matrix elements for excited states). This instability therefore needs addressing if one wishes to use a

*χ*-based method for the analysis of excited state masses.

^{2}The first such stabilisation that has been proposed and is widely used is known as Bayesian or constrained fitting. The idea here is to augment the χ

^{2}functional by prior information that one has about the spectrum of the theory (such as that energies are positive and less than the cutoff, but if one wishes also perhaps more stringent constraints coming e.g. from effective field theories or models). The reason one may do this is Bayes' theorem, which can be read as stating that the probability distribution of the parameters

*M*given the data

*D*is the product of the probability distribution of the data given the parameters times the probability distribution of the parameters absent any data:

*P(M|D)=P(D|M)/P(D) P(M)*; taking the logarithm of both sides and maximising of

*M*, we then want to maximise

*log(P(D|M)) + log(P(M))*. Now

*log(P(D|M))*is known to be proportional to

*-χ*, so if

^{2}*P(M)*was completely flat, we would end up minimizing

*χ*. If we take

^{2}*P(M)*to be Gaussian instead, we end up with an augmented

*χ*that contains an additional term

^{2}*Σ*that forces the parameters

_{n}(M_{n}-I_{n})^{2}/σ_{n}^{2}*M*towards their initial guesses ("priors")

_{n}*I*, and hence stabilises the fit -- in principle even with an infinite number of fit parameters. The widths

_{n}*σ*are arbitrary in principle; fitted values

_{n}*M*that noticeably depend on

_{n}*σ*are determined by the priors and not the data and must be discarded. In practice the lowest few energies and matrix elements do not show a significant dependence on

_{n}*σ*or on the number of higher states included in the fit, and may therefore be taken to have been determined by the data.

_{n}Bayesian fitting is a very powerful tool, but not everyone is happy with it. One objection is that adding any external information, even as a constraint, compromises the status of lattice QCD as a first-principles determination of physical quantities. Another common worry is the GIGO (garbage in-garbage out) principle with regards to the priors.

A way to address the former concern that has been proposed is the Sequential Empirical Bayes Method (SEBM). Here, one first performs an unstabilised single-exponential fit at large times

*t*, where the ground state is known to dominate. Then one performs a constrained two-exponential fit over a larger range of

*t*using the first fit result as a prior (with its error as the width). The result of this fit is then used as the prior in another three-exponential fit over an even larger time range, and so forth. (There is some variation as to the exact procedure followed, but this is the basic idea). In this way, all priors have been determined by the data themselves.

In the next post of this series we will look at a completely different approach to extracting excited state masses and matrix elements that does not rely on

*χ*at all.

^{2}