The fourth-root trick for the staggered determinant has long been controversial. Most recently, the debate has been rekindled by a series of papers by Mike Creutz, in which he argues that the rooting procedure fails in specific ways. While some of the arguments have been refuted by members of the staggered community, criticisms related to the question whether the rooted staggered theory can describe the axial anomaly correctly remain important. A direct physical probe of the axial anomaly is given by the η'-η splitting. Unfortunately, the determination of this splitting requires the evaluation of disconnected contributions to the η' correlator, which are very noisy and cannot be measured with sufficient precision to make a clear statement at the current time. In his recent paper, Stephan Dürr approaches the question of the correctness of the rooting procedure from the angle of a theory in which sufficient statistics can be readily obtained, namely the Schwinger model.
The Schwinger model is simply QED in 1+1 spacetime dimensions, as far as its action is concerned. Its physics is, however, radically different from that of QED in 3+1 dimensions, since firstly there is neither spin nor a physical gauge boson degree of freedom in 1+1d, and secondly the 1-dimensional Coulomb potential is linear and hence confining. The Schwinger model therefore has a spectrum similar to that of QCD, with a mass gap and meson degrees of freedom (note that there are neither baryons nor "photoballs" due to the abelian nature of the interaction [although there aren't any glueballs in 1+1d QCD either due to the absence of the gauge boson as a degree of freedom]), and can therefore serve as a laboratory for ideas in QCD. The basic meson η of the Schwinger model, which Schwinger demonstrated to have a mass squared of m2=e2/π (where e is the dimensionful gauge coupling in 1+1d), in particular, is an analogue of the η' in QCD, since its mass is mainly due to the axial anomaly.
The Schwinger model is much easier to simulate than QCD both because two dimensions are easier than four, and also because it turns out that reweighting works very well in two dimensions where the fermionic determinant can be evaluated exactly due to its comparably small size, so that one can generate quenched ensembles and include the fermionic determinant via reweighting. In particular the latter feature allows the generation of huge statistics (80,000 configurations in this case). Dürr employs an algorithm incorporating the introduction of instantons and antiinstantons as well as parity transformations to optimise the sampling of topological sectors. The resulting ensembles are then used to simulate the Nf=1(2) Schwinger model via reweighting with the rooted (unrooted) staggered fermion determinant. The latter is correct by construction; testing the former is the motivation for the study.
Using all-to-all propagators and U(1)-projected triply APE-smeared gauge links, Dürr is able to show the validity of the staggered index theorem with impressive precision. Turning to the meson spectrum, he finds that the connected part of the η has the same mass as the Nf=2 π meson up to cut-off effects, so that the mass of the physical η in the chiral limit comes entirely from the disconnected part. The ratio of the disconnected to the connected Green's functions for the η approaches the correct limiting value expected if the rooting trick works correctly. After a continuum and chiral extrapolation, he finds that the mass of the Nf=1 η meson agrees with Schwinger's analytical result.
This paper provides a very interesting study that adds to the empirical support for the correctness of the rooting procedure for staggered quarks. Of course it remains to see if this result will carry over to QCD, but I'd be honestly surprised if it didn;t. An analytical construction demonstrating the correctness of the rooted staggered formalism would of course be very welcome. Perhaps some of the recent results regarding the connection between staggered and overlap fermions will point the way in that regard.
Showing posts with label lattice fermions. Show all posts
Showing posts with label lattice fermions. Show all posts
Thursday, April 12, 2012
Friday, February 11, 2011
What's new at the fermion zoo?
If there is anything more typical of the landscape of lattice QCD than collaboration acronyms that mean something very different (like a car manufacturer, a color model, or an old DOS command), to people from outside the lattice community, it has to be the fact that each of the aforementioned collaborations uses a fermion action that is in some way different from those of all other collaborations. For gauge actions, there isn't all that much variety (Wilson, tree-level Symanzik, Lüscher-Weisz with or without O(Nfαsa2) corrections, and Iwasaki), but for fermions there is a veritable zoo.
Of course, for every zoo, there is a Linnean system establishing a taxonomy, so the fermion zoo can be ordered by grouping the fermion actions into different classes:
Of course, life being incredibly diverse, every taxonomist will sooner or later run into a creature which defies the existing taxonomic scheme. The past year has, I think, been such an occasion for the fermion zoo, which was increased by the addition of what may become two new families of fermions that straddle the boundaries between the classes outlined above.
One is the family of minimally doubled fermions, which are being championed by Mike Creutz and by people here at Mainz. The idea is to find an action which has the minimal number of doublers permitted for a chirally symmetric Dirac operator by the Nielsen-Ninomiya theorem, i.e. a doublet of fermions that can then be interpreted as the up/down doublet. There are two realisations of this idea, now known as Karsten-Wilczek and Creutz-Borici fermions, respectively, both of which rely on the addition of a Wilson-like term to the action. In a way, this puts them somewhere between Wilson and staggered fermions, the latter because of the existence of taste-changing interactions; of course, no rooting is required to simulate an Nf=2 theory with minimally doubled fermions. The price paid is that, because the line connecting the two poles in momentum space defines a preferred direction, at least one of the discrete spatiotemporal symmetries must be broken; this leads to the possibility of generating additional (relevant in the RG sense) dimension-3 operators in the action, which have to be fine-tuned away. Simulations with minimally doubled fermions are in preparation and will have to deal with these questions; it remains to be seen if this formulation will have practical relevance beyond its obvious theoretical impact.
The other new fermion family are the staggered overlap fermions introduced at this year's lattice conference by David Adams, and which as suggested by the name close the gap between staggered and overlap fermions. The idea here is to perform a similar construction to that used to obtain the overlap operator from the Wilson Dirac operator, but taking the staggered Dirac operator as the starting point. As it turns out, this results naturally in a theory with two fermion flavours, so again no rooting is required to simulate an up/down doublet in this fashion.
Like all taxonomy-defying creatures, these new fermion actions hold the potential to reveal hitherto unknown connections between previously unconnected classes of entities, in this case perhaps by establishing new connections between the number of flavours, chiral symmetry, doubling and the staggered formalism.
Of course, for every zoo, there is a Linnean system establishing a taxonomy, so the fermion zoo can be ordered by grouping the fermion actions into different classes:
- Wilson fermions get rid of the doublers by adding a term (the Wilson term) to the action that explicitly breaks chiral symmetry and thus lifts the degeneracy of the doublers, giving them masses of the order of the cut-off. Wilson fermions can be subdivided further firstly into straight Wilson fermions (which have O(a) discretisation effects and hence are rarely used) and O(a)-improved Wilson fermions, which add another term, the Sheikholeslami-Wohlert term, to reduce the lattice actifacts to be O(a2). The numerous individual actions being used then differ mainly by the kind of links that go into the discretised derivatives (and possibly into the SW term), whether they are thin links for rigorous locality and positivity properties, or different kinds of smeared links for empirically better statistical behaviour of various observables.
- twisted-mass fermions are close relatives of Wilson fermions, consisting of a doublet of unimproved Wilson fermions with a twisted mass term of the form τ3γ5; the doublet is interpreted as the up/down isospin doublet. One of the attractive features of twisted fermions is that spectral observables are automatically O(a)-improved. On the other hand, isospin and parity are violated by cut-off effects, which leads to potentially undesirable features such as a neutral pion with the quantum numbers of the vacuum.
- staggered fermions reduce the number of doublers to four by redistributing the degrees of freedom between sites. Also here, improvement by adding an additional three-link term (the Naik term) is commonly employed. Significant use is made of smearing to reduce the impact of high-momentum gluons whose exchange results in interactions mixing the different "tastes" of remaining doublers. An advantage of the staggered formalism is the preservation of a residual chiral symmetry; a disadvantage is the need to take the root of the determinant of the Dirac operator (unless one wants to simulate with Nf=4 degenerate flavours), and issue that has been surrounded by some controversy. The actions in current use are the asqtad and HISQ actions.
- overlap fermions are constructed as an exact solution to the Ginsparg-Wilson relation by means of the overlap operator, which is essentially the matrix sign function of the Wilson Dirac operator. While having the obvious theoretical advantage of exact chiral symmetry at finite lattice spacing, overlap fermions are very expensive to simulate, and thus are not in widespread use yet.
- domain-wall fermions use a fictitious fifth dimension to realise chiral symmetry by localising the opposite chiralities on different "branes" or domain walls in the fifth direction. They are likewise rather expensive to simulate.
Of course, life being incredibly diverse, every taxonomist will sooner or later run into a creature which defies the existing taxonomic scheme. The past year has, I think, been such an occasion for the fermion zoo, which was increased by the addition of what may become two new families of fermions that straddle the boundaries between the classes outlined above.
One is the family of minimally doubled fermions, which are being championed by Mike Creutz and by people here at Mainz. The idea is to find an action which has the minimal number of doublers permitted for a chirally symmetric Dirac operator by the Nielsen-Ninomiya theorem, i.e. a doublet of fermions that can then be interpreted as the up/down doublet. There are two realisations of this idea, now known as Karsten-Wilczek and Creutz-Borici fermions, respectively, both of which rely on the addition of a Wilson-like term to the action. In a way, this puts them somewhere between Wilson and staggered fermions, the latter because of the existence of taste-changing interactions; of course, no rooting is required to simulate an Nf=2 theory with minimally doubled fermions. The price paid is that, because the line connecting the two poles in momentum space defines a preferred direction, at least one of the discrete spatiotemporal symmetries must be broken; this leads to the possibility of generating additional (relevant in the RG sense) dimension-3 operators in the action, which have to be fine-tuned away. Simulations with minimally doubled fermions are in preparation and will have to deal with these questions; it remains to be seen if this formulation will have practical relevance beyond its obvious theoretical impact.
The other new fermion family are the staggered overlap fermions introduced at this year's lattice conference by David Adams, and which as suggested by the name close the gap between staggered and overlap fermions. The idea here is to perform a similar construction to that used to obtain the overlap operator from the Wilson Dirac operator, but taking the staggered Dirac operator as the starting point. As it turns out, this results naturally in a theory with two fermion flavours, so again no rooting is required to simulate an up/down doublet in this fashion.
Like all taxonomy-defying creatures, these new fermion actions hold the potential to reveal hitherto unknown connections between previously unconnected classes of entities, in this case perhaps by establishing new connections between the number of flavours, chiral symmetry, doubling and the staggered formalism.
Labels:
lattice fermions,
quarks
Sunday, May 18, 2008
Trento
So I've obviously been a really bad blogger recently, but I was quite busy. One of things I was doing was attending a workshop at the ECT* (not sure what's up with the star; I suppose ECTRNPARA was a little too long, so they used a shell wildcard) in Trento, Italy. The workshop was sort of a miniature version of the lattice conference, with representatives from all major collaborations talking about the state of the art in simulations with dynamical fermions. I briefly considered live-blogging it like I do with the lattice conferences, but in the end decided against it for various reasons. The ECT* is very nicely located in a historical villa a little outside of Trento itself; the meeting room is in the basement of a side building, though, so there is nothing to distract one from the talks. The workshop was very well organised, with hotels, meals and everything arranged in advance, so five stars to the organisers and ECT* staff for that.
Contentwise, the workshop brought few real surprises, but a lot of confirmation of the fact that dynamical fermion simulations are now pretty far advanced due to a combination of algorithmic advances and ever greater and faster parallel computers. To all but eliminate systematic errors, ultimately, one will need to simulate at small lattice spacings (0.04 fm, say), large volumes (5 fm, say) and at the physical light quark masses. At the moment, each major group is accomplishing at least one of these, with some approaching two out of the three. In three or four years at the latest, somebody will have an ensemble of configurations fulfilling all three. Given that lattice spacings this small, or quark masses anywhere in the vicinity of the physical point, were considered completely out of reach just three years ago, it is fair to say that the lattice has come a long way in a short time.
Some people will therefore sometime use phrases like "when we will have solved QCD", but great as that sounds one first needs to consider what solving QCD means. Even when we have predictions for the hadronic ground state mass spectrum with essentially zero systematic error, there will still be excited states, decay constants and widths, scattering lengths, form factors, multi-hadron states and potentials, and so forth coming from QCD, and many of these will likely require considerable effort in terms of new theoretical developments in order to make it viable to extract them from lattice simulations. So unless "solving QCD" means "computing the hadronic ground state mass spectrum", we won't solve it for a fair while to come. Which is good news, because otherwise I'd really have to start looking for a different job, and I actually like this one.
And of course then there is the often-mentioned possibility that the LHC might find evidence of technicolor or some other new strongly coupled physics at higher energies, putting lattice theorists at the cutting edge of the energy frontier. That sounds more like some kind of dream though.
I've also been doing other interesting things, but I'll save those for a different post. If everything goes as hoped for, there may also be an exciting guest post on this blog in the not too distant future.
Contentwise, the workshop brought few real surprises, but a lot of confirmation of the fact that dynamical fermion simulations are now pretty far advanced due to a combination of algorithmic advances and ever greater and faster parallel computers. To all but eliminate systematic errors, ultimately, one will need to simulate at small lattice spacings (0.04 fm, say), large volumes (5 fm, say) and at the physical light quark masses. At the moment, each major group is accomplishing at least one of these, with some approaching two out of the three. In three or four years at the latest, somebody will have an ensemble of configurations fulfilling all three. Given that lattice spacings this small, or quark masses anywhere in the vicinity of the physical point, were considered completely out of reach just three years ago, it is fair to say that the lattice has come a long way in a short time.
Some people will therefore sometime use phrases like "when we will have solved QCD", but great as that sounds one first needs to consider what solving QCD means. Even when we have predictions for the hadronic ground state mass spectrum with essentially zero systematic error, there will still be excited states, decay constants and widths, scattering lengths, form factors, multi-hadron states and potentials, and so forth coming from QCD, and many of these will likely require considerable effort in terms of new theoretical developments in order to make it viable to extract them from lattice simulations. So unless "solving QCD" means "computing the hadronic ground state mass spectrum", we won't solve it for a fair while to come. Which is good news, because otherwise I'd really have to start looking for a different job, and I actually like this one.
And of course then there is the often-mentioned possibility that the LHC might find evidence of technicolor or some other new strongly coupled physics at higher energies, putting lattice theorists at the cutting edge of the energy frontier. That sounds more like some kind of dream though.
I've also been doing other interesting things, but I'll save those for a different post. If everything goes as hoped for, there may also be an exciting guest post on this blog in the not too distant future.
Labels:
conferences,
lattice fermions,
travel
Thursday, February 14, 2008
arXiv catchup
I have been too lazybusy recently to blog anything. However, in the spirit of the day, I'd like to share a romantic little poem extolling the nonabelian nature of strong attraction:
No, I won't give up physics and become a card designer for H$llm$rk, don't worry. But after softening your hearts with this touching verse, I'd like to blog about some rather old stuff, which I hope hasn't gone stale in the meantime.
One paper on the arXiv that struck me as interesting in the last couple of months was this paper by Jeffrey Mandula (of Coleman-Mandula No-Go fame), who discusses the consequences of Lüscher's nonlinear realisation of chiral symmetry for Ginsparg-Wilson fermions. We recall that this symmetry can be written in two inequivalent ways by putting the phase factor eiαγ5 either on the quark field ψ or its conjugate $$\bar{\psi}$$. The crucial fact that Mandula points out is that both of these are independent symmetries of the lattice theory, and they don't commute! Hence, we have to look for the symmetry algebra generated by them, which turns out to be infinite-dimensional. Hence the lattice symmetry has an infinite number of conserved currents, a structure quite different from the continuum theory. However, it would really appear that the differences between any two of these lattice currents are just lattice artifacts of order a or higher that should disappear in the continuum limit, if the latter is properly defined. So some of the objections that the paper raises are likely a lot less serious than stated (especially the non-locality exhibited for free overlap fermions [eq. (38)] goes away once one realises that the continuum limit must be taken with the negative mass s constant in lattice units), but it appears that Ginsparg-Wilson fermions may have their own set of problems beyond just being expensive to simulate. Any comments on this from Ginsparg-Wilson specialists would be of great interest.
Another interesting paper was this one by Mike Creutz who proposed a new fermion discretisation based on features of the electronic structure of graphene. Apparently the low electronic excitations of a grpahene layer are described by the massless Dirac equation, and a lattice model based on this (by reducing the links in one of the three graphene hexagonal directions to points, and rescaling eveything to make the lattice rectangular again) exploits this to achieve the minimum number (two) of doublers permitted in an conventional chiral lattice theory by the Nielsen-Ninomiya theorem, and this construction can be extended to four dimensions and gauged to get a lattice discretisation of QCD with two light quark flavours. This was quickly followed up by a similar proposal for a minimally-doubling quark action, and by this paper which shows that any minimally-doubling chiral lattice theory necessarily has to break either of the discrete symmetries P or T such that their product PT is broken; this allows the generation of additional (relevant) dimension 3 operators that have to be removed by fine-tuning, precluding the use of minimally-doubling chiral actions in practice (unless some additional non-standard symmetry should conspire to do that fine-tuning itself, a possibility hinted at in the conclusion).
Roses are red, violets are blue
quarks come in colours, and so does glue.
No, I won't give up physics and become a card designer for H$llm$rk, don't worry. But after softening your hearts with this touching verse, I'd like to blog about some rather old stuff, which I hope hasn't gone stale in the meantime.
One paper on the arXiv that struck me as interesting in the last couple of months was this paper by Jeffrey Mandula (of Coleman-Mandula No-Go fame), who discusses the consequences of Lüscher's nonlinear realisation of chiral symmetry for Ginsparg-Wilson fermions. We recall that this symmetry can be written in two inequivalent ways by putting the phase factor eiαγ5 either on the quark field ψ or its conjugate $$\bar{\psi}$$. The crucial fact that Mandula points out is that both of these are independent symmetries of the lattice theory, and they don't commute! Hence, we have to look for the symmetry algebra generated by them, which turns out to be infinite-dimensional. Hence the lattice symmetry has an infinite number of conserved currents, a structure quite different from the continuum theory. However, it would really appear that the differences between any two of these lattice currents are just lattice artifacts of order a or higher that should disappear in the continuum limit, if the latter is properly defined. So some of the objections that the paper raises are likely a lot less serious than stated (especially the non-locality exhibited for free overlap fermions [eq. (38)] goes away once one realises that the continuum limit must be taken with the negative mass s constant in lattice units), but it appears that Ginsparg-Wilson fermions may have their own set of problems beyond just being expensive to simulate. Any comments on this from Ginsparg-Wilson specialists would be of great interest.
Another interesting paper was this one by Mike Creutz who proposed a new fermion discretisation based on features of the electronic structure of graphene. Apparently the low electronic excitations of a grpahene layer are described by the massless Dirac equation, and a lattice model based on this (by reducing the links in one of the three graphene hexagonal directions to points, and rescaling eveything to make the lattice rectangular again) exploits this to achieve the minimum number (two) of doublers permitted in an conventional chiral lattice theory by the Nielsen-Ninomiya theorem, and this construction can be extended to four dimensions and gauged to get a lattice discretisation of QCD with two light quark flavours. This was quickly followed up by a similar proposal for a minimally-doubling quark action, and by this paper which shows that any minimally-doubling chiral lattice theory necessarily has to break either of the discrete symmetries P or T such that their product PT is broken; this allows the generation of additional (relevant) dimension 3 operators that have to be removed by fine-tuning, precluding the use of minimally-doubling chiral actions in practice (unless some additional non-standard symmetry should conspire to do that fine-tuning itself, a possibility hinted at in the conclusion).
Labels:
arXiv,
lattice fermions
Sunday, December 09, 2007
Algorithms for dynamical fermions -- Hybrid Monte Carlo
In the previous post in this series parallelling our local discussion seminar on this review, we reminded ourselves of some basic ideas of Markov Chain Monte Carlo simulations. In this post, we are going to look at the Hybrid Monte Carlo algorithm.
To simulate lattice theories with dynamical fermions, one wants an exact algorithm that performs global updates, because local updates are not cheap if the action is not local (as is the case with the fermionic determinant), and which can take large steps through configuration space to avoid critical slowing down. An algorithm satisfying these demands is Hybrid Monte Carlo (HMC). HMC is based on the idea of simulating a dynamical system with Hamiltonian H = 1/2 p2 + S(q), where one introduces fictitious conjugate momenta p for the original configuration variables q, and treats the action as the potential of the fictitious dynamical system. If one now generates a Markov chain with fixed point distribution e-H(p,q), then the distribution of q ignoring p (the "marginal distribution") is the desired e-S(q).
To build such a Markov chain, one alternates two steps: Molecular Dynamics Monte Carlo (MDMC) and momentum refreshment.
MDMC is based on the fact that besides conserving the Hamiltonian, the time evolution of a Hamiltonian system preserves the phase space measure (by Liouville's theorem). So if at the end of a Hamiltonian trajectory of length τ we reverse the momentum, we get a mapping from (p,q) to (-p',q') and vice versa, thus obeying detailed balance: e-H(p,q) P((-p',q'),(p,q)) = e-H(p',q') P((p,q),(-p',q')), ensuring the correct fixed-point distribution. Of course, we can't actually exactly integrate Hamilton's in general; instead, we are content with numerical integration with an integrator that preserves the phase space measure exactly (more about which presently), but only approximately conserves the Hamiltonian. We make the algorithm exact nevertheless by adding a Metropolis step that accepts the new configuration with probability e-δH, where δH is the change in the Hamiltonian under the numerical integration.
The Markov step of MDMC is of course totally degenerate: the transition probability is essentially a δ-distribution, since one can only get to one other configuration from any one configuration, and this relation is reciprocal. So while it does indeed satisfy detailed balance, this Markov step is hopelessly non-egodic.
To make it ergodic without ruining detailed balance, we alternate between MDMC and momentum refreshment, where we redraw the fictitious momenta at random from a Gaussian distribution without regard to their present value or that of the configuration variables q: P((p',q),(p,q))=e-1/2 p'2. Obviously, this step will preserve the desired fixed-point distribution (which is after all simply Gaussian in the momenta). It is also obviously non-ergodic since it never changes the configuration variables q. However, it does allow large changes in the Hamiltonian and breaks the degeneracy of the MDMC step.
While it is generally not possible to prove with any degree of rigour that the combination of MDMC and momentum is ergodic, intuitively and empirically this is indeed the case. What remains to see to make this a practical algorithm is to find numerical integrators that exactly preserve the phase space measure.
This order is fulfilled by symplectic integrators. The basic idea is to consider the time evolution operator exp(τ d/dt) = exp(τ(-∂qH ∂p+∂pH ∂q)) = exp(τh) as the exponential of a differential operator on phase space. We can then decompose the latter as h = -∂qH ∂p+∂pH ∂q = P+Q, where P = -∂qH ∂p and Q = ∂pH ∂q. Since ∂qH = S'(q) and ∂pH = p, we can immediately evaluate the action of eτP and eτQ on the state (p,q) by applying Taylor's theorem: eτQ(p,q) = (p,q+τp), and eτP = (p-τS'(q),q).
Since each of these maps is simply a shear along one direction in phase space, they are clearly area preserving; so are all their powers and mutual products. In order to combine them into a suitable integrator, we need the Baker-Campbell-Hausdorff (BCH) formula.
The BCH formula says that for two elements A,B of an associative algebra, the identity
log(eAeB) = A + (∫01 ((x log x)/(x-1)){x=ead Aet ad B} dt) (B)
holds, where (ad A )(B) = [A,B], and the exponential and logarithm are defined via their power series (around the identity in the case of the logarithm). Expanding the first few terms, one finds
log(eAeB) = A + B + 1/2 [A,B] + 1/12 [A-B,[A,B]] - 1/24 [B,[A,[A,B]]] + ...
Applying this to a symmetric product, one finds
log(e1/2 AeBe1/2 A) = A + B + 1/24 [A+2B,[A,B]] + ...
where in both cases the dots denote fifth-order terms.
We can then use this to build symmetric products (we want symmetric products to ensure reversibility) of eP and eQ that are equal to eτh up to some controlled error. The simplest example is
(eδτ/2 Peδτ Qeδτ/2 P)τ/δτ = eτ(P+Q) + O((δτ)2)
and more complex examples can be found that either reduce the order of the error (although doing so requires one to use negative times steps -δτ as well as positive ones) or minimize the error by splitting the force term P into pieces Pi that each get their own time step δτi to account for their different sizes.
Next time we will hear more about how to apply all of this to simulations with dynamical fermions.
To simulate lattice theories with dynamical fermions, one wants an exact algorithm that performs global updates, because local updates are not cheap if the action is not local (as is the case with the fermionic determinant), and which can take large steps through configuration space to avoid critical slowing down. An algorithm satisfying these demands is Hybrid Monte Carlo (HMC). HMC is based on the idea of simulating a dynamical system with Hamiltonian H = 1/2 p2 + S(q), where one introduces fictitious conjugate momenta p for the original configuration variables q, and treats the action as the potential of the fictitious dynamical system. If one now generates a Markov chain with fixed point distribution e-H(p,q), then the distribution of q ignoring p (the "marginal distribution") is the desired e-S(q).
To build such a Markov chain, one alternates two steps: Molecular Dynamics Monte Carlo (MDMC) and momentum refreshment.
MDMC is based on the fact that besides conserving the Hamiltonian, the time evolution of a Hamiltonian system preserves the phase space measure (by Liouville's theorem). So if at the end of a Hamiltonian trajectory of length τ we reverse the momentum, we get a mapping from (p,q) to (-p',q') and vice versa, thus obeying detailed balance: e-H(p,q) P((-p',q'),(p,q)) = e-H(p',q') P((p,q),(-p',q')), ensuring the correct fixed-point distribution. Of course, we can't actually exactly integrate Hamilton's in general; instead, we are content with numerical integration with an integrator that preserves the phase space measure exactly (more about which presently), but only approximately conserves the Hamiltonian. We make the algorithm exact nevertheless by adding a Metropolis step that accepts the new configuration with probability e-δH, where δH is the change in the Hamiltonian under the numerical integration.
The Markov step of MDMC is of course totally degenerate: the transition probability is essentially a δ-distribution, since one can only get to one other configuration from any one configuration, and this relation is reciprocal. So while it does indeed satisfy detailed balance, this Markov step is hopelessly non-egodic.
To make it ergodic without ruining detailed balance, we alternate between MDMC and momentum refreshment, where we redraw the fictitious momenta at random from a Gaussian distribution without regard to their present value or that of the configuration variables q: P((p',q),(p,q))=e-1/2 p'2. Obviously, this step will preserve the desired fixed-point distribution (which is after all simply Gaussian in the momenta). It is also obviously non-ergodic since it never changes the configuration variables q. However, it does allow large changes in the Hamiltonian and breaks the degeneracy of the MDMC step.
While it is generally not possible to prove with any degree of rigour that the combination of MDMC and momentum is ergodic, intuitively and empirically this is indeed the case. What remains to see to make this a practical algorithm is to find numerical integrators that exactly preserve the phase space measure.
This order is fulfilled by symplectic integrators. The basic idea is to consider the time evolution operator exp(τ d/dt) = exp(τ(-∂qH ∂p+∂pH ∂q)) = exp(τh) as the exponential of a differential operator on phase space. We can then decompose the latter as h = -∂qH ∂p+∂pH ∂q = P+Q, where P = -∂qH ∂p and Q = ∂pH ∂q. Since ∂qH = S'(q) and ∂pH = p, we can immediately evaluate the action of eτP and eτQ on the state (p,q) by applying Taylor's theorem: eτQ(p,q) = (p,q+τp), and eτP = (p-τS'(q),q).
Since each of these maps is simply a shear along one direction in phase space, they are clearly area preserving; so are all their powers and mutual products. In order to combine them into a suitable integrator, we need the Baker-Campbell-Hausdorff (BCH) formula.
The BCH formula says that for two elements A,B of an associative algebra, the identity
log(eAeB) = A + (∫01 ((x log x)/(x-1)){x=ead Aet ad B} dt) (B)
holds, where (ad A )(B) = [A,B], and the exponential and logarithm are defined via their power series (around the identity in the case of the logarithm). Expanding the first few terms, one finds
log(eAeB) = A + B + 1/2 [A,B] + 1/12 [A-B,[A,B]] - 1/24 [B,[A,[A,B]]] + ...
Applying this to a symmetric product, one finds
log(e1/2 AeBe1/2 A) = A + B + 1/24 [A+2B,[A,B]] + ...
where in both cases the dots denote fifth-order terms.
We can then use this to build symmetric products (we want symmetric products to ensure reversibility) of eP and eQ that are equal to eτh up to some controlled error. The simplest example is
(eδτ/2 Peδτ Qeδτ/2 P)τ/δτ = eτ(P+Q) + O((δτ)2)
and more complex examples can be found that either reduce the order of the error (although doing so requires one to use negative times steps -δτ as well as positive ones) or minimize the error by splitting the force term P into pieces Pi that each get their own time step δτi to account for their different sizes.
Next time we will hear more about how to apply all of this to simulations with dynamical fermions.
Labels:
lattice fermions,
quarks,
seminars,
simulations
Thursday, November 29, 2007
Algorithms for dynamical fermions -- preliminaries
It has been a while since we had any posts with proper content on this blog. Lest my readers become convinced that this blog has become a links-only intellectual wasteland, I hereby want to commence a new series on algorithms for dynamical fermions (blogging alongside our discussion seminar at DESY Zeuthen/Humboldt University, where we are reading this review paper; I hope that is not too lazy to lift this blog above the waste level...).
I will assume that readers are familiar with the most basic ideas of Markov Chain Monte Carlo simulations; essentially, one samples the space of states of a system by generating a chain of states using a Markov process (a random process where the transition probability to any other state depends only on the current state, not on any of the prior history of the process). If we call the desired distribution of states Q(x) (which in field theory will be a Boltzmann factor Z-1e-S(x)), and the probability that the Markov process takes us to x starting from y P(x,y), we want to require that the Markov process keep Q(x) invariant, i.e. Q(x)=Σy P(x,y) Q(y). A sufficient, but not necessary condition for this is the the Markov process satisfy the condition of detailed balance: P(y,x)Q(x)=P(x,y)Q(y).
The simplest algorithm that satisfies detailed balance is the Metropolis algorithm: Chose a candidate x at random and accept it with probability P(x,y) = min(1,Q(x)/Q(y)), or else keep the previous state y as the next state.
Another property that we want our Markov chain to have is that it is ergodic, that is that the probability to go to any state from any other state is non-zero. While in the case of a system with a state space as huge as in the case of a lattice field theory, it may be hard to design an ergodic Markov step, we can achieve ergodicity by chaining several different non-ergodic Markov steps (such as first updating site 1, then site 2, etc.) so as to obtain an overall Markov step that is ergodic. As long as each substep has the right fixed-point distribution Q(x), e.g. by satisfying detailed balance, the overall Markov step will also have Q(x) as its fixed-point distribution, in addition to being ergodic. This justifies generating updates by 'sweeping' through a lattice point by point with local updates.
Unfortunately, successive states of a Markov chain are not really very independent, but in fact have correlations between them. This of course means that one does not get truly independent measurements from evaluating an operator on each of those states. To quantify how correlated successive states are, it is useful to introduce the idea of an autocorrelation time.
It is a theorem (which I won't prove here) that any ergodic Markov process has a fixed-point distribution to which it converges. If we consider P(x,y) as a matrix, this means that it has a unique eigenvalue λ0=1, and all other eigenvalues λi (|λi+1|≤|λi|) lie in the interior of the unit circle. If we start our process on a state u=Σi civi (where vi is the eigenvector belonging to λi), then PNu = Σi λiN civi = c0v0 + λ1Nc1v1 + ..., and hence the leading deviation from the fixed-point distribution decays exponentially with a characteristic time Nexp=-1/log|λ1| called the exponential autocorrelation time.
Unfortunately, we cannot readily determine the exponential autocorrelation time in any except the very simplest cases, so we have to look for a more accessible measure of autocorrelation. If we measure an observable O on each successive state xt, we can define the autocorrelation function of O as the t-average of measurements that are d steps apart: CO(d)=<O(xt+d)O(xt)>t/<O(xt)2>t, and the integrated autocorrelation time AO=Σd CO(d) gives us a measure of how many additional measurements we will need to iron out the effect of autocorrelations.
With these preliminaries out of the way, in the next post we will look at the Hybrid Monte Carlo algorithm.
I will assume that readers are familiar with the most basic ideas of Markov Chain Monte Carlo simulations; essentially, one samples the space of states of a system by generating a chain of states using a Markov process (a random process where the transition probability to any other state depends only on the current state, not on any of the prior history of the process). If we call the desired distribution of states Q(x) (which in field theory will be a Boltzmann factor Z-1e-S(x)), and the probability that the Markov process takes us to x starting from y P(x,y), we want to require that the Markov process keep Q(x) invariant, i.e. Q(x)=Σy P(x,y) Q(y). A sufficient, but not necessary condition for this is the the Markov process satisfy the condition of detailed balance: P(y,x)Q(x)=P(x,y)Q(y).
The simplest algorithm that satisfies detailed balance is the Metropolis algorithm: Chose a candidate x at random and accept it with probability P(x,y) = min(1,Q(x)/Q(y)), or else keep the previous state y as the next state.
Another property that we want our Markov chain to have is that it is ergodic, that is that the probability to go to any state from any other state is non-zero. While in the case of a system with a state space as huge as in the case of a lattice field theory, it may be hard to design an ergodic Markov step, we can achieve ergodicity by chaining several different non-ergodic Markov steps (such as first updating site 1, then site 2, etc.) so as to obtain an overall Markov step that is ergodic. As long as each substep has the right fixed-point distribution Q(x), e.g. by satisfying detailed balance, the overall Markov step will also have Q(x) as its fixed-point distribution, in addition to being ergodic. This justifies generating updates by 'sweeping' through a lattice point by point with local updates.
Unfortunately, successive states of a Markov chain are not really very independent, but in fact have correlations between them. This of course means that one does not get truly independent measurements from evaluating an operator on each of those states. To quantify how correlated successive states are, it is useful to introduce the idea of an autocorrelation time.
It is a theorem (which I won't prove here) that any ergodic Markov process has a fixed-point distribution to which it converges. If we consider P(x,y) as a matrix, this means that it has a unique eigenvalue λ0=1, and all other eigenvalues λi (|λi+1|≤|λi|) lie in the interior of the unit circle. If we start our process on a state u=Σi civi (where vi is the eigenvector belonging to λi), then PNu = Σi λiN civi = c0v0 + λ1Nc1v1 + ..., and hence the leading deviation from the fixed-point distribution decays exponentially with a characteristic time Nexp=-1/log|λ1| called the exponential autocorrelation time.
Unfortunately, we cannot readily determine the exponential autocorrelation time in any except the very simplest cases, so we have to look for a more accessible measure of autocorrelation. If we measure an observable O on each successive state xt, we can define the autocorrelation function of O as the t-average of measurements that are d steps apart: CO(d)=<O(xt+d)O(xt)>t/<O(xt)2>t, and the integrated autocorrelation time AO=Σd CO(d) gives us a measure of how many additional measurements we will need to iron out the effect of autocorrelations.
With these preliminaries out of the way, in the next post we will look at the Hybrid Monte Carlo algorithm.
Labels:
lattice fermions,
seminars,
simulations
Monday, June 11, 2007
Unquenching meets improvement
In a recent post, I explained how the fact that the vacuum in quantum field theory is anything but empty affects physical calculations by means of Feynman diagrams with loops, and specifically how one has to take account of these contributions in lattice field theory via perturbative improvement. In this post, I want to say some words about the relationship between perturbative improvement and unquenching.
To obtain accurate results from lattice QCD simulations, one must include the effects not just of virtual gluons, but also of virtual quarks. Technically, this happens by including the fermionic determinant that arises from integrating over the (Grassman-valued) quark fields. Since the historical name for omitting this determinant is "quenching", its inclusion is called "unquenching", and since quenching gives rise to an uncontrollable systematic error, unquenched simulations are absolutely crucial for the purpose of precise predictions and subsequent experimental tests of lattice QCD.
However, the perturbative improvement calculations that have been performed so far correct only for the effects of gluon loops. This leads to a mismatch in unquenched calculations using the perturbatively improved actions: while the simulation includes all the effects of both gluon and quark loops (including the discretisation artifacts they induce), only the discretisation artifacts caused by the gluon loops are removed. Therefore the discretisation artifacts caused by the quark loops remain uncorrected. Now, for many quantities of interest these artifacts are small higher-order effects; however, increased scaling violations in unquenched simulations (when compared with quenched simulations) have been seen by some groups. It is therefore important to account for the effects of the quark loops on the perturbative improvement of the lattice actions used.
This is what a group of collaborators including myself have done recently. For details of the calculations, I refer you to our paper. The calculation involved the numerical evaluation of a number of lattice Feynman diagrams (using automated methods that we have developed for the purpose) on a lattice with twisted periodic boundary conditions at a number of different fermion masses and lattice sizes, and the extrapolation of the results to the infinite lattice and massless quark limits. The computing resources needed were quite significant, as were the controls employed to insure the correctness of the results (which involved both repeated evaluations using independent implementations by different authors and comparison with known physical constraints, giving us great confidence in the correctness of our results). The results show that the changes in the coefficients in the actions needed for O(αsa2) improvement caused by unquenching are rather large for Nf=3 quark flavours, which is the case relevant to most unquenched simulations.
To obtain accurate results from lattice QCD simulations, one must include the effects not just of virtual gluons, but also of virtual quarks. Technically, this happens by including the fermionic determinant that arises from integrating over the (Grassman-valued) quark fields. Since the historical name for omitting this determinant is "quenching", its inclusion is called "unquenching", and since quenching gives rise to an uncontrollable systematic error, unquenched simulations are absolutely crucial for the purpose of precise predictions and subsequent experimental tests of lattice QCD.
However, the perturbative improvement calculations that have been performed so far correct only for the effects of gluon loops. This leads to a mismatch in unquenched calculations using the perturbatively improved actions: while the simulation includes all the effects of both gluon and quark loops (including the discretisation artifacts they induce), only the discretisation artifacts caused by the gluon loops are removed. Therefore the discretisation artifacts caused by the quark loops remain uncorrected. Now, for many quantities of interest these artifacts are small higher-order effects; however, increased scaling violations in unquenched simulations (when compared with quenched simulations) have been seen by some groups. It is therefore important to account for the effects of the quark loops on the perturbative improvement of the lattice actions used.
This is what a group of collaborators including myself have done recently. For details of the calculations, I refer you to our paper. The calculation involved the numerical evaluation of a number of lattice Feynman diagrams (using automated methods that we have developed for the purpose) on a lattice with twisted periodic boundary conditions at a number of different fermion masses and lattice sizes, and the extrapolation of the results to the infinite lattice and massless quark limits. The computing resources needed were quite significant, as were the controls employed to insure the correctness of the results (which involved both repeated evaluations using independent implementations by different authors and comparison with known physical constraints, giving us great confidence in the correctness of our results). The results show that the changes in the coefficients in the actions needed for O(αsa2) improvement caused by unquenching are rather large for Nf=3 quark flavours, which is the case relevant to most unquenched simulations.
Labels:
arXiv,
improvement,
lattice fermions
Thursday, April 19, 2007
Some quick links
Superweak has an interesting post on blind analysis, which is the first technique that has been carried over from medicine into nuclear and particle physics (rather than the other way, as were NMR, PET and a host of others). More on blind analysis techniques in experimental particle physics can be found in this review. Reading this, I was wondering wether any lattice groups used blinding in their data analyses; I am not aware of any that do, and the word "blind" does not appear to occur in hep-lat abstracts (except for phrases like "blindly relying on" and such). It may not be necessary, because we don't do the same kind of analyses that the experimenters do (like imposing cuts on the data), but the possibility of some degree "experimenter's (!?) bias" may still exist in the choice of operators used, priors imposed on fits etc.
There is a new paper on the arXiv which reports on tremendous gains in simulation efficiency that the authors have observed when using a loop representation for fermions instead of the conventional fermion determinant. Unfortunately their method does not work with gauge theories (except in the strong coupling limit) because it runs into a fermion sign problem, so it won't revolutionise QCD simulations, but it is very interesting, not least because it looks a lot like some kind of duality (between a theory of self-interacting fermions and a theory of self-avoiding loops) is at work.
There is a new paper on the arXiv which reports on tremendous gains in simulation efficiency that the authors have observed when using a loop representation for fermions instead of the conventional fermion determinant. Unfortunately their method does not work with gauge theories (except in the strong coupling limit) because it runs into a fermion sign problem, so it won't revolutionise QCD simulations, but it is very interesting, not least because it looks a lot like some kind of duality (between a theory of self-interacting fermions and a theory of self-avoiding loops) is at work.
Labels:
data analysis,
experiment,
lattice fermions
Tuesday, January 23, 2007
Evil, bad, diseased, or just ugly?
"Evil" is a word rarely heard in scientific discourse, at least among physicists, whose subject of study is after all morally neutral for pretty much any sensible definition of "morally". "Bad", "diseased" or "ugly" might be heard occasionally. But having all of them applied to a topic as relatively arcane as the fourth-root prescription for staggered fermions is, well, staggering. At last year's lattice meeting there was a lot of discussion as to whether this prescription was diseased or merely ugly. Now Mike Creutz has taken the discussion from the medicinally-aesthetic to the moral level by suggesting that rooting is actually evil. The arguments are much the same as before: The rooted staggered theory has a complicated non-locality structure at non-vanishing lattice spacing, and there is no complete proof (although there are strong arguments that many find very convincing) that this non-locality goes away in the continuum level. The debate will no doubt simmer on until a fully conclusive proof either way is found; the question is only, what kinds of unusual title words are we still going to see?
Labels:
lattice fermions
Monday, May 29, 2006
Non-Relativistic QCD
This is another installment in our series about fermions on the lattice. In the previous posts in this series we had looked at various lattice discretisations of the continuum Dirac action, and how they dealt with the problem of doublers posed by the Nielsen-Ninomiya theorem. As it turned out, one of the main difficulties in this was maintaining chiral symmetry, which is important in the limit of vanishing quark mass. But what about the opposite limit -- the limit of infinite quark mass?
As it turns out, that limit is also difficult to handle, but for entirely different reasons: The correlation functions, from which the properties of bound states are extracted, show an exponential decay of the form
, where
is the number of timesteps, and
is the product of the state's mass and the lattice spacing. Now for a heavy quark, e.g. a bottom, and the lattice spacings that are feasible with the biggest and fastest computers in existence today,
, which means that the correlation functions for an
will decay like
, which is way too fast to extract a meaningful signal. (Making the lattice spacing smaller is so hard because in order to fill the same physical volume you need to increase the number of lattice points accordingly, which requires a large increase in computing power.)
Fortunately, in the case of heavy quark systems the kinetic energies of the heavy quarks are small compared to their rest masses, as evidenced by the relatively small splittings between the ground and excited states of heavy
mesons. This means that the heavy quarks are moving at non-relativistic velocities
and can hence be well described by a Schrödinger equation instead of the full Dirac equation after integrating out the modes with energies of the order of
. The corresponding effective field theory is known as Non-Relativistic QCD (NRQCD) and can be schematically written using the Lagrangian

where $\psi$ is a non-relativistic two-component Pauli spinor and the Hamiltonian is

In actual practice, this is not a useful way to write things, since it is numerically unstable for
; instead one uses an action that looks like

where

whereas
incorporates the relativistic and other corrections, and
is a numerical stability parameter that makes the system stable for
.
This complicated form makes NRQCD rather formidable to work with, but it can be and has been successfully used in the description of the
system and in other contexts. In fact, some of the most precise predictions from lattice QCD rely on NRQCD for the description of heavy quarks.
It should be noted that the covariant derivatives in NRQCD are nearest-neighbours differences -- the reasons for having to take symmetric derivatives don't apply in the non-relativistic case; hence there are no doublers in NRQCD.
As it turns out, that limit is also difficult to handle, but for entirely different reasons: The correlation functions, from which the properties of bound states are extracted, show an exponential decay of the form
, where
is the number of timesteps, and
is the product of the state's mass and the lattice spacing. Now for a heavy quark, e.g. a bottom, and the lattice spacings that are feasible with the biggest and fastest computers in existence today,
, which means that the correlation functions for an
will decay like
, which is way too fast to extract a meaningful signal. (Making the lattice spacing smaller is so hard because in order to fill the same physical volume you need to increase the number of lattice points accordingly, which requires a large increase in computing power.)Fortunately, in the case of heavy quark systems the kinetic energies of the heavy quarks are small compared to their rest masses, as evidenced by the relatively small splittings between the ground and excited states of heavy
mesons. This means that the heavy quarks are moving at non-relativistic velocities
and can hence be well described by a Schrödinger equation instead of the full Dirac equation after integrating out the modes with energies of the order of
. The corresponding effective field theory is known as Non-Relativistic QCD (NRQCD) and can be schematically written using the Lagrangian
where $\psi$ is a non-relativistic two-component Pauli spinor and the Hamiltonian is

In actual practice, this is not a useful way to write things, since it is numerically unstable for
; instead one uses an action that looks like
where

whereas
incorporates the relativistic and other corrections, and
is a numerical stability parameter that makes the system stable for
.This complicated form makes NRQCD rather formidable to work with, but it can be and has been successfully used in the description of the
system and in other contexts. In fact, some of the most precise predictions from lattice QCD rely on NRQCD for the description of heavy quarks.It should be noted that the covariant derivatives in NRQCD are nearest-neighbours differences -- the reasons for having to take symmetric derivatives don't apply in the non-relativistic case; hence there are no doublers in NRQCD.
Labels:
lattice fermions
Monday, April 24, 2006
A debate about staggered fermions
Recently, there have been a number of short papers on the arXiv that discussed some potential problems that the usual procedure of taking the fourth root of the staggered fermion determinant to obtain a single-flavour theory might bring with it.
As a little reminder, staggered fermions are obtained from naive fermions by redistributing the spinor degrees of freedom across different lattice sites. As a result, staggered fermions describe a theory with four (rather than the 16 naive) degenerate fermion flavours, usually called "tastes" to distinguish them from real flavours. In order to obtain a theory with a single physical flavour, one usually takes the fourth root of the fermionic determinant for staggered fermions; this is correct in the free theory and in perturbation theory, but nobody really knows whether it makes sense nonperturbatively.
In the paper starting this recent debate, Creutz claimed that this procedure leads to unphysical results. His argument is based on the observation that with an odd number of quark flavours, physics is not invariant under a change of sign of the quark mass term, and hence the chiral expansion must contain odd powers of the quark mass. Since the staggered theory is invariant under a change of sign of the quark mass, so will be its fourth-rooted descendant, and hence it can only pick up even terms in the chiral expansion. Thus, Creutz claims, staggered fermions describe incorrect physics.
Within a week, there was a reply from Bernard, Golterman, Shamir and Sharpe, who claim that Creutz's argument is flawed since the quark mass in the theory corresponding to the continuum limit of the rooted staggered theory is always positive, regardless of the sign of the original quark mass, and since moreover the nonanalyticity inherent in taking a root leads to the emergence of odd powers of the (positive) mass in the continuum limit.
This was followed by third paper by Dürr and Hoelbling, in which they show how one may define "smart" determinant for staggered fermions (by including a phase factor that depends on the topological index of the gauge field background) that allows to reach the regime of negative quark masses. I have to admit that I do not fully understand this work, and enlightenment from readers is appreciated.
The debate over the correctness of the fourth root trick for staggered fermions is likely to go on for a while, particularly given the fact that the choice of fermion discretisation has become an almost religious issue within the lattice community. Personally, I certainly hope that staggered fermions give the correct physics, but I am not sure whether I actually have enough evidence or understanding to have an opinion either way.
Update: The paper by Creutz has been updated with a reply to the objections raised by Bernard et.al. (leading to the rather strange situation of circular citations between papers bearing different date stamps). Creutz now argues that while the problems he mentions may go away in the continuum limit, observables that develop a divergent dependence on a regulator at isolated points (such as the chiral condensate at m=0) are an "absurd behaviour" for a regulator, and that Wilson fermions are preferable in this regard. I am not entirely sure in how far the existence of exceptional configurations is a less absurd behaviour, though. I suppose there may be another round in this debate (with yet more circular citations).
As a little reminder, staggered fermions are obtained from naive fermions by redistributing the spinor degrees of freedom across different lattice sites. As a result, staggered fermions describe a theory with four (rather than the 16 naive) degenerate fermion flavours, usually called "tastes" to distinguish them from real flavours. In order to obtain a theory with a single physical flavour, one usually takes the fourth root of the fermionic determinant for staggered fermions; this is correct in the free theory and in perturbation theory, but nobody really knows whether it makes sense nonperturbatively.
In the paper starting this recent debate, Creutz claimed that this procedure leads to unphysical results. His argument is based on the observation that with an odd number of quark flavours, physics is not invariant under a change of sign of the quark mass term, and hence the chiral expansion must contain odd powers of the quark mass. Since the staggered theory is invariant under a change of sign of the quark mass, so will be its fourth-rooted descendant, and hence it can only pick up even terms in the chiral expansion. Thus, Creutz claims, staggered fermions describe incorrect physics.
Within a week, there was a reply from Bernard, Golterman, Shamir and Sharpe, who claim that Creutz's argument is flawed since the quark mass in the theory corresponding to the continuum limit of the rooted staggered theory is always positive, regardless of the sign of the original quark mass, and since moreover the nonanalyticity inherent in taking a root leads to the emergence of odd powers of the (positive) mass in the continuum limit.
This was followed by third paper by Dürr and Hoelbling, in which they show how one may define "smart" determinant for staggered fermions (by including a phase factor that depends on the topological index of the gauge field background) that allows to reach the regime of negative quark masses. I have to admit that I do not fully understand this work, and enlightenment from readers is appreciated.
The debate over the correctness of the fourth root trick for staggered fermions is likely to go on for a while, particularly given the fact that the choice of fermion discretisation has become an almost religious issue within the lattice community. Personally, I certainly hope that staggered fermions give the correct physics, but I am not sure whether I actually have enough evidence or understanding to have an opinion either way.
Update: The paper by Creutz has been updated with a reply to the objections raised by Bernard et.al. (leading to the rather strange situation of circular citations between papers bearing different date stamps). Creutz now argues that while the problems he mentions may go away in the continuum limit, observables that develop a divergent dependence on a regulator at isolated points (such as the chiral condensate at m=0) are an "absurd behaviour" for a regulator, and that Wilson fermions are preferable in this regard. I am not entirely sure in how far the existence of exceptional configurations is a less absurd behaviour, though. I suppose there may be another round in this debate (with yet more circular citations).
Labels:
lattice fermions
Tuesday, March 28, 2006
Twisted Mass Fermions
Time for another post in our series about lattice fermions. So far we have talked about naive, Wilson, staggered and Ginsparg-Wilson fermions. In this post, we are going to take a look at a still fairly new approach to lattice fermions that is known under the name of twisted mass QCD (tmQCD).
What one does in this approach is to take the Dirac operator for a flavour doublet of fermions and add to it a chirally twisted mass term

where the
acts in flavour space. This extra term together with the doublet structure has the consequence that the worrisome exceptional configurations that plague Wilson quarks (remember, those were the configurations where the additive mass renormalisation that is allowed for Wilson fermions because they violate chiral symmetry takes the renormalised mass through zero) no longer exist, since the twisted Dirac operator has positive determinant:
![$$det[D_{tw}] = det[D^\dag D +\mu^2]>0$$](http://photos1.blogger.com/blogger/1971/1881/320/twistedquarks_3.jpg)
and hence does not have any zero eigenvalues.
A flavour-dependent chiral rotation

leaves the continuum action with an added twisted mass term invariant, but mixes the ordinary mass
with the twisted mass
. Hence one can see the twisted mass action for a given
as being the result of applying this chiral rotation to the ordinary continuum QCD action, and vice versa. The basis in which the
term vanishes is known as the physical basis.
On the lattice, the twisted mass is usually added to the Wilson Dirac operator (which needs it most, since it suffers from exceptional configurations). The resulting action can then be used to study quarks at small masses, where the Wilson action itself would fail. It also has the added benefit that certain observables are automatically free of
lattice artifacts with a twisted mass.
The twisted mass theory has its own problems, though: The appearance of
in the twisted mass term means that the up- and down-type quarks have opposite signs of the twisted mass, and hence isospin is no longer conserved. Also, the appearance of
implies that parity is no longer a symmetry, although a generalised parity operation involving the twist angle can be defined as a symmetry of the twisted theory.
In closing, it should be stressed again that the exact meaning and properties of twisted mass are still a very active field of research, and some surprises may still be expected. I should also add that I am not really an expert on tmQCD (though other people here in Regina are), so corrections and additional remarks are particularly welcome on this post.
What one does in this approach is to take the Dirac operator for a flavour doublet of fermions and add to it a chirally twisted mass term

where the
acts in flavour space. This extra term together with the doublet structure has the consequence that the worrisome exceptional configurations that plague Wilson quarks (remember, those were the configurations where the additive mass renormalisation that is allowed for Wilson fermions because they violate chiral symmetry takes the renormalised mass through zero) no longer exist, since the twisted Dirac operator has positive determinant:![$$det[D_{tw}] = det[D^\dag D +\mu^2]>0$$](http://photos1.blogger.com/blogger/1971/1881/320/twistedquarks_3.jpg)
and hence does not have any zero eigenvalues.
A flavour-dependent chiral rotation

leaves the continuum action with an added twisted mass term invariant, but mixes the ordinary mass
with the twisted mass
. Hence one can see the twisted mass action for a given
as being the result of applying this chiral rotation to the ordinary continuum QCD action, and vice versa. The basis in which the
term vanishes is known as the physical basis.On the lattice, the twisted mass is usually added to the Wilson Dirac operator (which needs it most, since it suffers from exceptional configurations). The resulting action can then be used to study quarks at small masses, where the Wilson action itself would fail. It also has the added benefit that certain observables are automatically free of
lattice artifacts with a twisted mass.The twisted mass theory has its own problems, though: The appearance of
in the twisted mass term means that the up- and down-type quarks have opposite signs of the twisted mass, and hence isospin is no longer conserved. Also, the appearance of
implies that parity is no longer a symmetry, although a generalised parity operation involving the twist angle can be defined as a symmetry of the twisted theory.In closing, it should be stressed again that the exact meaning and properties of twisted mass are still a very active field of research, and some surprises may still be expected. I should also add that I am not really an expert on tmQCD (though other people here in Regina are), so corrections and additional remarks are particularly welcome on this post.
Labels:
lattice fermions
Thursday, February 02, 2006
Exactly chiral fermions
In the last post in this series, we looked at the Ginsparg-Wilson relation and how it might provide a way to get past the Nielsen-Ninomiya theorem. In this post we shall have a look at how this can happen in practice.
One way in which a four-dimensional theory of a chiral fermion can be realized is by dimensional reduction from a five-dimensional theory. Let us consider the five-dimensional continuum theory of a Dirac fermion coupled to a scalar background field depending on only the fifth dimension
:

where the scalar field is assumed to be a step function of the same general form as
. The plane
can be understood as a domain wall of width
separating domains of
and
. The case of interest has
large.
From the square of the Dirac equation, we have for a fermion with four-momentum
,
,
, that
![$$\left[ -\partial_s^2 + \gamma_5\partial_s\phi(s) + \phi(s)^2 \right] \chi(s) = m^2 \chi(s)$$](http://photos1.blogger.com/blogger/1971/1881/320/chiralquarks_12.jpg)
and the allowed masses on the four-dimensional domain wall are determined by the eigenvalue spectrum of a differential operator in
. All non-zero eigenvalues are of order
and hence large. For the zero eigenvalues, the Dirac equation can be decoupled into
![$$\left[ -\gamma_5\partial_s + \phi(s) \right] \chi(s) = 0 \\gamma_\mu p_\mu \chi(s) = 0$$](http://photos1.blogger.com/blogger/1971/1881/320/chiralquarks_15.png)
with solutions

Of these, only the negative chirality solution is normalizable, and hence the low-energy spectrum on the domain wall consists of a single left-handed chiral fermion.
The presence of the scalar background field
is a little awkward, but we may simplify the situation to the case of an ultra-massive five-dimensional fermion

in the half-space
subject to the Dirichlet boundary condition

and perform the same analysis with
replaced by
.
In the early nineties, Kaplan discovered that the same domain wall effect still occured on a lattice when the Wilson operator was used to discretize the five-dimensional theory. The apparent violation of the Nielsen-Ninomiya theorem is due to the fact that the four-dimensional theory is not the whole story: with a finite extent
in the fifth direction, there will necessarily be another domain wall with opposite orientation, on which a massless chiral fermion of opposite chirality will live, thus fulfilling both the Nielsen-Ninomiya theorem in the five-dimensional theory and ensuring the mutual cancellation of the chiral anomalies stemming from either fermion. The anomalous divergence simply becomes a flow of charge onto and off the domain wall from the extra dimension.
Around the same time, Narayanan and Neuberger discovered a formulation of chiral fermions in terms of the overlap between the ground states of two Hamiltonians representing "time" evolution to
along the fifth direction. Later, Neuberger discovered a way to write the overlap as the determinant of a Dirac operator, the overlap operator

where
is the Wilson Dirac operator. This formulation avoids the need for an explicit fifth dimension, but at the expense of introducing the slightly awkward operator sign function
.
Later, it was shown that the domain wall and overlap formulations were essentially equivalent. It can also be shown that both the overlap operator and the effective Dirac operator for fermions on the domain wall satisfy the Ginsparg-Wilson relation, thereby allowing to describe exactly chiral fermions on the lattice.
So what is the bad news? The bad news is that these exactly chiral fermion formulations are extremely hard to simulate. Domain wall fermions need to be simulated in five dimensions, greatly increasing the compuational demand, and for overlap fermions the operator sign function is rather difficult to compute. So while these actions are exactly chiral, and hence in way closer to the real continuum physics, simulating them at reasonable sizes and lattice spacing will require a huge computational effort. If one considers to what effort MILC had to go to get 1% level predictions using staggered fermions (which are very efficient to simulate), it becomes clear that high-precision predictions from dynamical simulations using exactly chiral fermions are still a fair while in the future.
In the next, and probably final post in this series, we will go and have a look at a fairly new lattice fermion action, known as twisted mass.
One way in which a four-dimensional theory of a chiral fermion can be realized is by dimensional reduction from a five-dimensional theory. Let us consider the five-dimensional continuum theory of a Dirac fermion coupled to a scalar background field depending on only the fifth dimension
:
where the scalar field is assumed to be a step function of the same general form as
. The plane
can be understood as a domain wall of width
separating domains of
and
. The case of interest has
large.From the square of the Dirac equation, we have for a fermion with four-momentum
,
,
, that![$$\left[ -\partial_s^2 + \gamma_5\partial_s\phi(s) + \phi(s)^2 \right] \chi(s) = m^2 \chi(s)$$](http://photos1.blogger.com/blogger/1971/1881/320/chiralquarks_12.jpg)
and the allowed masses on the four-dimensional domain wall are determined by the eigenvalue spectrum of a differential operator in
. All non-zero eigenvalues are of order
and hence large. For the zero eigenvalues, the Dirac equation can be decoupled into![$$\left[ -\gamma_5\partial_s + \phi(s) \right] \chi(s) = 0 \\gamma_\mu p_\mu \chi(s) = 0$$](http://photos1.blogger.com/blogger/1971/1881/320/chiralquarks_15.png)
with solutions

Of these, only the negative chirality solution is normalizable, and hence the low-energy spectrum on the domain wall consists of a single left-handed chiral fermion.
The presence of the scalar background field
is a little awkward, but we may simplify the situation to the case of an ultra-massive five-dimensional fermion
in the half-space
subject to the Dirichlet boundary condition
and perform the same analysis with
replaced by
.In the early nineties, Kaplan discovered that the same domain wall effect still occured on a lattice when the Wilson operator was used to discretize the five-dimensional theory. The apparent violation of the Nielsen-Ninomiya theorem is due to the fact that the four-dimensional theory is not the whole story: with a finite extent
in the fifth direction, there will necessarily be another domain wall with opposite orientation, on which a massless chiral fermion of opposite chirality will live, thus fulfilling both the Nielsen-Ninomiya theorem in the five-dimensional theory and ensuring the mutual cancellation of the chiral anomalies stemming from either fermion. The anomalous divergence simply becomes a flow of charge onto and off the domain wall from the extra dimension.Around the same time, Narayanan and Neuberger discovered a formulation of chiral fermions in terms of the overlap between the ground states of two Hamiltonians representing "time" evolution to
along the fifth direction. Later, Neuberger discovered a way to write the overlap as the determinant of a Dirac operator, the overlap operator
where
is the Wilson Dirac operator. This formulation avoids the need for an explicit fifth dimension, but at the expense of introducing the slightly awkward operator sign function
.Later, it was shown that the domain wall and overlap formulations were essentially equivalent. It can also be shown that both the overlap operator and the effective Dirac operator for fermions on the domain wall satisfy the Ginsparg-Wilson relation, thereby allowing to describe exactly chiral fermions on the lattice.
So what is the bad news? The bad news is that these exactly chiral fermion formulations are extremely hard to simulate. Domain wall fermions need to be simulated in five dimensions, greatly increasing the compuational demand, and for overlap fermions the operator sign function is rather difficult to compute. So while these actions are exactly chiral, and hence in way closer to the real continuum physics, simulating them at reasonable sizes and lattice spacing will require a huge computational effort. If one considers to what effort MILC had to go to get 1% level predictions using staggered fermions (which are very efficient to simulate), it becomes clear that high-precision predictions from dynamical simulations using exactly chiral fermions are still a fair while in the future.
In the next, and probably final post in this series, we will go and have a look at a fairly new lattice fermion action, known as twisted mass.
Labels:
lattice fermions
Subscribe to:
Posts (Atom)