_{ub}(currently subject to some significant tension between inclusive and exclusive determinations) are in the final stages of analysis. The other effective theory was QCD with N

_{f}<6 flavours -- which is of course technically an effective theory where the heavy quarks have been integrated out! Rainer presented a new factorisation formula that relates the mass of a light hadron in the theory with a heavy quark to that of the same hadron in a theory in which the heavy quark is massless by a factor dependent on the hadron and a universal perturbative factor. The factorisation formula has been tested for gluonic observables in the pure gauge theory matched to the two-flavour theory.

After tea, we had a session focussed on algorithms and machines. The first speaker was Andreas Frommer speaking about multigrid solvers for the Dirac equation in lattice QCD. A multigrid solver consists of a smoother and a coarse-grid correction. For the smoother for the Dirac equation, the Schwartz Alternating Procedure (SAP) is a natural choice, whereas for the coarse-grid correction, aggregate-based interpolation (essentially the same idea as Lüscher-style inexact deflation) can be used. The resulting multigrid algorithm is very similar to the domain-decomposed algorithm used in the DD-HMC and openQCD codes, but generalises to more than two levels, which may lead to better performance. Applications to the overlap operator were presented.

Next, Stephan Solbrig presented the QPACE2 project, which aims to build a supercomputer based on Intel Knight's Corner (Xeon Phi) cards as processors, where each node consists of four Xeon Phis linked to each other, a weak host CPU used only for booting, and to an Infiniband card via a PCIe switch. The whole system uses hot water cooling, building on experience gathered in the iDataCool project. The 512bit wide registes of the Xeon Phi necessitate several programming tricks such as site fusing to make optimal use of computing resources; the resulting code seems to scale almost perfectly as long as there are sufficient numbers of domains to keep all nodes busy. An interesting side note was that apparently there are extremophile bacteria that thrive in the copper pipes of water-cooled computer clusters.

Pushan Majumdar rounded off the session with a talk about QCD on GPUs. The special programming model of GPUs (small amount of memory per core, restrictions on branching, CPU/GPU data transfer as a bottleneck) makes programming GPUs challenging. The OpenACC compiler standard, which aims to offload the burden of dealing with GPU particulars onto the compiler vendor, may offer a possibility to easily port OpenMP-based code written for CPUs on GPUs, and Pushan showed some worked examples of Fortran 90 OpenMP code adapted for OpenACC.

After lunch, I had to retire to my room for a little (let me hasten to add that the truly excellent lunch provided by the extremely hospitable TIFR is definitely absolutely blameless in this), and thus missed the afternoon's first two talks, catching only the end of Jyotirmoy Maiti's talk about exploring the spectrum of the pure SU(3) gauge theory using the Wilson flow.

Gunnar Bali closed the day's proceeding with a very nice colloquium talk for a larger scientific audience, summarising the Standard Model and lattice QCD in an accessible manner for non-experts before proceeding to present recent results on the sea quark content and spin structure of the proton.