## 26 Aug Approximating Integrals in High Dimensions— An Informal Report

**1. Introduction**

For half a century or more many researchers have grappled with the problem of numerical integration in many dimensions— on the one hand to understand what is possible, on the other to construct and analyse effective algorithms. This is an informal and rather personal account of one part of that effort.

For simplicity, we consider only the problem of integration over the unit cube in dimensions.

Thus the task is to approximate the integral

_____ (1)

where * *is an integer greater than 1, and * *is a given continuous function (or a function from a given class), hopefully having also some additional smoothness property. The interesting case is when * *is large, say in the hundreds or even thousands.

The simplest kind of dimensional integration rule is just a product of 1-dimensional rules, obtained by using a favourite quadrature rule (say an point Gauss rule) in each dimension. But when is large no such product rule is feasible, since the total number of points for the dimensional rule is then an impossibly large number for large : if * *is 100 then even a product of 2-point Gauss rules would need function evaluations, a number unlikely ever to be feasible on a conventional computer.

**2. Quasi-Monte Carlo Rules**

In this short report we consider only point integration rules of the form

_____ (2)

that use prescribed points A rule of this form is called a Quasi-Monte Carlo (or QMC) rule. In contrast, if the points * *are independent random samples from a uniform distribution on then we have the simplest Monte Carlo rule. Monte Carlo methods have many virtues, the most prominent being that smoothness of the integrand is not needed: all that is needed is that * *belong to But the Monte Carlo method suffers from slow convergence: the root mean square error in the sense of expectation has the celebrated convergence rate of Our aim is to do better than that.

The central question for the QMC method is how to choose the points That is a big subject, explored in a recent survey [1]. But here the view is personal and selective, so we can be brief.

There are now two main strategies for choosing QMC points, which we may call the *low discrepancy *and *lattice *approaches. Here the focus is on the second.

**3. Lattice Rules**

In the simplest form of lattice rule, the only kind considered here, the points are given by the simple formula

_____ (3)

where known as the *generating vector*, is an integer vector of length * *having no factor in common with and the braces around a vector indicate that we take the fractional part of each component in the vector.

Lattice methods began life in the 1950s in the hands of number theorists, who obtained many beautiful results, especially for functions that are 1-periodic with respect to each component of For surveys of the classical results see [4] and [5]. Those results, based on Fourier analysis, are of limited practical use, because high-dimensional integrals arising from applications are usually not naturally periodic; and forcing periodicity through a change of variables has a tendency to turn easy high-dimensional problems into difficult ones. In the developments reported here we make no periodicity assumption.

The point set (3) depends entirely on the choice of the integer vector How should * *be chosen? Clearly there is no sense in allowing a component of * *to lie outside the range Within these constraints, how are we to find a good choice for ? Up to the present time no closed formula for a good generating vector * *(whatever “good” might mean) is known, beyond And intuition seems of little use in high dimensions. To find a good for such an we need some mathematical structure.

**4. Worst Case Error, and a Norm with Weights**

We assume that our functions * *all belong to some Hilbert space and then choose a generating vector * *to minimise (or at least makes small) the *worst case error *in this function space. The worst case error in the space * *of a QMC rule using the point set is the largest error for in the unit ball of that is

So now we have again changed the problem, this time to one of choosing a good function space It turns out that a good choice is to take to consist of variate absolutely continuous functions whose *mixed first derivatives are squareintegrable*. More precisely, we take

where is shorthand for while denotes the components * *of * *with and denotes * *with all components * *with not in set equal to 0, which we refer to as the *anchor*.

But how exactly should we define the norm in ? It turns out to be a fruitful idea to introduce positive numbers called *weights *into the norm: in the most general case we introduce one weight for each (with ), taking the norm of to be

Note that for * *to be in the unit ball of any term in the sum over with a small weight must also have a small mixed first derivative Thus the weight parameters describe the relative importance of different subsets of the variables small weights correspond to unimportant subsets.

The space is of course a Hilbert space, with the obvious inner product, which we denote by Less obviously, is a *reproducing kernel Hilbert space *with the relatively simple kernel

That is to say, is a symmetric function on with the property that *and *the reproducing property (easily verified for )

Why is the reproducing kernel property helpful? It is because in a reproducing kernel Hilbert space with kernel * *the worst case error is computable, via the formula

This can easily be proved by simple Hilbert space arguments together with the reproducing property of the kernel.

But is the worst case error really computable for our function space given that the sum over in our expression for contains * *terms? Clearly computations with general weights are not feasible when * *is large, but they become so for some special cases. The original weights introduced in [6] were of the *product *form

in which case it is easily seen that can be expressed as the easily evaluated product

Many results take an especially simple form in the product case. In particular, the necessary and sufficient condition established in [6] for the worst case error to be bounded independently of * *is just

This condition fails spectacularly in the classical case in which for all but is satisfied, for example, if

Nevertheless, it has to be admitted that product weights, in spite of their popularity, were introduced for mathematical convenience, rather than being guided by application. In this report general weights have been retained, because certain non-product weights of so-called *product and order dependent *or POD form, have arisen recently in applications, see [2]. The POD weights have the form

Where are given positive numbers. As with product weights, computations with POD weights turn out to be feasible (see below).

**5. A Little Randomisation**

It turns out to be useful, for reasons of both theory and practice, to modify slightly the QMC algorithm by introducing some randomisation: specifically, we replace the lattice rule given by (2) and (3) by the *randomly shifted lattice rule*

where is a random vector of length *s *drawn from a uniform distribution on This has some flavour of the Monte Carlo method: indeed it reduces to the Monte Carlo method if In practice the randomly shifted lattice rule is implemented, once * *is chosen, by picking some fixed number (say 10 or 30) of random (or rather pseudo-random) shifts and averaging the cubature results for the separate shifts to obtain an estimate of the integral, while using the spread of the results to estimate the error, just as with the Monte Carlo method. The worst case error is now replaced by the root mean square expected value of the error, known as the *shift-averaged worst case error*,

For the case of our particular Hilbert space it can be computed by the formula (see [1, Lemma 5.7])

where is the Bernoulli polynomial of degree 2. For the case of product weights the expression reduces to the easily computable

**6. Good Lattice Rules Exist!**

As explained below, we now know the following:

For every * *there exists * *such that, for all

_____ (4)

Here is the Riemann zeta function, and is the Euler totient function, that is, it is the number of integers between 1 and that are relatively prime to If * *is prime then and in that case the order of convergence is We would set if that were possible, because then the order of convergence would be but that is not possible because diverges as approaches 1 from above. But the result shows (since can be chosen arbitrarily close to 1/2) that for every the rate of convergence will ultimately be arbitrarily close to Can good rates of convergence be obtained independently of dimension? The answer depends on the weights : the answer is yes if the weights decay sufficiently quickly.

**7. Good Lattice Rules are (sometimes) Constructible!**

Lattice rules that achieve the bound (4) are in principle constructible— indeed, the proof of (4) (see [1, Theorem 5.9]) is by an inductive argument based on an explicit construction. That construction is the so-called *component by component *(or CBC) construction.

The idea of the CBC construction is that the components of the generating vector are determined one after the other, starting with with the successive components * *for * *being minimisers of a quantity related to the shift-averaged worst case error for the dimensional problem. For our particular choice of norm, the quantity to be minimised in the case of general weights is not exactly the shift-averaged worst case error, see [1, Theorem 5.9], but if we change slightly the definition of the norm, to

then the CBC algorithm precisely minimises the shift-averaged worstcase error. Here denotes the components * *of * *with * *in but *not *in so that now the components of * *that do not appear in the mixed derivative are integrated out instead of being anchored at 0. The formula for the shift-averaged worst case error is the same as before, except that the terms are now replaced by zero. The bound achieved by the CBC construction (proved inductively in [1, Theorem 5.8]) is exactly (4), except that again the term is replaced by zero. Finally, fast CBC construction is possible for both product weights and POD weights, see respectively [1, Section 5.5] and [1, Section 5.6].

**8. Conclusion**

We have seen that lattice rules for high dimensional integration are at an interesting stage of development: for given dimensionality * *and given weights we can in principle construct a generating vector * *for which the order of convergence for * *in the unit ball of is arbitrarily close to with an implied constant that is known explicitly for a given convergence rate, and that is independent of the dimensionality if the weights are small enough. The construction is feasible for both product weights and POD weights.

This informal report is incomplete in many ways. Not reported on are higher order QMC methods, which aim to achieve a rate of convergence faster than Also not reported is infinite-dimensional integration, a topic now showing high levels of activity. We have not done justice to the important work on fast implementation of CBC, initiated by Dirk Nuyens and Ronald Cools. We have not discussed how the weights should be chosen for a given practical problem. We have not considered lattice rules that are extensible in number of points or dimension. We have not considered integration over unbounded regions. All these matters are considered in [1].

Finally, many people have contributed to the recent developments outlined here, too many to mention them all, but especially I want to acknowledge Henryk Wo´zniakowski, Stephen Joe, Frances Kuo and Josef Dick. I am grateful to these and all others for their contributions, and to the Australian Research Council for its sustained support.

**References**

[1] J. Dick, F. Y. Kuo and I. H. Sloan, Numerical integration in high dimensions — the quasi-Monte Carlo way, *Acta Numerica*, 13 (2013) 133–288.

[2] F. Y. Kuo, C. Schwab and I. H. Sloan, Quasi- Monte Carlo finite element methods for a class of elliptic partial differential equations with random coefficients, *SIAM J., Numerical Analysis*, 50 (2012) 3351–3374.

[3] F. Y. Kuo and I. H. Sloan, Lifting the curse of dimensionality, *Notices of the AMS *52 (2005) 1320–1328.

[4] H. Niederreiter, Quasi-Monte Carlo methods and pseudo random numbers, *SIAM *(1992).

[5] I. H. Sloan and S. Joe, *Lattice Methods for Multiple Integration *(Oxford University Press, 1994).

[6] I. H. Sloan and H. Wo´zniakowski, When are quasi-Monte Carlo algorithms, efficient for highdimensional integrals? *J. Complexity *14 (1998) 1–33.

University of New South Wales, Australia

Ian Sloan completed physics and mathematics degrees at Melbourne University, a Master’s degree in mathematical physics at Adelaide, and a PhD in theoretical atomic physics (under the supervision of HSW Massey) at the University of London, finishing in 1964. After a decade of research on few-body collision problems in nuclear physics, his main research interests shifted to computational mathematics.

He was Head of the School of Mathematics in University of New South Wales and member of the ARC’s Research Grants Committee, and is a former President of the Australian Mathematical Society. His awards/recognitions include Fellow of the Australian Academy of Science, ANZIAM Medal, etc. In 2008 he was appointed an Officer of the Order of Australia (AO). He is currently Deputy Director of MASCOS, the ARC Centre of Excellence for Mathematics and Statistics of Complex Systems.

*[ This article originally appeared in the Asia Pacific Mathematics Newsletter, Volume 3 No. 3 (July 2013), published by World Scientific. It has been republished here with a special permission from World Scientific.]*

[ad#ad-2]

## No Comments