How to add primes Jan Vonk
Transcription
How to add primes Jan Vonk
Jan Vonk How to add primes Contents 1 Exponential sums 1.1 1.2 5 Motivating exponential sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.1 A historic example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.1.2 A motivation for exponential sums . . . . . . . . . . . . . . . . . . . . . . . 7 M¨ obius randomness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.1 Discussion and consequences . . . . . . . . . . . . . . . . . . . . . . . . . . 9 1.2.2 A note on generalisation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.3 A strategy outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4 Sums involving Λ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.4.1 An estimate for sums of Λ ∞ . . . . . . . . . . . . . . . . . . . . . . . . . . . 0 15 1.4.2 An estimate for a double sum with Λ . . . . . . . . . . . . . . . . . . . . . 17 1.4.3 A consequence of both estimates . . . . . . . . . . . . . . . . . . . . . . . . 20 2 The Goldbach problem 23 2.1 The binary Goldbach problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 The ternary Goldbach problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.3 Conclusion 29 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 A The toolbox 31 B The M¨ obius function µ 34 B.1 A proof of Davenport’s result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B.1.1 The case x τ <q≤τ B.1.2 The case q ≤ x τ 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 B.2 Discussion of Vinogradov’s method . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 B.3 Discussion of importance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2 Introduction When one first starts thinking about additive properties of primes, one finds himself immediately confronted with the fundamental difficulty underlying the statements. Prime numbers are in a certain sense meant for multiplying, not for adding. Being defined as fundamental numbers for multiplication, they seem hard to characterise when we start adding them. The first challenge therefore is finding a good way of adding primes. Part 1. We will start by motivating an interest in exponential sums, and outlining our general strategy using Fourier transforms. With this motivation, we investigate the fundamental M¨obius function and discuss an exponential sum involving this function and its consequences. A general heuristic is stated. We then proceed to treating sums involving another fundamental function: the von Mangoldt function. We derive results on sums involving this function, using our results on the M¨ obius function. At the end of the first chapter, we will also have the necessary machinery needed in the next part. Part 2. This part is about Goldbach’s problem. Based on a conjecture by Goldbach in a letter to Euler, we suspect that every even integer greater than 2 can be written as the sum of two primes (binary Goldbach problem) and every odd integer greater than 5 can be written as the sum of three primes (ternary Goldbach problem). These questions have been confirmed to some extent. We will prove the following celebrated theorem: Theorem. (Vinogradov, 1937) For any fixed A > 0, X Λ(k1 )Λ(k2 )Λ(k3 ) = k1 +k2 +k3 =n 1 S3 (n)n2 + O n2 (log n)−A , 2 where the implied constant depends only on A, and Y Y 1 1 . S3 = 1− 1+ (p − 1)2 (p − 1)3 p|n p-n This theorem has as a consequence the asymptotic result for the ternary Goldbach conjecture. A similar theorem for sums of two primes remains to date still unproven. The main conjecture is Conjecture 1. For any even n ≥ 4 and fixed A > 0, X Λ(k1 )Λ(k2 ) = S2 (n)n + O n(log n)−A , k1 +k2 =n where the implied constant depends only on A, and Y Y 1 1 . S2 (n) = 1+ 1− (p − 1) (p − 1)2 p|n p-n 3 This would indeed imply the asymptotic result for the binary Goldbach problem. We will prove this conjecture for almost all even numbers, hence ’almost’ establishing the conjectured asymptotic result. A full proof remains as to date still unknown. A note on the implied constant. In analytic number theory, one will almost exclusively deal with asymptotic results. This means certain results hold ’from some number onwards’. Depending on the method of proof, it might be possible (although it is certainly no trivial task) to determine this number exactly. For example, Vinogradov [9] showed as discussed above that every odd number, from a certain number onwards, can be written as the sum of three primes. Borodzkin 15 [2] showed that the number can be taken to be 33 . This is still beyond all hopes of checking the remaining cases. An example. The reader who is unfamiliar with asymptotic results might find it useful to see an example. This example will be of use to us later. Imagine we want to count the number of solutions (x, y) ∈ N2 such that ax + by = n, for a given a, b, n ∈ N with (a, b) = 1. We could try and find explicit examples, and quickly find that for (a, b, n) = (2, 3, 5) we have 1 solution. Or picking (a, b, n) = (1, 3, 7) gives us 3 solutions. One easily sees that it suffices to solve ax ≡ n n . Here (mod b) for 0 ≤ x ≤ na . Instead of trying to make this explicit, we say this is roughly ab n roughly means not deviating more than a constant value from ab (not necessarily an integer). We restate this in the language of this essay as X n 1= + O(1). ab ax+by=n We use the definition of O which can be found for example in [1], [4], or [6]. Notice the convenience of this approach. Any explicit formula is bound to get too complicated so as to cause unclarity in presenting calculations. In the end the result will not be greatly affected by using the explicit formula rather than the ’roughly correct’ answer. Furthermore, allowing one of x, y to be zero or not is irrelevant, since it will only change the constant. We choose therefore to work with Onotation, making the arguments cleaner and giving us more computational comfort. The price we pay is the unknown implied constant, possibly taking on monstrous proportions as in our previous remark. Acknowledgements. We mainly follow Iwaniec and Kowalski [8] in most proofs. The rest of the essay is a collection of ideas taken from various sources, notably Davenport [4], Apostol [1], and Green [6]. 4 Part 1 Exponential sums 1.1 Motivating exponential sums This section is mainly intended to illustrate the importance and outline the general procedure of using exponential sums to prove certain claims. We will be inspired by the theory of Fourier transformation on locally compact abelian groups. Our hope is to convince the reader that it is important and maybe even natural P to try and obtain good results about a function f by finding estimates for sums of the form n f (n)e2πinα , for various values of α. 1.1.1 A historic example Quadratic reciprocity. Suppose we want to prove the quadratic reciprocity law, which states that for any distinct odd primes p, q we have p−1 q−1 p q · = (−1) 2 · 2 . q p Since this theorem deals with Legendre symbols, we want to get to know them better. To do that, we wish to apply classical analysis and the multitude of tools available there. We see ourselves confronted with the fact that Legendre symbols are defined on finite groups of residue classes, and do not live in the realm of classical real or complex analysis. Wewant tomake them continuous somehow, and do it in a symmetric way that does not favour p2 above p3 , say. We choose to 5 examine (consider it an experiment at this point) p−1 X 2πim m G= e p . p m=1 This choice seems promising, since exponential functions behave nicely when analytically manipulated, and the periodicity of the Legendre symbol is reflected in the periodicity of the exponentials. It is called a Gauss sum, honouring the man who discovered a considerable amount of their properties and applications. p−1 Setting = (−1) 2 , simple computation shows that G2 = p. Note that this does not uniquely determine G. Finding the sign of G took Gauss several years, and any such problem must unarguably be hard. It turns out that √ p √ G= −p if p ≡ 1 (mod 4) if p ≡ 3 (mod 4) Dirichlet proved this in more generality with analytic methods. In fact, he evaluated the more general sum S := n−1 X e 2πik2 n . k=0 One can easily rewrite G to see that it is nothing more than the special case n = p. Dirichlet then rewrote the definition of a Gauss sum using an identity commonly Poisson’s summation 2called , he obtains formula (see Appendix A). Applying this formula for f (x) = cos 2πx n S= +∞ Z X 1 e2πin(x 2 +tx) dt. 0 i=−∞ After rewriting this result with various substitutions and evaluations, he ended up with −n S = n(1 + i Z +∞ ) 2 e2πint dt, −∞ reducing the problem to integration, a theory which is very well developed and confronts us with little problems. Indeed, if one is familiar with the theory of Lebesgue integration, the above integral 6 is an easy exercise. In fact, the simple substitution y = √un reduces the integral to a rather famous one. In particular, one obtains the above result for G. For full details, see [4, Chapter 2]. Finally, we look at the behavior of a prime q in the extension Q(ζp )/Q. We know that K := Q(G)/Q is the unique quadratic subfield, since Gal(Q(ζp )/Q) ∼ = (Z/pZ)× . Now we look at the splitting of q in two ways: • By Kummer-Dedekind (note that q - [OK : Z[G]] since 2 6= q), q splits completely in if K/Q p−1 q−1 p 2 · and only if x −p has a root mod q. Clearly this happens if and only if (−1) 2 2 q = 1. • Since Gal(Q(ζp )/K) is the unique subgroup of index 2 in Gal(Q(ζp )/Q)∼ =(Z/pZ)× , it is exactly the subgroup of squares. Therefore q splits in K/Q if and only if pq = 1. This proves the quadratic reciprocity law. 1.1.2 A motivation for exponential sums Inspired by the success of this approach, we try to imitate it in a general setting. The reader might object that all the magic happens in the algebraic number theory, which will be hard to generalise to apply to, say, the M¨ obius function. A valid guess, but it seems that the only effect of the use of deeper theorems is a shorter proof. The true core of the proof seems to lie in the determination of the sign of G, merely sketched here for brevity. Let us try and find the ideas in the preceding proof that made everything work. We found some crucial piece of information by taking the function under consideration (the Legendre symbol) and associating to it a new number (the Gauss sum). The latter has the fortunate property of being suitable for the use of analytic tools (the Poisson summation formula). We then have to find a way of translating the information back to our original function (here from the Gauss sum to the Legendre symbol). How do we associate a ’Gauss sum’ to our ’Legendre symbol’ ? A method is suggested by the theory of Fourier transformation. Indeed, setting p0 = 0, the Legendre symbol is a function f : Z/pZ → C, and hence its Fourier transform is fˆ(r) = p−1 X 2πirm m e p p m=0 where fˆ(1) = G. So we could try and generalise. 7 General strategy. Given an interesting well behaved arithmetic function f : Z → C, we define fˆ : R → C as fˆ(α) = X f (n)e2πinα . n∈Z This function is attacked using our extensive toolbox from analysis. As noted before, this theory is well developed and hence might enable us to find a considerably important piece of information. Obtained information about fˆ will transform back, using Fourier inversion, to information about f , our object of interest. In this fashion we will be able to subtitute our less developed theory for the realm in which f lives, to a theory we understand better and have more experience in. These two worlds are connected by the theory of Fourier transformation. Application to the M¨ obius function. This looks like a promising idea. It will however require some thought to make it work in concrete situations. Let us investigate one. The M¨obius function µ seems, in comparision to the Legendre symbol, to live on a much more fundamental level. It is defined as µ(n) = (−1)r 0 if n is the product of r distinct primes otherwise This function shows up in various places in number theory. Notably, it plays a central role in the theory of Dirichlet convolution, resulting in useful theorems as M¨obius inversion that allow us to restate the definition of many interesting arithmetic functions in terms of µ. It has many other equivalent definitions, and remarkable is how easily it can be defined in terms of divisibility. We therefore expect this function to be very fundamental and would like to know more about it. Since we do not want to worry about convergence, we try to adapt our ’Fourier’-attack and instead consider X µ(n)e2πinα . n≤x Notice that we have two very good reasons to assume that an analysis of this quantity will be considerably harder. Firstly, we have an unnatural cutoff x. Secondly, the M¨ obius function is much harder to grasp and as remarked more fundamental. It will contain a considerable amount of information regarding divisibility of numbers, and will therefore not likely give away its secrets easily. These comments make us aware that it might not be possible to give an explicit evaluation of our sum for a given x and α, as we did before in the particular case of the Gauss sum. Instead, we aim towards finding good estimates of the sum. This is exactly our goal in the next section. 8 1.2 M¨ obius randomness In this section we will discuss the chosen exponential sum in the M¨obius function µ. All results stated here are roughly the prerequisites of this essay. Since the ideas are fundamental and form the true core of the results, their proofs are sketched in the appendices. The main theorem is Theorem 1. For any real α, any A > 0 and x ≥ 2, we have X −A 2πiαm = O x (log x) . µ(m)e m≤x The implied constant only depends on A. The first proof was found by Davenport [3, Theorem 1], based on the technique used by Vinogradov [9] in his solution for the ternary Goldbach problem. However, we feel that it is more natural to turn things around when adapting the viewpoint of exponential sums as we do. We therefore give a proof of Vinogradov’s theorem (see Chapter 2) in a more modern setting, where we base ourselves on Theorem 1. Since Vinogradov’s ideas are fundamental, we decided to sketch the proof of Theorem 1 in Appendix B. Full details can be found in [3], and [8] for a more modern treatment, which we will follow. 1.2.1 Discussion and consequences In a way, this theorem can be considered the true core of many theorems in number theory. Examples of rather impressive corollaries (however untrivial) are the prime number theorem and Vinogradov’s theorem, as we will see in Chapter 2. For the latter, we will recast the theorem in the following essentially equivalent form that will be used later on. Corollary 1. For any real α, A > 0 and x ≥ 2, we have X −A 2πiαm µ(m) log(m)e = O x (log x) , m≤x The implied constant only depends on A. Remark. This is a typical example of an essentially equivalent reformulation that follows straightforwardly from partial summation. The indeterminacy in the log-factor absorbs any extra such factors that arise, resulting in an unchanged error term. This applies in general to estimates with error term O x(log x)−A , making an insertion of extra log-factors in the sum possible. This error term is more common than one might expect. It arises for whole families of sums in a result of Green and Tao we will mention at the end of this section. 9 Proof. Summation by parts (see Appendix A) gives us X µ(m) log(m)e2πiαm = log x m≤x X µ(m)e2πiαm − Z x X 1 m≤x m≤y 1 µ(m)e2πiαm dy. y This last integral is hard to calculate, but easy to estimate from above, keeping in mind that the logarithm is stricly increasing. The extra log-factor is absorbed in the error term we obtain from Theorem 1, giving the desired result. We will conclude this section by presenting the prime number theorem, which we all know very well, as a corollary of Theorem 1. Define first the von Mangoldt function as Λ(n) = log p if n = pm for some prime p and some m ≥ 1 0 otherwise. This function can be related to µ, since Λ(n) = X d|n P d|n Λ(d) = log n implies by M¨obius inversion that µ(d) log X n =− µ(d) log d. d d|n Remark. This function will play a crucial role in our attack of the Goldbach problem. For now it turns out a convenient counter for the abundancy of primes. In the next section, we will emphasise how it incorporates the multiplicative information of primes in an additive way. X Corollary 2. Let ψ(x) = Λ(n), then for any A > 0 we have n≤x ψ(x) = x + O x(log x)−A . A proof outline based on Theorem 1 can be found in Appendix B. By estimating our exponential sum in µ, we obtain crucial information which is translated to the above result. We are encouraged to try our luck again with some function different from µ. When we were confronted with quadratic reciprocity, we identified the Legendre symbol as the fundamental function and investigated the associated Gauss sum. Now our ultimate goal is trying to prove the Goldbach conjecture. In the next section, we identify the fundamental function Λ and investigate an exponential sum which will be the analogue of the Gauss sum. Remark. We recall that there are various stronger forms of the prime number theorem whose proofs generally depend on the location of zeroes of L-functions. However, elementary proofs are sometimes possible. We will state a stronger version which will be used later on. Note that even 10 stronger results have been obtained, but we choose to state this one since it is sufficient for our later estimates. We define for any (a, q) = 1 the function X ψ(x; q, a) = Λ(n). n≤x n≡a (mod q) Corollary 3 (Prime number theorem). For any fixed A > 0 we have, with (a, q) = 1, that ψ(x; q, a) = x + O x(log x)−A . ϕ(q) Remark. This form of the prime number theorem can be interpreted as the statement that the prime numbers are uniformly distributed over the invertible residue classes modulo q. Of course there is at most one prime in the other residue classes since (a, q) is a divisor of any number in such a class. 1.2.2 A note on generalisation This is the place to discuss a general heuristic. As it turns out, the M¨obius function P µ behaves randomly in the following way: If an is any ’reasonable’ sequence, then the sum n≤x µ(x)an is very ’small’ due to ’cancellation’ of terms by the flipping sign of µ. This is a good heuristic to keep in mind when estimating sums involving the M¨obius function, and indeed we will let it guide us frequently throughout this essay. Of course we need to find a rigorous result every time we consider a concrete sequence an . One method we could adapt is the so called Vinogradov’s method. We will discuss the method in some generality in Appendix B, after applying it to the sequence an = e2πiαn when proving Theorem 1. It gives a fairly explicit description of a method that seems to give us the desired results in many cases. When working with this heuristic, we need some feeling on what sequences turn out to be ’reasonable’ and how ’small’ exactly the resulting sum is. Precise meanings can be given to these words to some extent. If we consider O x(log x)−A for every A > 0 as ’small’, then Theorem 1 says that an = e2πinα is ’reasonable’ for any α. Green and Tao’s result [5] presents a much larger class of ’reasonable’ sequences. The mere statement of their result requires deep definitions and will not be discussed in detail here. The reader should however be aware of the fact that there is a huge family of sequences for which our heuristic is proven to hold when ’reasonable’ and ’small’ are defined as above. This heuristic seems to have almost unlimited power and it is not clear what the boundaries are for a general meaning of ’small’ and ’reasonable’, and even less how we could prove it. 11 1.3 A strategy outline Our ultimate goal is obtaining additive properties of prime numbers. As we noted in the introduction, the first challenge is to find a convenient way of adding primes, which contains their characteristic multiplicative behaviour. The logarithm seems to have the convenient mechanism of converting products to sums. The von Mangoldt function Λ seems the natural setting. Indeed, notice that X log n = Λ(d), d|n which can be seen as an additive translation of prime factorisation. Hence Λ seems to contain all information about factorisation, while behaving additively. The function has proven to be a convenient way of keeping track of primes in the above formulations of the prime number theorem. We are hence fully convinced that the true opproach to the Goldbach problem, if existent, might lie in the von Mangoldt function. Alternatives. Since we are initially only interested in adding prime numbers, the fact that Λ is nonzero at every proper prime power might seem problematic. We will explain in Chapter 2 how this can be solved. However, at this point, one could choose many different functions to get a grip on prime numbers. The results we obtain here with Λ can sometimes be reformulated using a different function. An example of this formulation of the prime P phenomenon is an equivalent P number theorem in terms of π(x) = p≤x 1 instead of ψ(x) = n≤x Λ(n). The function π(x) has the advantage of having a clear interpretation, and the disadvantage of being hard to grasp in computations. However, we can often freely switch between π and ψ with the aid of straightforward estimates. Our motivation makes us choose for Λ, giving more computational comfort and I dare say a more natural approach. Natural approach. Back to the Goldbach problem. According to our general principle, we feel like attempting to find good estimates for T (α) = X Λ(n)e2πinα . n≤x We expect that T (α) contains crucial information on the additive properties of primes. This is indeed the case, and if we specify our interest to the Goldbach problem, we can find an easy way of applying an estimate of T (α). Indeed, once we have a sufficiently sharp estimate, we consider ourselves brave enough to investigate S2 (n) := X Λ(k1 )Λ(k2 ) and S3 (n) := k1 +k2 =n X k1 +k2 +k3 =n 12 Λ(k1 )Λ(k2 )Λ(k3 ). These functions count the number of representations of n as the sum of two (resp. three) prime powers in a weighted manner. Proving that they are > 0 for the desired values proves the Goldbach problems. Moreover, a decent understanding of them gives us knowledge on the number of representations, a much deeper question. So how do we estimate these quantities using our estimate of the exponential sum above? Note that a good estimate for S2 (n) implies a good estimate for S3 (n) = X S2 (m1 )Λ(m2 ). m1 +m2 =n This is in accordance with the fact that the binary Goldbach problem is clearly the stronger statement. Furthermore, we have that X Z Λ(k1 )Λ(k2 ) = 1 T (α)2 e−2πinα dα. 0 k1 +k2 =n k1 ,k2 ≤x This is in essence the idea of Fourier inversion, and it gives us a very concrete approach to the problem. In fact, T (α) can indeed be sufficiently sharply bounded, allowing a full proof of Vinogradov’s theorem. Full details can be found in [9] and [4, Chapter 26]. Our approach. The above approach is exactly what we desired to motivate in this essay, and works fine in obtaining our desired results. However, we shall present a different approach that relies on Theorem 1. The intention is to show how the ’randomness’ of the M¨obius function is in a certain sense the fundamental issue. We already mentioned that the prime number theorem can be derived from it (see Appendix B) and we will base our proof of Vinogradov’s theorem on it. The strategy we will follow is a direct one. We will deduce the necessary information about Λ from our estimate in Theorem 1, both functions being connected by Λ(n) = X µ(d) log d|n n . d In this way, we will directly estimate S2 (n). These estimates will finally be used to come to a serious attack on the Goldbach problem in Chapter 2. 13 1.4 Sums involving Λ Let us reflect on how to estimate S2 (n) := X Λ(k1 )Λ(k2 ). k1 +k2 =n P We remember that Λ(n) = − d|n µ(d) log d. Notice that when d is large, we expect the randomness of the M¨ obius function to cancel out lots of the terms, and that the small terms will have most influence. If we have a chance of successfully estimating S2 (n) through the use of our result on the M¨ obius function, we would better split up the defining sum for Λ(n). We now come to formal definitions. For any k ≥ 2, define Λ0 (n) := − X µ(d) log d d|n d≤k and Λ∞ (n) := − X µ(d) log d, d|n d>k respectively. We have chosen not to mention k explicitly in this notation, so the reader is warned that the choice of a fixed k is implied in these quantities. Since our goal is to estimate S2 (n) = X k1 +k2 =n Λ0 (k1 )Λ0 (k2 ) + 2 X Λ0 (k1 )Λ∞ (k2 ) + k1 +k2 =n X Λ∞ (k1 )Λ∞ (k2 ), k1 +k2 =n we will try and obtain good estimates for the inner terms of this sum. Notice that we expect our estimates involving Λ∞ to be more accurate and general, since we expect Theorem 1 to bound things sharply. 14 1.4.1 An estimate for sums of Λ∞ Following our general motto, we will try and obtain information about Λ∞ by considering the sum S ∞ (α) = X Λ∞ (n)e2πinα . n≤x The next lemma gives a good estimate of this quantity. Lemma 1. For any k ≥ 2, α ∈ R and A > 0, we have |S ∞ (α)| = O x log x(log k)−A , the implied constant depending only on A. Remark. Note that S ∞ (α) implicitly depends on k, although not explicitly mentioned. Proof. We have S ∞ (α) = − X X 2πinα µ(d) log d . e n≤x d|n d>k Writing n = dd0 , we get S ∞ (α) = − X X 0 µ(d) log d e2πidd α . d>k d0 ≤ x d Interchanging the order of summation now gives us S ∞ (α) = − X X 0 µ(d) log d e2πidd α . x d0 < x k k<d≤ d0 P P Now by by the inequality | i ai | ≤ i |ai | , we obtain using Corollary 1, −A X x x , |S ∞ (α)| = O log 0 0 d x d 0 d <k 15 the implied constants only depending on A > 0. Note that we make use of the uniformity in α here. The usage of Corollary 1 is justified because letting the sum run over k < d ≤ dx0 gives the 1 = O(log m), this is easily seen to be same estimate. Now using log dx0 > log k and 1 + 12 + . . . + m the desired result. Remark. The step of finding a good estimate for the exponential sum is done in the previous lemma. Now the challenge consists of finding a suitable application of this estimate that gives us the information we need about the function Λ∞ . Lemma 2. For any vectors u, v ∈ Cn and fixed A ≥ 0, we have X −A . ua vb Λ∞ (c) = O kukkvkn log n (log k) a+b+c=n Furthermore, the implied constant can be chosen to depend only on A. Remark. R 1 2πinx The proof is straightforward estimation, cunningly invoking the orthogonality relations e dx = 1 if n = 0 and 0 otherwise. This is exactly what we expect, since this step 0 corresponds to the Fourier inversion in our general strategy. Closely investigating the proof will make this apparent, and remind us that in fact we are applying our attack using Fourier analysis. Proof of Lemma 2. Z 1 Using the orthogonality relations, we obtain that the sum equals X 0 m1 ≤n um1 e2πiαm1 X vm2 e2πiαm2 S ∞ (α)e−2πinα dα. m2 ≤n The uniformity in α of our estimate for S ∞ (α) allows us to integrate it. We hence obtain the upper bound Z 1 X X 2πiαm1 2πiαm2 −2πinα · O n log n(log k)−A , u e v e e dα m1 m2 0 m1 ≤n m2 ≤n for any A > 0, the implied constant only depending on A. By the Cauchy-Schwarz inequality, we have the upper bound Z 0 1 2 2 21 Z 1 X X 2πiαm1 2πiαm2 −A um1 e um2 e . dα · dα · O n log n(log k) 0 m ≤n m1 ≤n 2 16 The statement now just follows from Parseval’s identity (see Appendix A), or immediately from the orthogonality relations. 1.4.2 An estimate for a double sum with Λ0 Having obtained a very general result estimating sums involving Λ∞ , the task remains to do the same for Λ0 . Because for large values of n, the expression for Λ0 contains relatively very few terms, we do not in general expect a lot of ’cancellation’, hence a general result with a sharp bound as we obtained for Λ∞ seems somehow no triviality. Therefore we will relax the generality and focus on a more specific sum which is of fundamental importance to the Goldbach problem. Recall from the introduction that an attractive idea to attack this problem is considering the sum X Λ(k1 )Λ(k2 ). k1 +k2 =n The main result in this section is Lemma 3. For k ≥ 2, and fixed A ≥ 0, X 1 Λ0 (k1 )Λ0 (k2 ) = S2 (n)n + O n(log k)−A + τ (n)nk − 3 + k 3 , k1 +k2 =n the implied constant depending only on A. Remark. We defined in the introduction Y S2 (n) = 1+ p|n 1 (p − 1) Y 1− p-n 1 . (p − 1)2 This function might seem rather hard to interpret. Note that plugging in an odd number yields 0. However the growth of this function for even n is easily estimated from the equivalent form Y p−1 2 p−2 p|n p>2 Y 1− p>2 1 (p − 1)2 ! =C Y p−1 . p−2 p|n p>2 This infinite product is clearly convergent, since the sum of the squared reciprocals of the natural numbers is convergent, justifying the constant C. 17 To estimate the product appearing here, some trivial estimates suffice to see that 1≤ Y p−1 ≤ n, p−2 p|n p>2 hence S2 (n), however badly behaved, is controlled by two basic functions. Finally, the reader verifies easily that S2 (n) = X µ2 (d)d X d|n ϕ2 (d) (c,d)=1 µ(c) , ϕ2 (c) which is the form in which this function will naturally arise in later estimates. Remark. Notice that considering the estimate for Λ∞ , by giving up some generality, a sharp bound was still obtained for our sum in Λ0 . The error term might seem rather frightening, but with a suitable choice of k, things simplify greatly. The proof of this estimate is rather long and technical. It seems to consist of little more than clever algebraic manipulations that allow one to apply the prime number theorem. Proof of Lemma 3. estimate. We have Let us try and rewrite the left hand side to a form which is easier to X Λ0 (k1 )Λ0 (k2 ) = k1 +k2 =n X k1 +k2 =n X X . µ(d ) log d µ(d ) log d 2 2 1 1 d2 |k2 d2 ≤k d1 |k1 d1 ≤k Writing d1 d01 = k1 and d2 d02 = k2 , we obtain X µ(d1 )µ(d2 ) log d1 log d2 n [d1 ,d2 ] 1. d1 d01 +d2 d02 =n d1 ,d2 ≤k The inner sum is X + O(1) whenever (d1 , d2 ) | n, according to our very first example. 18 So if we set X S2 (n, k) = d1 ,d2 ≤k (d1 ,d2 )|n µ(d1 )µ(d2 ) log d1 log d2 , [d1 , d2 ] we have X Λ0 (k1 )Λ0 (k2 ) = nS2 (n, k) + O((k log k)2 ). k1 +k2 =n We have now reduced our task to estimating the quantity S2 (n, k). This notation is suggestive, and as it turns out this definition, although depending on k, will turn out to be a good estimate of the quantity S2 (n) defined before. We will first estimate S2 (n, k), and then show how it is related to S2 (n). Estimating S2 (n, k). We rearrange to 2 S2 (n, k) = X µ(d) X µ(cd) X d c2 k k d|n c≤ d m≤ cd (m,cd)=1 µ(m) log cdm , m since this contains an inner sum which is very convenient to estimate. This identity can be checked by direct verification and some coffee. √ For cd > k, we have by the trivial estimates µ(x) ≤ 1 and log cdm ≤ log k that this contribution is dominated by X X (log k)4 2τ (n) ≤ √ (log k)4 . 2 c d √ k d|n cd> k √ For cd ≤ k, we use log cdm = log cd + log m, and on both pieces we apply some trivial estimates and the prime number theorem rewritten with M¨obius functions to show that − X k m≤ cd (m,cd)=1 µ(m) cd log cdm = +O m ϕ(cd) 19 τ (cd)cd −A (log k) , ϕ(cd) the implied constant depending solely on the choice of A. Combining both, we get the estimate X X µ(d)µ(cd)d τ (n) 4−A 4 S2 (n, k) = + O (log k) + √ (log k) . ϕ(cd)2 √ k d|n cd≤ k √ Linking S2 (n, k) to S2 (n). Notice that in the above expression, replacing the sum over cd ≤ k with a sum over all c, we make a really small mistake, absorbed in the error term which is already present. Doing this, we approximate S2 (n, k) by S2 (n) = X µ2 (d)d X d|n ϕ2 (d) (c,d)=1 µ(c) . ϕ2 (c) Now we have obtained the required dominant term. Our error term is now τ (n)n O n(log k)4−A + √ (log k)4 + (k log k)2 , k the implied constant only depending on A > 0. Hence we obtain the result, upon simplifying slightly keeping in mind that log k is smaller than any power of k for k large enough. 1.4.3 A consequence of both estimates We now close in on the Goldbach problem. Define the quantity S2 (n) := X Λ(k1 )Λ(k2 ). k1 +k2 =n The following theorem will be the core of the proof of all results on the Goldbach problems appearing in the next part. Theorem 2. We have for any A > 0 and un ∈ C that X X 3 un S2 (n) = un S2 (n)n + O kukx 2 (log x)−A , n≤x n≤x the implied constant depending only on A. 20 Remark. To obtain this result we will specify k and apply the two estimates we obtained before for Λ∞ and Λ0 . The following proof seems to consist of little more, we have done all the hard work already. Proof of Theorem 2. We start by recalling X X ! n≤x un S2 (n) = X un n≤x 0 X 0 Λ (k1 )Λ (k2 ) + 2 k1 +k2 =n 0 ∞ Λ (k1 )Λ (k2 ) + k1 +k2 =n X ∞ ∞ Λ (k1 )Λ (k2 ) . k1 +k2 =n 1 The strategic choice at this point is k = x 4 . Step 1. Note that Lemma 3 allows us to estimate X n≤x un X Λ0 (k1 )Λ0 (k2 ) = X 3 un S2 (n)n + O kukx 2 (log x)−A , n≤x k1 +k2 =n for any A > 0. This deserves some explanation. The leading term is straightforward, while the error term is in this case a sum over 1 3 un · O n(log x)−A + τ (n)nx− 12 + x 4 This is estimated using the discrete Cauchy-Schwarz inequality. The first and last terms can be dealt with by straightforward estimates. For the middle term, we need a slightly more sophisticated result analogous to Dirichlet’s formula (see Appendix A). Indeed, it can be proven fairly elementarily (see [8][Chapter 1]) that X τ (n)2 = O x(log x)3 . n≤x Applying this to the second term we obtain an error term of 3 3 1 5 3 O kukx 2 (log x)−A + kukx 2 − 12 log x + kukx 4 = O kukx 2 (log x)−A . 21 Step 2. According to Theorem 2, we have X un n≤x X Λ0 (k1 )Λ∞ (k2 ) = O kukkΛ0 kx(log x)−A . k1 +k2 =n 1 Now clearly kΛ0 k = O x 2 log x , which can be derived in numerous ways. Most directly, it follows from the prime number theorem that X Λ(n)2 ≤ log x X Λ(n) = O (x log x) , n≤x n≤x giving us a contribution of 3 O kukx 2 (log x)1−A , for any A > 0. 1 Step 3. Similarly to the previous step, estimating kΛ∞ k = O x 2 log x . The above three steps show that X n≤x un S2 (n) = X 3 un S2 (n)n + O kukx 2 (log x)−A . n≤x Remark. Notice how we have the uniformity in α in every step. 22 Part 2 The Goldbach problem In the first part, we already found ways of estimating sums involving the von Mangoldt function, as an attempt to gather useful information for proving the Goldbach conjecture. We now turn to investigate the growth of S2 (n) = X Λ(k1 )Λ(k2 ) X and S3 (n) = k1 +k2 =n Λ(k1 )Λ(k2 )Λ(k3 ). k1 +k2 +k3 =n These functions count the number of representations of n as the sum of two (resp. three) prime powers in a weighted manner. Proving that they are strictly greater than 0 for all appropriate n would prove the Goldbach problems. Remarks. The reader might have three immediate objections at this point. 1. The weights seem inconvenient and have no easy interpretation. 2. We are being too ambitious in trying to find the (weighted) number of representations. 3. The fact we allow all prime powers seems a problem. For the first inconvenience, we can remind the reader of our earlier remarks on the von Mangoldt function being the natural choice in this setting. The weight will turn out to be convenient in the proof, making the prime numbers more flexible in algebraic manipulations. The second objection might well be true for the binary case, it is simply an underestimation of our estimates in the ternary case, as we will see shortly. 23 As for the third objection, we remark that the prime powers are very sparse compared to the primes, and the inconvenience is not as terrifying as it might seem. √ Consider S2 (n). Say k1 is a proper prime power, then we have at most n choices for k1 . Once 2 k1 is chosen, k2 is fixed. Each of the choices√contributes at most (log n) to S2 (n). We conclude 2 that the proper prime powers contribute O n(log n) to S2 (n). Similarly for S3 (n), except that for fixed k1 , we can pick k2 , hence fixing k3 , in at most n 3 3 ways. Now the proper prime powers contribute O n 2 (log n) to S3 (n). Notice that taking the particular order of the prime powers into account does only multiply the estimate by at most a constant. If we succeed in obtaining an estimate for S2 and S3 with a dominant term which is larger, we do not need to worry about the proper prime power contribution. This allows us to benefit from the nice properties of Λ in calculations, getting results in which we may as well ignore the proper prime powers. 24 2.1 The binary Goldbach problem Definitions. As before, we define S2 (n) := X Λ(k1 )Λ(k2 ). k1 +k2 =n For any x, define the exceptional set to be the set of even natural numbers ≤ x that are not the sum of two primes. The size of the exceptional set is E(x). In this section, we will try and prove the Goldbach conjecture. Even though this goal will not be achieved, we will prove the conjecture from the introduction for almost all integers. Even though this does not settle the Goldbach conjecture, we will prove that the exceptional set has density zero. First we remind the reader of the main conjecture. Conjecture 2. For n ≥ 4 and fixed A > 0, X Λ(k1 )Λ(k2 ) = 2S2 (n)n + O n(log n)−A , k1 +k2 =n where the implied constant depends only on A, and Y Y 1 1 . 1− 1+ S2 (n) = (p − 1) (p − 1)2 p|n p-n The proof of this conjecture is not within reach with our current estimates. However, it follows straightforwardly that it ’almost holds’ in the sense that for every possible choice of the implied constant, the set of exceptions has density zero. More precisely, we have Theorem 3. For a fixed A, C, x > 0, the number of integers 4 ≤ n ≤ x for which |S2 (n) − S2 (n)n| > Cn(log n)−A , is of size O x(log x)−A , where the implied constant depends only on A, B. Proof. We apply Theorem 2 with un = S2 (n) − S2 (n)n and the exponent of the log-factor This gives us X 2 (S2 (n) − S2 (n)n) = O x3 (log x)−3A , n≤x 25 3A 2 . where the implied constant depends only on A > 0. Trivially, every term is ≥ 0. The terms 2 corresponding to an n as described in the statement have (S2 (n) − S2 (n)n) > C 2 n2 (log n)−2A , hence if we call EA,C (x) the number of such n we clearly have EA,C (x) C 2 (log x)−2A X n2 = O i=1 X (S2 (n) − S2 (n)n) 2 = O x3 (log x)−3A , n≤x from which the desired clearly follows. We now proceed to proving that the exceptional set has density zero. More precisely, we will prove the following theorem. Theorem 4. For any A > 0, we have E(x) = O(x(log x)−A ). Proof. Any integer n is not the sum of two prime powers if and only if S2 (n) = 0. The set for which S2 (n) = 0 is by no means of size as great as E(x), but due to our previous remark on the sparseness of prime powers, we√can applya similar trick as in the proof of last theorem, keeping in mind that S2 (n) = S20 (n)+O n(log n)2 , with S20 (n) defined similarly to S2 (n) while only taking primes into account and neglecting proper prime power contributions. We arrive analogously at E(x) X i2 = O x3 (log x)−2A . i=1 The sum on the left hand side is O(E(x)3 ). This implies the result. Remark. Notice how easily we got rid of the unwanted prime power contribution in the proof. As long as the error term is large enough, we are free to ignore proper powers as above. The set of proper prime powers is indeed very sparse. So we have the benefit of working with the convenient von Mangoldt function in our calculations, and can easily get rid of its less convenient properties in the given context. Remark. This result shows that the counterexamples to the binary Goldbach conjecture have natural density zero. Of course the theorem is more precise than this statement. It seems as close as we can get to a proof of the full conjecture. More refined versions are available now, but they would require a lot more machinery. We will discuss later why our proof has no real chance of going all the way. 26 2.2 The ternary Goldbach problem As before, define X S3 (n) := Λ(k1 )Λ(k2 )Λ(k3 ). k1 +k2 +k3 =n In this section, we will find the growth of this function, with a small error term compared to the dominating term for odd n. This implies that for large enough odd n, this function is strictly positive, hence proving the asymptotic result for the ternary Goldbach conjecture. Moreover, we have established the growth of the number of weighted representations. Theorem 5 (Vinogradov, 1937). For any fixed A > 0, we have S3 (n) = where 1 S3 (n)n2 + O n2 (log n)−A , 2 Y S3 = 1− p|n Y 1 1 . 1+ (p − 1)2 (p − 1)3 p-n Remark. As we did for S2 (n), we can do a similar analysis to obtain that the rather wildly behaved function S3 (n) is of something between constant and linear growth. For even n, we have S3 (n) = 0, in which case the dominant term would vanish, giving us a useless result. Furthermore, we have the identity X µ2 (d)dµ(c) S3 (n) = . ϕ3 (d)ϕ2 (c) (d,cn)=1 Proof. obtain: √ By applying Theorem 2 to um = Λ(n − m), using as before that kΛk = O ( n log n) we S3 (n) = X Λ(m)S2 (n − m)(n − m) + O n2 (log n)−A . m<n We remember that X µ2 (d)d X d|m ϕ2 (d) (c,d)=1 µ(c) = S2 (m), ϕ2 (c) 27 hence the above may be rewritten by changing the order of summation into S3 (n) = X µ2 (d)dµ(c) ϕ2 (d)ϕ2 (c) (c,d)=1 X Λ(m)(n − m) + O n2 (log n)−A , m<n m≡n (mod d) the purpose of this rearrangement being that the inner sum can be estimated easily. Indeed, if we use partial summation then by the prime number theorem we have for (d, n) = 1 that, X m≡d Λ(m)(n − m) = m<n (mod n) n2 + O n2 (log n)−A . 2ϕ(d) The factor 12 comes from the integration and is the extra factor in the statement of the theorem. The result now follows from the identity S3 (n) = X (d,cn)=1 µ2 (d)dµ(c) . ϕ3 (d)ϕ2 (c) Remark. As we can restate the prime number theorem in terms of π(x) rather than ψ(x), an analogous statement for the number of representations can be deduced. This is a relatively simple matter, and might be of interest to the reader that is bothered by the weighted way of counting in our approach. However, we hope that the above approach seems more natural after our motivation. The corresponding result is Theorem 6. Let s(n) be the number of representations of n as the sum of three primes. Then for any fixed A > 0, we have s(n) = 1 n2 S3 (n) + O n2 (log n)−4 . 2 (log n)3 28 2.3 Conclusion Summary. As we discussed, the fundamental function to consider in the setting of number theory is the M¨ obius function. A thorough understanding of its properties might be the key to solving some problems which seem hard to access by naive approaches. The need for a good way of obtaining information on the behaviour of the M¨obius function announced itself. The theory of Fourier transformation gives us a way to transform any arithmetic function under consideration to a function on the unit circle. This function can then be studies using analysis, and the result could be translated back using Fourier inversion. The M¨obius function is however not behaved appropriately for our theory, forcing us to make some adjustments to this ideal strategy. This leads us to estimating the exponential sum involving µ, which was done by Davenport. He obtained that for any real α and A > 0, we have uniformly in α that X −A µ(m)e2πiαm = O x (log x) . m≤x We deduced crucial information on some sums involving the von Mangoldt function Λ. The reason this was possible is because of the easy expression of the von Mangoldt function as a sum involving µ. Note that an analogous connection would be harder to find or at least more complicated when a function different from Λ was chosen to approach the Goldbach problem. Having obtained information on various sums involving the von Mangoldt function, we deduced fairly painlessly a proof of Vinogradov’s theorem, as well as a zero density result for the exceptional set consisting of the counterexamples to the Goldbach conjecture. We conclude with some final remarks. Remark. While we focused on the von Mangoldt function here, seen as our goal was the Goldbach conjecture, it would have been equally possible to investigate other arithmetic functions that have a definition involving µ (and indeed many have because of M¨obius inversion), hence obtaining other interesting results. Remark. A full proof of the ternary Goldbach conjecture is still not found. Even though we have established the asymptotic result with Vinogradov’s theorem, the implied constant is enormous. Assuming the GRH, Zinoviev et. al. [10], have given a full proof. The paper contains some mistakes however, but as I heard they can be fixed. An unconditional proof remains as to date unknown. Remark. It might be interesting to reflect on why our method fails in proving the asymptotic result for the binary Goldbach problem. We identified the dominant term in S2 (n) as S2 (n)n, and hence obtaining any error term of smaller growth would solve our problem. The growth of S2 is very mysterious, and it is clear that our rough upper bound n previously obtained is not sharp, and is usually even hopelessly inaccurate. Our lower bound is however very sharp, due to the small value taken when n is prime. In any case, an error term solving the problem would have to be smaller than linear growth. The approach to the problem is complicated enough to obscure 29 the weaker points in our estimates, but a careful analysis shows that both our estimates for the double sum in Λ0 and Λ∞ would have to improve. Obtaining better estimates is certainly a very nontrivial task, since one would almost have to improve Davenport’s result of Theorem 1, and the prime number theorem. Even though we admit that for the latter much better bounds are known, the reader who believes that this is the way out is invited to try it for him/herself. A new approach is needed to prove the conjectured result, if true. Since we based ourselves on the heuristic involving the M¨ obius function, we made our cutoff at k accordingly. The nature of this heuristic prescribes a natural cutoff for large values, but dictates no natural way of further subdividing the regions, hence making our estimates sharper. However, an approach based on a different heuristic or obtained inequality might lead to a different approach and a different subdivision. It is however not clear what this might be. For the ternary Goldbach problem, we have the luxury of an extra factor n in the dominant term. This is a huge extra freedom which makes the result accessible for our obtained estimates, as we saw. The result is now definitely settled, from a certain number N onwards. As quoted before, 15 Borodzkin [2] showed that N can be taken 33 . There are two ways to proceed towards a full proof. One is to try and obtain better estimates so as to lower N (for example by proving the GRH). Another is to improve our computational methods to check te remaining cases, so as to get N within range. Both approaches are growing towards each other, and it seems the full proof is just a matter of time. Epilogue. This concludes our discussion of the Goldbach problem. Analytic number theory certainly owes a lot of its development to this problem, and is now a flourishing branch of mathematics with many results and still many open problems. We hope this essay made the reader as enthousiastic as it did the author while writing it. 30 Appendix A The toolbox Some theorems and techniques that have been used or refered to in the main text are presented here. Proofs are omitted, but can be found in most texts on analysis or analytic number theory, such as [1] and [8]. Summation by parts. Let f be an arithmetic function, and g : R+ → R be continuously differentiable. Then we have X n≤x f (n)g(n) = X Z x f (n) g(x) − 1 n≤x X f (n) g 0 (t)dt. n≤t This is probably one of the most useful tools in analytic number theory that allows us to deform obtained results to slightly different ones. The name comes from the analogy with the theorem of integration by parts. Poisson summation formula. Let f : R → R be monotonic in stretches, then n−1 +∞ Z n X X f (m) + f (n) + f (k) = f (x)e2πikx dx. 2 m k=m+1 k=−∞ In fact, this is reminiscent of the theory of Fourier transformation, see below. Indeed, this could be restated in more general terms, but we choose to present it thus since it suffices for our applications, and it is one of the most commonly used forms in number theory. 31 The Cauchy-Schwarz inequality I. Let u, v ∈ Cn be two complex vectors, then kuk · kvk ≥ u · v. This inequality is sometimes called the ’discrete version’ of the Cauchy-Schwarz inequality in this essay. This is the version most high school students are acquainted with. The obvious generalisation to arbitrary inner product spaces holds, and in fact the following version is nothing more than another particular case. We have chosen to state both separately here. This allows us to refer merely to the used version, while omitting explicit inequalities. The following version will be referred to as the integral version of Cauchy-Schwarz, or merely Cauchy-Schwarz. The Cauchy-Schwarz inequality II. Let f, g : R → C be square-integrable, then Z 2 Z Z b b b 2 2 f (x)g(x)dx ≤ |f (x)| dx · |g(x)| dx. a a a Dirichlet’s formula. We have that X √ τ (n) = x log x + (2γ − 1)x + O( x). n≤x The proof can be found in most undergraduate texts on analytic number theory, for example [1]. The problem of finding the minimal error term in the above theorem is a fascinating one with many stronger results then the one stated here. More precisely, finding the infimum of θ such that the error term in Dirichlet’s formula may be replaced with O(xθ ) is an unsolved problem. Fourier analysis on Z and T. Although the theory of Fourier analysis is a vast one and generalises greatly to arbitrary locally compact abelian groups, we will only state the results for Z and T. This is because none of the results have been used, and this summary is only included to provide a considerable piece of inspiration that motivates our interest in exponential sums. Since our ultimate goal is doing number theory, we are not surprised that working in the group Z is most useful to us. For a function f : Z → C, we define the Fourier transform fˆ(α) = X f (n)e−2πinα , n∈Z 32 as a formal series. It would be especially interesting should this define a function suitable for analytic manipulation. Unfortunately, this is not always a convergent series, unless we restrict our original function somewhat. Let S(Z) be the space of functions f : Z → C such that for all k we have lim |n|k |f (n)| = 0. Define T to be the unit circle, and the space S(T) := C ∞ (T). A famous |n|→∞ theorem in Fourier analysis now says that if f ∈ S(Z), then fˆ ∈ S(T). Hence to any well behaved function on the integers, we can associate a function in C ∞ (T). Not only can we associate such a function, we can also ’go back’. The Fourier inversion theorem states that for any f ∈ S(Z), we have 1 Z fˆ(α)e2πinα dα. f (n) = 0 This could be considered a motivation for the exponential sums we considered in the text. Information about arithmetic functions could be obtained by investigating its Fourier transform with the extensive toolbox of analysis (which is understood to be much more than the tools mentioned in this appendix). One could then transform this information back to the original function by means of the Fourier inversion theorem. Of course, one needs the arithmetic function under consideration to be contained in S(Z) for this to work. This is unfortunately not always the case, notably the M¨ obius function is not behaved as desired. There are various ways out, as we discussed and put into practise in the text. Parseval’s identity. This identity might be considered as a general form of Pythagoras’ theorem. It applies in the general setting of separable Hilbert spaces, but for our goals the following form will suffice. For any f ∈ L2 (T), set 1 Z f (α)e−2πinα dα. cn := 0 Then we have the identity X n∈Z 2 Z 1 |cn | = 0 33 |f (α)|2 dα. Appendix B The M¨ obius function µ However well concealed, the ideas of Davenport [3], using ideas of Vinogradov [9], using ideas of Hardy and Littlewood [7], are fundamental. It seems all we did was deduce good estimates from Davenport’s result, here labeled as Theorem 1. However plausible this theorem seems given our heuristic about the M¨ obius function, the proof is not at all easy. The main objective of this appendix is presenting the ideas for a proof of this result of [3]. We use more modern language, and the ideas are taken from [8]. Theorem. For any real α, any A > 0 and x ≥ 2, we have X −A µ(m)e2πiαm = O x (log x) , m≤x uniformly in α. Important remark. Mathematicians knew before the development of Vinogradov’s method that a similar result was true for rational α. In fact, they knew that for any x ≥ 2 and A > 0, we have for any (a, q) = 1 that X µ(m)e 2πiam q = O qx(log x)−A , (B.1) m≤x the implied constant only depending on A. We will need this result in the proof of Theorem 1, in quite a crucial way. However, to prove it, we need some deep estimates of Dirichlet L-functions. An inclusion of their vast and beautiful theory would certainly double the size of this essay. Therefore, we will merely sketch the proof of B.1 and direct the interested reader (which is hopefully everyone) to [8][Chapter 5]. 34 One starts from the observation X m≤x µ(m)e 2πiam q X µ(d) = ϕ( dq ) d|q X χ (mod τ (χ)χ(a) q d) X µ(n)χχ0(d) (n) , n≤x P 2πin where the inner sum is over all Dirichlet characters, τ (χ) = n (mod q) χ(n)e q the Gauss sum, and χ0(d) the P principal Dirichlet character modulo d. We have hence reduced the problem to estimating n≤x µ(n)χ(n) for a Dirichlet character modulo q. We will show how an estimate for this last sum can be obtained from corresponding good estimates for the Dirichlet L-function L(χ, s). Indeed, notice that we have (wherever everything converges, including for example the region <(s) > 1) 1 L(χ, s) = Y (1 − χ(p)p−s ) = X = X p µ(n)χ(n)n−s n Z +∞ µ(n)χ(n)s n n ∞ Z P n≤x = s µ(n)χ(n) xs+1 1 dx xs+1 dx While this gives us the L-function in function of our sum, we can invert the relation using Perron’s formula (see [4], [8] for instance) to obtain that for any real c for which L(χ, s) converges, we have X n≤x µ(n)χ(n) = 1 2πi Z c+i∞ c−i∞ xt dt. L(χ, t)t Note that in principle, we need x not to be integer for this to hold. If x happens to be integer, the last term in the sum should be multiplied by 12 . If one has sufficiently good estimates for the L-function, this is how an estimate for B.1 is deduced. This estimation can indeed be accomplished sufficiently sharply in a zero-free region of this function. The precise details of the estimates for this L-function can be found in [8][Chapter 5]. 35 B.1 A proof of Davenport’s result Proof sketch. The proof uses the fact that for any integer τ , there exist a, q with (a, q) = 1 1 and q ≤ τ such that α − aq ≤ qτ . This can be proven using elementary properties of continued fractions. We pick some τ to be specified later. The proof now splits up in two cases, according to the value of q arising from our α. This is motivated by noting that the exponential sum will be large when α is close to a rational number with small denominator. B.1.1 The case x τ <q≤τ Let us start by observing that if we pick any y, z ≥ 1, then for any n > max {y, z} we have µ(n) = − X X µ(d1 )µ(d2 ) + d1 ,d2 d1 d2 |n,d1 ≤y,d2 ≤z µ(d1 )µ(d2 ). d1 ,d2 d1 d2 |n,d1 >y,d2 >z This identity follows from straightforward manipulations. Indeed, since n = 1, we have X µ(n) = P d|n µ(d) = 0 unless µ(d1 )µ(d2 ). d1 d2 |n We let the ranges P for d1 , d2 be split up by y, z respectively, giving us four summations. Now apply the identity d|n µ(d) = 0 unless n = 1 again to get X µ(d1 )µ(d2 ) = − d1 ,d2 d1 d2 |n,d1 ≤y,d2 >z X µ(d1 )µ(d2 ) = d1 ,d2 d1 d2 |n,d1 ≤y,d2 ≤z X µ(d1 )µ(d2 ), d1 ,d2 d1 d2 |n,d1 >y,d2 ≤z from which the required identity follows. It allows us to rewrite the quantity under consideration: X µ(m)e2πiαm = m≤x − X X X d1 ≤y d2 ≤z d1 d2 d3 ≤x µ(d1 )µ(d2 )e2πiαd1 d2 d3 + X X X d1 >y d2 >z d1 d2 d3 ≤x 36 µ(d1 )µ(d2 )e2πiαd1 d2 d3 +O (max {y, z}) , where the error term arises since our expression for µ(n) is only valid for n > max {y, z}. It will turn out to be negligible. As we will see, the sums on the right hand side are both relatively easy to estimate. The reduction to sums of this form is what is commonly referred to as Vinogradov’s method. We shall discuss it in more general terms later. We now proceed to finding estimates for the sums in the above expression. We consider some general exponential sums, forgetting about our M¨obius functions for a while and treating the problem slightly more generally. Notice that X e2πiαn = n≤N sin παN 2πiα N +1 2 . e sin πα If we denote kαk for the distance of α to the nearest integer, we clearly obtain X 1 2πiαn e ≤ min N, . 2kαk n≤N By a repeated series of occasionally clever applications of this fact, we obtain the two estimates X X x 2πiαmn e = O M + + q log 2qx , q x m≤M n≤ m and X u m vn e 2πiαmn =O m,n mn≤x, m>M, n>N x x x + + +q M N q 12 ! 1 2 2 x (log x) , where u, v are any complex vectors such that |um | , |vn | ≤ 1. Details can be found in [8][Chapter 13], but no deep results are needed apart from some elementary combinatorial arguments. We these two results to the first and second term respectively in our expression for P can apply2πiαm µ(m)e , obtaining that m≤x X 1 1 1 4 1 1 µ(m)e2πiαm = O (q 2 x 2 + q − 2 x + x 5 ) 2 x 2 (log x)4 . m≤x 37 (B.2) So far we have made no use of the specific case we are considering. Looking at our obtained error 1 term, we see it will give us the best result when q is relatively large, to keep the term q − 2 x inside the square root under control. Indeed, specifying q to our considered range in this case, we obtain X 1 1 4 1 1 µ(m)e2πiαm = O (2τ 2 x 2 + x 5 ) 2 x 2 (log x)4 . m≤x B.1.2 The case q ≤ x τ In this case α is close to a rational number with small denominator, making the sequence of exponentials conspire with the M¨ obius function. This gives us the feeling our heuristic on the M¨ obius function will not work, making a large contribution possible. We therefore expect this to be the dominant contribution, and will probably need more machinery to give a sharp bound than in the last section, which was essentially a sequence of elementary estimates. We define S(x) := X µ(n)e 2πina q . Now after writing α = a q + β, we obtain n≤x X µ(n)e2πinα = n≤x X (S(n) − S(n − 1)) e2πinβ . n≤x Rewriting a sum with an inner difference is usually done when one is hoping to apply partial summation and hence reducing the question to a quantity which is easier to estimate. Indeed, partial summation gives us that X 2πinα µ(n)e = S(x)e 2πixβ Z − x S(t)2πiβe2πitβ dt. 1 n≤x Calculating the integral explicitly seems hard. However, by picking 1 ≤ y ≤ x strategically, we can make S(y) maximal, hence obtaining x X 2πx 2πinα |S(y)| , µ(n)e ≤ 1+ qτ n=1 where we simplified by β ≤ we treated in B.1. 1 qτ . We have reduced our task to estimating S(x), which is the case 38 Applying B.1 to our estimate in the first paragraph, we obtain that X µ(m)e2πiαm = O q+ m≤x x x(log x)−5A , τ which simplifies in our case, using q ≤ xτ , to X µ(m)e 2πiαm =O m≤x B.2 x2 −5A (log x) . τ Discussion of Vinogradov’s method Conclusion of the proof. In both cases, we get the estimate X µ(m)e 2πiαm =O m≤x 1 3 9 x2 −5A 4 4 4 4 10 (log x) + τ x (log x) + x (log x) . τ −4A We conclude by picking the value τ = x (log x) x ≥ 2, we have our desired result X , obtaining that for any real α, any A > 0 and −A µ(m)e2πiαm = O x (log x) . m≤x The uniformity in α is a wonderful feature of this result. Unfortunately, some omitted steps in the above proof obscure this, but full details can be found in [8][Chapter 13]. Remark. Notice the difference in techniques used in both cases. The first case gave us a small contribution, a feature which is reflected in the fact that very elementary and straightforward estimates suffice. Indeed, it was only in the second case, the dominant contribution, that we resorted to using a deeper theorem on L-functions. Vinogradov’s method. It should also be noted that the above method generalises considerably, and the general method is commonly called Vinogradov’s method. It is however hard to describe the general line of attack in this method, since its appearance varies greatly when applied to different problems. However, there seems to be a pattern present in many applications which could be described thus: We have a natural interest in prime numbers, and in this branch of 39 mathematics specifically in sums over primes. These sums are closely related to sums over M¨obius functions, where the connection goes by the name of the von Mangoldt function. We have seen this process in the main text and will meet it again later when we investigate the prime number theorem. Vinogradov’s method could be seen as a way of making our heuristic of the M¨obius randomness rigorous. The method developed successfully applies to a wide range of functions, and shows its orthogonality to the M¨ obius function. This is often done roughly as follows. Suppose we want to bound X µ(n)an , n≤x that is, trying to find out how orthogonal an is to the M¨obius function. By combinatorial arguments, often analogous to the ones used above, we split the sum into several smaller sums of the form S1 := X um amn and m,n S2 := X vm wn amn , m,n where the sum runs over suitable values of m and n. These two types of sums are usually referred to as Type I/II sums. The point of this rearrangements is that sums of these types can be estimated successfully. Typically, one estimates Type II sums using some more general estimates for bilinear forms, much like the ones used in the above proof. The details take on many different forms, but a possibility might be to split the Type II sums further, letting the variables range over, say, dyadic segments. (See [6][Chapter 4] or [8][Chapter 13] for two examples) The Cauchy-Schwarz inequality allows one to eliminate coefficients. The generality of this description is in accordance to its wide applicability. The above proof can be seen as a concrete example of this general description, and another stunning example can be found in [6][Chapter 4]. B.3 Discussion of importance The importance of Theorem 1 can hardly be underestimated. The randomness of the M¨obius function is fundamental to number theory, as we will see from its impressive consequences. As we have seen before it also produces estimates strong enough for proving Vinogradov’s theorem and ’almost’ establishing the binary Goldbach conjecture. Another consequence we wish to treat is the prime number theorem, which is essentially equivalent to the previous result with α = 0 in the form we will consider now. 40 Corollary 4. Let ψ(x) = X Λ(n), then for any A > 0, we have n≤x ψ(x) = x + O x(log x)−A . Remark. The reader might object to deducing the prime number theorem from Davenport’s result. If one is prepared to use results on Dirichlet L-functions as the ones quoted in the discussion of estimate B.1, then we are virtually using the prime number theory in the proof. Indeed, the situation should be investigated more closely to find out exactly how equivalent the properties of the L-functions are to the prime number theorem. In any case, we have chosen to present it this way, since we feel it underlines the fundamentality of the M¨obius randomness more. Indeed, Theorem 1 for α = 0 is essentially equivalent to the prime number theorem, and it makes the mind wonder about what other deep results might correspond to different values of α. Moreover, all these values of α give the same error term since we have uniformity. Proof. Let us call γ = lim 1+ n→+∞ the Euler constant. Since τ (n) = P d|n 1 1 + . . . − log n 2 n 1, we have (M¨obius inversion) that 1= X µ(d)τ d|n Also remember that P d|n n d . µ(d) = 0 unless n = 1. This allows us to write bxc − ψ(x) − 2γ = XX n≤x d|n = X n n µ(d) τ − log − 2γ d d µ(d) (τ (d0 ) − log(d0 ) − 2γ) . dd0 ≤x To bound this sum, set f (n) = (τ (n) − log(n) − 2γ), so we can restate our reduced problem to finding an estimate for X µ(d)f (d0 ). dd0 ≤x 41 Notice that X f (n) = n≤x X τ (n) − n≤x X √ log n − 2γbxc = O( x), n≤x by Dirichlet’s formula and the easily verified P n≤x log n = x log x − x + O(log x). We have information about the partial sums of µ (by Theorem 1) and f by the above, so a rearrangement in terms of partial sums of these functions would allow us to give an estimate. We can in fact rearrange in the following fashion X µ(d)f (d0 ) = dd0 ≤x X µ(n) X f (m) + x m≤ n n≤a X X f (n) µ(m) − x m≤ n n≤b X f (n) · n≤a X µ(n), n≤b for any postive a, b such that ab = x. This is √ an easily verified rearrangement, √ which is exactly what we are looking for. The first term is O xa(log a)−A , the last one O ab(log b)−A . To estimate the second term, we note that by partial summation X f (n) n≤b We see now that a = b = √ n − 12 =O b Z − ! b t − 23 dt 1 = O b− 2 . 1 x seems to give the optimal bound. This shows that X µ(d)f (d0 ) = O x(log x)−A , dd0 ≤x for any A > 0, as required. 42 Bibliography [1] T. Apostol, Introduction to Analytic Number Theory, Springer-Verlag (1976). [2] K.G. Borodzkin, On I. M. Vinogradov’s constant, Proc. 3rd All-Union Math. Conf., vol. 1,. Izdat. Akad. Nauk SSSR, Moscow (1956). [3] H. Davenport, On some infinite series involving arithmetical functions. II, Quart. J. Math. Oxf. 8 (1937), 313–320. [4] H. Davenport, Multiplicative number theory, Springer-Verlag, Third edition (2000). [5] B.J. Green and T. Tao, The M¨ obius function is strongly orthogonal to nilsequences, Annals of Math., to appear. [6] B.J. Green, Additive number theory, Part http://www.dpmms.cam.ac.uk/ bjg23/ANT.html III lecture notes, available at [7] G. H. Hardy and L. E. Littlewood, Some problems of ‘Partitio Numerorum’. III: On the expression of a number as a sum of primes, Acta Mathematica 44 (1922), 1–70. [8] H. Iwaniec and E. Kowalski, Analytic number theory, AMS Colloquium publications, 53 (2004). [9] I. M. Vinogradov, Representation of an odd number as a sum of three primes, Comptes Rendues (Doklady) de l’Academy des Sciences de l’USSR 15 (1937), 191–294. [10] Deshouillers, Effinger, Te Riele and Zinoviev, A complete Vinogradov 3-primes theorem under the Riemann hypothesis, Electronic Research Announcements of the American Mathematical Society, 3 (1997), 99-104. 43