Lecture 1: jet algorithms Lecture 2: jet substructure
Transcription
Lecture 1: jet algorithms Lecture 2: jet substructure
MadGraph School Shanghai - November 2015 Jets Lecture 1: jet algorithms Lecture 2: jet substructure Matteo Cacciari LPTHE Paris Why jets A jet is something that happens in high energy events: a collimated bunch of hadrons flying roughly in the same direction We could eyeball the collimated bunches, but it becomes impractical with millions of events The classification of particles into jets is best done using a clustering algorithm Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 2 The pervasiveness of jets ‣ ATLAS and CMS have each published 400+ papers since 2010 ‣ More than half of these papers make use of jets ‣ 60% of the searches papers makes use of jets Plot by G. Salam (Source: INSPIRE. Results may vary when employing different search keywords) Why are jets so important? Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 3 Taming reality Multileg + PS ?? QCD predictions Real data Jets One purpose of a ‘jet clustering’ algorithm is to reduce the complexity of the final state, simplifying many hadrons to simpler objects that one can hope to calculate Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 4 Jet clustering algorithm A jet algorithm maps the momenta of the final state particles into the momenta of a certain number of jets: {pi} jet algorithm {jk} jets particles, 4-momenta, calorimeter towers, .... Most algorithms contain a resolution parameter, R, which controls the extension of the jet Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 5 Jet Definition A jet algorithm + its parameters (e.g. R) + a recombination scheme = a Jet Definition /// define a jet definition JetDefinition jet_def(JetAlgorithm jet_algorithm, double R, RecombinationScheme rec_sch = E_scheme); Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 6 Two main classes of jet algorithms ‣ Sequential recombination algorithms Bottom-up approach: combine particles starting from closest ones How? Choose a distance measure, iterate recombination until few objects left, call them jets Works because of mapping closeness QCD divergence Examples: Jade, kt, Cambridge/Aachen, anti-kt, ….. ‣ Cone algorithms Top-down approach: find coarse regions of energy flow. How? Find stable cones (i.e. their axis coincides with sum of momenta of particles in it) Works because QCD only modifies energy flow on small scales Examples: JetClu, MidPoint, ATLAS cone, CMS cone, SISCone…... Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 7 A little history ‣Cone-type jets were introduced first in QCD in the 1970s (Sterman-Weinberg ’77) ‣In the 1980s cone-type jets were adapted for use in hadron colliders (SppS, Tevatron...) ➙ iterative cone algorithms ‣LEP was a golden era for jets: new algorithms and many relevant calculations during the 1990s ‣ Introduction of the ‘theory-friendly’ kt algorithm ‣ sequential recombination type algorithm, IRC safe ‣ it allows for all order resummation of jet rates ‣ Several accurate calculations in perturbative QCD of jet properties: rates, jet mass, thrust, .... Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 8 + ee kt (Durham) algorithm [Catani, Dokshitzer, Olsson, Turnock, Webber ’91] Distance: In the collinear limit, the numerator reduces to the relative transverse momentum (squared) of the two particles, hence the name of the algorithm ‣ Find the minimum ymin of all yij ‣ If ymin is below some jet resolution threshold ycut, recombine i and j into a single new particle (‘pseudojet’), and repeat ‣ If no ymin < ycut are left, all remaining particles are jets Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 9 + ee kt (Durham) algorithm in action 2-jet Characterise events in terms of number of jets (as a function of ycut) 3-jet 4-jet 5-jet Resummed calculations for distributions of ycut doable with the kt algorithm Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 10 + ee kt (Durham) algorithm v. QCD kt is a sequential recombination type algorithm One key feature of the kt algorithm is its relation to the structure of QCD divergences: The yij distance is the inverse of the emission probability ‣ The kt algorithm roughly inverts the QCD branching sequence (the pair which is recombined first is the one with the largest probability to have branched) ‣ The history of successive clusterings has physical meaning Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 11 Jet challenges at the LHC The LHC environment differs from the LEP one (and even the Tevatron) under many respects ‣ Number of final state particles much larger (order 103) ➙ needs a fast algorithm ‣ Many higher order calculations (NLO, NNLO) available ➙ needs an IRC-safe algorithm ‣ Presence of background (underlying event and pileup) ➙ needs small/known susceptibility and/or ability to subtract background ‣ Jets often initiated by a large-momentum heavy particle ➙ needs capability to distinguish boosted objet jet from QCD jet Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 12 IRC safety An observable is infrared and collinear safe if, in the limit of a collinear splitting, or the emission of an infinitely soft particle, the observable remains unchanged: O(X; p1 , . . . , pn , pn+1 O(X; p1 , . . . , pn ⇥ pn+1 ) 0) O(X; p1 , . . . , pn ) O(X; p1 , . . . , pn + pn+1 ) This property ensures cancellation of real and virtual divergences in higher order calculations If we wish to be able to calculate a jet rate in perturbative QCD the jet algorithm that we use must be IRC safe: soft emissions and collinear splittings must not change the hard jets Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 13 Janus Jets From Wikipedia: In ancient Roman religion and myth, Janus (Latin: Ianus, pronounced [ˈiaː.nus]) is the god of beginnings and transitions, and thereby of gates, doors, passages, endings and time. He is usually depicted as having two faces, since he looks to the future and to the past. The Romans named the month of January (Ianuarius) in his honor. Like Janus, jets can serve two purposes: ‣ Observables ‣ Tools to be defined, calculated, measured to be used to extract specific properties of the final state Different clustering algorithms have different properties and characteristics that can make them more or less appropriate for each of these tasks Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 14 Jets as tools Background characterisation and subtraction Mass reconstruction Remove soft contamination from a hard jet Tag heavy objects originating the jet Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 15 (Boosted) jet studies at the LHC Lily Asquith, summary talk at BOOST 2015 Essentially none of these tools existed as lately as seven years ago Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 16 Glossary What AKT CA BDRS trimmed pruned HTT N-subjettiness WTA one-pass JVT ρ D2 PUPPI Soft Drop Matteo Cacciari - LPTHE i.e. Anti-kt algorithm Cambridge/Aachen algorithm mass-drop tagger, includes filtering Trimming, tagger/groomer Pruning, tagger/groomer HepTopTagger jet shape function, used in tagging Winner-Take-All (recombination scheme) choice of axis for N-subjettiness Jet Vertex Tagger (used in pileup subtr.) background density (used in pileup subtr.) jet shape function, used in tagging particle-by-particle pileup subtr. tagger/groomer MadGraph School - Shanghai - October 2015 When 2008 1999 2008 2009 2009 2009 2010 2013 2010 2014 2007 2014 2014 2014 Ref. 0802.1189 9907280 0802.2470 0912.1342 0903.5081 0910.5472 1011.2268 1310.7584 0707.1378 1409.6298 1407.6013 1402.2657 17 Outline ‣Algorithms, speed ‣Infrared and collinear safety Lecture 1 ‣Background (pileup) ‣Substructure Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 Lecture 2 18 Outline ‣Algorithms, speed ‣Infrared and collinear safety ‣Background (pileup) ‣Substructure Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 19 The common wisdom circa 2005 ‣Cone algorithms are IRC unsafe ➙ because, to make them reasonably fast, they were usually implemented via approximate methods using seeds ‣Sequential recombination algorithms (i.e. kt) are slow and too susceptible to background contamination ➙ because they scale like N3 ➙ because they tend to collect soft particles up to large distances from centre ➙ because they were often run with R=1 and compared to cones with R=0.5! Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 20 Geometry The solution to the speed problem came from considering the clustering problem from a geometrical rather from a combinatorial point of view ‣ Sequential recombination algorithms could be implemented with O(N2) or even O(NlnN) complexity rather than O(N3) [MC, Salam, 2006] ‣ Cone algorithms could be implemented exactly (and therefore made IRC safe) with O(N2lnN) rather than O(N2N) complexity [Salam, Soyez 2007] Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 21 The kt algorithm and its siblings 2 + Δφ2 Δy 2p 2p di j = min(kti , kt j ) R2 p=1 kt algorithm 2p diB = kti S. Catani,Y. Dokshitzer, M. Seymour and B. Webber, Nucl. Phys. B406 (1993) 187 S.D. Ellis and D.E. Soper, Phys. Rev. D48 (1993) 3160 p = 0 Cambridge/Aachen algorithm Y. Dokshitzer, G. Leder, S.Moretti and B. Webber, JHEP 08 (1997) 001 M. Wobisch and T. Wengler, hep-ph/9907280 p = -1 anti-kt algorithm MC, G. Salam and G. Soyez, arXiv:0802.1189 NB: in anti-kt pairs with a hard particle will cluster first: if no other hard particles are close by, the algorithm will give perfect cones Quite ironically, a sequential recombination algorithm is the ‘perfect’ cone algorithm Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 22 IRC safety of generalised-kt algorithms 2 + Δφ2 Δy 2p 2p di j = min(kti , kt j ) R2 2p diB = kti p>0 New soft particle (kt →0) means that d → 0 clustered first, no effect on jets New collinear particle (Δy2+ΔΦ2 → 0) means that d → 0 clustered first, no effect on jets p=0 New soft particle (kt →0) can be new jet of zero momentum New collinear particle (Δy2+ΔΦ2 → 0) means that d → 0 no effect on hard jets clustered first, no effect on jets p<0 New soft particle (kt →0) means d →∞ clustered last or new zero-jet, no effect on hard jets New collinear particle (Δy2+ΔΦ2 → 0) means that d → 0 Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 clustered first, no effect on jets 23 IRC safe algorithms kt Cambridge/ Aachen anti-kt SISCone SR dij = min(kti2,ktj2)ΔRij2/R2 hierarchical in rel p Catani et al ‘91 Ellis, Soper ‘93 NlnN Dokshitzer et al ‘97 Wengler, Wobish ‘98 NlnN t SR dij = ΔRij2/R2 hierarchical in angle SR MC, Salam, Soyez ’08 2 -2 -2 2 dij = min(kti ,ktj )ΔRij /R (Delsart, Loch) gives perfectly conical hard jets Seedless iterative cone with split-merge Salam, Soyez ‘07 gives ‘economical’ jets N3/2 N2lnN ‘second-generation’ algorithms All are available in FastJet, http://fastjet.fr (As well as many IRC unsafe ones) Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 24 Jet clustering in FastJet /// define a jet definition JetDefinition jet_def(JetAlgorithm jet_algorithm, double R, RecombinationScheme rec_sch = E_scheme); jet_algorithm can be any one of the four IRC safe algorithms, or also most of the old IRC-unsafe ones, for legacy purposes /// create a ClusterSequence, extract the jets ClusterSequence cs(input_particles, jet_def); vector<PseudoJet> jets = sorted_by_pt(cs.inclusive(jets)); ... // pt of hardest jet double pt_hardest = jets[0].pt(); ... // constituents of hardest jet vector<PseudoJet> constits = jets[0].constituents(); Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 25 FastJet speed Time needed to cluster an event with N particles 2 10 kt n e t ch an a A / e g d i r b m Ca t i-k Intel i5 760 FastJet 3.0.1 R=0.7 10 Co ne 1 10 -2 10 -3 10 -4 10 -5 10 -6 PbPb collisions SIS time (s) 10-1 hadron collisions with pileup anti-kt kt C/A SISCone hadron collisions 10-7 1 10 102 103 104 105 106 107 N Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 26 Outline ‣Algorithms, speed ‣Infrared and collinear safety ‣Background (pileup) ‣Substructure Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 27 Hard jets and background In a realistic set-up underlying event (UE) and pile-up (PU) from multiple collisions produce many soft particles which can ‘contaminate’ the hard jet Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 28 Pileup 78-vertices event from CMS https://cds.cern.ch/record/1479324 Pileup can deposit several tens of GeV (or even hundreds, in a heavy ion collision) into a medium-sized jet Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 29 Hard jets and background How are the hard jets modified by the background? Susceptibility (how much bkgd gets picked up) Resiliency (how much the original jet changes) Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 Jet areas Backreaction 30 Jet areas A jet’s area is defined as the extent of the region where infinitesimally soft particles get clustered into the jet More in details, a jet’s active area is the extent of the region where a distribution of infinitesimally soft particles, that can also cluster among themselves, is clustered into the jet A jet’s active area measures a jet’s sensitivity to contamination from soft particles like underlying event and pileup Matteo Cacciari - LPTHE Jets do not necessarily have cone-shaped profiles MadGraph School - Shanghai - October 2015 31 kt Cam/Aa SISCone anti-kt Resiliency: backreaction “How (much) a jet changes when immersed in a background” Without background With background Backreaction loss Backreaction gain Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 33 Resiliency: backreaction 1/N dN/dpt (GeV-1) pT loss 1 SISCone (f=075) Cam/Aachen kt anti-kt pT gain Pythia 6.4 LHC (high lumi) 2 hardest jets pt,jet> 1 TeV 0.1 |y|<2 R=1 0.01 0.001 -20 -15 -10 -5 0 pt(B) (GeV) 5 10 15 Anti-kt jets are much more resilient to changes from background immersion (NB. Backreaction is a minimal issue in pp background and at large pt. Can be much more important in Heavy Ion collisions) Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 34 Anti-kt jets and background Anti-kt jets maximise resiliency, and their regular shapes makes them easier to correct for detector-related effects Default choice of all LHC collaborations Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 35 The IRC safe algorithms Hierarchical UE/pileup Speed Regularity contamination Backreaction substructure kt ☺☺☺ ☂ ☂☂ ☁☁ ☺☺ Cambridge /Aachen ☺☺☺ ☂ ☂ ☁☁ ☺☺☺ anti-kt ☺☺☺ ☺☺ ☁/☺ ☺☺ ✘ SISCone ☺ ☁ ☺☺ ☁ ✘ Array of tools with different characteristics. Pick the right one for the job Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 36 Hard jets and background Modifications of the hard jet ⇧ pt = A ± (⇥ A + ⇥ A + ⇤A2 ⌅ ⇤A⌅2 ) + BR pt Background momentum density (per unit area) background ‘susceptibility’ Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 back-reaction ‘resiliency’ 37 Background subtraction Two main approaches to background subtraction ‣ Jet-based ‣ Cluster the full event, determine the event-specific (ρ) and jet- specific (A) quantities, and subtract the relevant contamination from a given observable ‣ Pros: largely unbiased subtraction ‣ Cons: slow, potentially large(er) residual uncertainty ‣ Examples: `jet area/median’ in FastJet, GenericSubtractor for jet shapes, JetFFMoments for fragmentation functions, .... ‣ Particle-based ‣ Produce a reduced event, by dropping some of the particles. Cluster this reduced event, and calculate from it the observables ‣ Pros: fast, often small(er) residual uncertainty ‣ Not natively unbiased, can depend on choice of parameters ‣ Examples: ConstituentSubtractor, SoftKiller, PUPPI, .... Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 38 Background subtraction: jet-based Correction of a jet transverse momentum hard jet, corrected pT = hard jet, raw pT ρ ⇥ Areahard jet MC, Salam, 0707.1378 If ρ is measured on an event-by-event basis, and each jet subtracted individually, this procedure will remove many fluctuations and generally improve the resolution of, say, a mass peak ⇧ pt = A ± (⇥ A + ⇥ A + ⇤A2 ⌅ ⇤A⌅2 ) + pBR t Irreducible fluctuations: uncertainty of the subtraction Needs two ingredients: ρ and Ajet Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 39 Jet areas Definition of a specific jet area and its calculation for each jet in FastJEt /// specify where to place the ghosts, their area, how many times /// to repeat the clustering double rapmax = 2.5; int nrepeat = 1; double ghost_area = 0.01; GhostedAreaSpec gas(rapmax, nrepeat, ghost_area); /// use this configuration to define the area /// A sensible default for gas is provided AreaDefinition area_def(active_area, gas); /// construct cluster sequence with areas ClusterSequenceArea(input_particles, jet_def, area_def); .... jets[0].area() Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 40 Background estimation and subtraction // constructor for a background estimator JetMedianBackgroundEstimator bge(Selector sel, JetDefinition jet_def, AreaDefinition area_def); // an alternative (faster) background estimator //GridMedianBackgroundEstimator bge(Selector sel, grid_step); bge.set_particles(input_particles); .... double rho = bge.rho(jet); // extract rho estimation // define a subtractor Subtractor sub(&bge); // apply it to a jet (or a vector of jets) PseudoJet subtracted_jet = sub(jet); Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 41 Numerical jet shape correction Soyez et al. 1211.2811 A generic jet shape (a function of the momenta of all constituents of a jet) is modified by the addition of pileup Correct it by calculating numerically the derivatives that enter its Taylor expansion and subtracting (this generalises the jet area/median subtraction for transverse mom.) Pileup momentum density Matteo Cacciari - LPTHE Numerical derivative w.r.t. ghosts momenta MadGraph School - Shanghai - October 2015 42 Shape subtraction Pilup subtraction from jet shapes using GenericSubtractor from fjcontrib #include “ExampleShapes.hh” #include “GenericSubtractor.hh” // define a specific jet shape contrib::Angularity ang(1.0); // angularity with alpha=1.0 // define a generic subtractor ... construct a background estimator bge_rho.... contrib::GenericSubtractor gen_sub(&bge_rho); // compute the subtracted shape double subtracted_ang = gen_sub(ang, jet); Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 43 Particle-level pilup removal Pilup removal using SoftKiller from fjcontrib (SoftKiller progressively removes soft particles until ρ of event is zero) #include “SoftKiller.hh” // define SoftKiller double grid_size = 0.4; contrib::SoftKiller soft_killer(rapmax, grid_size); // apply it to the full event double pt_thresh //returns threshold for killed particles vector<PseudoJet> soft_killed_event; soft_killer.apply(full_event, soft_killed_event, pt_thresh); // proceed with clustering and calculating shapes with // the reduced soft_killed_event ClusterSequence(soft_killed_event,.....) Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 44 Comparisons of pileup subtractions from the CERN pileup mitigation workshop, https://indico.cern.ch/event/306155/ PRELIMINARY 3 areasub constitsub safeareasub soft killer puppi safenpcsub lin cl.(Rsub=0.3) 2.5 anti-kt R=0.4 30, 60, 100, 140 PU 2 σΔO (GeV) DISPERSION CHS event, jet m, pt=020 WORSE 1.5 BETTER 1 0.5 BIAS 0 -2 -1.5 -1 -0.5 0 <ΔO> (GeV) 0.5 1 1.5 2 BETTER Matteo Cacciari - LPTHE MadGraph School - Shanghai - October 2015 45 Recap of lecture 1 ‣ A number of different IRC-safe jet algorithms exist ‣ They all try to be good proxies for hard partons, but they have different characteristics, especially with respect to soft particles ‣ Jets from all algorithms inevitably suffer from pileup contamination ‣ Techniques exist to subtract it, either at jet-level, or at particle-level ‣ Both the jet algorithms and many pileup subtraction techniques are packaged aither in FastJet or in fjcontrib contributions ‣ Use of standard algorithms and packages (either directly or through interfaces) should be privileged, as it ensures reproducibility http://fastjet.fr Matteo Cacciari - LPTHE http://fastjet.hepforge.org/contrib/ MadGraph School - Shanghai - October 2015 46