LAr operation manual
Transcription
LAr operation manual
LAr operation manual This document is focused on the operation of the LAr calorimeter in the ATLAS Control Room. For more details on the hardware, you should refer to the original LAr Operation Manual that can be found here : https://edms.cern.ch/document/834898/21 If you are reading this manual in PS or PDF, please realize that it is being updated each week. Feel free to use it as a general reference, but for any particular details (which scripts to use, where to find the data, etc) refer to the version on the web at P1 http://pc-atlaswww.cern.ch/lar/doc/AtLarOper.html/index.html2 or outside P13 . In general P1 links are given first, and links outside P1 are identified with the (*) symbol. 1 https://edms.cern.ch/document/834898/2 http://pc-atlas-www.cern.ch/lar/doc/AtLarOper.html/index.html 3 https://atlasop.cern.ch/atlas-point1/lar/doc/AtLarOper.html/index.html 2 1 Contents 1 Getting Started 1.1 Before you arrive for shifts . . 1.2 When you first arrive for shift 1.3 During the shift . . . . . . . . 1.4 End of shift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 4 4 5 5 2 Calibrations Runs 2.1 Taking Calibration Runs . . . . . . . . . . . . . . . . 2.2 Transferring data to castor at the end of a run . . . 2.2.1 Automatic processing of the calibration runs 2.2.2 Number of events to complete a run . . . . . 2.3 Monitoring of the Calibration Runs . . . . . . . . . . 2.4 Old way to take calibration runs . . . . . . . . . . . 2.5 Special calibration runs . . . . . . . . . . . . . . . . 2.5.1 SCA test runs . . . . . . . . . . . . . . . . . . 2.5.2 Start a trigger calibration run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 6 8 9 9 10 10 10 12 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Physics Runs 16 3.1 What to do at the start of a physics run? . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2 Monitoring of physics run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4 Environment 4.1 The DAQ panel . . . . . . . . . . . . . . . 4.2 Opening the Monitoring Advanced Panel 4.3 Basic functionalities of the DAQ GUI . . 4.3.1 LAr H/W control parameters . . . 4.3.2 Complex deadtime and the Central 4.3.3 Segment and Resource . . . . . . . 4.3.4 LAr Crates . . . . . . . . . . . . . 4.3.5 The Run parameters . . . . . . . . 4.4 Description of different processes involved 4.5 OKS database . . . . . . . . . . . . . . . . 4.6 DCS - Detector Control and Safety . . . . 4.6.1 Check the LVPS Status . . . . . . 4.6.2 Check the ROD Crate Status . . . 4.6.3 Check HV Status . . . . . . . . . . 4.6.4 Check the DCS Alarms Screen . . 4.7 Using the ATLAS e-log (ATLOG) . . . . 4.7.1 Access and use of Elog . . . . . . . 4.7.2 Information to put in the Elog . . 5 Monitoring and Data Quality 5.1 Monitoring Displays . . . . . . . . . 5.1.1 DQMD . . . . . . . . . . . . 5.1.2 OHP . . . . . . . . . . . . . . 5.1.3 Trigger Presenter . . . . . . . 5.1.4 Atlantis . . . . . . . . . . . . 5.1.5 Other monitoring tools at P1 5.2 Where to find the monitoring data? . . . . . . . . . . . . . . 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Trigger Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 18 19 19 19 20 21 22 23 24 24 29 29 30 31 32 34 34 34 . . . . . . . 35 35 35 35 35 35 35 35 5.3 5.4 5.2.1 Setting up ROOT . . . . . . . . . . 5.2.2 Online Calibration ROOT Files . . . 5.2.3 Online Physics ROOT Files . . . . . 5.2.4 Offline Physics ROOT files . . . . . 5.2.5 Using the event dump . . . . . . . . Data Quality Checklists . . . . . . . . . . . 5.3.1 Online - Calibration Runs . . . . . . 5.3.2 Online - Physics Runs . . . . . . . . Monitoring plots description . . . . . . . . . 5.4.1 Run Parameters . . . . . . . . . . . 5.4.2 Detector Coverage . . . . . . . . . . 5.4.3 Data Integrity DSP . . . . . . . . . 5.4.4 Data Integrity FEB . . . . . . . . . 5.4.5 High Energy Digits . . . . . . . . . . 5.4.6 Timing . . . . . . . . . . . . . . . . 5.4.7 Energy Flow . . . . . . . . . . . . . 5.4.8 Quality Factor . . . . . . . . . . . . 5.4.9 MisBehaving Channels Digits . . . . 5.4.10 MisBehaving Channels CaloCells . . 5.4.11 MisBehaving Channels RawChannels 5.4.12 CaloGlobal . . . . . . . . . . . . . . A Tips to work at P1 A.1 Access rights . . . . . . . . . . . . . . A.2 Network . . . . . . . . . . . . . . . . . A.3 logout at P1 . . . . . . . . . . . . . . . A.4 Printers . . . . . . . . . . . . . . . . . A.5 Phone numbers . . . . . . . . . . . . . A.6 Updating this document and checklists A.7 Creating graphics (screenshots) at P1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 35 35 36 36 36 36 36 37 38 40 41 43 45 50 50 51 54 57 60 61 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 65 65 65 65 66 66 67 B Hardware memento 68 C Few hints on events dump 70 D What to remind from the old discussion forum? 73 D.1 On the triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 D.2 Data flow picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 E More info about LAr FEB errors 75 3 1 Getting Started Welcome to Liquid Argon ATLAS shifts! This manual should help you to complete all the tasks given to a LAr shifter. When you find errors, things that are out-of-date, or sections that are confusing or could be improved, please let us know. During the start-up in 2009, we will have a huge number of new people going through this document for the first time. One of the jobs of those LAr shifters is to help make shifts better, by letting us know what we can do to improve. Please put your comments/questions/additions on the LAr Bug Reports4 where they can be seen and addressed by experts. When have any questions, you can contact the Run Coordinators: Paolo Iengo, Jessica Leveque, Stephanie Majewski, or Damien Prieur. 1.1 Before you arrive for shifts There are several things you can do before you arrive for shifts. 1. Get access to the Control Room. You will need a CERN ID, with access to the “ATL CR” region to get into the door of the main control room. If you have not taken the CERN safety courses, you need to take the “Basic safety course” in person at CERN, offered two times each day in English and French. After that, take the “level 4” course online. Then request ATL CR access in EDH. More details can be found at https://atlasop.cern.ch/twiki/bin/view/Main/LArShiftSignup5 . 2. You will also need to access the LAr satellite control room, next to the main control room. To do this, you have to go with your CERN ID to S. Auerbach (located at 124-R011). Tell him you will take LAr shifts and request access to 3159-R012. He will place your ID into a machine and add the access rights there. 3. You will need accounts for the Point 1 machines and e-log. The LAr Operations crew will request these accounts for you once you have signed up for shifts, and you should receive an email letting you know. The password for your P1 account will be the same as your NICE password for other application at CERN. 4. You should read the most recent slides under “LAr Shifter Tutorials” found at this page from outside P16 or at P17 , and go through this manual. 1.2 When you first arrive for shift Log into the RunCom tool (“New Shifter” and “Ready”). Follow the instructions from the “signinLAr” checklist that opens. You don’t know how to do it? Here is a recipe : • If the computers at the LAr desk are not already logged in, start the session. The username is “crlar” and you can just hit “Enter” in the password field without filling in a password. When prompted to select a role, choose “LAR:shifter”. 4 https://savannah.cern.ch/bugs/?func=additem&group=lar https://atlasop.cern.ch/twiki/bin/view/Main/LArShiftSignup 6 https://atlasop.cern.ch/twiki/bin/view/Main/LArOperationManualShifter#LAr Shifter Tutorials 7 http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArOperationManualShifter#LAr Shifter Tutorials 5 4 • The RunCom tool will take you through the actions you need to accomplish as the shift begins. Make sure the RunCom tool is open, if it is not, click on the bottom menu bar: “General” → “RunCom Tool”. Click “new shifter” and then “Ready”. The “signin-LAr” checklist will pop up. Complete this checklist. • If the RunCom tool is broken, or if the checklists are broken, let the shift leader know immediately. You can access LAr documentation by opening a web browser. The ATLAS home page should come up. Select ”Documentation” on the left, and then ”LAr”. 1.3 During the shift You will have to go through all the checklists during your shift (accessible from the “LAr” menu → “LAr Checklists”). When a new run is starting, your first priority is to look at the DQMD monitoring. During the first 3 minutes of a run, it is ESSENTIAL to spot immediately data integrity problems which may need to stop the run and start a new one. When you find some free time during your shift, read the calibration section (section 2) and the Calibration checklist, to be prepared for the calibration taking period which could be in a hurry. The rest of this manual should provide the information necessary to deal with tasks and problems during the shift as they come up. Let us know what you find lacking. When problems come up, the procedure should be to 1. Know the information on the LAr WhiteBoard web page. You should read it when you first arrive for shift, and keep it in mind. It may contain instructions that are new or especially vital for this particular shift. 2. Look at the wiki “Guidelines for LAr errors” which contains procedures for dealing errors on the fly, such as the LAr being “busy”, TDAQ error messages, monitoring PT’s crashing, OHP not showing plots, etc. 3. Next, check this manual ONLINE. Do not rely on a paper copy which will become outdated. The official version is the one on the web. 4. If you cannot find the answer to your question on the WhiteBoard, Guidelines for Errors, or in the manual, call relevant experts and please make a note of the fact that you couldn’t find the documentation you needed in your e-log entry. 1.4 End of shift At the end of the shift, you will need to finish your e-log Shift Summary, and submit it IMPERATIVELY 15 MINUTES BEFORE the end of your shift. Choose Message Type : Shift Summary, ShiftSummary Desk : LArg, System affected : LArg, Status : closed, Subject : “Shift summary for LArg Desk”. You should go through this Shift Summary with the crew of shifters that come after you, to clarify any points. 5 2 Calibrations Runs 2.1 Taking Calibration Runs Please follow these instructions to take standard LAr Calibration Runs in the Main ATLAS Control Room or the LAr Satellite Control Room. You will take three sets of runs, using a script to start each set, and closing the GUI (and DQMD and OHP) fully after each set. If you encounter problems, have a look at the Troubleshooting page8 (*)9 . 1. Inform the shift leader and run control shifter that LAr would like to be “out” for the calibration period (and therefore that LAr should be removed from the ATLAS partition). If any LV power supplies have been turned on immediately before calibrations, check the Troubleshooting page to see what actions to take. 2. Define with the Run Coordinator which set of runs you need to take. “Weekly runs” correspond to Pedestal runs in 32 samples mode, Ramp runs in 7 samples and Delay runs in 32 samples. “Daily runs” correspond to Pedestal and Ramp run, both in 7 samples; no Delay in this case. 3. Start the Calibration Checklist from the LAr Menu, and follow those instructions. The following information is supplemental. 4. Copy and paste the ELOG template10 in a text editor (kedit) to use during the calibration runs. 5. Verify that no one else is using LAr. To do that, look at the ATLAS Data taking Status11 . If LAr is still in the ATLAS partition during the period allocated to calibration, it is ok as long as the Root Controller State is NONE. Be especially sure to communicate with the shift leader / run control desk; make sure you are finished with any calibration partitions before they boot the ATLAS partition. 6. With the nominal settings on the DAQ Panel, click on the LAr tab (see Figure 1). Click on “Calibration Runs” to start the calibration script (note you no longer need a terminal). Choose a partition from the drop-down dialog box, then click OK. Unless otherwise instructed, start with the EM partition, followed by the HECFCAL, and finish with the PS partition. This will open the TDAQ GUI, OHP and DQMD. Other partitions (listed in Table 1) are special cases listed here for reference and should not be used unless you were explicitely asked to by the run coordinator. 7. Switch off “Enable ATLOG interface” in the “Settings” menu of the TDAQ GUI, to avoid too many ELOG entries. 8. Open an MRS window (from the button at the top of the TDAQ GUI window, or just use the window at the bottom of the gui), change the number of messages to 2000 (in the “Number of visible rows” field). 9. Load the “MasterPanel” inside “LoadPanels” in order to see the “LAr” tab. 8 http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArTroubleShootingCalibration https://atlasop.cern.ch/twiki/bin/view/Main/LArTroubleShootingCalibration 10 http://pc-atlas-www.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=ELOG Summary Calibration.txt 11 http://pc-atlas-www.cern.ch/wmi/RunStatus.html 9 6 PartitionTag EM HECFCAL PS EMB ALL description full Barrel and EMEC A and C sides (including EMEC PS) HEC and FCAL Barrel PreSampler full Barrel All LAr PartitionName LArgEm LArgHecFcal LArgBarrelPS LArgBarrel LArgAll Table 1: List of available partitions. LArgBarrel is used only if the endcaps are unavailable. LArgAll is not yet used for Calibrations. 10. Go now into the “Shifter” tab, under the “LAr” tab. • Press “Daily Run” or “Weekly Run” button ONCE. Do NOT click any other buttons in the Shifter panel or the Run Control panel. • Select “yes” in the confirmation window that will appear. • A window “Remember to check the settings” opens at the begin of each new run, answer “OK”. • At this point, the set of runs for the applicable partition will be taken. Current information relating to the run(s) in progress will be displayed in the text area called “Information” on the panel. For more details, there is another text area: “Log & Thread Information”, which contains a log of the completed runs and whether or not the user aborted the runs. Next to it, are a series of fields which display the status of threads used by the panel - green is for an active thread, red for a thread that is no longer active, and the default gray color means that the thread hasn’t been executed. • At the end of each run, the Data Integrity is automatically checked. If there is an error, two windows pop up : Figure 1: The LAr Tab on the DAQ panel, showing the “Calibration Runs” button. 7 (a) “The run had errors, please check the log and determine if you need to retake the run ”. Answer “YES” or “NO”. (b) the log information. Click “YES” to retake the run; “NO” to continue. Please use your judgement when deciding to retake runs (based on whether you are taking a weekly or daily set, how much time you have before the end of the calibration period, how many FEB errors occurred, etc.). If you are not sure, consult with the experienced shifters or the run coordinator. If the error is related to a known problematic FEB, don’t take the run again (→ the error will stay for all runs). In the “Log & Thread Information” area, you will see which of the runs have failed the Data Integrity check and your decision to retake them or not. • In case of problems, the shifter may stop the runs by pressing the “ABORT” button within the “Shifter” panel. This is done to ensure that all of the threads executed by the panel have fully exited. • Watch for important messages scrolling in the MRS window. If you see any errors, check the Troubleshooting page (*) and the Whiteboard to see if the data are still good, or how to recover from the errors. Copy the messages in your ELOG summary. • Look at the DQMD and OHP windows during the run. If you see any red partitions in DQMD, the complete campain for this partition should be retaken. Make a note of the problem in your ELOG entry. • At the end of the data taking, copy the information about the calibration runs from the Emacs session which will open, into your ELOG summary. Verify the contents of these lines: proper gains, sample numbers, and run numbers! Some information about the data integrity check is also provided. If you encounter problem with this Emacs session, look in your home directory at rf_cal_runs.log (formatted log with data integrity check); note that this file is always overwritten, so they only reflect the last set taken. You can always look into the complete Calibration log file in ~lardaq/LAr-CalibrationRuns.log (if this file has *not* been automatically updated, report the problem to the run coordinator), but in that case, you will have to run yourself the script of section 5.3.1 to access data integrity information. • Remark: You don’t need anymore to copy the runs to CASTOR, it is done automatically. 11. When all of the runs for one partition are finished, click on “SHUTDOWN” in the “Shifter” panel and then close the TDAQ GUI by clicking the EXIT button in the file menu (top lefthand corner). When asked if you want to shut down the partition infrastructure, say “yes”. Also close all DQMD and OHP windows. 12. Start again from the LAr tab of the DAQ Panel for the other partitions (go back to item 4). 13. Post only one ELOG entry for the whole sets of runs. For standard calibrations, please choose Message Type: LArg, LArg EntryType: Calibration Summary, select the appropriate LArg Partitions, and for the Subject line: use the proposed template in the ELOG summary. 2.2 Transferring data to castor at the end of a run The data are normally copied automatically to CASTOR. You can check it with script /det/lar/project/scripts/check_calib_run.sh. The calibration runs should appear in /data/copy. If it is not the case, they may already appear in CASTOR. Check the location /castor/cern.ch/grid/atlas/DAQ/lar/ElecCalib/2009. 8 If you need to copy the runs by yourself, use the script /det/lar/project/scripts/copy_run_to_castor.sh which takes one argument, the run number. Execute the script once for each run. If you want to look more in detail: • Log on the machines where the data are written (pc-lar-eb-02 (A side) pc-lar-eb-03 (C side) for Barrel event builders and pc-lar-eb-01 (A side) pc-lar-eb-04 (C side) for Endcap event builders). When a run is supposed to be good, the data should have been moved from the /data/check directory to /data/copy directory of the event builder machine. The data will then be automatically copied on castor by a daemon script. Given the huge amount of acquired data, a regular cleaning of /data/check directory is performed: any data older than 1-2 day may be erased from it. • Experts only: If you want to keep data in a safe place for further debugging but do not want to have them copied on Castor and processed, you can copy them in /data/temp, indicating in RunLog.txt why you want to keep these data. This may be especially interesting to store runs with a lot of data integrity problems, that should be debugged by expert. The concerning runs and errors should also be reported in the Elog entry such that experts are aware of the problems. 2.2.1 Automatic processing of the calibration runs A daemon is in charge of handling the Automatic Processing (AP) of the calibration runs. A CronJob checks regularly if new calibration runs are copied to castor and if the corresponding information is stored in the database. If it is the case, the AP is launched. You can check on the following link12 the status of electronic calibration runs. In general, it takes from 10 to 30 min between the end of the data taking and the start of the AP. If all partitions are complete [EM, BarrelPS, HECFCAL], the script waits only 10 minutes after last run was taken and copied on CASTOR before launching AP. If there’s an incomplete/non existing partition [for example on EM partition taken], the scripts waits ∼ 30 minutes after last run was taken and copied on CASTOR before launching AP. The purpose of this safety period is to keep the opportunity to gather runs that might be taken again because they appear to be corrupted. If you discover a problem concerning the running of the AP, log on the hypernews for electronic calibration13 (only accessible from outside P1). Look if people have already reacted to the automatic message from the AP. If not, put a message to the ECAL team using this mailing list. If the problem seems to be related to the quality of the data, re-take the complete campain for that partition as soon as possible. 2.2.2 Number of events to complete a run In the case of pedestal runs, the maximum number of events is given either by the default value or in the Run parameter panel. In the case of delay/ramps/cabling run, it is automatically determined by the calibration pattern: • Ramp: Nevents = Nsubsteps × Neventspersubstep × Npatterns × Nf inedelays • Delay: Nevents = Nsubsteps × Neventspersubstep × Npatterns × NDACvalues For more information on the patterns, the definition files can be found in: ~lardaq/LAr-CalibrationRuns.log. 12 13 http://lar-elec-automatic-processing.web.cern.ch/lar-elec-automatic-processing/ https://groups.cern.ch/group/hn-atlas-lar-electronic-calibration/default.aspx 9 2.3 Monitoring of the Calibration Runs After typing > source /det/lar/project/scripts/LArCalibRunSetup.sh [PartitionTag] in a P1 termal, two windows come open in addition to the TDAQ GUI : OHP and DQMD. If it does not work, you can also launch the application with : • > dqmd -p [PartitionName] for DQMD • > ohp -p [PartitionName] -c $OHPSEARCHPATH/lar/ohp/LArMonitoringShifterCalib.ohp.xml for OHP. where [PartitionTag] and [PartitionName] are defined in table 1. OHP and DQMD allow to monitor calibration data during the run. After having taken the runs, monitoring can be done on ROOT files as mentionned in section 5.3.1. 2.4 Old way to take calibration runs The complete procedure explained in section 2.1 is still valid, but in place of clicking on the “Weekly” or “Daily” button (item 7.), you may want to follow the basic instructions for taking specific calibrations runs: • Go in the Shifter panel (under the LAr panel): • Choose Gain - High, Medium, or Low. • Press the CONFIGURE button - This is right above the information field. DO NOT click the buttons to the left in the Run control panel. • Choose Monitoring Tools - By default, only FebMon is selected. • Choose Run Type - Pedestal, Delay, Ramp. You should follow the program in the ELOG template, for the run types and number of samples you should use. • Decide the Number of Samples - Generally, you should keep the default as shown, 0 (but be careful for pedestal runs : choose 32 for weekly, keep 0 for daily (as 7 is the present default)). • Press the RUN button • The run should be stopped by the shifters only in the case of problems. In this procedure, no Emacs session will pop up. The information about the calibration runs which have just been taken can be found in the Calibration log file ~lardaq/LAr-CalibrationRuns.log. 2.5 Special calibration runs 1. Before starting a calibration run, make sure that LAr is not in the combined partition. 2. Get the following information from the expert(s) who requested the run. Also check the list of special runs below if it is one that is commonly taken (for example, SCA test runs). • Partition(s) • Gain(s) • Calibration run type (Pedestal, Delay or Ramp) • H/W settings for the runs – Run type 10 – – – – – Filename tag, if specified Number of samples L1 Latency and First Sample Data format Calibration tag for Ramp runs 3. Launch a new shell, and type the following commands to setup the DAQ online environment: > source /det/lar/project/scripts/LArCalibRunSetup.sh [PartitionTag], where the [PartitionTag] depends on which part of the detector you want to run (see table 1). 4. To set the run parameters in the TDAQ GUI : • In the left panel, hit Boot • Open the ”LAr H/W Control” tab. Set the parameters in PARAMS GLOBAL according to the expert(s) instructions : – – – – – – – – nbOfSamples gainType l1 Latency firstSample format runType InhbDelay Do not forget to load the new values by clicking the button on the bottom-right with the green arrow + disk as shown in Figure 2. • Go in the ”Run Info & Settings” small window on the left, and in the ”Settings” tab, set the run parameters according to the expert(s) instructions : – Run Type: “LArPedestal”, or “LArCalibration” for ramp and delay runs, other for other runs – Tier0 Project Name: use “dataXX calib” (with XX for the year, like 09) – Filename Tag: include Type and gain in the file name (e.g. Filename=”PedestalHigh”). Pay attention, it should not be too long! – Recording: “Enable” – do not forget to hit Set Values at the bottom of the window once the set up is done 5. In the left panel of the TDAQ GUI, hit Initialize. 6. Wait for all segments to be set as INITIALIZE in black on blue in the Root Controller. Then hit Config. 7. Once all segments show up as CONNECTED in black on yellow, hit Start. 8. Once the run is finished hit Unconfig and Terminate. Restart the same procedure for the next calibration runs. 9. Once all calibration runs are taken, before exiting the DAQ GUI you have to hit Unconfig, Terminate and Shutdown. Then exit the GUI cleanly by using the Exit button in the file menu, located in the left right corner (NOT the red CROSS!). 11 10. Write info about the calibration runs in a dedicated ELOG entry. Information about the runs are found on the event building machines (pc-lar-eb-01,pc-lar-eb-02,pc-lar-eb-03,pc-lar-eb-04). To extract this info, open a terminal in the Control Room, ssh on the events builders and look at the file ~lardaq/LAr-CalibrationRuns.log (by reading it with “more” “less” “tail” etc.) Copy and paste the relevant lines into the ELOG entry about all your calibration runs. If this file has *not* been automatically updated, report the problem to the run coordinator. 2.5.1 SCA test runs For SCA test runs, follow this pattern: • In Step 4, for Run Parameters: – Run Type: SCATest – Recording: Enabled – Filename tag: scaleak – Max Nb events: 0 • In Step 4, for the H/W control panel info, write down the intial values for these parameters (you will need them later), then change them to: – nbOfSamples = 7 – gainType = H, M, or L depending on the run – l1 Latency = 17 – firstSample = 3 – format = transparent – runType = RawData – InhbDelay = 72 – Do not forget to load the new values by clicking the button on the bottom-right with the green arrow + disk as shown in Figure 2 • For the LAr Calibration Manager Panel: – Sequence type = Test – Tag = HighSCATest (or MediumSCATest or LowSCATest, depending on the gain), set for each subdetector if there is more than one – Do not forget to click the bottom right hand button with the blue gear to save the settings as shown in Figure 3. • Now follow the instructions above to go through Initialize, Config, Start, Unconfig, Terminate and Shutdown. • Reset the values in the H/W control panel and exit the Gui. • Transfer the data to Castor, using the link above. 12 2.5.2 Start a trigger calibration run For experts only. Partitions, etc, in this section are outdated. The acquisition of trigger calibration runs is not yet automatized. Moreover, only the barrel case is for the moment fully implemented. To start a trigger calibration run, first go in the Segment and Resource panel and enable the following segment : Larg_L1Mon_[01,02] in the LARG_EMB[A,C] segment. Be careful to only enable the segment and NOT the sub tree. There is one trigger crate per half barrel, each crate being connected to 2 ADCs of 8 channels that allow to test only one front end crate per run and per half barrel. The choice of the tested front end crate is made by configuring the USB controller in the Global Params of the Larg L1Mon [01,02] object defined in the LAr H/W control panel (see section 4.3.1). Check that the number of samples acquired by the ADC is set to 27. Then if you modified something, do not forget to reload the databases by clicking the icon on the bottom right. To avoid to acquire data for all the configured front end crates, it is recommended to switch off all the GLinks by running the script : >~lardaq/bin/stop\_otx.csh. All GLinks being switched off, only the relevant ones are reconfigured and therefore send their data to the RODs. Finally go to the Trigger panel (see figure 4), and modify the different properties: • Detector type; • Connectivity or Calibration : choose Calibration; • TBB (EMB and EMEC) or TDB (HEC or FCAL) • Gain : choose medium • nSamples : 12 (this is the number of samples sent by the front end boards). Figure 2: The red arrow points to the button to click to upload the configuration changes. 13 • nTriggers : 100 Then click on run . Figure 3: The red arrow points to the button to click to upload these changes. 14 Figure 4: the Run Control - need to get a better picture panel 15 3 Physics Runs 3.1 What to do at the start of a physics run? 1. Launch the DAQ panel (see 4.1) if not already open. 2. Check the parameters for the DAQ panel. You may need to browse for the configuration, and click into the box where “Browse” is highlighted. • Setup Script: /det/lar/project/scripts/LArShiftSetup.sh • Part Name : ATLAS. • Database file: /atlas/oks/tdaq-XXX/combined/partitions/ATLAS.data.xml (replace XXX by the last version of tdaq; to do that, start browsing /atlas/oks/ and you will see all the available tdaq versions) • Setup Opt : -newgui • MRS Filter: LAR • OHP Opt: -c $OHPSEARCHPATH/lar/ohp/LArMonitoringShifter.ohp.xml (the variable $OHPSEARCHPATH will be automaticaly defined by the Setup Script) • TriP Opt: -c $OHPSEARCHPATH/trigger/trp/trp_gui_conf.xml 3. Click on “Read Info” to get the information specific to the chosen partition. It will take a minute, please be patient. A box should pop up and say “Information read out. You can proceed.” click OK. 4. Open several windows from the DAQ panel by clicking on the following buttons : • Monitor Partition: will open the “DAQ GUI” • Busy: will open the “busy panel” to know the state of the LAr • MRS • OHP • DQMD 5. Before the start of the run, make sure you have the instructions from the Run Coordinator (look at the WhiteBoard or call him). You need to know: • In which mode the data will be taken : Physics mode? • The number of samples • The mode of readout : RawData? Which format? • The L1 latency and the first sample parameter • Which parts of the detector have to be included? 6. When the LAr shifter is asked by the Atlas shift leader if LAr can go back into the combined partition: If all LAr standalone work is completed (calibration data taking, expert work on Lar system...) and LAr is ready to go back. The LAr shifter answers: ”Yes we are ready to go in. Please re-enabled us.” It may happen that LAr segment (in the segment & ressources panel) is already enabled if for example LAr was in a calibration time slot (so doing standalone work) and no combined partition was restarted during that time slot. But it is better in anycase to ask for it. 16 7. THEN: ONLY after LAr segment has been re-enabled, the LAr shifter can see/check/modify the run parameters in the online H/W panel (such as number of sample/l1 latency...) More generally speaking, if the LAr shifter needs to look at any LAr panel in the combined partition, (s)he needs to start the Atlas combined DAQ GUI (clicking on “Monitor Partition” from the DAQ panel). Then (s)he can see those panel and proceed BUT ONLY if the LAr segment has been enabled in the combined partition. 8. If not already done, copy and paste the e-log template14 (*)15 to a text editor. 9. Open the “startrun-LAr” checklist form the “LAr” menu of the computer. You will have to check the DCS status and start the monitoring. 3.2 Monitoring of physics run 1. Open OHP through the DAQ Panel. 2. If it does not work or if you want to open many OHPs at the same type, you can use the commands : >source /det/lar/project/scripts/LArShiftSetup.sh >ohp -p [PartitionName] -c $OHPSEARCHPATH/lar/ohp/LArMonitoringShifter.ohp.xml where [PartitionName] is the one in the 3rd column of Table 1. 3. Open DQMD through the DAQ Panel. 4. Open the “LAr-online-Monitoring” checklist The monitoring histograms are detailed in sections 5.4. You can also have a look at the LAr DQ Policy Description16 (*)17 . 14 http://pc-atlas-www.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=ELOG Summary.txt https://atlasop.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=ELOG Summary.txt 16 http://pc-atlas-www.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=LArDQPolicy.pdf 17 https://atlasop.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=LArDQPolicy.pdf 15 17 4 Environment 4.1 The DAQ panel The DAQ Panel provides access to most of the tasks you will need to complete as a LAr shifter. To launch the DAQ Panel in the control room, go to the TDAQ menu at the bottom of the screen and click on DAQPanel. The panel shown in Figure 5 should show up. You will need to configure the TDAQ environment. See for example section 3.1. (New shifters: What is a partition? A partition is a collection of hardware and software that is given a single name, so that someone can control it. One or many partitions can be running at once – each defined to include, for example, the readout crates of individual systems being tested. For a combined run, the partition includes most of ATLAS. Each piece of ATLAS should only belong to one partition at a time. So, you may hear people saying they want particular things included in the partition, to run with all the systems, or removed from the partition, to test them on their own. People working on LAr may also refer to a part of the detector as a partition. They may call the LAr BarrelA or HECA a partition since each one is a partition for calibrations, but the partition for the cosmics is usually all of ATLAS.) It is then possible to access and click a wide collection of applications/functionalities. Monitor Partition (the old Spy IGUI), OHP and DQMF are the most useful for the standard shifter : • Monitor Partition : This starts a DAQ GUI in spy mode. This is an useful mode to see from the LAr desk what’s happening at the Run Control Desk, and especially to check that the LAr parameters are correctly set. The refreshing of the display seems to be sometimes slow or broken. It may be worth to kill the DAQ GUI and launch again Monitor Partition, if there are some doubts on the reliability of the display. • OHP : Start an Online Histogram Presenter (OHP) with the configuration file specified in the left part of the DAQ panel. The OHP application is devoted to the online display of Figure 5: The DAQ panel (the Main panel) 18 monitoring histograms (see section 5). • DQMF : Start the Data Quality Monitoring Display. Other buttons you may need: • Start Partition : Start a DAQ GUI in expert mode WARNING : you must never start a DAQ GUI in expert mode for an ATLAS partition, if a partition is already existing, as this may confuse and crash the existing one. It should be up to the run control to start the run. This should be fixed soon, but why risk it? • Busy : Display the busy status of the whole data acquisition. • MRS : pops up a window where the error messages are displayed (see Section 4.3). How To Recover Lost Log Messages? If an important log message you wanted to post on the elog flies by in the MRS Monitor before you can copy/paste it into the elog you can still access the log message by hitting the Log Manager button from the DAQ Panel. Navigate to the partition you want to see and the user who is controlling the partition (ATLAS and crrc for combined running). That will open up a list of all runs taken by that user in that partition. Select the run the log message came from and in the top right select the type of FATAL/ERROR/WARNING/etc you are looking for. • OKS : used to access the OKS database 4.2 Opening the Monitoring Advanced Panel To open the Mon Advanced Panel, first open the DAQ Panel as described in the last section, then click on the tab called “Mon Advanced”. You should see a panel similar to the one below, but not exactly the same. In the the Mon Advanced panel, you’ll find some tools for the Information Server (IS) and others. 4.3 Basic functionalities of the DAQ GUI A collection of basic facilities can be accessed from the DAQ GUI presented on figure 7 by: • clicking on MRS opens a new enlarged window to display the different messages. This new window especially allows to perform some selection on the displayed messages, change the number of displayed messages, etc. • clicking on IS allows to access to all informations of Information Service (IS) • in “Commands” scrolling menu, it is possible to clear the small MRS window located in the DAQ GUI • you can load specific panels which do not appear in the DAQ GUI, through the “LoadPanels” menu. 4.3.1 LAr H/W control parameters To check these parameters when running inside the ATLAS partition, load “OnlinePanel” through the “LoadPanels” menu. The “LAr H/W Control” tab will appear, click on it and choose PARAMS GLOBAL (see Fig. 8). For the calibration runs, when running on specific LAr partitions, the “LAr H/W Control” tab is accessible under the “LAr” tab, to see it load “MasterPanel” through the “LoadPanels” menu. 19 The most recent values for the l1aLatency, 1st sample, etc, will be on the WhiteBoard. The way that data are formatted is determined by a combination of the properties Run type and Format. This is detailed in table 2. Some additionnal notes on various parameters : • The choice of physic mode (“Result + format1”) requires to enable the Online DB in the Run Control. It is located under LArg - LArg Plugins - OnlineDB. The database must be started first. To switch from Transparent to Format 1, this database should be started by contacting the Run Coordinator. • If EMBA and EMBC are enabled, EMBA should be SLAVE and EMBC LAST SLAVE but if EMBC is disabled, then EMBA should have the LAST SLAVE setting. The LAST SLAVE prevents the BUSY to come from thdownstream LTP’s. • The inhbDelay should equal 52 + l1aLatency. The “52” clock cycles comes from the time it takes to send the command + cable lengths. 50 and 51 are sometimes used, but 52 gives the best pulse placement. The inhbDelay is the time between the BG02 command and the L1A. 4.3.2 Complex deadtime and the Central Trigger Processor The LAr Front End Boards (FEBs) hold the data coming from the calorimeter in a buffer of 144 cells, called the SCA (Switched Capacitor Array). If the trigger rate is too high, events can’t be read in and written out fast enough. Therefore, a parameter is set at the Central Trigger Processor (CTP) to control the number of triggers sent to the FEBs. For 5 samples, the settings are maximum 9 events in the system and 400 clock cycles to send out an event. For 32 samples, the settings are 1 event and 4000 clock cycles. Figure 6: The DAQ panel for (the Monitoring panel) 20 If the complex deadtime is incorrectly set, you can see it in FebMon – you’ll get many errors, especially in the “Wrong SCAC status” plot. You should notify the shift leader and the L1 trigger desk. 4.3.3 Segment and Resource Go in the Segment and Resource panel tab of the GUI (see figure 9) and check that the expected partitions are properly enabled. For combined cosmic running, all the detector parts should of course be enabled under LArg: Larg EMBA, Larg EMBC, Larg EMECA, Larg EMECC, Larg HECFCALA and Larg HECFCALC. In contrary, some segments should be disabled in in LAr plugins : Calibration, EventCounterReset, RunLogger, Action inspector; and in LTPIC: LArg TTCRCD TTC2LAN. The segments called Larg L1MON A and Larg L1MON C should never be enabled with ”No beam” configuration If you see a different configuration under LArg, consult the WhiteBoard and then the Run Coordinator. In combined running mode, it is worth to note that a partition not readout may be however included in the segment and resources for trigger distribution purposes. Figure 7: The DAQ GUI. 21 For the curious: A resource can be identified by the icon made of shapes (a cube, sphere, cone). It’s a piece of C++ code which can be enabled or disabled. They are also called applications. A segment is identified by the puzzle piece icon. It’s a collection of applications, which can have subsegments. PT is a “Processing Task.” 4.3.4 LAr Crates To check the status or temperatures of the crates that are taking data matches what you saw in the DCS, click on the “LAr Panels Manager” panel in the DAQ GUI, and then on “Crates”. Crates that are “On” in DCS are physically on, while crates that are “On” in this panel are the ones that are expected to be read out by DAQ. To see which crates are being read out, click “Refresh” once. Then click through the End Cap A, Barrel A, Barrel C, and End Cap C. If you see any differences (a crate is ON or OFF in DCS, but different here) check the WhiteBoard and then inform the Run Coordinator. Temperatures for the half crates are displayed below the status of each crate and are separated into minimum and maximum text fields. Minimum temperatures appear in the left field, maximum on the right. The temperatures are separated by a — and are crate ordered left, right, and in a few cases, special. These “special” temperatures correspond to HECA and HECC. Figure 8: the HW Control panel. 22 LAr H/W control parameters Run Format type Raw Transparent data Result Format1 Result Transparent Raw Data + Result Transparent Raw data Format2 Format1 Description Common usage All the digits are simply transferred to the ROS without any processing in DSP Physics run only at low trigger rate - Useful to study pulse shape, noise... Physics run at high rate. Only in 5 or 7 samples mode. Energy are computed for all cells by DSP and sent to the ROS. If the cell energy is greater than a threshold T1 , time and Q factor are also computed; if the cell energy is greater than a threshold T2 , digits are also transferred. The digits are averaged over a certain number of events (typically 100) by the DSP and only the average is sent to the ROS. In case of pedestal runs, auto correlation coefficient are also computed . Combined “Raw data + Transparent” and “Result + Transparent” Not implemented Meaningless Pedestal, delay ramp runs and Pedestal run. Will be used until validation of autocorrelation matrix computation in DSP. Should not be used. Should not be used. Table 2: Determination of data format provided by the LAr H/W control parameters Run type and Format. Number of Samples (thresholds) 32 16 10 7 (0/0) 7 (3σ/5σ) 5 (0/0) 5 (3σ/5σ) Event Size ∼ 14MB ∼ 8MB ∼ 6MB ∼ 4.2MB ∼ 1.2MB ∼ 3.2MB ∼ 1.2MB Deadtime Setting (simple, complex) 2500, –/– 1250, –/– 10, 3/830 7, 5/570 7, 5/570 5, 7/415 5, 7/415 Maximum L1 Rate ∼ 10kHz 25kHz ∼ 60kHz ∼ 60kHz 90kHz 90kHz Table 3: Deadtime settings, L1 trigger rates, and event sizes for different numbers of LAr samples and different threshold settings. (Taken from the ATLAS Shift Leader training slides.) 4.3.5 The Run parameters In this panel, are given very general parameters : • Run number : unique for the run, automatically retrieved from a db • Number of events : useful for stopping the pedestal runs; in this case, taking at least 20003000 events is recommanded. Taking much more is not necessary for standard studies. See 2.2.2 for more informations on ramp/delay/cabling runs. 23 4.4 Description of different processes involved The DAQ is performed by different processes (PMG = Process ManaGer) running on different machines (see table 4). All processes should be either in UP state or in RUNNING state to have an efficient running. If such processes crash or are stuck one can remove it from the DAQ by clicking on out in the Run Control panel (in the Membership subpanel - see figure 11). It also possible to try to restart the process on the fly. This should only be done at the Run Control desk when authorized by an expert. 4.5 OKS database The OKS database contains the status of all hardware parts. It should only be edited by experts, or with an expert on the phone. Changes here can break the entire ATLAS partition, so proceed with caution. To edit it, one has to start the OKS editor from the DAQ Panel. Then, two new windows pop up. In the largest one, look in the left column for the type of object, that you want to modify: • An half front end crate : LARG HFEC Figure 9: the Segment and resource panel. Surrounded are the segments associated to the ROD crates. 24 • A single Front End Card (FEB, TBB, calibration...) : LARG FEModule To enable/disable an object, change its state to true/false and close the window (see figure 13). When all wished objects are modified, exit from the OKS editor, confirm the changes (as many questions as modified objects) and the run control shifter should reload the database by clicking in the appropriate button (see figure 12) in the DAQ GUI. When doing this, you should see restarting all the PMG agents as at the DAQ startup. Figure 10: the LAr Crates panel. 25 Process EB-[PART] disk usage monitoring onasic-setconditions ROS-LAR[PART]-[N] RODC[PART][N] TTCC[PART] LArPT-[N] LArGatherer LArArchive AndRemove Machine pc-lar-eb0[N] pc-lar-eb0[N] Gui machine pc-lar-rosemba-00 sbc-lar-rcc[PART]0[N] sbc-lar-tccemb-01 pc-tdqmon-14 pc-tdqmon-18 pc-tdqmon-14 Description Event building Monitors the occupancy of /data disk Runs for a short time, usually shows up ABSENT ROS control ROD crate control Trigger control Monitoring process - May be connected to EB or SFI Merging monitoring data from different LArPT Save histograms at the end of the run (in absent state during the run, up at the end) Table 4: Description of different processes involved in DAQ - [PART] is a partition (EMBA, EMBC, EMECA, EMECC...) - [N] is an integer 26 Figure 11: the Run Control panel. The circled button is the one to remove a PMG from the DAQ. Figure 12: Upper toolbar. Surrounded are the buttons to edit OKS database(green) and reload it (blue). 27 Figure 13: OKS editor windows. A : list of object types - B : list of objects for a given type - C : list of property of an object. 28 4.6 DCS - Detector Control and Safety To open the DCS Panel in the control room, go to the LAr menu on the bottom of the screen. Click : LAr → DCS → LAr DCS FSM . Make sure you open this from the LAr button, not the DCS button, as the settings of the program will be different. From outside the control room, log on pcatlgcslin through the Atlas gateway (with the lardaq account for example - see A.2) and type: >/scratch/pvss/bin/viewlarfsm You should see the following graphic: Figure 14: DCS FSM screen for LAr. 4.6.1 Check the LVPS Status To check the subsystems, go through each of the six detector parts: EMB A, EMB C, EMEC A, EMEC C, HEC FCAL A, and HEC FCAL C. For each part, you will get a global picture first (15 for example) and then you can click further down the tree to see each set of crates (LV, HV, ROD). Starting with the EMB A as an example, first click “EMB A”. If all five sub-sub-systems on the left-hand side say READY and OK, and everything in the global picture is green, you can go on to the next subsystem. This means the FEC LVPS, HV, and ROD crates are all on, or any problems below are known and do not propagate upwards. The run coordinators can mask known problems so LAr may still say “OK” even when some pieces are off.s If anything in the global picture is gray, yellow, or red, or if there are any FATAL or ERROR conditions in the left-hand panel, you should follow this procedure: 1. Click to see the FEC status (for EMB A, click EMBA FEC). For each set of FEC’s, a panel 29 similar to the one shown in Figure 16 will pop up. 2. Note the components which are off or have errors. 3. Check the LAr WhiteBoard18 (*)19 to see if the components which are off or in error are already known problems. If they are known, you do not need to notify an expert. 4. If the off/error components are NOT on the WhiteBoard, notify the LAr Run Coordinator by phone. Check the LVPS for the rest of the subsystems by clicking “LAR” at the top of the tree to return you to the main page. Repeat the procedure with the EMB C (EMBC FEC), EMEC A (EMEC A FEC), EMEC C (EMEC C FEC), HEC FCAL A (HEC A LV), and HEC FCAL C (HEC C LV). If you want to know more about one crate you can click on the crate itself. (For the Barrel and EMEC, this will work, not for the HEC.) A new panel will show the actual voltage on the O.C.E.M. power supply in USA15 (around 280V), the current and the voltages on the output of the LVPS (DC-DC converter) on the detector. 4.6.2 Check the ROD Crate Status To check the ReadOut Driver (ROD) crates, go through each of the six detector parts: EMB A, EMB C, EMEC A, EMEC C, HEC FCAL A, and HEC FCAL C. For each part, you will get a global picture first and then you can click further down the tree to see each set of crates. 18 19 http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArWhiteBoard https://atlasop.cern.ch/twiki/bin/view/Main/LArWhiteBoard Figure 15: Compact view of DCS for a half barrel. From inner circle to outer one, are displayed : ROD crates, Front End Crates (FECs), cooling loops, High Voltage (HV) for Power Supplies, High Voltage. 30 1. Starting with the EMB A as an example, first click “EMB A”. 2. Click to see the ROD status (for EMB A, click “EMB A ROD”). A panel similar to the one shown in Figure 17 should pop up. 3. All the crates should say ON and OK. Note the components which are off or have errors. 4. Check the LAr WhiteBoard20 (*)21 to see if the components which are off or in error are already known problems. If they are known, you do not need to notify an expert. 5. If the off/error components are NOT on the WhiteBoard, notify the LAr Run Coordinator by phone. Check the RODs for the rest of the subsystems by clicking “LAR” at the top of the tree to return you to the main page. Repeat the procedure with the EMB C, EMEC A, EMEC C, HEC FCAL A, and HEC FCAL C. 4.6.3 Check HV Status To check the High Voltage (HV) status, go through each of the six detector parts: EMB A, EMB C, EMEC A, EMEC C, HEC FCAL A, and HEC FCAL C. For each part, you will get a global picture first and then you can click further down the tree to see each set of crates. 1. Starting with the EMB A as an example, first click “EMB A”. 2. Click to see the HV status (for EMB A, click “EMB A HV”). A panel similar to the one shown in Figure 18 should pop up. 20 21 http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArWhiteBoard https://atlasop.cern.ch/twiki/bin/view/Main/LArWhiteBoard Figure 16: DCS FSM screen for the LAr low voltage power supplies (LVPS). 31 3. All the crates should say ON and OK. Note the components which are off or have errors. 4. Check the LAr WhiteBoard22 (*)23 to see if the components which are off or in error are already known problems. If they are known, you do not need to notify an expert. 5. If the off/error components are NOT on the WhiteBoard, double-check to make sure you are reading the PHI numbers from the list and not the graphic, which is sometimes wrong. If it is really a new problem, notify the LAr Run Coordinator. For the HV, there are many sub-subsystems to check. For the EMB, you need to look at both “EMB HV A” and “LAR EMBPSA HV” for the PreSampler. The same goes for EMB C. The EMEC A and EMEC C have only one HV to check, while HEC FCAL A has two again (HEC A HV & FCAL A HV), and the same for HEC FCAL C (HEC C HV & FCAL C HV). 4.6.4 Check the DCS Alarms Screen To open the DCS Alarm Screen, go to the LAr menu (NOT the DCS menu) on the bottom of the screen. Click : LAr → DCS → LAr DCS Alarms. This will show all of the DCS alarms that apply to the LAr systems – temperature, voltages, etc. Alarms have a color code, and a letter in the first column. • FATAL - F - red • ERROR - E - orange • WARNING - W - yellow 22 23 http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArWhiteBoard https://atlasop.cern.ch/twiki/bin/view/Main/LArWhiteBoard Figure 17: DCS FSM screen for the LAr ROD crates in USA15. 32 If you see any red/(F)/FATAL errors, check the WhiteBoard first and then call the Run Coordinator. For each other alarm, right-click and select “Trend”. The orange line is the level for an ERROR, the yellow line is for WARNING. The blue line is the quantity as measured. Next, put the mouse cursor on the x-axis (time), and then use the scroll wheel to zoom out. You can see if the value is constant over the last few hours/days or if it is rapidly changing. For any WARNINGs or ERRORs that are not rapidly changing, check the WhiteBoard to see if they are known. If they are not known, make an entry in the e-log with the “complete” alarm information (the full line). They will be evaluated by an expert, and if the value is OK, the thresholds can be changed later. Keep this screen open and visible throughout the shift, reacting as above for each new alarm. * If you see ALL the alarms for ATLAS, you opened DCS → DCS Alarms, not LAr → DCS → LAr DCS Alarms. * If there are over 1000 alarms, maybe the HECLV is still in the filter. Look for new HECLV problems (within the past day), and check for them on the WhiteBoard or with an expert. Once you have dealt with the HEC problems, you want to filter out all the old messages. Double-click in the white box on the bottom left underneath the checkbox “Systems”. Click on ATLLARCLVTEMP, then hold down control and click on the other ATLLAR* systems. Do NOT include ATLLARHECLV. Next click the button that says “Apply filter” on the bottom right. Most of the messages should be gone. Figure 18: DCS FSM screen for the barrel HV. 33 4.7 4.7.1 Using the ATLAS e-log (ATLOG) Access and use of Elog The ATLAS e-log is now the main problem-reporting tool used by Liquid Argon. We are still refining the reporting procedure, so we appreciate your feedback. New categories for LAr have been defined, explained here: This menu appears when choosing ”Message Type = LAr” 1. Observation : intended to cover all observations with a limited time scale. Examples: problem of data integrity in one run, HV trip, excess in one histogram. The sub-menus mainly recall the 5 different fields identified in DQ, used in webdisplay/OHP . It seems that it can cover most of usual aspects (apart from DP perhaps). 1.1 Online environment - refers to a general problem (ex : monitoring PT crashes) 1.2 DCS 1.3 DAQ / Data integrity 1.4 Misbehaving channels 1.5 Signal 1.6 Physics 1.7 Other 2. Development / maintenance : this also covers a documentation for experts. Examples: replacement of a FEB, installation of a new HV module, upgrade of software 2.1 Hardware 2.2 Software 2.3 Other In DQ, we can imagine using mainly the field ”Observation”, leaving the status open, and defining which error it is (menu 2) and where it was observed. Regularly, coordinator of the 5 different sub tasks and the biweekly DQ coordinator (to be confirmed) go through the logbook and close the case if this is not a problem and assign it to someone for debugging if it is. We also would like to request to ATLOG team to be able to edit the messages (at least their status : such that we can change it from ”open” to ”closed” depending on its status). It is useful to note that in the new ATLOG version, the shift summary is posted by chosing ”Shift summary” in message type with a submenu giving the system type. 4.7.2 Information to put in the Elog There is a template for your shift log summary in the start-of-shift (signin-LAr) checklist. 34 5 Monitoring and Data Quality 5.1 5.1.1 Monitoring Displays DQMD See on the twiki information on DQMD24 (*)25 . 5.1.2 OHP See on the twiki information on OHP26 (*)27 . 5.1.3 Trigger Presenter See on the twiki information on the Trigger Presenter28 (*)29 . The option to launch the Trigger Presenter from the DAQ panel is written in the section 4.1. 5.1.4 Atlantis See on the twiki information on Atlantis30 (*)31 . 5.1.5 Other monitoring tools at P1 See on the twiki page32 (*)33 . 5.2 5.2.1 Where to find the monitoring data? Setting up ROOT To use ROOT in order to browse plots in the Control Room, set up ROOT with the following command: >source /det/lar/project/scripts/setup_root.sh. 5.2.2 Online Calibration ROOT Files The online histograms of calibration runs are stored in: /det/lar/project/Histogramming/PARTITION/ where PARTITION corresponds to the partition named used for the calibration set (See table 1) The ROOT filenames contain the partition name, run type and the run number. 5.2.3 Online Physics ROOT Files Online histograms of the global runs are stored on pc-tdq-mon-09. To retrieve them, simply execute the following script: /det/lar/project/scripts/get_online_root_file.sh <RunNumber> 24 https://pc-atlas-www.cern.ch/twiki/bin/view/Main/DataQualityMonitoringDisplay https://atlasop.cern.ch/twiki/bin/view/Main/DataQualityMonitoringDisplay 26 https://pc-atlas-www.cern.ch/twiki/bin/view/Main/OnlineHistogrammingPresenter 27 https://atlasop.cern.ch/twiki/bin/view/Main/OnlineHistogrammingPresenter 28 https://pc-atlas-www.cern.ch/twiki/bin/view/Main/TriggerPresenter 29 https://atlasop.cern.ch/twiki/bin/view/Main/TriggerPresenter 30 https://pc-atlas-www.cern.ch/twiki/bin/view/Main/AtlantisEventDisplay 31 https://atlasop.cern.ch/twiki/bin/view/Main/AtlantisEventDisplay 32 https://pc-atlas-www.cern.ch/twiki/bin/view/Main/MonitoringShifterOperationManual 33 https://atlasop.cern.ch/twiki/bin/view/Main/MonitoringShifterOperationManual 25 35 The file will be copied in the scratch area of your local machine. The filename includes the partition name, the run number and the lumi block number. The FEBMon histograms are saved at the end of each lumiblock. All histograms are saved at the end of a run (lumi block called l_EoR). You can also retrieve the online monitoring plots34 , as well as the online monitoring root files35 from the web display. 5.2.4 Offline Physics ROOT files If you want to look at more plots offline, you can retrieve the monitoring histograms produced at Tier0. ( This is not possible from the P1 computers, you have to use your own laptop). The monitoring files location is given on this wiki page : https://twiki.cern.ch/twiki/bin/view/Atlas/CosmicCommissioningReconstructionStatus36 (reachable from outside P1 only). 5.2.5 Using the event dump The event dump may be useful if the online monitoring is not available or if you want to look at something specific not available in the online monitoring. Log on the machine where the data are written (for Physics runs, the data are written by the SFI machines pc-tdq-sfi-00x directly on their disk /localdisk/data. To know which SFI machines are involved, you have to check the Run Control panel). You can have a look at raw data by typing: > /atlas-home/1/hwilkens/dumpeformat/dumpeformat -n NbOfEvts -v VerbLevel -f FileName.data Using the program dumpeformatroot instead of dumpeformat allows to produce a simple root file named output.root where are stored pedestal and noise for all FEBs. Look at appendix C for some help for reading the event dump. 5.3 Data Quality Checklists Section under construction. 5.3.1 Online - Calibration Runs The data integrity of the calibration runs can be checked using the following command lines : > cd /det/lar/project/scripts/ > source LArShiftSetup.sh > python CheckDataIntegrityCalibRuns.py [Run1] [Run2] The run range to look at is defined between [Run1] and [Run2]. For information, short calibration runs as Pedestals runs for HECFCAL and all PS runs have no associated monitoring files (because the run ends before the monitoring starts). Remark: With the DAILY and WEEKLY procedures, the data integrity check is automatically done at the end of the campaign for one partition. The summary of the check appears in the window that pops up. If it is not the case, have a look in your home directory at rf_cal_runs.log for the last set of runs. 5.3.2 Online - Physics Runs In the desktop toolbar, in the “LAr” Menu, open the “LAr-online-Monitoring” checklist. 34 http://atlasdqm.cern.ch:8080/webdisplay/online https://atlasdqm.cern.ch/tier0/Cosmics08 online root 36 https://twiki.cern.ch/twiki/bin/view/Atlas/CosmicCommissioningReconstructionStatus 35 36 5.4 Monitoring plots description The following subsections are describing the most important LAr monitoring plots. The monitoring histograms are produced with 2 different packages : • LArMonTools : To monitor basic information like DSPs, FEBs, digits, noise, calibration constants, detector coverage, cell masking... • CaloMonitoring : To monitor the energy reconstruction and noise at the cell and cluster level The plots are organized into 5 categories, used both in OHP (online) and on the DQ web display (offline). The “Run Info” and “Timing” categories contains histograms related to the full liquid argon calorimeter, while all the other categories are split by partition : EMBA, EMBC, EMECA, EMACC, HECA, HECC, FCALA, FCALC. • Run Info : summary plots about detector coverage, number of events, trigger type... • Data Integrity : contains fundamental checks about the ROD, DSP and FEB readout. Any plot with a suspicious behavior found in this tab is a sufficient reason to call the LAr run coordinator and possibly STOP the ongoing run. The recorded data will very probably be corrupted and useless. • High Energy Digits : plots used to monitor the detector timing and the pulse shape. • Timing: plots showing the collisions candidates • Energy Flow: total energy deposited in the calorimeter during a run. • MisBehaving Channels : plots used to spot hot cells or noisy detector regions. • CaloGlobal : to monitor higher level quantities like clusters, jets, EM objects. 37 5.4.1 Run Parameters Event Type • One 1d histogram for the full detector. • Description: Type of recorded data (Raw, physics, Calibration...) • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? At the beginning the run. • Expected status: Depends on the run plans. • DQMF checks: none Number of Samples • One 1d histogram for the full detector. • Description: Number of readout samples • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? At the beginning the run. • Expected status: No expected status. Depends on the run plans. • DQMF checks: none Nb Of Events per Minute • One 1d histogram for the full detector. • Description: Number of recorded events per 60 seconds block. • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? Anytime during the run. • Expected status: Flat distribution. Spike would indicate a problem with the trigger rate • DQMF checks: none Nb Of Rejected Events per Minute • One 1d histogram for the full detector. • Description: Number of events with at least one FEB showing error, per 60 seconds block. • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? Anytime during the run. • Expected status: Empty. Too many entries will indicate serious data integrity issues, and requires to look at the data integrity plots 5.4.4 • DQMF checks: none 38 ADC Threshold in DSP • One 1d histogram for the full detector. • Description: Threshold (in ADC count) above which the digits + energy computed in DSPs are transferred in the dataflow. • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? Anytime during the run. • Expected status: No expected status. This plot is aimed at helping us to quickly retrieve useful information for offline analysis. • DQMF checks: none DSP Threshold - Qfactor+time • One 1d histogram for the full detector. • Description: Threshold (in ADC count) above which the energy+time+quality factor computed in DSPs are transferred in the dataflow. • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? Anytime during the run. • Expected status: No expected status. This plot is aimed at helping us to quickly retrieve useful information for offline analysis. • DQMF checks: none Number of Events per L1 trigger bit • One 1d histogram for the full detector. • Description: Number of events passing L1 trigger terms. • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? Anytime during the run. • Expected status: No expected status. • DQMF checks: none Number of readout FEBs • One 2d histogram for the full detector. • Description: Number of readout FEB per partition. • OHP tab: [Run Info][Run Parameters][LAr] • When to check it? At the beginning of the Run. • Expected status: To be compared with expected numbers written on the Whiteboard. • DQMF checks: none 39 Number of events per Stream • One 1d histogram • Description: list of Streams available in the runs. • OHP tab: [Run Info][Run Parameters][ATLAS] • When to check it? Anytime during the run. • Expected status: No expected status • DQMF checks: none Raw Stream correlation • One 2d histogram • Description: events overlap between streams. • OHP tab: [Run Info][Run Parameters][ATLAS] • When to check it? Anytime during the Run. • Expected status: No expected status • DQMF checks: none 5.4.2 Detector Coverage Coverage - Sampling - Partition • One 2d histogram per sampling (S0=PreSampler, S1=Front, S2=Middle, S3=Back). • Description: missing cell/missing FEB (0 - White), KNOWN missing FEB (1 - Purple), masked cells (2 - Blue), good cells (3 - Green), FEB supposedly missing, but actually readout (4 - Red) • OHP tab: [Run Info][Detector Coverage][PARTITION] • When to check it? At the beginning of a run. • Expected status: Most of the cells should be green, except the known missing FEBs documented on the WhiteBoard. In case of holes not documented on the WhiteBoard or in case of red FEBs, report to the LAr run Coordinator immediately. • DQMF checks : none 40 5.4.3 Data Integrity DSP Checks that computations of Energy (E), Time (T) and Quality Factor (Q) in the DSP (online) are made properly by comparing to the same quantities computed offline. Number of errors per partition and per gain • One 2d histogram for the whole detector • Description: Number of times where the difference between offline and online quantities is above a tolerance threshold: • OHP tab: [Data Integrity][DSP Physics][Summary] • When to check it ? All along the run • Expected status: Empty – If errors are general (many partitions, many gains), call the run coordinator – If error in a specific partition, open the corresponding tab • DQMF checks: none E(DSP) - E(offline) distribution • One 2d histogram for the whole detector • Description: – X-axis : E(DSP) - E(offline) – Y-axis : energy range (the tolerance on the energy computation depends on the energy range) • OHP tab: [Data Integrity][DSP Physics][Summary] • When to check it? All along the run • Expected status: – – – – Range Range Range Range 0 1 2 3 : : : : E E E E < 213 < 216 < 219 < 222 : : : : entries entries entries entries between between between between ±1 MeV. ±8 MeV. ±64 MeV. ±512 MeV. • DQMF checks: none T(DSP) - T(offline) distribution • One 1d Plot for the whole detector • Description: T(DSP) - T(offline) • OHP tab: [Data Integrity][DSP Physics][Summary] • When to check it? All along the run • Expected status: distribution centered on 0, between ±10 picoseconds. Some outliers up to a few 10 picoseconds can be seen. The RMS should not exceed 10 picoseconds. • DQMF checks: none 41 Q(DSP) - Q(offline) distribution • One 1d Plot for the whole detector • Description: (Q(DSP) - Q(offline))/Sqrt(Q(offline)) • OHP tab: [Data Integrity][DSP Physics][Summary] • When to check it? All along the run • Expected status: Distribution centered on 0 (large peak), between ±1. • DQMF checks: none Errors number per FEB • One 2d histogram per partition in FT/Slot plane • Description: Number of times where the difference between offline and online quantities is above a tolerance threshold. • OHP tab: [Data Integrity][DSP Physics][PARTITION] • When to check it? If the summary plots show errors • Expected status: Empty – If a few entries : check if the involved FEB is known to be unhappy ... – If all partition in error : problem of constants loaded in the DSP, call the run coordinator • DQMF checks: Yellow if 1 entry, Red if more than 1 entry Correlation between E(DSP) and E(offline)[respectively T and Q] • 1d Scatter Plot per partition and per quantity (E,T,Q) • Description: Correlation between online and offline quantities • OHP tab: [Data Integrity][DSP Physics][PARTITION] • When to check it? If the summary plots show errors • Expected status: Perfect correlation (slope = 1) • DQMF checks: none 42 5.4.4 Data Integrity FEB Offline Rejection Yield • One 1d histogram for the full detector. • Description: % of events rejected because of data integrity problems. “Whole event corrupted”: the number of readout FEB was not constant during the run “Single FEB corrupted”: the data integrity problem is localized in specific FEBs. • OHP tab: [Data Integrity][FEB Errors][Global] • DQMF check: Fraction of rejected Events < 1% • Expected status: 100% accepted events Number of Readout FEBs • One single 1d histogram per half barrel/endcap • Description: Compact view of number of readout FEBs per event. Only a check of DSP header is performed (the data block can be empty!). • OHP tab: [Data Integrity][FEB Errors][Global] • When to check it? systematically. • Expected status : The distribution must be a dirac. Check on the White Board37 which partitions are in the readout to determine how many FEBs are expected. The total number of FEBs for a completed partition can be found in table 6. For half of the barrel, there should be 448 boards. For each of the endcaps, there should be 314. The entire dectector should have 1524 boards. • DQMF check: none Number Of LArFEBMon Errors • One 2d histogram for the full detector. • Description : Number of FEBMon errors for each partition. • OHP tab: [Data Integrity][FEB Errors][Global] • When to check it? Anytime during the run. • Expected status : Empty if everything runs fine • DQMF checks : none • For more info about FEB errors, check Appendix E 37 https://atlasop.cern.ch/twiki/bin/view/Main/LArWhiteBoard 43 Number of Events DSP header • One 2d histogram per half barrel/endcap • Description: Number of events acquired per FEB. Only a check of DSP header is performed (the data block can be empty!). • OHP tab: [Data Integrity][FEB Data][Barrel/Endcap] • When to check it? systematically. • Expected status: the number of events must be uniform among all FEBs of a partition and contains no unexpected holes. The barrel should have a board in each bin, while the endcap has a more interesting structure seen below. Do not worry that it’s a different plot below, showing errors instead of events, the structure is the same. This will eventually be checked against a reference histogram instead of by eye. • DQMF check : none. Average Number of cells above DSP thresholds • One 2d histogram per half barrel/endcap • Description: Number of events where the digits/time+quality factor are sent by DSP. • OHP tab: [Data Integrity][FEB Data][Barrel/Endcap] • When to check it? Systematically • Expected status: depends on the thresholds. See plots in section 5.4.1 • DQMF check : none. 44 5.4.5 High Energy Digits Type of run what to look at. • Cosmics runs : look at High Energy Digits CosmicCalo tab. • Circulating beams runs, no collisions : look at High Energy Digits CosmicCalo tab. • Circulating beams runs, splashes : look at High Energy Digits L1Calo tab. • Collisions runs : look at High Energy Digits L1Calo tab. • For each of the previous case you also should look at High energy Digits Timing tab. • note (a) : Most of the histograms are filled for a particular stream. This stream is written in the histogram title. If the histogram is filled for all the streams, it’s also written in the histogram title. • note (b) : L1Calo events are selected on filled bunches of LHC, while CosmicCalo events are selected on empty bunches of LHC. • note (c) : Some other information are written in the histogram title: expected sample max, range, selection cut... High Energy Digits Summary • One 2d histogram for the whole detector. • Description : Summary of errors per partition • OHP tab : [High energy Digits][High energy Digits][Global] • When to check it? Any time during the run. • First Bin : Number of Channel with max sample outside the expected range at least 0.5% of the run. The percentage is computed dynamically during the run, that’s the reason why the number of error increase and decrease. • Second Bin : Number of Channel having been saturated at least once in the run. (ie: max=4095 ADC count). • Third Bin : Number of Channel with the min sample = 0 ADC count at least once during the run. • Fourth Bin : Mean time of the sub-partition. • Expected status : Ideally the first 3 bin should be empty. The last bin should be filled as long as the partition as triggered an event, and the average time should be close to the expected one given in the title of the histogram. • DQMF checks : none 45 Normalized Signal Shape • One 1d histograms per partition. • Description : Average signal pulse for high energy cells. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a run. • Expected status : Nice pulse shape if only cells with cosmics signal enters in the plot. Flat or distorted shape if noise is dominant. • Stream monitored : written in the title. • DQMF checks: none Energy vs Sample Number • One 1d histograms per partition. • Description : Energy (in ADC) of the highest sample vs sample number. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a run. • Expected status: Mostly peaked. Peak value should be close as the expected one given in the title. If noisy cells are selected, the distribution should be more flat (max sample is random) and with low energy. • Stream monitored : written in the title. • DQMF checks: none Max Sample vs Time • One 1d histograms per partition. • Description: Energy (in ADC) of the highest sample vs time. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Flat, value close to expected sample max given in the title. • Stream monitored : written in the title. • DQMF checks : none 46 Average Position Max Digit • One 2d histograms per partition. • Description: Average position of the sample max for each FEB in the partition. Select only events passing a 5 sigma cut. Each FEB should be close to the expected sample max given in the title. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Flat. • Stream monitored : written in the title. • DQMF checks : none Out Of Range • One 2d histograms per partition. • Description: Yield of events with max sample outside the temporal range given in the title. Select events passing a 5 sigma cut. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Empty. If one bin starts to grow up, most of the time means there is one or more noisy channels in the FEB, could then check out of range at channel level. If noise is dominant in the run this histogram will be filled with high values. • Stream monitored : written in the title. • DQMF checks : none Out of Range at Channel level • One 2d histograms per partition. • Description: Same histogram than Out Of Range, but information displayed here for each channels. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Empty. • Stream monitored : written in the title. • DQMF checks : none 47 Null Digit • One 2d histograms per partition. • Description: Yield of events with at least one digit null (ie content=0, without subtracting the pedestal).Could be correlated with the DSP monitoring. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Empty. If not, the yield should be low, try to correlate with DSP monitoring, or Q factor monitoring (noise burst event). • Stream monitored : All streams. • DQMF checks : none Null Digit at Channel level • One 2d histograms per partition. • Description: Same as Null Digit but at channel level. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Empty. If not, the yield should be low, try to correlate with DSP monitoring, or Q factor monitoring (noise burst event). • Stream monitored : All streams. • DQMF checks : none Saturation • One 2d histograms per partition. • Description: Yield of events with at least one saturated sample (ie content=4095). • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Empty. If not, the yield should be low, try to correlate with Q factor monitoring (noise burst event). • Stream monitored : All streams. • DQMF checks : none 48 Saturation at Channel level • One 2d histograms per partition. • Description: Same as saturation but at channel level. • OHP tab: [High energy Digits][High energy Digits][PARTITION] • When to check it? systematically during a cosmic run. • Expected status: Empty. If not, the yield should be low, try to correlate with Q factor monitoring (noise burst event). • Stream monitored : All streams. • DQMF checks : none Max Sample per Stream • One 1d histograms per partition. • Description: Average position of the sample max per Stream. Select only events passing a 5 sigma cut. • OHP tab: [High energy Digits][High energy Digits Timing][Max Sample per Stream] • When to check it? systematically during a cosmic run. • Expected status: Should be peaked for each streams at the expected sample given in the title. • Stream monitored : All streams. • DQMF checks : none Trigger Word • One 1d histograms per partition. • Description: Average position of the sample max per L1 trigger word. Select only events passing a 5 sigma cut. • OHP tab: [High energy Digits][High energy Digits Timing][Trigger Word] • When to check it? systematically during a cosmic run. • Expected status: Flat. • Stream monitored : All streams. • DQMF checks : none 49 5.4.6 Timing Difference in time between C and A sides • One 1d histogram for the full detector. One entry per event. • Description: Difference ofn average particle arrival time between C and A sides. • OPH tab: [Timing][Run] • When to check? Any time during the run • Expected Status: Collisions candidates events should be centered around zero. • DQMF check:ONLINE. In DQMD, important background from the beams will turn the “Beam Background” DQ regions yellow or red. this is not a problem for LAr, it’s only an indication of the beam quality. Difference in time between C and A sides vs Lumiblock number • One 2d histogram for the full detector. One entry per event. • Description: Difference ofn average particle arrival time between C and A sides, vs lumiblock number • OPH tab: [Timing][Run] • When to check? Any time during the run • Expected Status: Collisions candidates events should be centered around zero. • DQMF check: NONE 5.4.7 Energy Flow Total Cell Energy vs (η,φ) for <sampling> - no Threshold, rndm trigger • One 2d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA ...) • Description : Distribution in (η,φ) of the total accumulated energy in a given cell • OHP tab: [MisBehaving Channels][CaloCells-RNDM][PARTITION] and [MisBehaving Channels][CaloCells][PARTITION] • DQMD tab: [CaloGlobal] • When to check it? • Expected status: The distribution is expected to be uniform in φ strips at fixed eta. In min bias stream holes/depressed areas can indicate dead cells. • DQMF check: OFFLINE: tested in DQ offline Web Display. ONLINE: DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX. 50 5.4.8 Quality Factor The quality factor Q is calculated as the the quadradic difference between the measured pulse shape (in ADC counts, pedestal subtracted) and the expected pulse shape. For cosmics and for the initial collision data, the calculation is performed iterating on the relative phase between data and prediction independently for each cell above a certain threshold (the digit must have been written in the bytestream, so that depends on the DSP settings). The quality factor is not normalised to the amplitude or energy, so it is expected to increase with energy for a give gain selection. Studies on cosmics and early collisions show that cutting on Q > 4000 is a safe cut to select noisy channels. This is the definition for a noisy channel in the following. A preamplifier is linked to four channels, it is considered to have noise if three or four channels are noisy. Most FEBs have 128 channels connected, a FEB will be considered as having noise if more that 30 channels have noise. Because of the cross-talk that alter the pulse shape, it is possible that for some preamplifiers to be wrongly flagged as noisy in a given event, that is why one has to juge on the rate of occurence over the run. The cut to define a FEB as noisy is safe and one would not expect real energy deposits to fake a noisy FEB. Known bad FEBs are not declared noisy and thus will not appear in the histograms. In “free running”, LAr triggered events, like in cosmics, the noise can trigger the event and noisy FEB are regularly detected in particular from the “partial ring events” in the outer EMEC, and, less often, from coherent noise in the barrel presampler. The probability that such events happen in coincidence with beam crossing is small. So far, events with several noisy FEBs in collision runs were out-of-time cosmics. Since we did not have enough experience with cosmics and early collision runs, there are no histogram for FCAL nor HEC. Number of noisy FEB • One 1d histogram for the whole detector • Description: histogram of number of FEBs declared bad per event • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime durimg the run • Expected status : less than 5, most probably events with more than 10 have problems • DQMF check: OFFLINE:none. ONLINE:none. Time of noisy FEB • One 1d histogram for the whole detector • Description: Time (hours in the day) when a noisy FEB was detected • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime during the run • Expected status : watch for noise bursts • DQMF check: OFFLINE:none. ONLINE:none. 51 Number of noisy preamplifiers • One 1d histogram for the whole detector • Description: histogram of number of preamplifiers declared bad per event • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime during the run • Expected status : to be defined • DQMF check: OFFLINE:none. ONLINE:none. Time of noisy preamplifier • One 1d histogram for the whole detector • Description: Time (hours in the day) when a noisy preamplifier was detected • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime during the run • Expected status : watch for noise bursts • DQMF check: OFFLINE:none. ONLINE:none. Percentage of events with FEB noisy (was: Noisy FEB fraction) • One 2d histogram per partition • Description: feedthrough vs slot histogram of the fraction of event in which the FEB was declared noisy • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime during the run • Expected status : less than 1% • DQMF check: OFFLINE:none. ONLINE:none. Number of noisy FEB per LBN • One 1d histogram per partition • Description: histogram of the LBN in which a FEB was declared bad (there could be more than one entry per event) • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime during the run • Expected status : allows to identify LBNs where e.g. external noise could have been injected in the LAr. • DQMF check: OFFLINE:none. ONLINE:none. 52 Percentage of events with PA noisy (was: Noisy PA fraction) • One 2d histogram per partition • Description: preamplifier number (arbitrary, in increasing channel number order) vs feedthrough/slot histogram of the fraction of event in which the preamplifier was declared noisy • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime during the run • Expected status : less than 1% • DQMF check: OFFLINE:none. ONLINE:none. Number of noisy PA per LBN • One 1d histogram per partition • Description: histogram of the LBN in which a preamplifier was declared bad (there could be more than one entry per event) • OHP tab:[MisBehaving Channels][Quality Factor] • When to check it? Anytime during the run • Expected status : allows to identify LBNs where e.g. external noise could have been injected in the LAr. • DQMF check: OFFLINE:none. ONLINE:none. 53 5.4.9 MisBehaving Channels Digits Number of monitored channels • One single 1d histogram for the whole LAr. • Description: Number of monitored channels per partition. With respect to the readout channels, the channels flagged in the BadChannelDB and the channels without reference pedestal/noise (if retrieved from COOL) are removed. • OHP tab: [MisBehaving Channels][Digits][Global] • When to check it? at the beginning of a run (if reference pedestals/noise are retrieved from COOL) or after some time (if reference are computed from first events of the run). • Expected status : The typical numbers of channels for a given partition are summarised in table 6. The Endcap is the sum of the Standard EMEC + Special EMEC + HEC + FCAL = 39,800 channels. When the whole barrel/endcap is readout, the number of channels should be close. If it is not the case, this may be due to some missing conditions (if the reference pedestals/noise are read from COOL) : in this case, check the plots in the Detector Coverage tab). • DQMF check: none Odd events yield • One single 1d histogram for the whole LAr. • Description: yield of odd events in the whole detector. One entry is made for each “ATLAS event”. • OHP tab: [MisBehaving Channels][Digits][Global] • When to check it? systematically. • Expected status : the yield should be around the expected gaussian behaviour. For a 3 sigma cut and both tails (only negative), one expects a yield around 0.27% (0.13%). If the histogram peaks at 0, that means that no reference pedestals/noise are available. If it is empty, a major overflow may be suspected : this is usually the case when the references are not reliable. • DQMF check: none Odd events temporal distribution • One single 1d histogram for the whole LAr. • Description: number of odd events (i.e cells which are 3 sigma away from the reference) as a function of time (event id or basic event counter) • OHP tab: [MisBehaving Channels][Digits][Global] • When to check it? systematically. • Expected status : The number of odd events as a funtion of time should be flat. • DQMF check: none 54 Proportion of odd events per channel • One 2d histogram per partition. • Description : number of odd events per channel. On the X axis, one can find all the FEBs of a given partition (ordered first by half crate and inside the crates by increasing slot • see table 5). On the Y axis, one can find the 128 channels of each FEB. The empty bins correspond to channel not monitored (either not connected, with missing conditions or flagged as bad in the DB). • OHP tab: [MisBehaving Channels][Digits][PARTITION] • When to check it? When hot trigger towers or accumulations of clusters are found • Expected status: the yield for all channels should be around the expected gaussian value 0.27%. To determine which channels were spotted by the summary plot, double click on the histogram to have access to all root options and redefine the minimum value in Z to exhibit the channels that are above the threshold. First check quantitatively the observed increase of noise. Then identifiy whether all channels are widespread in all detectors or grouped per FEB (probably a problem of bad references) or by shaper (probably a hardware problem). • DQMF check: none Proportion per FEB of odd events • One 2d histogram per partition. • Description: number of odd events per FEB.On the X axis, one can find all the half crates of a given partition. On the Y axis, one can find all the FEB of the half crates (see table 5). The empty bins correspond to FEBs not monitored (this is especially the case for the endcaps where a lot of slot are not populated). • OHP tab: [MisBehaving Channels][Digits][PARTITION] • When to check it? When hot trigger towers or accumulations of clusters are found • Expected status: in each bin (corresponding to one FEB), is filled the yield of odd events in this given FEB. As in this case, the odd events for all channels, this quantity is less easy to interpret in term of noise increase. With a much larger statistics, the gaussian behaviour should be observed much faster than in the case of individual channels. • DQMF check: none Odd sums per FEB • One 2d histogram per partition. • Description: Fraction of events where the sum of the cell energy per FEB is above 3 sigma. On the Y axis, one can find all the FEB of the half crates (see table 5). The empty bins correspond to FEBs not monitored (this is especially the case for the endcaps where a lot of slot are not populated). • OHP tab: [MisBehaving Channels][Digits][PARTITION] • When to check it? When hot trigger towers or accumulations of clusters are found 55 • Expected status: distibuted around 0.27% in each bin (corresponding to one FEB). FEBs with higher value are showing coherent noise. • DQMF check: none Odd Channels Yield per event • One 1d histogram per partition. • Description: number of channels above 3 sigma • OHP tab: [MisBehaving Channels][Digits][PARTITION] • When to check it? When L1 Calo trigger bursts are observed. • Expected status: Centered on 0.27%. Tails will indicate evenst with large coherent noise bursts. • DQMF check: OFFLINE. Flag turns yellow if tails are found. Time of bursty events • One 1d histogram per partition. • Description: number of channels above 3 sigma • OHP tab: [MisBehaving Channels][Digits][PARTITION] • When to check it? When L1 Calo trigger bursts are observed. • Expected status: Empty. If burst are observed, they might also be visible in the timing plots of 5.4.1 • DQMF check: none 56 5.4.10 MisBehaving Channels CaloCells Percentage of events in (η,φ) for <sampling> - Ecell < 3σ • One 2d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA, HECC, FCALA, FCALC). • Description: Percentage occupancy as a function of η,φ i.e fraction of the events where Ecell < 3σ. • OHP tab: [MisBehaving Channels][CaloCells-L1Calo][PARTITION], [MisBehaving Channels][CaloCells][PARTITION], [MisBehaving Channels][CaloCells-BPTX][PARTITION] • DQMD tab: [CaloGlobal] • When to check it? At least when 1000 events have been processed. • Expected status : the distribution should be uniform with a bin value around 0.135. • DQMF check: OFFLINE: tested in DQ offline web display. Online:Search for deviations. DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX. Percentage of events in (η,φ) for <sampling> - |Ecell | > 4σ • One 2d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA, HECC, FCALA, FCALC). • Description : Percentage occupancy as a function of η,φ i.e. fraction of the events where |Ecell | ¡ 4 σ. CaloTopoCluster seeds. • OHP tab: [MisBehaving Channels][CaloCells-L1Calo][PARTITION], [MisBehaving Channels][CaloCells][PARTITION], [MisBehaving Channels][CaloCells-BPTX][PARTITION] • DQMD tab: [CaloGlobal] • When to check it? At least when 100000 events have been processed. • Expected status : in RNDM stream the distribution should be uniform with a bin value around 0.63×10−2 . • DQMF check: OFFLINE:tested in DQ offline web display. Online:Search for deviations. DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX. Percentage Deviation: (Energy RMS - DBNoise)/DBNoise vs (η,φ) for < sampling> rndm stream • One 2d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA, HECC, FCALA,FCALC). • Description : Distribution in (η,φ) of the values: (measured energy RMS - Noise in the database)/Noise in the database. • OHP tab: [MisBehaving Channels][CaloCells-RNDM][PARTITION] and [MisBehaving Channels][CaloCells][PARTITION] • DQMD tab: [CaloGlobal] 57 • When to check it? After at least 10 events have been recorded in the rndm stream. • Expected status: The distribution should be centered at zero with variations that are less than 10%. • DQMF check: OFFLINE:tested in DQ offline web display. ONLINE:Search for deviations. DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX. Cell Energy/Noise(DB) - <sampling> • One 1d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA, HECC, FCALA,FCALC) • Description : 1d distribution of the ratio of cell energy to the database noise for a given cell. • OHP tab: [MisBehaving Channels][CaloCel-RNDM][PARTITION] • DQMD tab: [CaloGlobal] • When to check it? systematically. • Expected status : the distribution should be a Gaussian centered at zero with an RMS of 1. Check if the mean shifts significantly from zero. Report if the RMS is different from by more than 2 to 4 %. • DQMF check: OFFLINE:tested in DQ offline web display. ONLINE: DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX. Average Cell Energy vs (η,φ) for <sampling> - no Threshold, rndm trigger • One 2d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA, HECC, FCALA,FCALC) • Description : Distribution in (η,φ) of the average cell energy • OHP tab: [MisBehaving Channels][CaloCells-RNDM][PARTITION] and [MisBehaving Channels][CaloCells][PARTITION] • DQMD tab: [CaloGlobal] • When to check it? At least after 100 events have been processed. • Expected status : in random stream the distribution should be uniform with value around zero. Search for outstanding channels where the average is non zero. • DQMF check: OFFLINE: tested in DQ offline Web Display. ONLINE: DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX. Cell Energy • One 1d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA, HECC, FCALA,FCALC). • Description : Energy distribution of all the cells in the sampling. • OHP tab: [MisBehaving Channels][CaloCells-L1Calo][PARTITION] and [MisBehaving Channels][CaloCells-BPTX][PARTITION] 58 • When to check it? systematically • Expected status : Check for the presence of very large tails • DQMF check: OFFLINE.none. ONLINE. None. Percentage of events in (η,φ) for <sampling> Ecell > 5σ • One 2d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA, HECA, HECC, FCALA, FCALC). • Description : Percentage occupancy as a function of η,φ i.e. fraction of the events where |Ecell | ¡ 4 σ. CaloTopoCluster seeds. • OHP tab: [MisBehaving Channels][CaloCells-L1Calo][PARTITION], [MisBehaving Channels][CaloCells][PARTITION], [MisBehaving Channels][CaloCells-BPTX][PARTITION] • DQMD tab: [CaloGlobal] • When to check it? • Expected status: The distribution is expected to be uniform in φ strips at fixed eta. In min bias stream holes/depressed areas can indicate dead cells. In RNDM stream the distribution should be uniform with a bin value around 5.7×10−5 (only after at least ten million events). • DQMF check: OFFLINE:tested in DQ offline web display. DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX. 59 5.4.11 MisBehaving Channels RawChannels Mean Energy (MeV) • One 2d histogram per partition. One entry per cell. • Description: Average cell energy vs eta and phi, in MeV. • OHP tab: [MisBehaving Channels][RawChannels][PARTITION] • When to check it? systematically during a run. • Expected status: The average energy should be centered around 0. Regions with significant deviation from 0 indicate hot cells or cells with wrong calibration constants • DQMF check: OFFLINE. Find cells with ABS(Mean Energy) > 50 MeV. Percentage of events above 3 sigma • One 2d histogram per partition. One entry per cell. • Description: Fraction of events with energy greater than 3 times the noise stored in database. • OPH tab: [MisBehaving Channels][RawChannels][PARTITION] • When to check? After at least 1000 events have been processed. • Expected Status: The bin values should be around 0.27%. • DQMF check:OFFLINE. Look for channels over 1.5 % Percentage of events below 3 sigma • One 2d histogram per partition. One entry per cell. • Description: Fraction of events with energy lower than -3 times the noise stored in database. • OPH tab: [MisBehaving Channels][RawChannels][PARTITION] • When to check? After at least 1000 events have been processed. • Expected Status: The bin values should be around 0.27%. • DQMF check:OFFLINE. Look for channels over 1.5 % 60 5.4.12 CaloGlobal Hit map of cells with E/Ecluster >0.9 • One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π). • Description: distribution in η,φ of the number of clusters whose energy is accounted for by 90% or more by one cell. • OHP tab: [Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters] • DQMD tab: [CaloGlobal] • When to check it? After 100 events at first then certainly systematically beyond 10000 events. • Expected status : The distribution should be φ symmetric. Search for oustanding bins at fixed eta. • DQMF check: OFFLINE: tested in DQ offline web display. ONLINE: present and tested in DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters. Average number of cells in clusters • One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π) • Description: distribution in η,φ of the average number of cells in a cluster. • OHP tab:[Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters] • DQMD tab: [CaloGlobal] • When to check it? After 100 events at first then certainly systematically beyond 10000 events. • Expected status: The distribution should be φ symmetric. • DQMF check: OFFLINE: tested in DQ Offline web display. ONLINE: present and tested in DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters. Avg energy of cluster with Energy > 0.0 GeV • One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π) • Description: distribution in η,φ of the average energy of positive energy clusters • OHP tab:[Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters] • DQMD tab: [CaloGlobal] • When to check it? After 100 events at first then certainly systematically beyond 10000 events. • Expected status: The distribution is expected to be uniform in φ strips at fixed eta. Highly populated rehoins indicate possible coherent noise effects. • DQMF check:OFFLINE: tested in DQ Offline web display. ONLINE: present and tested in DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters 61 Hit Map of Cluster with Energy > 0.0 GeV • One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π). • Description: distribution in η,φ of number of positive energy clusters. • OHP tab:[Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters] • DQMD tab: [CaloGlobal] • When to check it? After 100 events at first then certainly systematically beyond 10000 events. • Expected status : The distribution is expected to be uniform in φ strips at fixed eta. • DQMF check: OFFLINE: tested in DQ Offline web display. ONLINE: present and tested in DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters Eta Energy > 0.0 GeV • One 1d histogram with with η in (-4.9,4.9) • Description: distribution in η of number of positive energy clusters (integrated over φ) • OHP tab:[Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters] • DQMD tab: [CaloGlobal] • When to check it? After 100 events at first then certainly systematically beyond 10000 events. • Expected status : Depending on trigger distribution. Expect a parabula-like shape for min bias events (low occupancy in the central, high occupancy in forward region). Following the granularity for random stream (high flat occupancy in central region, a sharp decrease after |η| = 2.5 and peaks at |η| = 3.2 and 4.2) • DQMF check:OFFLINE: shown in DQ Offline web display. ONLINE: shown in DQMD, under CaloGlobal: navigate to CaloMon/CaloTopoClusters Phi Energy > 0.0 GeV • One 1d histogram with with φ in (-π,π) • Description: distribution in φ of number of positive energy clusters (integrated over eta) • OHP tab:[Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters] • DQMD tab: [CaloGlobal] • When to check it? After 100 events at first then certainly systematically beyond 10000 events. • Expected status: The distribution is expected to be uniform. Consult DQMD under the CaloGlobal label (momentarily) • DQMF check:OFFLINE: shown in DQ Offline web display. ONLINE: shown in DQMD, under CaloGlobal: navigate to CaloMon/CaloTopoClusters 62 Tower Occupancy vs η and φ with E > 0.0 GeV • One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π) • Description : 2d distribution in (η,φ) of number of positive energy towers • OHP tab:[CombinedTowers] • DQMD tab: [CaloGlobal] • When to check it? Not yet. • Expected status : The distribution is expected to be uniform in φ strips at fixed eta. Consult DQMD under the CaloGlobal label (momnetarily) • DQMF check:OFFLINE: shown in DQ Offline web display. ONLINE: shown in DQMD, under CaloGlobal: navigate to CaloMon/CombinedTowers Energy in Most Energetic Tower • One 1d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π) • Description : Energy distribution for the most energetic tower • OHP tab:[CombinedTowers] • DQMD tab: [CaloGlobal] • When to check it? Not yet. • Expected status : The energy distribution of the most energetic tower • DQMF check: OFFLINE:none. ONLINE:none EtaPhi of Most Energetic Tower • One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π) • Description : 2d distribution in (η,φ) of the position of the most energetic tower • OHP tab:[CombinedTowers] • DQMD tab: [CaloGlobal] • When to check it? Not yet. • Expected status : The occupancy is expected to be uniform in φ strips at fixed eta. • DQMF check: OFFLINE:none. ONLINE:none 63 Tower Occupancy Vs Phi with E > 0.0GeV • One 1d histogram with full detector coverage: φ in (-π,π) • Description: 1d distribution in (φ) of the position of positive energy tower (integrated over all η) • OHP tab:[CombinedTowers] • DQMD tab: [CaloGlobal] • When to check it? Not yet. • Expected status : The occupany is expected to be uniform. • DQMF check: OFFLINE:none. ONLINE:none. Tower Occupancy vs Vs Eta with E > 0.0 GeV • One 1d histogram with full detector coverage: η in (-4.9,4.9) • Description : 1d distribution in (η) of the position of positive energy tower (integrated over all φ) • OHP tab:[CombinedTowers] • DQMD tab: [CaloGlobal] • When to check it? Not yet. • Expected status : caloTopoClusters) The occupany is expected to follow the granularity (similar to • DQMF check: OFFLINE:none. ONLINE:none. 64 A Tips to work at P1 A.1 Access rights To access the control room, you need to have a CERN ID, and take the basic safety training courses 1-4 (http://safety-commission.web.cern.ch/safetycommission/SC-site/sc pages/training/basic.html38 ) and request “ATL CR” through EDH. To access the LAr satellite control room (3159-R012), you have to go with your CERN ID to S. Auerbach (located at 124-R011). Tell them you will take LAr shifts and request access to 3159-R012. To access the underground area (including USA15), you now need a token and dosimeter. This is not required for most LAr shifters, who will remain in the control room only. To get this access, you need a medical evaluation and a radiation training course. If you do go into the underground area, it is also mandatory to wear safety shoes, a hard hat and lamp. A.2 Network To connect from/to P1 network, one has to go through the gateway called atlasgw as a user with an account on it (example lardaq). Not all shifters have permissions to use atlasgw, but all should be able to use atlasgw-exp. • Connect to P1 network from outside world : >ssh atlasgw.cern.ch (possible targets include pc-lar-scr-01, to 05, and pc-atlas-cr-03, 04, and 20) • Connect to outside world from P1 network : >ssh atlasgw-exp.cern.ch (from here, at Hostname you can type “lxplus” to open a browser, etc.) Very few web pages are accessible from within the P1 network. And, the P1 webserver (pc-atlaswww.cern.ch) is viewable only from P1, or with a proxy server. The P1 webserver is, however, mirrored (atlasop.cern.ch) for the outside world. More details on the connection of P1 to the outer world can be found at the following address: https://atlasop.cern.ch/FAQ/point1/39 . To find out if someone has a P1 account, you can use either of the following commands on a P1 machine: ldapsearch -xLLL ’gecos=*firstname*lastname*’ /daq_area/tools/bin/lfinger -U ’*firstname*lastname*’ To see what roles are enabled, you can use the second command, “lfinger” with the username of the person, without the “-U”. A.3 logout at P1 If you need to logout, and nothing works, press “Ctrl, Alt, Back space”. A.4 Printers A printer named 3162-1C01-HP is located at the 1st floor above the Atlas control room. 38 39 http://safety-commission.web.cern.ch/safety-commission/SC-site/sc pages/training/basic.html https://atlasop.cern.ch/FAQ/point1/ 65 A.5 Phone numbers LAr desks Atlas Control Room LAr satellite Control Room Tile desk Atlas Control Room LAr Run Coordinator 71346 70949 71446 162582 The list of expert phone numbers can be found here from the LArOperationManualShifter page40 (*)41 . A.6 Updating this document and checklists This document is created in Latex; the source files can be found in a SVN repositery. To get all the sources files, follow the standard SVN procedure: >export SVNGRP=svn+ssh://svn.cern.ch/reps/atlasgrp >svn co $SVNGRP/Detectors/LAr/AtLarOper/trunk AtLarOper The whole tree can be browsed on the web42 . The structure is very simple with a core tex file LArOperation.tex that includes all the sections. Please try to keep this overall architecture and try to use the new commands defined in LArOperation.tex to write down button, panel... names. Be careful about changing names of sections (the html files are named after the sections; changing a section name will break all links and require the checklists to be updated.) Once your modifications are commited in SVN, contact J. Leveque or S. Majewski in order to update the web. (They’ll update pc-atlas-www-1:/www/web files/html/lar). The checklists should be edited by experts only – and very carefully. One problem in one checklist can break all of the checklists in the Atlas Control Room! They can be found in /det/tdaq/ACR/XMLdata at Point 1. To make a new checklist: • Create the new XML file in the same directory as the others, for example, LAr-ex.xml • Copy the structure of the other xml checklists, giving the same title to your checklist as the name of the file. • Edit CheckList.xml in the same directory. Add your checklist in two places: into the top set with the \!ENTITY tag, typing the name of your checklist twice, and then below in the list of checklists once. The position in the bottom list will set the position of the checklist. Status of checklists: • signin-LAr replaces the old “LAr-Start-of-Shift”. It contains the tasks which shifters should complete when they first arrive. It is linked to the RunCom tool. Completing this checklist changes the Runcom state to “ready.” • startrun-LAr should be completed at the beginning of the shift, and at the beginning of a run. (startrun-LAr replaces the “desk” checklist.) • DQcheck-LAr is the checklist attached to the RunCom tool for the DQ status. It is short, to make sure that OHP is open and some quality control is done. It links to the longer LArOnline-Monitoring checklist. Completing these actions does not mean that the Data Quality 40 https://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArOperationManualShifter#Phone Numbers https://atlasop.cern.ch/atlas-point1/twiki/bin/view/Main/LArOperationManualShifter#Phone Numbers 42 https://svnweb.cern.ch/cern/wsvn/atlasgrp/Detectors/LAr/AtLarOper/#path Detectors LAr AtLarOper 41 66 itself is OK, it just means that the checks are being performed. Finishing this checklist changes the DQ status from “unchecked” to “checked”. (This used to be the “shifter” checklist.) • injection-LAr contains tasks that should be performed before giving the OK for beam injection • stablebeam-LAr will be used to bring the FCAL voltages back up, once there is stable beam. Right now, it is just a placeholder. • LAr-Shift-Tutorial is a short tutorial for new shifters, to check their credentials and introduce them to the tools. • LAr-Online-Monitoring and LAr-Offline-Monitoring are meant to guide the shifters through the most important monitoring checks. They are still works in progress. • LAr-Calibration.xml and LArg-Calibration.xmlOLD are obsolete, replaced by the detailed instructions in this manual. They have been removed from the CheckList tool, but remain in the directory. Tips for latex2html: • To add a link to this document where ”link name” is a hyperlink to the URL ”link-URL” in the hypertext version of the document, but no indication of this is included in the printed version, use \htmladdnormallink {link name} {link-URL} • To include the original link as \htmladdnormallinkfoot {link name} {link-URL} a footnote, use • Using the LATEX package “graphicx”, you can include a picture without defining its extension (eps or png). It is the LATEX compilation which will choose the correct version. So latex will use eps format and latex2html will use png. It resquests to have different version of the picture in your directory. See the file “Appendix/hwMemento.tex” for example. A.7 Creating graphics (screenshots) at P1 1. Open KSnapshot from the General Menu (in the bottom of your left screen) or Open a terminal and Type ‘ksnapshot” 2. A panel will open, and you should select Capture mode: “Window under cursor” 3. Click “New Snapshot” 4. Click the window you want to capture, and it will show up on its own in the ksnapshot program. 5. Click “Save As...” and you can save the file. 6. Now, you need to copy it from P1. Quit knapshot, if you like. 7. Back in the terminal, in the directory with the file, you can copy it to your own home area with cp <filename> /atlas-home/<0 or 1>/<yourusername>/. Now you can go to a terminal on your own machine and do scp atlasgw:/atlas-home/<0 or 1>/<yourusername>/<filename>. 67 Slot 1 EMB PS Std EMEC PS FEB type Spe EMEC PS 2 F0 F0 F0 3 4 5 6 7 8 9 10 11 12 13 14 15 F1 F2 F3 F4 F5 F6 B0 B1 M0 M1 M2 M3 - F1 F2 F3 F4 F5 B0 B1 M0 M1 M2 M3 - M0 M1 F1 F2 F3 F4 B0 M2 M3 F5 B1 M4 M5 HEC I1 (Emec inner wheel) I2 (Emec inner wheel) HEC-L1 HEC-L2 HEC-M1 HEC-M2 HEC-H1 HEC-H2 Table 5: Correspondence between slot and FEB type for all type of crates. B Hardware memento 68 FCAL F1 00 F1 01 F1 F1 F1 F1 F1 02 03 04 05 06 F1 F2 F2 F2 F2 F3 F3 07 00 01 02 03 00 01 Nb of FEBs Nb of channels EMB 448 5̃7k Std EMEC 208 2̃7k Spe EMEC 68 8̃k HEC 24 3̃k Table 6: Total number of FEBs of a given type in a half barrel/ endcap. 69 FCAL 14 1̃800 C Few hints on events dump If you want to further investigate a data integrity problem, it may be very useful to use event dump program directly on the machine on which the data are written (see 5.2.5). For more details on the meaning of the Ctrl words, please refer to documentation by J.Prast py6414cyclone v29.doc (v2.9) available on the website http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html43 One should finally note that most of the errors should be spotted in the LArFEBMon algorithm (running either online or offline). Detailed list to come. 70 ROD marker ee1234ee ROD Hdrsize 9 ROD Eformat 3000008 ROD source 410110 ROD runnb 12046 ROD evtid 0 : 01 ROD bcid 63 ROD trigger 0 Coding of event types : 2 : Calibration / 4 : Transparent / 7 : Physic format ROD evt type Transparent DSP block Size 2343 DSP FebId 0x39300000 Feb Side EMBA 04L PS DSP FebSer 000001020 Offset and sizes for different blocks. Depends on both settings in the LAr H/W control DSP Off Energy 0 Size 0 DSP Off Chi2 0 Size 0 DSP Off RawData 26 Size 2330 DSP Status 0x00000000 DSP Nb gain 1 DSP Nb samp 32 DSP Feb config 0x00000003 DSP Unknown 0x00000000 panel: RawData/Results + Format The Ctrl0 word contains 16 blocks (1 for each gain selector). The 4 first bits corresponds to a parity flag; the 4 following bits identify the gain selector (therefore all different between 0 and f); the 8 last bits (here : a0) correspond to the EVTID and should be equal for all gain selectors. Ctrl0 8a0 40a0 49a0 1a0 4aa0 2a0 ba0 43a0 4ca0 4a0 da0 45a0 ea0 46a0 4fa0 7a0 43 http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html The Ctrl1 word contains the BCID for the 16 gain selector. It should be the same for all gain selectors. Ctrl1 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62 62 The number following the Hdr block corresponds to the sample number; it must be between 0 and the Nb samp (see above in DSP block). The 16 following numbers (here 59) correspond to the number of the SCA cell (in hexadecimal) that is readout (1 per gain selector): they must be all equal. Hdr 00 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59 59 The number following the Samp block correspond to the sample number again; then are written the raw data for the 16 gain selectors followed by the gain (here High). This type of line is repeated 8 times to complete the 128 channels of the FEB. Samp00 h 1000 h 955 h 980 h 971 h 994 h 1003 h 915 h 972 h 975 h 966 h 983 h 1014 h 985 h 994 h 1009 h 986 Samp00 h 987 h 946 h 962 h 954 h 973 h 993 h 987 h 997 h 990 h 944 h 971 h 1021 h 1000 h 966 h 965 h 996 Samp00 h 1010 h 936 h 957 h 943 h 1027 h 1000 h 957 h 973 h 993 h 964 h 943 h 1032 h 995 h 963 h 991 h 987 Samp00 h 989 h 922 h 989 h 993 h 957 h 964 h 936 h 953 h 985 h 960 h 953 h 1024 h 1009 h 981 h 1013 h 1006 Samp00 h 953 h 940 h 988 h 970 h 1004 h 995 h 959 h 966 h 991 h 922 h 948 h 996 h 980 h 988 h 978 h 986 Samp00 h 947 h 941 h 972 h 980 h 970 h 998 h 925 h 958 h 975 h 956 h 971 h 1015 h 995 h 991 h 1005 h 1020 Samp00 h 950 h 965 h 957 h 986 h 1006 h 991 h 988 h 950 h 997 h 973 h 970 h 1025 h 962 h 982 h 1004 h 993 Samp00 h 936 h 927 h 963 h 972 h 990 h 997 h 955 h 947 h 987 h 925 h 941 h 979 h 1012 h 971 h 967 h 1022 71 Same as before but for the following sample. It is worth to note that for reasons of way that they are not consecutive from one sample to another. This is normal. Hdr 01 58 58 58 58 58 58 58 58 58 Samp01 h 1005 h 961 h 969 h 980 h 989 h 991 h 901 h 974 h 988 Samp01 h 976 h 928 h 949 h 963 h 989 h 984 h 974 h 1001 h 990 Samp01 h 1009 h 939 h 959 h 950 h 1030 h 987 h 972 h 959 h 1004 Samp01 h 999 h 922 h 999 h 990 h 948 h 966 h 941 h 944 h 1004 Samp01 h 955 h 930 h 967 h 986 h 993 h 1011 h 962 h 951 h 1025 Samp01 h 942 h 952 h 967 h 974 h 977 h 1010 h 900 h 980 h 992 Samp01 h 950 h 967 h 975 h 963 h 1011 h 991 h 979 h 965 h 1013 Samp01 h 927 h 924 h 946 h 978 h 1006 h 982 h 953 h 932 h 978 Hdr 02 Samp02 Samp02 Samp02 Samp02 h h h h 48 1003 970 994 992 h h h h 48 935 932 952 922 h h h h 48 974 957 973 1004 h h h h 48 983 956 968 981 h h h h 48 997 1007 1036 950 h h h h 48 985 995 990 976 h h h h 48 911 987 974 942 h h h h 48 980 992 947 950 h h h h 48 994 1005 1022 990 time/memory optimisation, the SCA cell numbers are encoded in such a h h h h h h h h 58 983 943 960 971 947 962 969 925 h h h h 48 975 939 961 964 h h h h h h h h 58 992 973 961 954 931 966 962 961 h h h h 48 992 969 960 941 h h h h h h h h 58 1023 1023 1042 1032 1007 1010 1023 984 h h h h 48 1028 1019 1036 1029 58 58 58 h 1002 h 1003 h 1012 h h 994 h 957 h 947 h h 989 h 964 h 999 h h 999 h 978 h 987 h h 957 h 984 h 997 h h 997 h 985 h 994 h h 970 h 984 h 1010 h h 996 h 979 h 970 h h h h h 48 978 996 997 996 h h h h 48 995 959 981 995 h h h h 48 1005 966 996 974 h h h h 58 959 1001 999 995 988 1025 1000 1016 48 962 1005 1022 995 Samp02 Samp02 Samp02 Samp02 Hdr 03 Samp03 Samp03 Samp03 Samp03 Samp03 Samp03 Samp03 h h h h 960 948 954 937 h h h h h h h 49 998 972 1000 985 954 952 971 h h h h 944 944 946 936 h h h h h h h 49 934 940 942 924 941 945 942 h h h h 959 972 968 963 h h h h h h h 49 988 957 969 1000 967 977 974 h h h h 986 972 966 979 h h h h h h h 49 951 947 963 982 980 955 977 h 1007 h 1000 h h 988 h 1005 h h 1029 h 985 h h 1002 h 960 h h h h h h h h 49 994 999 1037 962 1010 976 1005 h h h h h h h 49 1012 1007 993 970 1009 998 982 h h h h h h h 955 916 972 970 49 912 999 969 953 955 948 997 h h h h 938 945 947 939 h 1004 h h 983 h h 1010 h h 970 h 960 963 964 929 h h h h h h h 49 962 988 950 955 961 944 955 49 1004 995 1029 983 999 982 1008 49 981 935 964 976 937 965 972 h h h h h h h h h h h h h h h h h h 919 953 962 923 h 1021 h h 1013 h h 1017 h h 995 h 960 984 967 998 h h h h h h h 49 1000 978 960 963 939 952 976 49 1023 1015 1029 1038 995 1021 1019 49 995 1016 993 1003 984 989 983 h h h h h h h h h h h h h h h h h h 999 975 977 999 h h h h h h h 49 999 979 974 1009 993 981 980 h h h h 990 998 999 974 h 988 h 1022 h 1012 h 999 h h h h h h h 49 1000 961 983 984 983 995 983 49 976 1003 1011 998 986 999 1010 h h h h h h h 72 Here are skipped the blocks of data for samples between 04 and 31 ... ... The Ctrl3 word contains 16 blocks coded in hexadecimal (one for each gain selector). The 4 first bits are related to parity information. The 4 following to bit errors and SEU : 8 means OK. The 8 last bits contain the SPAC status : it should be equal to 07 for the first event and 05 for the following ones. A SPAC error should normally be propagated to the INFGA status. Ctrl3 4807 4807 4807 4807 4807 4807 4807 4807 4807 4807 4807 4807 4807 4807 4807 4807 The INFGA status contains several checks regarding BCID, EVTID, SCAC performed at the FEB level. It should normally be equal to 0x0. INFPGA status 0x0 Unknown 0 samples 32 gains 1 D What to remind from the old discussion forum? D.1 On the triggers How to find which triggers are included in a trigger stream (eg: L1Calo)? • open a terminal and type: /det/tdaq/scripts/start trigger tool • login as user • Double click on the correct SUPER MASTER KEY (you can find this in the top field of the trigger tab in the tdaq panel). Wait for this to load. Ignore the window which opens automatically and use the original window which opened • Click on L1 Streaming and a new window will open • Double click the stream you want (eg: L1Calo) and a list of the triggers comes up You can also use Streams instead of L1 Streaming for L2 and EF info How can I learn specifics about the definitions of triggers being run right now? • From a terminal : > /det/tdaq/scripts/setup_TDAQ_15.2.0.sh > /det/tdaq/scripts/start_trigger_tool • The trigger GUI will pop up. Login as ”User” • Click Search. You get will a list of all SuperMaster keys. • How to find the SuperMasterKey currently used in the run : – On the AtlasOperations wiki page, click on the Run Control WhiteBoard (at the top of the page) – The trigger menu is given in the ”basic Run parameters” sections • Choose the corresponding masterKey from the list in the trigger GUI. You will see all of the L1 trigger bits. Double-click on the ID number and a window will pop up showing the L1 triggers. One can expand any of them to see the logical definition and the different L1 conditions. Keep expanding down the tree to see thresholds, etc... D.2 Data flow picture 73 74 E More info about LAr FEB errors The LArFEBMon algorithm performs different checks on the data integrity. Here are briefly detailed the different errors reported; for more details on the bit significance, you should report to the table 1 of the reference document pu6414cyclone v29.doc downloadable at the adress http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html44 The data sent by one Front End Board (FEB) are always accompanied by a StatusWord, describing the status of several internal FEB components. This status word coded on 12 bits is encapsulated in Digital Signal Processor (DSP) header for further decoding. From this StatusWord, are derived by the algorithm 6 types of errors depending on the observed error bits : • Bit 6 : parity error; • Bits 2 or 7 : BCID mismatch between 2 halves or within one half; • Bits 3 or 8 : Sample header mismatch between 2 halves or within one half; • Bits 1 or 9 : EVTID mismatch between 2 halves or within one half; • Bits 4, 11 or 12 Wrong SCAC status within one half FEB or in one half of FEB; • Bit 5 : Gain mismatch within time samples; Additionally to these 6 types of errors, are performed 4 checks45 • Type mismatch : data blocks of several FEBs are of different types (Raw data, Physics data, or Calibration data). The first readout data block is taken as reference. • SCA out of range : the decoded SCA (analog pipeline located between shaper and ADC in the readout chain of the FEB) adress is outside the physical range [0;144]. This is probably due to a more severe data corruption or bad bite stream conversion. • Non uniform number of samples : data blocks of several FEBs have a different number of samples. The first readout data block is taken as reference. • Empty FEB data block : one FEB does not send any data (but the presence of a DSP header proves that it is included in the readout and therefore should send data). 44 http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html in the case of calibration runs or physics run in physic format, the sca block is not available, that prevents us to perform the 3 last checks detailed here after 45 75