LAr operation manual

Transcription

LAr operation manual
This document is focused on the operation of the LAr calorimeter in the ATLAS Control Room.
For more details on the hardware, you should refer to the original LAr Operation Manual that can
be found here : https://edms.cern.ch/document/834898/21
If you are reading this manual in PS or PDF, please realize that it is being updated each
week. Feel free to use it as a general reference, but for any particular details (which scripts
to use, where to find the data, etc) refer to the version on the web at P1 http://pc-atlaswww.cern.ch/lar/doc/AtLarOper.html/index.html2 or outside P13 . In general P1 links are given
first, and links outside P1 are identified with the (*) symbol.
1
https://edms.cern.ch/document/834898/2
http://pc-atlas-www.cern.ch/lar/doc/AtLarOper.html/index.html
3
https://atlasop.cern.ch/atlas-point1/lar/doc/AtLarOper.html/index.html
2
1
Contents
1 Getting Started
1.1 Before you arrive for shifts . .
1.2 When you first arrive for shift
1.3 During the shift . . . . . . . .
1.4 End of shift . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
4
4
5
5
2 Calibrations Runs
2.1 Taking Calibration Runs . . . . . . . . . . . . . . . .
2.2 Transferring data to castor at the end of a run . . .
2.2.1 Automatic processing of the calibration runs
2.2.2 Number of events to complete a run . . . . .
2.3 Monitoring of the Calibration Runs . . . . . . . . . .
2.4 Old way to take calibration runs . . . . . . . . . . .
2.5 Special calibration runs . . . . . . . . . . . . . . . .
2.5.1 SCA test runs . . . . . . . . . . . . . . . . . .
2.5.2 Start a trigger calibration run . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
6
6
8
9
9
10
10
10
12
13
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3 Physics Runs
16
3.1 What to do at the start of a physics run? . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Monitoring of physics run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4 Environment
4.1 The DAQ panel . . . . . . . . . . . . . . .
4.2 Opening the Monitoring Advanced Panel
4.3 Basic functionalities of the DAQ GUI . .
4.3.1 LAr H/W control parameters . . .
4.3.2 Complex deadtime and the Central
4.3.3 Segment and Resource . . . . . . .
4.3.4 LAr Crates . . . . . . . . . . . . .
4.3.5 The Run parameters . . . . . . . .
4.4 Description of different processes involved
4.5 OKS database . . . . . . . . . . . . . . . .
4.6 DCS - Detector Control and Safety . . . .
4.6.1 Check the LVPS Status . . . . . .
4.6.2 Check the ROD Crate Status . . .
4.6.3 Check HV Status . . . . . . . . . .
4.6.4 Check the DCS Alarms Screen . .
4.7 Using the ATLAS e-log (ATLOG) . . . .
4.7.1 Access and use of Elog . . . . . . .
4.7.2 Information to put in the Elog . .
5 Monitoring and Data Quality
5.1 Monitoring Displays . . . . . . . . .
5.1.1 DQMD . . . . . . . . . . . .
5.1.2 OHP . . . . . . . . . . . . . .
5.1.3 Trigger Presenter . . . . . . .
5.1.4 Atlantis . . . . . . . . . . . .
5.1.5 Other monitoring tools at P1
5.2 Where to find the monitoring data?
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
.
.
.
.
.
.
.
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
Trigger Processor
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
18
18
19
19
19
20
21
22
23
24
24
29
29
30
31
32
34
34
34
.
.
.
.
.
.
.
35
35
35
35
35
35
35
35
5.3
5.4
5.2.1 Setting up ROOT . . . . . . . . . .
5.2.2 Online Calibration ROOT Files . . .
5.2.3 Online Physics ROOT Files . . . . .
5.2.4 Offline Physics ROOT files . . . . .
5.2.5 Using the event dump . . . . . . . .
Data Quality Checklists . . . . . . . . . . .
5.3.1 Online - Calibration Runs . . . . . .
5.3.2 Online - Physics Runs . . . . . . . .
Monitoring plots description . . . . . . . . .
5.4.1 Run Parameters . . . . . . . . . . .
5.4.2 Detector Coverage . . . . . . . . . .
5.4.3 Data Integrity DSP . . . . . . . . .
5.4.4 Data Integrity FEB . . . . . . . . .
5.4.5 High Energy Digits . . . . . . . . . .
5.4.6 Timing . . . . . . . . . . . . . . . .
5.4.7 Energy Flow . . . . . . . . . . . . .
5.4.8 Quality Factor . . . . . . . . . . . .
5.4.9 MisBehaving Channels Digits . . . .
5.4.10 MisBehaving Channels CaloCells . .
5.4.11 MisBehaving Channels RawChannels
5.4.12 CaloGlobal . . . . . . . . . . . . . .
A Tips to work at P1
A.1 Access rights . . . . . . . . . . . . . .
A.2 Network . . . . . . . . . . . . . . . . .
A.3 logout at P1 . . . . . . . . . . . . . . .
A.4 Printers . . . . . . . . . . . . . . . . .
A.5 Phone numbers . . . . . . . . . . . . .
A.6 Updating this document and checklists
A.7 Creating graphics (screenshots) at P1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
35
35
35
36
36
36
36
36
37
38
40
41
43
45
50
50
51
54
57
60
61
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
65
65
65
65
65
66
66
67
B Hardware memento
68
C Few hints on events dump
70
D What to remind from the old discussion forum?
73
D.1 On the triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
D.2 Data flow picture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
E More info about LAr FEB errors
75
3
1
Getting Started
Welcome to Liquid Argon ATLAS shifts!
This manual should help you to complete all the tasks given to a LAr shifter.
When you find errors, things that are out-of-date, or sections that are confusing or could be
improved, please let us know. During the start-up in 2009, we will have a huge number of new
people going through this document for the first time. One of the jobs of those LAr shifters is to
help make shifts better, by letting us know what we can do to improve.
Please put your comments/questions/additions on the LAr Bug Reports4 where they can be
seen and addressed by experts.
When have any questions, you can contact the Run Coordinators: Paolo Iengo, Jessica Leveque,
Stephanie Majewski, or Damien Prieur.
1.1
Before you arrive for shifts
There are several things you can do before you arrive for shifts.
1. Get access to the Control Room. You will need a CERN ID, with access to the
“ATL CR” region to get into the door of the main control room. If you have not
taken the CERN safety courses, you need to take the “Basic safety course” in person at
CERN, offered two times each day in English and French. After that, take the “level
4” course online. Then request ATL CR access in EDH. More details can be found at
https://atlasop.cern.ch/twiki/bin/view/Main/LArShiftSignup5 .
2. You will also need to access the LAr satellite control room, next to the main control room.
To do this, you have to go with your CERN ID to S. Auerbach (located at 124-R011). Tell
him you will take LAr shifts and request access to 3159-R012. He will place your ID into a
machine and add the access rights there.
3. You will need accounts for the Point 1 machines and e-log. The LAr Operations crew will
request these accounts for you once you have signed up for shifts, and you should receive an
email letting you know. The password for your P1 account will be the same as your NICE
password for other application at CERN.
4. You should read the most recent slides under “LAr Shifter Tutorials” found at this page from
outside P16 or at P17 , and go through this manual.
1.2
When you first arrive for shift
Log into the RunCom tool (“New Shifter” and “Ready”). Follow the instructions from the “signinLAr” checklist that opens.
You don’t know how to do it? Here is a recipe :
• If the computers at the LAr desk are not already logged in, start the session. The username
is “crlar” and you can just hit “Enter” in the password field without filling in a password.
When prompted to select a role, choose “LAR:shifter”.
4
https://savannah.cern.ch/bugs/?func=additem&group=lar
https://atlasop.cern.ch/twiki/bin/view/Main/LArShiftSignup
6
https://atlasop.cern.ch/twiki/bin/view/Main/LArOperationManualShifter#LAr Shifter Tutorials
7
http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArOperationManualShifter#LAr Shifter Tutorials
5
4
• The RunCom tool will take you through the actions you need to accomplish as the shift
begins. Make sure the RunCom tool is open, if it is not, click on the bottom menu bar:
“General” → “RunCom Tool”. Click “new shifter” and then “Ready”. The “signin-LAr”
checklist will pop up. Complete this checklist.
• If the RunCom tool is broken, or if the checklists are broken, let the shift leader know
immediately. You can access LAr documentation by opening a web browser. The ATLAS
home page should come up. Select ”Documentation” on the left, and then ”LAr”.
1.3
During the shift
You will have to go through all the checklists during your shift (accessible from the “LAr” menu
→ “LAr Checklists”).
When a new run is starting, your first priority is to look at the DQMD monitoring. During the
first 3 minutes of a run, it is ESSENTIAL to spot immediately data integrity problems which may
need to stop the run and start a new one.
When you find some free time during your shift, read the calibration section (section 2) and the
Calibration checklist, to be prepared for the calibration taking period which could be in a hurry.
The rest of this manual should provide the information necessary to deal with tasks and problems
during the shift as they come up. Let us know what you find lacking.
When problems come up, the procedure should be to
1. Know the information on the LAr WhiteBoard web page. You should read it when you first
arrive for shift, and keep it in mind. It may contain instructions that are new or especially
vital for this particular shift.
2. Look at the wiki “Guidelines for LAr errors” which contains procedures for dealing errors
on the fly, such as the LAr being “busy”, TDAQ error messages, monitoring PT’s crashing,
OHP not showing plots, etc.
3. Next, check this manual ONLINE. Do not rely on a paper copy which will become outdated.
The official version is the one on the web.
4. If you cannot find the answer to your question on the WhiteBoard, Guidelines for Errors, or
in the manual, call relevant experts and please make a note of the fact that you couldn’t find
the documentation you needed in your e-log entry.
1.4
End of shift
At the end of the shift, you will need to finish your e-log Shift Summary, and submit it
IMPERATIVELY 15 MINUTES BEFORE the end of your shift. Choose Message Type : Shift
Summary, ShiftSummary Desk : LArg, System affected : LArg, Status : closed, Subject : “Shift
summary for LArg Desk”.
You should go through this Shift Summary with the crew of shifters that come after you, to
clarify any points.
5
2
Calibrations Runs
2.1
Taking Calibration Runs
Please follow these instructions to take standard LAr Calibration Runs in the Main ATLAS Control
Room or the LAr Satellite Control Room. You will take three sets of runs, using a script to start
each set, and closing the GUI (and DQMD and OHP) fully after each set. If you encounter
problems, have a look at the Troubleshooting page8 (*)9 .
1. Inform the shift leader and run control shifter that LAr would like to be “out” for the
calibration period (and therefore that LAr should be removed from the ATLAS partition).
If any LV power supplies have been turned on immediately before calibrations, check the
Troubleshooting page to see what actions to take.
2. Define with the Run Coordinator which set of runs you need to take. “Weekly runs”
correspond to Pedestal runs in 32 samples mode, Ramp runs in 7 samples and Delay runs in
32 samples. “Daily runs” correspond to Pedestal and Ramp run, both in 7 samples; no Delay
in this case.
3. Start the Calibration Checklist from the LAr Menu, and follow those instructions. The
following information is supplemental.
4. Copy and paste the ELOG template10 in a text editor (kedit) to use during the calibration
runs.
5. Verify that no one else is using LAr. To do that, look at the ATLAS Data taking Status11 .
If LAr is still in the ATLAS partition during the period allocated to calibration, it is ok as
long as the Root Controller State is NONE. Be especially sure to communicate with the shift
leader / run control desk; make sure you are finished with any calibration partitions before
they boot the ATLAS partition.
6. With the nominal settings on the DAQ Panel, click on the LAr tab (see Figure 1). Click
on “Calibration Runs” to start the calibration script (note you no longer need a terminal).
Choose a partition from the drop-down dialog box, then click OK. Unless otherwise instructed,
start with the EM partition, followed by the HECFCAL, and finish with the PS partition.
This will open the TDAQ GUI, OHP and DQMD.
Other partitions (listed in Table 1) are special cases listed here for reference and should not
be used unless you were explicitely asked to by the run coordinator.
7. Switch off “Enable ATLOG interface” in the “Settings” menu of the TDAQ GUI, to avoid
too many ELOG entries.
8. Open an MRS window (from the button at the top of the TDAQ GUI window, or just use the
window at the bottom of the gui), change the number of messages to 2000 (in the “Number
of visible rows” field).
9. Load the “MasterPanel” inside “LoadPanels” in order to see the “LAr” tab.
8
http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArTroubleShootingCalibration
https://atlasop.cern.ch/twiki/bin/view/Main/LArTroubleShootingCalibration
10
http://pc-atlas-www.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=ELOG Summary Calibration.txt
11
http://pc-atlas-www.cern.ch/wmi/RunStatus.html
9
6
PartitionTag
EM
HECFCAL
PS
EMB
ALL
description
full Barrel and EMEC A and C sides (including EMEC PS)
HEC and FCAL
Barrel PreSampler
full Barrel
All LAr
PartitionName
LArgEm
LArgHecFcal
LArgBarrelPS
LArgBarrel
LArgAll
Table 1: List of available partitions. LArgBarrel is used only if the endcaps are unavailable.
LArgAll is not yet used for Calibrations.
10. Go now into the “Shifter” tab, under the “LAr” tab.
• Press “Daily Run” or “Weekly Run” button ONCE.
Do NOT click any other buttons in the Shifter panel or the Run Control panel.
• Select “yes” in the confirmation window that will appear.
• A window “Remember to check the settings” opens at the begin of each new run, answer
“OK”.
• At this point, the set of runs for the applicable partition will be taken.
Current information relating to the run(s) in progress will be displayed in the text area
called “Information” on the panel.
For more details, there is another text area: “Log & Thread Information”, which contains
a log of the completed runs and whether or not the user aborted the runs. Next to it,
are a series of fields which display the status of threads used by the panel - green is for
an active thread, red for a thread that is no longer active, and the default gray color
means that the thread hasn’t been executed.
• At the end of each run, the Data Integrity is automatically checked. If there is an error,
two windows pop up :
Figure 1: The LAr Tab on the DAQ panel, showing the “Calibration Runs” button.
7
(a) “The run had errors, please check the log and determine if you need to retake the
run ”. Answer “YES” or “NO”.
(b) the log information.
Click “YES” to retake the run; “NO” to continue. Please use your judgement when
deciding to retake runs (based on whether you are taking a weekly or daily set, how
much time you have before the end of the calibration period, how many FEB errors
occurred, etc.). If you are not sure, consult with the experienced shifters or the run
coordinator. If the error is related to a known problematic FEB, don’t take the run
again (→ the error will stay for all runs). In the “Log & Thread Information” area,
you will see which of the runs have failed the Data Integrity check and your decision to
retake them or not.
• In case of problems, the shifter may stop the runs by pressing the “ABORT” button
within the “Shifter” panel. This is done to ensure that all of the threads executed by
the panel have fully exited.
• Watch for important messages scrolling in the MRS window. If you see any errors, check
the Troubleshooting page (*) and the Whiteboard to see if the data are still good, or
how to recover from the errors. Copy the messages in your ELOG summary.
• Look at the DQMD and OHP windows during the run. If you see any red partitions in
DQMD, the complete campain for this partition should be retaken. Make a note of the
problem in your ELOG entry.
• At the end of the data taking, copy the information about the calibration runs from
the Emacs session which will open, into your ELOG summary. Verify the contents
of these lines: proper gains, sample numbers, and run numbers! Some information
about the data integrity check is also provided. If you encounter problem with this
Emacs session, look in your home directory at rf_cal_runs.log (formatted log with
data integrity check); note that this file is always overwritten, so they only reflect
the last set taken. You can always look into the complete Calibration log file in
~lardaq/LAr-CalibrationRuns.log (if this file has *not* been automatically updated,
report the problem to the run coordinator), but in that case, you will have to run yourself
the script of section 5.3.1 to access data integrity information.
• Remark: You don’t need anymore to copy the runs to CASTOR, it is done automatically.
11. When all of the runs for one partition are finished, click on “SHUTDOWN” in the “Shifter”
panel and then close the TDAQ GUI by clicking the EXIT button in the file menu (top lefthand corner). When asked if you want to shut down the partition infrastructure, say “yes”.
Also close all DQMD and OHP windows.
12. Start again from the LAr tab of the DAQ Panel for the other partitions (go back to item 4).
13. Post only one ELOG entry for the whole sets of runs. For standard calibrations, please
choose Message Type: LArg, LArg EntryType: Calibration Summary, select the appropriate
LArg Partitions, and for the Subject line: use the proposed template in the ELOG summary.
2.2
Transferring data to castor at the end of a run
The data are normally copied automatically to CASTOR.
You can check it with script /det/lar/project/scripts/check_calib_run.sh.
The
calibration runs should appear in /data/copy.
If it is not the case, they may already appear in CASTOR. Check the location
/castor/cern.ch/grid/atlas/DAQ/lar/ElecCalib/2009.
8
If you need to copy the runs by yourself, use the script
/det/lar/project/scripts/copy_run_to_castor.sh which takes one argument, the run
number. Execute the script once for each run.
If you want to look more in detail:
• Log on the machines where the data are written (pc-lar-eb-02 (A side) pc-lar-eb-03 (C side)
for Barrel event builders and pc-lar-eb-01 (A side) pc-lar-eb-04 (C side) for Endcap event
builders). When a run is supposed to be good, the data should have been moved from the
/data/check directory to /data/copy directory of the event builder machine. The data
will then be automatically copied on castor by a daemon script. Given the huge amount of
acquired data, a regular cleaning of /data/check directory is performed: any data older than
1-2 day may be erased from it.
• Experts only: If you want to keep data in a safe place for further debugging but do not want
to have them copied on Castor and processed, you can copy them in /data/temp, indicating
in RunLog.txt why you want to keep these data. This may be especially interesting to store
runs with a lot of data integrity problems, that should be debugged by expert. The concerning
runs and errors should also be reported in the Elog entry such that experts are aware of the
problems.
2.2.1
Automatic processing of the calibration runs
A daemon is in charge of handling the Automatic Processing (AP) of the calibration runs. A
CronJob checks regularly if new calibration runs are copied to castor and if the corresponding
information is stored in the database. If it is the case, the AP is launched. You can check on the
following link12 the status of electronic calibration runs. In general, it takes from 10 to 30 min
between the end of the data taking and the start of the AP. If all partitions are complete [EM,
BarrelPS, HECFCAL], the script waits only 10 minutes after last run was taken and copied on
CASTOR before launching AP. If there’s an incomplete/non existing partition [for example on EM
partition taken], the scripts waits ∼ 30 minutes after last run was taken and copied on CASTOR
before launching AP. The purpose of this safety period is to keep the opportunity to gather runs
that might be taken again because they appear to be corrupted.
If you discover a problem concerning the running of the AP, log on the hypernews for electronic
calibration13 (only accessible from outside P1). Look if people have already reacted to the automatic
message from the AP. If not, put a message to the ECAL team using this mailing list. If the problem
seems to be related to the quality of the data, re-take the complete campain for that partition as
soon as possible.
2.2.2
Number of events to complete a run
In the case of pedestal runs, the maximum number of events is given either by the default value or
in the Run parameter panel. In the case of delay/ramps/cabling run, it is automatically determined
by the calibration pattern:
• Ramp: Nevents = Nsubsteps × Neventspersubstep × Npatterns × Nf inedelays
• Delay: Nevents = Nsubsteps × Neventspersubstep × Npatterns × NDACvalues
For more information on the patterns, the definition files can be found in:
~lardaq/LAr-CalibrationRuns.log.
12
13
http://lar-elec-automatic-processing.web.cern.ch/lar-elec-automatic-processing/
https://groups.cern.ch/group/hn-atlas-lar-electronic-calibration/default.aspx
9
2.3
Monitoring of the Calibration Runs
After typing > source /det/lar/project/scripts/LArCalibRunSetup.sh [PartitionTag] in
a P1 termal, two windows come open in addition to the TDAQ GUI : OHP and DQMD.
If it does not work, you can also launch the application with :
• > dqmd -p [PartitionName] for DQMD
• > ohp -p [PartitionName]
-c $OHPSEARCHPATH/lar/ohp/LArMonitoringShifterCalib.ohp.xml for OHP.
where [PartitionTag] and [PartitionName] are defined in table 1.
OHP and DQMD allow to monitor calibration data during the run. After having taken the
runs, monitoring can be done on ROOT files as mentionned in section 5.3.1.
2.4
Old way to take calibration runs
The complete procedure explained in section 2.1 is still valid, but in place of clicking on the
“Weekly” or “Daily” button (item 7.), you may want to follow the basic instructions for taking
specific calibrations runs:
• Go in the Shifter panel (under the LAr panel):
• Choose Gain - High, Medium, or Low.
• Press the CONFIGURE button - This is right above the information field. DO NOT click
the buttons to the left in the Run control panel.
• Choose Monitoring Tools - By default, only FebMon is selected.
• Choose Run Type - Pedestal, Delay, Ramp. You should follow the program in the ELOG
template, for the run types and number of samples you should use.
• Decide the Number of Samples - Generally, you should keep the default as shown, 0 (but be
careful for pedestal runs : choose 32 for weekly, keep 0 for daily (as 7 is the present default)).
• Press the RUN button
• The run should be stopped by the shifters only in the case of problems.
In this procedure, no Emacs session will pop up. The information about the calibration runs which
have just been taken can be found in the Calibration log file ~lardaq/LAr-CalibrationRuns.log.
2.5
Special calibration runs
1. Before starting a calibration run, make sure that LAr is not in the combined partition.
2. Get the following information from the expert(s) who requested the run. Also check the list
of special runs below if it is one that is commonly taken (for example, SCA test runs).
• Partition(s)
• Gain(s)
• Calibration run type (Pedestal, Delay or Ramp)
• H/W settings for the runs
– Run type
10
–
–
–
–
–
Filename tag, if specified
Number of samples
L1 Latency and First Sample
Data format
Calibration tag for Ramp runs
3. Launch a new shell, and type the following commands to setup the DAQ online environment:
> source /det/lar/project/scripts/LArCalibRunSetup.sh [PartitionTag],
where the [PartitionTag] depends on which part of the detector you want to run (see
table 1).
4. To set the run parameters in the TDAQ GUI :
• In the left panel, hit Boot
• Open the ”LAr H/W Control” tab. Set the parameters in PARAMS GLOBAL according
to the expert(s) instructions :
–
–
–
–
–
–
–
–
nbOfSamples
gainType
l1 Latency
firstSample
format
runType
InhbDelay
Do not forget to load the new values by clicking the button on the bottom-right
with the green arrow + disk as shown in Figure 2.
• Go in the ”Run Info & Settings” small window on the left, and in the ”Settings” tab,
set the run parameters according to the expert(s) instructions :
– Run Type: “LArPedestal”, or “LArCalibration” for ramp and delay runs, other for
other runs
– Tier0 Project Name: use “dataXX calib” (with XX for the year, like 09)
– Filename Tag: include Type and gain in the file name (e.g. Filename=”PedestalHigh”). Pay attention, it should not be too long!
– Recording: “Enable”
– do not forget to hit Set Values at the bottom of the window once the set up is
done
5. In the left panel of the TDAQ GUI, hit Initialize.
6. Wait for all segments to be set as INITIALIZE in black on blue in the Root Controller. Then
hit Config.
7. Once all segments show up as CONNECTED in black on yellow, hit Start.
8. Once the run is finished hit Unconfig and Terminate. Restart the same procedure for the
next calibration runs.
9. Once all calibration runs are taken, before exiting the DAQ GUI you have to hit Unconfig,
Terminate and Shutdown. Then exit the GUI cleanly by using the Exit button in the file
menu, located in the left right corner (NOT the red CROSS!).
11
10. Write info about the calibration runs in a dedicated ELOG entry. Information about the runs
are found on the event building machines (pc-lar-eb-01,pc-lar-eb-02,pc-lar-eb-03,pc-lar-eb-04).
To extract this info, open a terminal in the Control Room, ssh on the events builders and
look at the file ~lardaq/LAr-CalibrationRuns.log (by reading it with “more” “less” “tail”
etc.) Copy and paste the relevant lines into the ELOG entry about all your calibration runs.
If this file has *not* been automatically updated, report the problem to the run coordinator.
2.5.1
SCA test runs
For SCA test runs, follow this pattern:
• In Step 4, for Run Parameters:
– Run Type: SCATest
– Recording: Enabled
– Filename tag: scaleak
– Max Nb events: 0
• In Step 4, for the H/W control panel info, write down the intial values for these parameters
(you will need them later), then change them to:
– nbOfSamples = 7
– gainType = H, M, or L depending on the run
– l1 Latency = 17
– firstSample = 3
– format = transparent
– runType = RawData
– InhbDelay = 72
– Do not forget to load the new values by clicking the button on the bottom-right with
the green arrow + disk as shown in Figure 2
• For the LAr Calibration Manager Panel:
– Sequence type = Test
– Tag = HighSCATest (or MediumSCATest or LowSCATest, depending on the gain), set
for each subdetector if there is more than one
– Do not forget to click the bottom right hand button with the blue gear to save the
settings as shown in Figure 3.
• Now follow the instructions above to go through Initialize, Config, Start, Unconfig, Terminate
and Shutdown.
• Reset the values in the H/W control panel and exit the Gui.
• Transfer the data to Castor, using the link above.
12
2.5.2
Start a trigger calibration run
For experts only. Partitions, etc, in this section are outdated.
The acquisition of trigger calibration runs is not yet automatized. Moreover, only the barrel
case is for the moment fully implemented. To start a trigger calibration run, first go in the Segment
and Resource panel and enable the following segment : Larg_L1Mon_[01,02] in the LARG_EMB[A,C]
segment. Be careful to only enable the segment and NOT the sub tree.
There is one trigger crate per half barrel, each crate being connected to 2 ADCs of 8 channels
that allow to test only one front end crate per run and per half barrel. The choice of the tested front
end crate is made by configuring the USB controller in the Global Params of the Larg L1Mon [01,02]
object defined in the LAr H/W control panel (see section 4.3.1). Check that the number of
samples acquired by the ADC is set to 27. Then if you modified something, do not forget to
reload the databases by clicking the icon on the bottom right. To avoid to acquire data for all
the configured front end crates, it is recommended to switch off all the GLinks by running the
script : >~lardaq/bin/stop\_otx.csh. All GLinks being switched off, only the relevant ones are
reconfigured and therefore send their data to the RODs.
Finally go to the Trigger panel (see figure 4), and modify the different properties:
• Detector type;
• Connectivity or Calibration : choose Calibration;
• TBB (EMB and EMEC) or TDB (HEC or FCAL)
• Gain : choose medium
• nSamples : 12 (this is the number of samples sent by the front end boards).
Figure 2: The red arrow points to the button to click to upload the configuration changes.
13
• nTriggers : 100
Then click on run .
Figure 3: The red arrow points to the button to click to upload these changes.
14
Figure 4: the Run Control - need to get a better picture panel
15
3
Physics Runs
3.1
What to do at the start of a physics run?
1. Launch the DAQ panel (see 4.1) if not already open.
2. Check the parameters for the DAQ panel. You may need to browse for the configuration, and
click into the box where “Browse” is highlighted.
• Setup Script: /det/lar/project/scripts/LArShiftSetup.sh
• Part Name : ATLAS.
• Database file: /atlas/oks/tdaq-XXX/combined/partitions/ATLAS.data.xml
(replace XXX by the last version of tdaq; to do that, start browsing /atlas/oks/ and
you will see all the available tdaq versions)
• Setup Opt : -newgui
• MRS Filter: LAR
• OHP Opt: -c $OHPSEARCHPATH/lar/ohp/LArMonitoringShifter.ohp.xml
(the variable $OHPSEARCHPATH will be automaticaly defined by the Setup Script)
• TriP Opt: -c $OHPSEARCHPATH/trigger/trp/trp_gui_conf.xml
3. Click on “Read Info” to get the information specific to the chosen partition. It will take a
minute, please be patient. A box should pop up and say “Information read out. You can
proceed.” click OK.
4. Open several windows from the DAQ panel by clicking on the following buttons :
• Monitor Partition: will open the “DAQ GUI”
• Busy: will open the “busy panel” to know the state of the LAr
• MRS
• OHP
• DQMD
5. Before the start of the run, make sure you have the instructions from the Run Coordinator
(look at the WhiteBoard or call him). You need to know:
• In which mode the data will be taken : Physics mode?
• The number of samples
• The mode of readout : RawData? Which format?
• The L1 latency and the first sample parameter
• Which parts of the detector have to be included?
6. When the LAr shifter is asked by the Atlas shift leader if LAr can go back into the combined
partition: If all LAr standalone work is completed (calibration data taking, expert work
on Lar system...) and LAr is ready to go back. The LAr shifter answers: ”Yes we are
ready to go in. Please re-enabled us.” It may happen that LAr segment (in the segment &
ressources panel) is already enabled if for example LAr was in a calibration time slot (so
doing standalone work) and no combined partition was restarted during that time slot. But
it is better in anycase to ask for it.
16
7. THEN: ONLY after LAr segment has been re-enabled, the LAr shifter can see/check/modify
the run parameters in the online H/W panel (such as number of sample/l1 latency...) More
generally speaking, if the LAr shifter needs to look at any LAr panel in the combined partition,
(s)he needs to start the Atlas combined DAQ GUI (clicking on “Monitor Partition” from the
DAQ panel). Then (s)he can see those panel and proceed BUT ONLY if the LAr segment
has been enabled in the combined partition.
8. If not already done, copy and paste the e-log template14 (*)15 to a text editor.
9. Open the “startrun-LAr” checklist form the “LAr” menu of the computer. You will have to
check the DCS status and start the monitoring.
3.2
Monitoring of physics run
1. Open OHP through the DAQ Panel.
2. If it does not work or if you want to open many OHPs at the same type, you can use the
commands :
>source /det/lar/project/scripts/LArShiftSetup.sh
>ohp -p [PartitionName] -c $OHPSEARCHPATH/lar/ohp/LArMonitoringShifter.ohp.xml
where [PartitionName] is the one in the 3rd column of Table 1.
3. Open DQMD through the DAQ Panel.
4. Open the “LAr-online-Monitoring” checklist
The monitoring histograms are detailed in sections 5.4. You can also have a look at the LAr
DQ Policy Description16 (*)17 .
14
http://pc-atlas-www.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=ELOG Summary.txt
https://atlasop.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=ELOG Summary.txt
16
http://pc-atlas-www.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=LArDQPolicy.pdf
17
https://atlasop.cern.ch/twiki/bin/viewfile/Main/LArOperationManualShifter?filename=LArDQPolicy.pdf
15
17
4
Environment
4.1
The DAQ panel
The DAQ Panel provides access to most of the tasks you will need to complete as a LAr shifter.
To launch the DAQ Panel in the control room, go to the TDAQ menu at the bottom of the
screen and click on DAQPanel. The panel shown in Figure 5 should show up.
You will need to configure the TDAQ environment. See for example section 3.1.
(New shifters: What is a partition? A partition is a collection of hardware and software that is
given a single name, so that someone can control it. One or many partitions can be running at
once – each defined to include, for example, the readout crates of individual systems being tested.
For a combined run, the partition includes most of ATLAS. Each piece of ATLAS should only
belong to one partition at a time. So, you may hear people saying they want particular things
included in the partition, to run with all the systems, or removed from the partition, to test them
on their own. People working on LAr may also refer to a part of the detector as a partition. They
may call the LAr BarrelA or HECA a partition since each one is a partition for calibrations, but
the partition for the cosmics is usually all of ATLAS.)
It is then possible to access and click a wide collection of applications/functionalities. Monitor
Partition (the old Spy IGUI), OHP and DQMF are the most useful for the standard shifter :
• Monitor Partition : This starts a DAQ GUI in spy mode. This is an useful mode to see from
the LAr desk what’s happening at the Run Control Desk, and especially to check that the
LAr parameters are correctly set. The refreshing of the display seems to be sometimes slow
or broken. It may be worth to kill the DAQ GUI and launch again Monitor Partition, if there
are some doubts on the reliability of the display.
• OHP : Start an Online Histogram Presenter (OHP) with the configuration file specified in
the left part of the DAQ panel. The OHP application is devoted to the online display of
Figure 5: The DAQ panel (the Main panel)
18
monitoring histograms (see section 5).
• DQMF : Start the Data Quality Monitoring Display.
Other buttons you may need:
• Start Partition : Start a DAQ GUI in expert mode WARNING : you must never start a DAQ
GUI in expert mode for an ATLAS partition, if a partition is already existing, as this may
confuse and crash the existing one. It should be up to the run control to start the run. This
should be fixed soon, but why risk it?
• Busy : Display the busy status of the whole data acquisition.
• MRS : pops up a window where the error messages are displayed (see Section 4.3).
How To Recover Lost Log Messages? If an important log message you wanted to post on
the elog flies by in the MRS Monitor before you can copy/paste it into the elog you can still
access the log message by hitting the Log Manager button from the DAQ Panel. Navigate
to the partition you want to see and the user who is controlling the partition (ATLAS and
crrc for combined running). That will open up a list of all runs taken by that user in that
partition. Select the run the log message came from and in the top right select the type of
FATAL/ERROR/WARNING/etc you are looking for.
• OKS : used to access the OKS database
4.2
Opening the Monitoring Advanced Panel
To open the Mon Advanced Panel, first open the DAQ Panel as described in the last section, then
click on the tab called “Mon Advanced”. You should see a panel similar to the one below, but not
exactly the same.
In the the Mon Advanced panel, you’ll find some tools for the Information Server (IS) and
others.
4.3
Basic functionalities of the DAQ GUI
A collection of basic facilities can be accessed from the DAQ GUI presented on figure 7 by:
• clicking on MRS opens a new enlarged window to display the different messages. This new
window especially allows to perform some selection on the displayed messages, change the
number of displayed messages, etc.
• clicking on IS allows to access to all informations of Information Service (IS)
• in “Commands” scrolling menu, it is possible to clear the small MRS window located in the
DAQ GUI
• you can load specific panels which do not appear in the DAQ GUI, through the “LoadPanels”
menu.
4.3.1
LAr H/W control parameters
To check these parameters when running inside the ATLAS partition, load “OnlinePanel” through
the “LoadPanels” menu. The “LAr H/W Control” tab will appear, click on it and choose
PARAMS GLOBAL (see Fig. 8).
For the calibration runs, when running on specific LAr partitions, the “LAr H/W Control” tab
is accessible under the “LAr” tab, to see it load “MasterPanel” through the “LoadPanels” menu.
19
The most recent values for the l1aLatency, 1st sample, etc, will be on the WhiteBoard.
The way that data are formatted is determined by a combination of the properties Run type
and Format. This is detailed in table 2.
Some additionnal notes on various parameters :
• The choice of physic mode (“Result + format1”) requires to enable the Online DB in the Run
Control. It is located under LArg - LArg Plugins - OnlineDB. The database must be started
first. To switch from Transparent to Format 1, this database should be started by contacting
the Run Coordinator.
• If EMBA and EMBC are enabled, EMBA should be SLAVE and EMBC LAST SLAVE but
if EMBC is disabled, then EMBA should have the LAST SLAVE setting. The LAST SLAVE
prevents the BUSY to come from thdownstream LTP’s.
• The inhbDelay should equal 52 + l1aLatency. The “52” clock cycles comes from the time it
takes to send the command + cable lengths. 50 and 51 are sometimes used, but 52 gives the
best pulse placement. The inhbDelay is the time between the BG02 command and the L1A.
4.3.2
Complex deadtime and the Central Trigger Processor
The LAr Front End Boards (FEBs) hold the data coming from the calorimeter in a buffer of 144
cells, called the SCA (Switched Capacitor Array). If the trigger rate is too high, events can’t be
read in and written out fast enough. Therefore, a parameter is set at the Central Trigger Processor
(CTP) to control the number of triggers sent to the FEBs.
For 5 samples, the settings are maximum 9 events in the system and 400 clock cycles to send
out an event.
For 32 samples, the settings are 1 event and 4000 clock cycles.
Figure 6: The DAQ panel for (the Monitoring panel)
20
If the complex deadtime is incorrectly set, you can see it in FebMon – you’ll get many errors,
especially in the “Wrong SCAC status” plot. You should notify the shift leader and the L1 trigger
desk.
4.3.3
Segment and Resource
Go in the Segment and Resource panel tab of the GUI (see figure 9) and check that the expected
partitions are properly enabled.
For combined cosmic running, all the detector parts should of course be enabled
under LArg: Larg EMBA, Larg EMBC, Larg EMECA, Larg EMECC, Larg HECFCALA and
Larg HECFCALC. In contrary, some segments should be disabled in in LAr plugins : Calibration,
EventCounterReset, RunLogger, Action inspector; and in LTPIC: LArg TTCRCD TTC2LAN. The
segments called Larg L1MON A and Larg L1MON C should never be enabled with ”No beam”
configuration
If you see a different configuration under LArg, consult the WhiteBoard and then the Run
Coordinator.
In combined running mode, it is worth to note that a partition not readout may be however
included in the segment and resources for trigger distribution purposes.
Figure 7: The DAQ GUI.
21
For the curious: A resource can be identified by the icon made of shapes (a cube, sphere, cone).
It’s a piece of C++ code which can be enabled or disabled. They are also called applications. A
segment is identified by the puzzle piece icon. It’s a collection of applications, which can have
subsegments. PT is a “Processing Task.”
4.3.4
LAr Crates
To check the status or temperatures of the crates that are taking data matches what you saw in the
DCS, click on the “LAr Panels Manager” panel in the DAQ GUI, and then on “Crates”. Crates
that are “On” in DCS are physically on, while crates that are “On” in this panel are the ones that
are expected to be read out by DAQ.
To see which crates are being read out, click “Refresh” once. Then click through the End Cap
A, Barrel A, Barrel C, and End Cap C. If you see any differences (a crate is ON or OFF in DCS,
but different here) check the WhiteBoard and then inform the Run Coordinator.
Temperatures for the half crates are displayed below the status of each crate and are separated
into minimum and maximum text fields. Minimum temperatures appear in the left field, maximum
on the right. The temperatures are separated by a — and are crate ordered left, right, and in a
few cases, special. These “special” temperatures correspond to HECA and HECC.
Figure 8: the HW Control panel.
22
LAr H/W control parameters
Run
Format
type
Raw
Transparent
data
Result
Format1
Result
Transparent
Raw
Data +
Result
Transparent
Raw
data
Format2
Format1
Description
Common usage
All the digits are simply transferred to the
ROS without any processing in DSP
Physics run only at low
trigger rate - Useful
to study pulse shape,
noise...
Physics run at high
rate. Only in 5 or 7
samples mode.
Energy are computed for all cells by DSP
and sent to the ROS. If the cell energy is
greater than a threshold T1 , time and Q
factor are also computed; if the cell energy
is greater than a threshold T2 , digits are
also transferred.
The digits are averaged over a certain
number of events (typically 100) by the
DSP and only the average is sent to
the ROS. In case of pedestal runs, auto
correlation coefficient are also computed .
Combined “Raw data + Transparent” and
“Result + Transparent”
Not implemented
Meaningless
Pedestal, delay
ramp runs
and
Pedestal run. Will be
used until validation of
autocorrelation matrix
computation in DSP.
Should not be used.
Should not be used.
Table 2: Determination of data format provided by the LAr H/W control parameters Run type
and Format.
Number of Samples (thresholds)
32
16
10
7 (0/0)
7 (3σ/5σ)
5 (0/0)
5 (3σ/5σ)
Event Size
∼ 14MB
∼ 8MB
∼ 6MB
∼ 4.2MB
∼ 1.2MB
∼ 3.2MB
∼ 1.2MB
Deadtime Setting (simple, complex)
2500, –/–
1250, –/–
10, 3/830
7, 5/570
7, 5/570
5, 7/415
5, 7/415
Maximum L1 Rate
∼ 10kHz
25kHz
∼ 60kHz
∼ 60kHz
90kHz
90kHz
Table 3: Deadtime settings, L1 trigger rates, and event sizes for different numbers of LAr samples
and different threshold settings. (Taken from the ATLAS Shift Leader training slides.)
4.3.5
The Run parameters
In this panel, are given very general parameters :
• Run number : unique for the run, automatically retrieved from a db
• Number of events : useful for stopping the pedestal runs; in this case, taking at least 20003000 events is recommanded. Taking much more is not necessary for standard studies. See
2.2.2 for more informations on ramp/delay/cabling runs.
23
4.4
Description of different processes involved
The DAQ is performed by different processes (PMG = Process ManaGer) running on different
machines (see table 4). All processes should be either in UP state or in RUNNING state to have
an efficient running.
If such processes crash or are stuck one can remove it from the DAQ by clicking on out in the
Run Control panel (in the Membership subpanel - see figure 11). It also possible to try to restart
the process on the fly. This should only be done at the Run Control desk when authorized by an
expert.
4.5
OKS database
The OKS database contains the status of all hardware parts. It should only be edited by experts,
or with an expert on the phone. Changes here can break the entire ATLAS partition, so proceed
with caution. To edit it, one has to start the OKS editor from the DAQ Panel. Then, two new
windows pop up. In the largest one, look in the left column for the type of object, that you want
to modify:
• An half front end crate : LARG HFEC
Figure 9: the Segment and resource panel. Surrounded are the segments associated to the ROD
crates.
24
• A single Front End Card (FEB, TBB, calibration...) : LARG FEModule
To enable/disable an object, change its state to true/false and close the window (see figure 13).
When all wished objects are modified, exit from the OKS editor, confirm the changes (as many
questions as modified objects) and the run control shifter should reload the database by clicking in
the appropriate button (see figure 12) in the DAQ GUI. When doing this, you should see restarting
all the PMG agents as at the DAQ startup.
Figure 10: the LAr Crates panel.
25
Process
EB-[PART]
disk usage
monitoring
onasic-setconditions
ROS-LAR[PART]-[N]
RODC[PART][N]
TTCC[PART]
LArPT-[N]
LArGatherer
LArArchive
AndRemove
Machine
pc-lar-eb0[N]
pc-lar-eb0[N]
Gui
machine
pc-lar-rosemba-00
sbc-lar-rcc[PART]0[N]
sbc-lar-tccemb-01
pc-tdqmon-14
pc-tdqmon-18
pc-tdqmon-14
Description
Event building
Monitors the occupancy of /data disk
Runs for a short time, usually shows up ABSENT
ROS control
ROD crate control
Trigger control
Monitoring process - May be connected to EB or SFI
Merging monitoring data from different LArPT
Save histograms at the end of the run (in absent state
during the run, up at the end)
Table 4: Description of different processes involved in DAQ - [PART] is a partition (EMBA, EMBC,
EMECA, EMECC...) - [N] is an integer
26
Figure 11: the Run Control panel. The circled button is the one to remove a PMG from the DAQ.
Figure 12: Upper toolbar. Surrounded are the buttons to edit OKS database(green) and reload it
(blue).
27
Figure 13: OKS editor windows. A : list of object types - B : list of objects for a given type - C :
list of property of an object.
28
4.6
DCS - Detector Control and Safety
To open the DCS Panel in the control room, go to the LAr menu on the bottom of the screen.
Click : LAr → DCS → LAr DCS FSM . Make sure you open this from the LAr button, not the
DCS button, as the settings of the program will be different.
From outside the control room, log on pcatlgcslin through the Atlas gateway (with the lardaq
account for example - see A.2) and type: >/scratch/pvss/bin/viewlarfsm
You should see the following graphic:
Figure 14: DCS FSM screen for LAr.
4.6.1
Check the LVPS Status
To check the subsystems, go through each of the six detector parts: EMB A, EMB C, EMEC A,
EMEC C, HEC FCAL A, and HEC FCAL C. For each part, you will get a global picture first (15
for example) and then you can click further down the tree to see each set of crates (LV, HV, ROD).
Starting with the EMB A as an example, first click “EMB A”.
If all five sub-sub-systems on the left-hand side say READY and OK, and everything in the global
picture is green, you can go on to the next subsystem. This means the FEC LVPS, HV, and
ROD crates are all on, or any problems below are known and do not propagate upwards. The run
coordinators can mask known problems so LAr may still say “OK” even when some pieces are off.s
If anything in the global picture is gray, yellow, or red, or if there are any FATAL or ERROR
conditions in the left-hand panel, you should follow this procedure:
1. Click to see the FEC status (for EMB A, click EMBA FEC). For each set of FEC’s, a panel
29
similar to the one shown in Figure 16 will pop up.
2. Note the components which are off or have errors.
3. Check the LAr WhiteBoard18 (*)19 to see if the components which are off or in error are
already known problems. If they are known, you do not need to notify an expert.
4. If the off/error components are NOT on the WhiteBoard, notify the LAr Run Coordinator
by phone.
Check the LVPS for the rest of the subsystems by clicking “LAR” at the top of the tree to return
you to the main page. Repeat the procedure with the EMB C (EMBC FEC), EMEC A (EMEC A
FEC), EMEC C (EMEC C FEC), HEC FCAL A (HEC A LV), and HEC FCAL C (HEC C LV).
If you want to know more about one crate you can click on the crate itself. (For the Barrel
and EMEC, this will work, not for the HEC.) A new panel will show the actual voltage on the
O.C.E.M. power supply in USA15 (around 280V), the current and the voltages on the output of
the LVPS (DC-DC converter) on the detector.
4.6.2
Check the ROD Crate Status
To check the ReadOut Driver (ROD) crates, go through each of the six detector parts: EMB A,
EMB C, EMEC A, EMEC C, HEC FCAL A, and HEC FCAL C. For each part, you will get a
global picture first and then you can click further down the tree to see each set of crates.
18
19
http://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArWhiteBoard
https://atlasop.cern.ch/twiki/bin/view/Main/LArWhiteBoard
Figure 15: Compact view of DCS for a half barrel. From inner circle to outer one, are displayed :
ROD crates, Front End Crates (FECs), cooling loops, High Voltage (HV) for Power Supplies, High
Voltage.
30
1. Starting with the EMB A as an example, first click “EMB A”.
2. Click to see the ROD status (for EMB A, click “EMB A ROD”). A panel similar to the one
shown in Figure 17 should pop up.
3. All the crates should say ON and OK. Note the components which are off or have errors.
5. If the off/error components are NOT on the WhiteBoard, notify the LAr Run Coordinator
by phone.
Check the RODs for the rest of the subsystems by clicking “LAR” at the top of the tree to
return you to the main page. Repeat the procedure with the EMB C, EMEC A, EMEC C, HEC
FCAL A, and HEC FCAL C.
4.6.3
Check HV Status
To check the High Voltage (HV) status, go through each of the six detector parts: EMB A, EMB
C, EMEC A, EMEC C, HEC FCAL A, and HEC FCAL C. For each part, you will get a global
picture first and then you can click further down the tree to see each set of crates.
1. Starting with the EMB A as an example, first click “EMB A”.
2. Click to see the HV status (for EMB A, click “EMB A HV”). A panel similar to the one
shown in Figure 18 should pop up.
20
21
Figure 16: DCS FSM screen for the LAr low voltage power supplies (LVPS).
31
3. All the crates should say ON and OK. Note the components which are off or have errors.
5. If the off/error components are NOT on the WhiteBoard, double-check to make sure you are
reading the PHI numbers from the list and not the graphic, which is sometimes wrong. If it
is really a new problem, notify the LAr Run Coordinator.
For the HV, there are many sub-subsystems to check. For the EMB, you need to look at both
“EMB HV A” and “LAR EMBPSA HV” for the PreSampler. The same goes for EMB C. The
EMEC A and EMEC C have only one HV to check, while HEC FCAL A has two again (HEC A
HV & FCAL A HV), and the same for HEC FCAL C (HEC C HV & FCAL C HV).
4.6.4
Check the DCS Alarms Screen
To open the DCS Alarm Screen, go to the LAr menu (NOT the DCS menu) on the bottom of the
screen. Click : LAr → DCS → LAr DCS Alarms. This will show all of the DCS alarms that apply
to the LAr systems – temperature, voltages, etc.
Alarms have a color code, and a letter in the first column.
• FATAL - F - red
• ERROR - E - orange
• WARNING - W - yellow
22
23
Figure 17: DCS FSM screen for the LAr ROD crates in USA15.
32
If you see any red/(F)/FATAL errors, check the WhiteBoard first and then call the Run
Coordinator.
For each other alarm, right-click and select “Trend”. The orange line is the level for an ERROR,
the yellow line is for WARNING. The blue line is the quantity as measured.
Next, put the mouse cursor on the x-axis (time), and then use the scroll wheel to zoom out.
You can see if the value is constant over the last few hours/days or if it is rapidly changing.
For any WARNINGs or ERRORs that are not rapidly changing, check the WhiteBoard to
see if they are known. If they are not known, make an entry in the e-log with the “complete”
alarm information (the full line). They will be evaluated by an expert, and if the value is OK, the
thresholds can be changed later.
Keep this screen open and visible throughout the shift, reacting as above for each new alarm.
* If you see ALL the alarms for ATLAS, you opened DCS → DCS Alarms, not LAr → DCS →
LAr DCS Alarms.
* If there are over 1000 alarms, maybe the HECLV is still in the filter.
Look for new HECLV problems (within the past day), and check for them on the WhiteBoard
or with an expert. Once you have dealt with the HEC problems, you want to filter out all the old
messages.
Double-click in the white box on the bottom left underneath the checkbox “Systems”. Click on
ATLLARCLVTEMP, then hold down control and click on the other ATLLAR* systems. Do NOT
include ATLLARHECLV. Next click the button that says “Apply filter” on the bottom right. Most
of the messages should be gone.
Figure 18: DCS FSM screen for the barrel HV.
33
4.7
4.7.1
Using the ATLAS e-log (ATLOG)
Access and use of Elog
The ATLAS e-log is now the main problem-reporting tool used by Liquid Argon.
We are still refining the reporting procedure, so we appreciate your feedback. New categories
for LAr have been defined, explained here:
This menu appears when choosing ”Message Type = LAr”
1. Observation : intended to cover all observations with a limited time scale. Examples:
problem of data integrity in one run, HV trip, excess in one histogram. The sub-menus
mainly recall the 5 different fields identified in DQ, used in webdisplay/OHP . It seems that
it can cover most of usual aspects (apart from DP perhaps).
1.1 Online environment - refers to a general problem (ex : monitoring PT crashes)
1.2 DCS
1.3 DAQ / Data integrity
1.4 Misbehaving channels
1.5 Signal
1.6 Physics
1.7 Other
2. Development / maintenance : this also covers a documentation for experts. Examples:
replacement of a FEB, installation of a new HV module, upgrade of software
2.1 Hardware
2.2 Software
2.3 Other
In DQ, we can imagine using mainly the field ”Observation”, leaving the status open, and
defining which error it is (menu 2) and where it was observed. Regularly, coordinator of the 5
different sub tasks and the biweekly DQ coordinator (to be confirmed) go through the logbook and
close the case if this is not a problem and assign it to someone for debugging if it is. We also would
like to request to ATLOG team to be able to edit the messages (at least their status : such that
we can change it from ”open” to ”closed” depending on its status).
It is useful to note that in the new ATLOG version, the shift summary is posted by chosing
”Shift summary” in message type with a submenu giving the system type.
4.7.2
Information to put in the Elog
There is a template for your shift log summary in the start-of-shift (signin-LAr) checklist.
34
5
Monitoring and Data Quality
5.1
5.1.1
Monitoring Displays
DQMD
See on the twiki information on DQMD24 (*)25 .
5.1.2
OHP
See on the twiki information on OHP26 (*)27 .
5.1.3
Trigger Presenter
See on the twiki information on the Trigger Presenter28 (*)29 . The option to launch the Trigger
Presenter from the DAQ panel is written in the section 4.1.
5.1.4
Atlantis
See on the twiki information on Atlantis30 (*)31 .
5.1.5
Other monitoring tools at P1
See on the twiki page32 (*)33 .
5.2
5.2.1
Where to find the monitoring data?
Setting up ROOT
To use ROOT in order to browse plots in the Control Room, set up ROOT with the following
command:
>source /det/lar/project/scripts/setup_root.sh.
5.2.2
Online Calibration ROOT Files
The online histograms of calibration runs are stored in:
/det/lar/project/Histogramming/PARTITION/
where PARTITION corresponds to the partition named used for the calibration set (See table 1) The
ROOT filenames contain the partition name, run type and the run number.
5.2.3
Online Physics ROOT Files
Online histograms of the global runs are stored on pc-tdq-mon-09.
To retrieve them, simply execute the following script:
/det/lar/project/scripts/get_online_root_file.sh <RunNumber>
24
https://pc-atlas-www.cern.ch/twiki/bin/view/Main/DataQualityMonitoringDisplay
https://atlasop.cern.ch/twiki/bin/view/Main/DataQualityMonitoringDisplay
26
https://pc-atlas-www.cern.ch/twiki/bin/view/Main/OnlineHistogrammingPresenter
27
https://atlasop.cern.ch/twiki/bin/view/Main/OnlineHistogrammingPresenter
28
https://pc-atlas-www.cern.ch/twiki/bin/view/Main/TriggerPresenter
29
https://atlasop.cern.ch/twiki/bin/view/Main/TriggerPresenter
30
https://pc-atlas-www.cern.ch/twiki/bin/view/Main/AtlantisEventDisplay
31
https://atlasop.cern.ch/twiki/bin/view/Main/AtlantisEventDisplay
32
https://pc-atlas-www.cern.ch/twiki/bin/view/Main/MonitoringShifterOperationManual
33
https://atlasop.cern.ch/twiki/bin/view/Main/MonitoringShifterOperationManual
25
35
The file will be copied in the scratch area of your local machine. The filename includes the partition
name, the run number and the lumi block number. The FEBMon histograms are saved at the end
of each lumiblock. All histograms are saved at the end of a run (lumi block called l_EoR).
You can also retrieve the online monitoring plots34 , as well as the online monitoring root files35
from the web display.
5.2.4
Offline Physics ROOT files
If you want to look at more plots offline, you can retrieve the monitoring histograms
produced at Tier0.
( This is not possible from the P1 computers, you have
to use your own laptop).
The monitoring files location is given on this wiki
page : https://twiki.cern.ch/twiki/bin/view/Atlas/CosmicCommissioningReconstructionStatus36
(reachable from outside P1 only).
5.2.5
Using the event dump
The event dump may be useful if the online monitoring is not available or if you want to look at
something specific not available in the online monitoring.
Log on the machine where the data are written (for Physics runs, the data are written by the SFI
machines pc-tdq-sfi-00x directly on their disk /localdisk/data. To know which SFI machines are
involved, you have to check the Run Control panel). You can have a look at raw data by typing:
> /atlas-home/1/hwilkens/dumpeformat/dumpeformat -n NbOfEvts -v VerbLevel
-f FileName.data
Using the program dumpeformatroot instead of dumpeformat allows to produce a simple root file
named output.root where are stored pedestal and noise for all FEBs.
Look at appendix C for some help for reading the event dump.
5.3
Data Quality Checklists
Section under construction.
5.3.1
Online - Calibration Runs
The data integrity of the calibration runs can be checked using the following command lines :
> cd /det/lar/project/scripts/
> source LArShiftSetup.sh
> python CheckDataIntegrityCalibRuns.py [Run1] [Run2]
The run range to look at is defined between [Run1] and [Run2].
For information, short calibration runs as Pedestals runs for HECFCAL and all PS runs have
no associated monitoring files (because the run ends before the monitoring starts).
Remark: With the DAILY and WEEKLY procedures, the data integrity check is automatically
done at the end of the campaign for one partition. The summary of the check appears in the window
that pops up. If it is not the case, have a look in your home directory at rf_cal_runs.log for
the last set of runs.
5.3.2
Online - Physics Runs
In the desktop toolbar, in the “LAr” Menu, open the “LAr-online-Monitoring” checklist.
34
http://atlasdqm.cern.ch:8080/webdisplay/online
https://atlasdqm.cern.ch/tier0/Cosmics08 online root
36
https://twiki.cern.ch/twiki/bin/view/Atlas/CosmicCommissioningReconstructionStatus
35
36
5.4
Monitoring plots description
The following subsections are describing the most important LAr monitoring plots. The monitoring
histograms are produced with 2 different packages :
• LArMonTools : To monitor basic information like DSPs, FEBs, digits, noise, calibration
constants, detector coverage, cell masking...
• CaloMonitoring : To monitor the energy reconstruction and noise at the cell and cluster level
The plots are organized into 5 categories, used both in OHP (online) and on the DQ web display
(offline). The “Run Info” and “Timing” categories contains histograms related to the full liquid
argon calorimeter, while all the other categories are split by partition : EMBA, EMBC, EMECA,
EMACC, HECA, HECC, FCALA, FCALC.
• Run Info : summary plots about detector coverage, number of events, trigger type...
• Data Integrity : contains fundamental checks about the ROD, DSP and FEB readout. Any
plot with a suspicious behavior found in this tab is a sufficient reason to call the LAr run
coordinator and possibly STOP the ongoing run. The recorded data will very probably be
corrupted and useless.
• High Energy Digits : plots used to monitor the detector timing and the pulse shape.
• Timing: plots showing the collisions candidates
• Energy Flow: total energy deposited in the calorimeter during a run.
• MisBehaving Channels : plots used to spot hot cells or noisy detector regions.
• CaloGlobal : to monitor higher level quantities like clusters, jets, EM objects.
37
5.4.1
Run Parameters
Event Type
• One 1d histogram for the full detector.
• Description: Type of recorded data (Raw, physics, Calibration...)
• OHP tab: [Run Info][Run Parameters][LAr]
• When to check it? At the beginning the run.
• Expected status: Depends on the run plans.
• DQMF checks: none
Number of Samples
• Description: Number of readout samples
• When to check it? At the beginning the run.
• Expected status: No expected status. Depends on the run plans.
Nb Of Events per Minute
• Description: Number of recorded events per 60 seconds block.
• When to check it? Anytime during the run.
• Expected status: Flat distribution. Spike would indicate a problem with the trigger rate
Nb Of Rejected Events per Minute
• Description: Number of events with at least one FEB showing error, per 60 seconds block.
• Expected status: Empty. Too many entries will indicate serious data integrity issues, and
requires to look at the data integrity plots 5.4.4
38
ADC Threshold in DSP
• Description: Threshold (in ADC count) above which the digits + energy computed in DSPs
are transferred in the dataflow.
• Expected status: No expected status. This plot is aimed at helping us to quickly retrieve
useful information for offline analysis.
DSP Threshold - Qfactor+time
• Description: Threshold (in ADC count) above which the energy+time+quality factor
computed in DSPs are transferred in the dataflow.
• Expected status: No expected status. This plot is aimed at helping us to quickly retrieve
useful information for offline analysis.
Number of Events per L1 trigger bit
• Description: Number of events passing L1 trigger terms.
• Expected status: No expected status.
Number of readout FEBs
• Description: Number of readout FEB per partition.
• When to check it? At the beginning of the Run.
• Expected status: To be compared with expected numbers written on the Whiteboard.
39
Number of events per Stream
• One 1d histogram
• Description: list of Streams available in the runs.
• OHP tab: [Run Info][Run Parameters][ATLAS]
• Expected status: No expected status
Raw Stream correlation
• One 2d histogram
• Description: events overlap between streams.
• OHP tab: [Run Info][Run Parameters][ATLAS]
• When to check it? Anytime during the Run.
• Expected status: No expected status
5.4.2
Detector Coverage
Coverage - Sampling - Partition
• One 2d histogram per sampling (S0=PreSampler, S1=Front, S2=Middle, S3=Back).
• Description: missing cell/missing FEB (0 - White), KNOWN missing FEB (1 - Purple),
masked cells (2 - Blue), good cells (3 - Green), FEB supposedly missing, but actually readout
(4 - Red)
• OHP tab: [Run Info][Detector Coverage][PARTITION]
• When to check it? At the beginning of a run.
• Expected status: Most of the cells should be green, except the known missing FEBs
documented on the WhiteBoard. In case of holes not documented on the WhiteBoard or
in case of red FEBs, report to the LAr run Coordinator immediately.
• DQMF checks : none
40
5.4.3
Data Integrity DSP
Checks that computations of Energy (E), Time (T) and Quality Factor (Q) in the DSP (online)
are made properly by comparing to the same quantities computed offline.
Number of errors per partition and per gain
• One 2d histogram for the whole detector
• Description: Number of times where the difference between offline and online quantities is
above a tolerance threshold:
• OHP tab: [Data Integrity][DSP Physics][Summary]
• When to check it ? All along the run
• Expected status: Empty
– If errors are general (many partitions, many gains), call the run coordinator
– If error in a specific partition, open the corresponding tab
E(DSP) - E(offline) distribution
• Description:
– X-axis : E(DSP) - E(offline)
– Y-axis : energy range (the tolerance on the energy computation depends on the energy
range)
• When to check it? All along the run
• Expected status:
–
–
–
–
Range
Range
Range
Range
0
1
2
3
:
:
:
:
E
E
E
E
< 213
< 216
< 219
< 222
:
:
:
:
entries
entries
entries
entries
between
between
between
between
±1 MeV.
±8 MeV.
±64 MeV.
±512 MeV.
T(DSP) - T(offline) distribution
• One 1d Plot for the whole detector
• Description: T(DSP) - T(offline)
• Expected status: distribution centered on 0, between ±10 picoseconds. Some outliers up to
a few 10 picoseconds can be seen. The RMS should not exceed 10 picoseconds.
41
Q(DSP) - Q(offline) distribution
• One 1d Plot for the whole detector
• Description: (Q(DSP) - Q(offline))/Sqrt(Q(offline))
• Expected status: Distribution centered on 0 (large peak), between ±1.
Errors number per FEB
• One 2d histogram per partition in FT/Slot plane
• Description: Number of times where the difference between offline and online quantities is
above a tolerance threshold.
• OHP tab: [Data Integrity][DSP Physics][PARTITION]
• When to check it? If the summary plots show errors
• Expected status: Empty
– If a few entries : check if the involved FEB is known to be unhappy ...
– If all partition in error : problem of constants loaded in the DSP, call the run coordinator
• DQMF checks: Yellow if 1 entry, Red if more than 1 entry
Correlation between E(DSP) and E(offline)[respectively T and Q]
• 1d Scatter Plot per partition and per quantity (E,T,Q)
• Description: Correlation between online and offline quantities
• OHP tab: [Data Integrity][DSP Physics][PARTITION]
• When to check it? If the summary plots show errors
• Expected status: Perfect correlation (slope = 1)
42
5.4.4
Data Integrity FEB
Offline Rejection Yield
• Description: % of events rejected because of data integrity problems. “Whole event
corrupted”: the number of readout FEB was not constant during the run “Single FEB
corrupted”: the data integrity problem is localized in specific FEBs.
• OHP tab: [Data Integrity][FEB Errors][Global]
• DQMF check: Fraction of rejected Events < 1%
• Expected status: 100% accepted events
Number of Readout FEBs
• One single 1d histogram per half barrel/endcap
• Description: Compact view of number of readout FEBs per event. Only a check of DSP
header is performed (the data block can be empty!).
• When to check it? systematically.
• Expected status : The distribution must be a dirac. Check on the White Board37 which
partitions are in the readout to determine how many FEBs are expected. The total number
of FEBs for a completed partition can be found in table 6. For half of the barrel, there should
be 448 boards. For each of the endcaps, there should be 314. The entire dectector should
have 1524 boards.
• DQMF check: none
Number Of LArFEBMon Errors
• Description : Number of FEBMon errors for each partition.
• Expected status : Empty if everything runs fine
• For more info about FEB errors, check Appendix E
37
43
Number of Events DSP header
• One 2d histogram per half barrel/endcap
• Description: Number of events acquired per FEB. Only a check of DSP header is performed
(the data block can be empty!).
• OHP tab: [Data Integrity][FEB Data][Barrel/Endcap]
• Expected status: the number of events must be uniform among all FEBs of a partition and
contains no unexpected holes. The barrel should have a board in each bin, while the endcap
has a more interesting structure seen below. Do not worry that it’s a different plot below,
showing errors instead of events, the structure is the same. This will eventually be checked
against a reference histogram instead of by eye.
• DQMF check : none.
Average Number of cells above DSP thresholds
• One 2d histogram per half barrel/endcap
• Description: Number of events where the digits/time+quality factor are sent by DSP.
• OHP tab: [Data Integrity][FEB Data][Barrel/Endcap]
• When to check it? Systematically
• Expected status: depends on the thresholds. See plots in section 5.4.1
• DQMF check : none.
44
5.4.5
High Energy Digits
Type of run what to look at.
• Cosmics runs : look at High Energy Digits CosmicCalo tab.
• Circulating beams runs, no collisions : look at High Energy Digits CosmicCalo tab.
• Circulating beams runs, splashes : look at High Energy Digits L1Calo tab.
• Collisions runs : look at High Energy Digits L1Calo tab.
• For each of the previous case you also should look at High energy Digits Timing tab.
• note (a) : Most of the histograms are filled for a particular stream. This stream is written
in the histogram title. If the histogram is filled for all the streams, it’s also written in the
histogram title.
• note (b) : L1Calo events are selected on filled bunches of LHC, while CosmicCalo events
are selected on empty bunches of LHC.
• note (c) : Some other information are written in the histogram title: expected sample max,
range, selection cut...
High Energy Digits Summary
• One 2d histogram for the whole detector.
• Description : Summary of errors per partition
• OHP tab : [High energy Digits][High energy Digits][Global]
• When to check it? Any time during the run.
• First Bin : Number of Channel with max sample outside the expected range at least 0.5% of
the run. The percentage is computed dynamically during the run, that’s the reason why the
number of error increase and decrease.
• Second Bin : Number of Channel having been saturated at least once in the run. (ie:
max=4095 ADC count).
• Third Bin : Number of Channel with the min sample = 0 ADC count at least once during
the run.
• Fourth Bin : Mean time of the sub-partition.
• Expected status : Ideally the first 3 bin should be empty. The last bin should be filled as long
as the partition as triggered an event, and the average time should be close to the expected
one given in the title of the histogram.
45
Normalized Signal Shape
• One 1d histograms per partition.
• Description : Average signal pulse for high energy cells.
• OHP tab: [High energy Digits][High energy Digits][PARTITION]
• When to check it? systematically during a run.
• Expected status : Nice pulse shape if only cells with cosmics signal enters in the plot. Flat
or distorted shape if noise is dominant.
• Stream monitored : written in the title.
Energy vs Sample Number
• Description : Energy (in ADC) of the highest sample vs sample number.
• Expected status: Mostly peaked. Peak value should be close as the expected one given in the
title. If noisy cells are selected, the distribution should be more flat (max sample is random)
and with low energy.
Max Sample vs Time
• Description: Energy (in ADC) of the highest sample vs time.
• When to check it? systematically during a cosmic run.
• Expected status: Flat, value close to expected sample max given in the title.
46
Average Position Max Digit
• Description: Average position of the sample max for each FEB in the partition. Select only
events passing a 5 sigma cut. Each FEB should be close to the expected sample max given
in the title.
• Expected status: Flat.
Out Of Range
• Description: Yield of events with max sample outside the temporal range given in the title.
Select events passing a 5 sigma cut.
• Expected status: Empty. If one bin starts to grow up, most of the time means there is one
or more noisy channels in the FEB, could then check out of range at channel level. If noise
is dominant in the run this histogram will be filled with high values.
Out of Range at Channel level
• Description: Same histogram than Out Of Range, but information displayed here for each
channels.
• Expected status: Empty.
47
Null Digit
• Description: Yield of events with at least one digit null (ie content=0, without subtracting
the pedestal).Could be correlated with the DSP monitoring.
• Expected status: Empty. If not, the yield should be low, try to correlate with DSP monitoring,
or Q factor monitoring (noise burst event).
• Stream monitored : All streams.
Null Digit at Channel level
• Description: Same as Null Digit but at channel level.
• Expected status: Empty. If not, the yield should be low, try to correlate with DSP monitoring,
or Q factor monitoring (noise burst event).
Saturation
• Description: Yield of events with at least one saturated sample (ie content=4095).
• Expected status: Empty. If not, the yield should be low, try to correlate with Q factor
monitoring (noise burst event).
48
Saturation at Channel level
• Description: Same as saturation but at channel level.
• Expected status: Empty. If not, the yield should be low, try to correlate with Q factor
monitoring (noise burst event).
Max Sample per Stream
• Description: Average position of the sample max per Stream. Select only events passing a 5
sigma cut.
• OHP tab: [High energy Digits][High energy Digits Timing][Max Sample per Stream]
• Expected status: Should be peaked for each streams at the expected sample given in the title.
Trigger Word
• Description: Average position of the sample max per L1 trigger word. Select only events
passing a 5 sigma cut.
• OHP tab: [High energy Digits][High energy Digits Timing][Trigger Word]
• Expected status: Flat.
49
5.4.6
Timing
Difference in time between C and A sides
• One 1d histogram for the full detector. One entry per event.
• Description: Difference ofn average particle arrival time between C and A sides.
• OPH tab: [Timing][Run]
• When to check? Any time during the run
• Expected Status: Collisions candidates events should be centered around zero.
• DQMF check:ONLINE. In DQMD, important background from the beams will turn the
“Beam Background” DQ regions yellow or red. this is not a problem for LAr, it’s only
an indication of the beam quality.
Difference in time between C and A sides vs Lumiblock number
• One 2d histogram for the full detector. One entry per event.
• Description: Difference ofn average particle arrival time between C and A sides, vs lumiblock
number
• OPH tab: [Timing][Run]
• When to check? Any time during the run
• Expected Status: Collisions candidates events should be centered around zero.
• DQMF check: NONE
5.4.7
Energy Flow
Total Cell Energy vs (η,φ) for <sampling> - no Threshold, rndm trigger
• One 2d histogram per Sampling (PS, S1,S2,S3) and per partition (EMBA, EMBC,EMECA,
HECA ...)
• Description : Distribution in (η,φ) of the total accumulated energy in a given cell
• OHP tab: [MisBehaving Channels][CaloCells-RNDM][PARTITION] and [MisBehaving
Channels][CaloCells][PARTITION]
• DQMD tab: [CaloGlobal]
• When to check it?
• Expected status: The distribution is expected to be uniform in φ strips at fixed eta. In min
bias stream holes/depressed areas can indicate dead cells.
• DQMF check: OFFLINE: tested in DQ offline Web Display. ONLINE: DQMD check available
under CaloGlobal: navigate to CaloCells/SamplingX.
50
5.4.8
Quality Factor
The quality factor Q is calculated as the the quadradic difference between the measured pulse
shape (in ADC counts, pedestal subtracted) and the expected pulse shape. For cosmics and for the
initial collision data, the calculation is performed iterating on the relative phase between data and
prediction independently for each cell above a certain threshold (the digit must have been written
in the bytestream, so that depends on the DSP settings).
The quality factor is not normalised to the amplitude or energy, so it is expected to increase
with energy for a give gain selection. Studies on cosmics and early collisions show that cutting on
Q > 4000 is a safe cut to select noisy channels. This is the definition for a noisy channel in the
following. A preamplifier is linked to four channels, it is considered to have noise if three or four
channels are noisy. Most FEBs have 128 channels connected, a FEB will be considered as having
noise if more that 30 channels have noise.
Because of the cross-talk that alter the pulse shape, it is possible that for some preamplifiers to
be wrongly flagged as noisy in a given event, that is why one has to juge on the rate of occurence
over the run.
The cut to define a FEB as noisy is safe and one would not expect real energy deposits to fake
a noisy FEB. Known bad FEBs are not declared noisy and thus will not appear in the histograms.
In “free running”, LAr triggered events, like in cosmics, the noise can trigger the event and noisy
FEB are regularly detected in particular from the “partial ring events” in the outer EMEC, and,
less often, from coherent noise in the barrel presampler. The probability that such events happen
in coincidence with beam crossing is small. So far, events with several noisy FEBs in collision runs
were out-of-time cosmics.
Since we did not have enough experience with cosmics and early collision runs, there are no
histogram for FCAL nor HEC.
Number of noisy FEB
• Description: histogram of number of FEBs declared bad per event
• OHP tab:[MisBehaving Channels][Quality Factor]
• When to check it? Anytime durimg the run
• Expected status : less than 5, most probably events with more than 10 have problems
• DQMF check: OFFLINE:none. ONLINE:none.
Time of noisy FEB
• Description: Time (hours in the day) when a noisy FEB was detected
• When to check it? Anytime during the run
• Expected status : watch for noise bursts
51
Number of noisy preamplifiers
• Description: histogram of number of preamplifiers declared bad per event
• Expected status : to be defined
Time of noisy preamplifier
• Description: Time (hours in the day) when a noisy preamplifier was detected
• Expected status : watch for noise bursts
Percentage of events with FEB noisy (was: Noisy FEB fraction)
• One 2d histogram per partition
• Description: feedthrough vs slot histogram of the fraction of event in which the FEB was
declared noisy
• Expected status : less than 1%
Number of noisy FEB per LBN
• Description: histogram of the LBN in which a FEB was declared bad (there could be more
than one entry per event)
• Expected status : allows to identify LBNs where e.g. external noise could have been injected
in the LAr.
52
Percentage of events with PA noisy (was: Noisy PA fraction)
• Description: preamplifier number (arbitrary, in increasing channel number order) vs
feedthrough/slot histogram of the fraction of event in which the preamplifier was declared
noisy
• Expected status : less than 1%
Number of noisy PA per LBN
• Description: histogram of the LBN in which a preamplifier was declared bad (there could be
more than one entry per event)
• Expected status : allows to identify LBNs where e.g. external noise could have been injected
in the LAr.
53
5.4.9
MisBehaving Channels Digits
Number of monitored channels
• One single 1d histogram for the whole LAr.
• Description: Number of monitored channels per partition. With respect to the readout
channels, the channels flagged in the BadChannelDB and the channels without reference
pedestal/noise (if retrieved from COOL) are removed.
• OHP tab: [MisBehaving Channels][Digits][Global]
• When to check it? at the beginning of a run (if reference pedestals/noise are retrieved from
COOL) or after some time (if reference are computed from first events of the run).
• Expected status : The typical numbers of channels for a given partition are summarised in
table 6. The Endcap is the sum of the Standard EMEC + Special EMEC + HEC + FCAL
= 39,800 channels. When the whole barrel/endcap is readout, the number of channels should
be close. If it is not the case, this may be due to some missing conditions (if the reference
pedestals/noise are read from COOL) : in this case, check the plots in the Detector Coverage
tab).
Odd events yield
• Description: yield of odd events in the whole detector. One entry is made for each “ATLAS
event”.
• Expected status : the yield should be around the expected gaussian behaviour. For a 3 sigma
cut and both tails (only negative), one expects a yield around 0.27% (0.13%). If the histogram
peaks at 0, that means that no reference pedestals/noise are available. If it is empty, a major
overflow may be suspected : this is usually the case when the references are not reliable.
Odd events temporal distribution
• Description: number of odd events (i.e cells which are 3 sigma away from the reference) as a
function of time (event id or basic event counter)
• Expected status : The number of odd events as a funtion of time should be flat.
54
Proportion of odd events per channel
• One 2d histogram per partition.
• Description : number of odd events per channel. On the X axis, one can find all the FEBs of
a given partition (ordered first by half crate and inside the crates by increasing slot
• see table 5). On the Y axis, one can find the 128 channels of each FEB. The empty bins
correspond to channel not monitored (either not connected, with missing conditions or flagged
as bad in the DB).
• OHP tab: [MisBehaving Channels][Digits][PARTITION]
• When to check it? When hot trigger towers or accumulations of clusters are found
• Expected status: the yield for all channels should be around the expected gaussian value
0.27%. To determine which channels were spotted by the summary plot, double click on the
histogram to have access to all root options and redefine the minimum value in Z to exhibit
the channels that are above the threshold. First check quantitatively the observed increase
of noise. Then identifiy whether all channels are widespread in all detectors or grouped per
FEB (probably a problem of bad references) or by shaper (probably a hardware problem).
Proportion per FEB of odd events
• Description: number of odd events per FEB.On the X axis, one can find all the half crates
of a given partition. On the Y axis, one can find all the FEB of the half crates (see table
5). The empty bins correspond to FEBs not monitored (this is especially the case for the
endcaps where a lot of slot are not populated).
• Expected status: in each bin (corresponding to one FEB), is filled the yield of odd events
in this given FEB. As in this case, the odd events for all channels, this quantity is less easy
to interpret in term of noise increase. With a much larger statistics, the gaussian behaviour
should be observed much faster than in the case of individual channels.
Odd sums per FEB
• Description: Fraction of events where the sum of the cell energy per FEB is above 3 sigma.
On the Y axis, one can find all the FEB of the half crates (see table 5). The empty bins
correspond to FEBs not monitored (this is especially the case for the endcaps where a lot of
slot are not populated).
55
• Expected status: distibuted around 0.27% in each bin (corresponding to one FEB). FEBs
with higher value are showing coherent noise.
Odd Channels Yield per event
• Description: number of channels above 3 sigma
• When to check it? When L1 Calo trigger bursts are observed.
• Expected status: Centered on 0.27%. Tails will indicate evenst with large coherent noise
bursts.
• DQMF check: OFFLINE. Flag turns yellow if tails are found.
Time of bursty events
• Description: number of channels above 3 sigma
• When to check it? When L1 Calo trigger bursts are observed.
• Expected status: Empty. If burst are observed, they might also be visible in the timing plots
of 5.4.1
56
5.4.10
MisBehaving Channels CaloCells
Percentage of events in (η,φ) for <sampling> - Ecell < 3σ
HECA, HECC, FCALA, FCALC).
• Description: Percentage occupancy as a function of η,φ i.e fraction of the events where
Ecell < 3σ.
• OHP tab:
[MisBehaving Channels][CaloCells-L1Calo][PARTITION], [MisBehaving
Channels][CaloCells][PARTITION], [MisBehaving Channels][CaloCells-BPTX][PARTITION]
• When to check it? At least when 1000 events have been processed.
• Expected status : the distribution should be uniform with a bin value around 0.135.
• DQMF check: OFFLINE: tested in DQ offline web display. Online:Search for deviations.
DQMD check available under CaloGlobal: navigate to CaloCells/SamplingX.
Percentage of events in (η,φ) for <sampling> - |Ecell | > 4σ
• Description : Percentage occupancy as a function of η,φ i.e. fraction of the events where
|Ecell | ¡ 4 σ. CaloTopoCluster seeds.
• OHP tab:
• When to check it? At least when 100000 events have been processed.
• Expected status : in RNDM stream the distribution should be uniform with a bin value
around 0.63×10−2 .
• DQMF check: OFFLINE:tested in DQ offline web display. Online:Search for deviations.
Percentage Deviation: (Energy RMS - DBNoise)/DBNoise vs (η,φ) for < sampling> rndm stream
HECA, HECC, FCALA,FCALC).
• Description : Distribution in (η,φ) of the values: (measured energy RMS - Noise in the
database)/Noise in the database.
57
• When to check it? After at least 10 events have been recorded in the rndm stream.
• Expected status: The distribution should be centered at zero with variations that are less
than 10%.
• DQMF check: OFFLINE:tested in DQ offline web display. ONLINE:Search for deviations.
Cell Energy/Noise(DB) - <sampling>
HECA, HECC, FCALA,FCALC)
• Description : 1d distribution of the ratio of cell energy to the database noise for a given cell.
• OHP tab: [MisBehaving Channels][CaloCel-RNDM][PARTITION]
• Expected status : the distribution should be a Gaussian centered at zero with an RMS of 1.
Check if the mean shifts significantly from zero. Report if the RMS is different from by more
than 2 to 4 %.
• DQMF check: OFFLINE:tested in DQ offline web display. ONLINE: DQMD check available
Average Cell Energy vs (η,φ) for <sampling> - no Threshold, rndm trigger
HECA, HECC, FCALA,FCALC)
• Description : Distribution in (η,φ) of the average cell energy
• When to check it? At least after 100 events have been processed.
• Expected status : in random stream the distribution should be uniform with value around
zero. Search for outstanding channels where the average is non zero.
• DQMF check: OFFLINE: tested in DQ offline Web Display. ONLINE: DQMD check available
Cell Energy
HECA, HECC, FCALA,FCALC).
• Description : Energy distribution of all the cells in the sampling.
• OHP tab: [MisBehaving Channels][CaloCells-L1Calo][PARTITION] and [MisBehaving
Channels][CaloCells-BPTX][PARTITION]
58
• When to check it? systematically
• Expected status : Check for the presence of very large tails
• DQMF check: OFFLINE.none. ONLINE. None.
Percentage of events in (η,φ) for <sampling> Ecell > 5σ
• Description : Percentage occupancy as a function of η,φ i.e. fraction of the events where
|Ecell | ¡ 4 σ. CaloTopoCluster seeds.
• OHP tab:
• When to check it?
• Expected status: The distribution is expected to be uniform in φ strips at fixed eta. In min
bias stream holes/depressed areas can indicate dead cells. In RNDM stream the distribution
should be uniform with a bin value around 5.7×10−5 (only after at least ten million events).
• DQMF check: OFFLINE:tested in DQ offline web display. DQMD check available under
CaloGlobal: navigate to CaloCells/SamplingX.
59
5.4.11
MisBehaving Channels RawChannels
Mean Energy (MeV)
• One 2d histogram per partition. One entry per cell.
• Description: Average cell energy vs eta and phi, in MeV.
• OHP tab: [MisBehaving Channels][RawChannels][PARTITION]
• Expected status: The average energy should be centered around 0. Regions with significant
deviation from 0 indicate hot cells or cells with wrong calibration constants
• DQMF check: OFFLINE. Find cells with ABS(Mean Energy) > 50 MeV.
Percentage of events above 3 sigma
• Description: Fraction of events with energy greater than 3 times the noise stored in database.
• OPH tab: [MisBehaving Channels][RawChannels][PARTITION]
• When to check? After at least 1000 events have been processed.
• Expected Status: The bin values should be around 0.27%.
• DQMF check:OFFLINE. Look for channels over 1.5 %
Percentage of events below 3 sigma
• Description: Fraction of events with energy lower than -3 times the noise stored in database.
• OPH tab: [MisBehaving Channels][RawChannels][PARTITION]
• When to check? After at least 1000 events have been processed.
• Expected Status: The bin values should be around 0.27%.
• DQMF check:OFFLINE. Look for channels over 1.5 %
60
5.4.12
CaloGlobal
Hit map of cells with E/Ecluster >0.9
• One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π).
• Description: distribution in η,φ of the number of clusters whose energy is accounted for by
90% or more by one cell.
• OHP tab: [Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters]
• When to check it? After 100 events at first then certainly systematically beyond 10000 events.
• Expected status : The distribution should be φ symmetric. Search for oustanding bins at
fixed eta.
• DQMF check: OFFLINE: tested in DQ offline web display. ONLINE: present and tested in
DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters.
Average number of cells in clusters
• One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π)
• Description: distribution in η,φ of the average number of cells in a cluster.
• OHP tab:[Clusters][CaloTopoClusters] and [Clusters][EMTopoClusters]
• Expected status: The distribution should be φ symmetric.
• DQMF check: OFFLINE: tested in DQ Offline web display. ONLINE: present and tested in
DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters.
Avg energy of cluster with Energy > 0.0 GeV
• Description: distribution in η,φ of the average energy of positive energy clusters
• Expected status: The distribution is expected to be uniform in φ strips at fixed eta. Highly
populated rehoins indicate possible coherent noise effects.
• DQMF check:OFFLINE: tested in DQ Offline web display. ONLINE: present and tested in
DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters
61
Hit Map of Cluster with Energy > 0.0 GeV
• One 2d histogram with full detector coverage: η in (-4.9,4.9) and φ in (-π,π).
• Description: distribution in η,φ of number of positive energy clusters.
• Expected status : The distribution is expected to be uniform in φ strips at fixed eta.
• DQMF check: OFFLINE: tested in DQ Offline web display. ONLINE: present and tested in
DQMD check available under CaloGlobal: navigate to CaloMon/CaloTopoClusters
Eta Energy > 0.0 GeV
• One 1d histogram with with η in (-4.9,4.9)
• Description: distribution in η of number of positive energy clusters (integrated over φ)
• Expected status : Depending on trigger distribution. Expect a parabula-like shape for min
bias events (low occupancy in the central, high occupancy in forward region). Following the
granularity for random stream (high flat occupancy in central region, a sharp decrease after
|η| = 2.5 and peaks at |η| = 3.2 and 4.2)
• DQMF check:OFFLINE: shown in DQ Offline web display. ONLINE: shown in DQMD, under
CaloGlobal: navigate to CaloMon/CaloTopoClusters
Phi Energy > 0.0 GeV
• One 1d histogram with with φ in (-π,π)
• Description: distribution in φ of number of positive energy clusters (integrated over eta)
• Expected status: The distribution is expected to be uniform. Consult DQMD under the
CaloGlobal label (momentarily)
CaloGlobal: navigate to CaloMon/CaloTopoClusters
62
Tower Occupancy vs η and φ with E > 0.0 GeV
• Description : 2d distribution in (η,φ) of number of positive energy towers
• OHP tab:[CombinedTowers]
• When to check it? Not yet.
• Expected status : The distribution is expected to be uniform in φ strips at fixed eta. Consult
DQMD under the CaloGlobal label (momnetarily)
CaloGlobal: navigate to CaloMon/CombinedTowers
Energy in Most Energetic Tower
• Description : Energy distribution for the most energetic tower
• Expected status : The energy distribution of the most energetic tower
• DQMF check: OFFLINE:none. ONLINE:none
EtaPhi of Most Energetic Tower
• Description : 2d distribution in (η,φ) of the position of the most energetic tower
• Expected status : The occupancy is expected to be uniform in φ strips at fixed eta.
• DQMF check: OFFLINE:none. ONLINE:none
63
Tower Occupancy Vs Phi with E > 0.0GeV
• One 1d histogram with full detector coverage: φ in (-π,π)
• Description: 1d distribution in (φ) of the position of positive energy tower (integrated over
all η)
• Expected status : The occupany is expected to be uniform.
Tower Occupancy vs Vs Eta with E > 0.0 GeV
• One 1d histogram with full detector coverage: η in (-4.9,4.9)
• Description : 1d distribution in (η) of the position of positive energy tower (integrated over
all φ)
• Expected status :
caloTopoClusters)
The occupany is expected to follow the granularity (similar to
64
A
Tips to work at P1
A.1
Access rights
To
access
the
control
room,
you
need
to
have
a
CERN
ID, and take the basic safety training courses 1-4 (http://safety-commission.web.cern.ch/safetycommission/SC-site/sc pages/training/basic.html38 ) and request “ATL CR” through EDH.
To access the LAr satellite control room (3159-R012), you have to go with your CERN ID
to S. Auerbach (located at 124-R011). Tell them you will take LAr shifts and request access to
3159-R012.
To access the underground area (including USA15), you now need a token and dosimeter. This
is not required for most LAr shifters, who will remain in the control room only. To get this access,
you need a medical evaluation and a radiation training course. If you do go into the underground
area, it is also mandatory to wear safety shoes, a hard hat and lamp.
A.2
Network
To connect from/to P1 network, one has to go through the gateway called atlasgw as a user with
an account on it (example lardaq). Not all shifters have permissions to use atlasgw, but all should
be able to use atlasgw-exp.
• Connect to P1 network from outside world : >ssh atlasgw.cern.ch (possible targets include
pc-lar-scr-01, to 05, and pc-atlas-cr-03, 04, and 20)
• Connect to outside world from P1 network : >ssh atlasgw-exp.cern.ch (from here, at
Hostname you can type “lxplus” to open a browser, etc.)
Very few web pages are accessible from within the P1 network. And, the P1 webserver (pc-atlaswww.cern.ch) is viewable only from P1, or with a proxy server. The P1 webserver is, however,
mirrored (atlasop.cern.ch) for the outside world.
More details on the connection of P1 to the outer world can be found at the following address:
https://atlasop.cern.ch/FAQ/point1/39 .
To find out if someone has a P1 account, you can use either of the following commands on a
P1 machine:
ldapsearch -xLLL ’gecos=*firstname*lastname*’
/daq_area/tools/bin/lfinger -U ’*firstname*lastname*’
To see what roles are enabled, you can use the second command, “lfinger” with the username
of the person, without the “-U”.
A.3
logout at P1
If you need to logout, and nothing works, press “Ctrl, Alt, Back space”.
A.4
Printers
A printer named 3162-1C01-HP is located at the 1st floor above the Atlas control room.
38
39
http://safety-commission.web.cern.ch/safety-commission/SC-site/sc pages/training/basic.html
https://atlasop.cern.ch/FAQ/point1/
65
A.5
Phone numbers
LAr desks Atlas Control Room
LAr satellite Control Room
Tile desk Atlas Control Room
LAr Run Coordinator
71346
70949
71446
162582
The list of expert phone numbers can be found here from the LArOperationManualShifter
page40 (*)41 .
A.6
Updating this document and checklists
This document is created in Latex; the source files can be found in a SVN repositery. To get all
the sources files, follow the standard SVN procedure:
>export SVNGRP=svn+ssh://svn.cern.ch/reps/atlasgrp
>svn co $SVNGRP/Detectors/LAr/AtLarOper/trunk AtLarOper
The whole tree can be browsed on the web42 .
The structure is very simple with a core tex file LArOperation.tex that includes all the
sections. Please try to keep this overall architecture and try to use the new commands defined
in LArOperation.tex to write down button, panel... names.
Be careful about changing names of sections (the html files are named after the sections; changing
a section name will break all links and require the checklists to be updated.)
Once your modifications are commited in SVN, contact J. Leveque or S. Majewski in order to
update the web. (They’ll update pc-atlas-www-1:/www/web files/html/lar).
The checklists should be edited by experts only – and very carefully. One problem in one
checklist can break all of the checklists in the Atlas Control Room! They can be found in
/det/tdaq/ACR/XMLdata at Point 1.
To make a new checklist:
• Create the new XML file in the same directory as the others, for example, LAr-ex.xml
• Copy the structure of the other xml checklists, giving the same title to your checklist as the
name of the file.
• Edit CheckList.xml in the same directory. Add your checklist in two places: into the top set
with the \!ENTITY tag, typing the name of your checklist twice, and then below in the list of
checklists once. The position in the bottom list will set the position of the checklist.
Status of checklists:
• signin-LAr replaces the old “LAr-Start-of-Shift”. It contains the tasks which shifters should
complete when they first arrive. It is linked to the RunCom tool. Completing this checklist
changes the Runcom state to “ready.”
• startrun-LAr should be completed at the beginning of the shift, and at the beginning of a
run. (startrun-LAr replaces the “desk” checklist.)
• DQcheck-LAr is the checklist attached to the RunCom tool for the DQ status. It is short,
to make sure that OHP is open and some quality control is done. It links to the longer LArOnline-Monitoring checklist. Completing these actions does not mean that the Data Quality
40
https://pc-atlas-www.cern.ch/twiki/bin/view/Main/LArOperationManualShifter#Phone Numbers
https://atlasop.cern.ch/atlas-point1/twiki/bin/view/Main/LArOperationManualShifter#Phone Numbers
42
https://svnweb.cern.ch/cern/wsvn/atlasgrp/Detectors/LAr/AtLarOper/#path Detectors LAr AtLarOper
41
66
itself is OK, it just means that the checks are being performed. Finishing this checklist changes
the DQ status from “unchecked” to “checked”. (This used to be the “shifter” checklist.)
• injection-LAr contains tasks that should be performed before giving the OK for beam injection
• stablebeam-LAr will be used to bring the FCAL voltages back up, once there is stable beam.
Right now, it is just a placeholder.
• LAr-Shift-Tutorial is a short tutorial for new shifters, to check their credentials and introduce
them to the tools.
• LAr-Online-Monitoring and LAr-Offline-Monitoring are meant to guide the shifters through
the most important monitoring checks. They are still works in progress.
• LAr-Calibration.xml and LArg-Calibration.xmlOLD are obsolete, replaced by the detailed
instructions in this manual. They have been removed from the CheckList tool, but remain in
the directory.
Tips for latex2html:
• To add a link to this document where ”link name” is a hyperlink to the URL ”link-URL” in
the hypertext version of the document, but no indication of this is included in the printed
version, use \htmladdnormallink {link name} {link-URL}
• To
include
the
original
link
as
\htmladdnormallinkfoot {link name} {link-URL}
a
footnote,
use
• Using the LATEX package “graphicx”, you can include a picture without defining its extension
(eps or png). It is the LATEX compilation which will choose the correct version. So latex
will use eps format and latex2html will use png. It resquests to have different version of the
picture in your directory. See the file “Appendix/hwMemento.tex” for example.
A.7
Creating graphics (screenshots) at P1
1. Open KSnapshot from the General Menu (in the bottom of your left screen) or Open a
terminal and Type ‘ksnapshot”
2. A panel will open, and you should select Capture mode: “Window under cursor”
3. Click “New Snapshot”
4. Click the window you want to capture, and it will show up on its own in the ksnapshot
program.
5. Click “Save As...” and you can save the file.
6. Now, you need to copy it from P1. Quit knapshot, if you like.
7. Back in the terminal, in the directory with the file, you can copy it to your own home
area with cp <filename> /atlas-home/<0 or 1>/<yourusername>/. Now you can go to
a
terminal
on
your
own
machine
and
do
scp atlasgw:/atlas-home/<0 or 1>/<yourusername>/<filename>.
67
Slot
1
EMB
PS
Std EMEC
PS
FEB type
Spe EMEC
PS
2
F0
F0
F0
3
4
5
6
7
8
9
10
11
12
13
14
15
F1
F2
F3
F4
F5
F6
B0
B1
M0
M1
M2
M3
-
F1
F2
F3
F4
F5
B0
B1
M0
M1
M2
M3
-
M0
M1
F1
F2
F3
F4
B0
M2
M3
F5
B1
M4
M5
HEC
I1 (Emec inner
wheel)
I2 (Emec inner
wheel)
HEC-L1
HEC-L2
HEC-M1
HEC-M2
HEC-H1
HEC-H2
Table 5: Correspondence between slot and FEB type for all type of crates.
B
Hardware memento
68
FCAL
F1 00
F1 01
F1
F1
F1
F1
F1
02
03
04
05
06
F1
F2
F2
F2
F2
F3
F3
07
00
01
02
03
00
01
Nb of FEBs
Nb of channels
EMB
448
5̃7k
Std EMEC
208
2̃7k
Spe EMEC
68
8̃k
HEC
24
3̃k
Table 6: Total number of FEBs of a given type in a half barrel/ endcap.
69
FCAL
14
1̃800
C
Few hints on events dump
If you want to further investigate a data integrity problem, it may be very useful to use event dump program directly on the machine on which
the data are written (see 5.2.5).
For more details on the meaning of the Ctrl words, please refer to documentation by J.Prast py6414cyclone v29.doc (v2.9) available on the
website http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html43
One should finally note that most of the errors should be spotted in the LArFEBMon algorithm (running either online or offline). Detailed
list to come.
70
ROD marker ee1234ee
ROD Hdrsize
9
ROD Eformat 3000008
ROD source
410110
ROD runnb
12046
ROD evtid
0 :
01
ROD bcid
63
ROD trigger
0
Coding of event types : 2 : Calibration / 4 : Transparent / 7 : Physic format ROD evt type Transparent
DSP block Size
2343
DSP FebId
0x39300000 Feb Side EMBA 04L PS
DSP FebSer
000001020
Offset and sizes for different blocks.
Depends on both settings in the LAr H/W control
DSP Off Energy
0 Size
0
DSP Off Chi2
0 Size
0
DSP Off RawData
26 Size
2330
DSP Status
0x00000000
DSP Nb gain
1
DSP Nb samp
32
DSP Feb config 0x00000003
DSP Unknown
0x00000000
panel:
RawData/Results
+
Format
The Ctrl0 word contains 16 blocks (1 for each gain selector). The 4 first bits corresponds to a parity flag; the 4 following bits identify the gain selector
(therefore all different between 0 and f); the 8 last bits (here : a0) correspond to the EVTID and should be equal for all gain selectors.
Ctrl0
8a0
40a0
49a0
1a0
4aa0
2a0
ba0
43a0
4ca0
4a0
da0
45a0
ea0
46a0
4fa0
7a0
43
http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html
The Ctrl1 word contains the BCID for the 16 gain selector. It should be the same for all gain selectors.
Ctrl1
62
62
62
62
62
62
62
62
62
62
62
62
62
62
62
62
The number following the Hdr block corresponds to the sample number; it must be between 0 and the Nb samp (see above in DSP block). The 16 following
numbers (here 59) correspond to the number of the SCA cell (in hexadecimal) that is readout (1 per gain selector): they must be all equal.
Hdr 00
59
59
59
59
59
59
59
59
59
59
59
59
59
59
59
59
The number following the Samp block correspond to the sample number again; then are written the raw data for the 16 gain selectors followed by the gain
(here High). This type of line is repeated 8 times to complete the 128 channels of the FEB.
Samp00 h 1000 h 955 h 980 h 971 h 994 h 1003 h 915 h 972 h 975 h 966 h 983 h 1014 h 985 h 994 h 1009 h 986
71
Same as before but for the following sample. It is worth to note that for reasons of
way that they are not consecutive from one sample to another. This is normal.
Hdr 01
58
58
58
58
58
58
58
58
58
Samp01 h 1005 h 961 h 969 h 980 h 989 h 991 h 901 h 974 h 988
Samp01 h 976 h 928 h 949 h 963 h 989 h 984 h 974 h 1001 h 990
Samp01 h 1009 h 939 h 959 h 950 h 1030 h 987 h 972 h 959 h 1004
Samp01 h 999 h 922 h 999 h 990 h 948 h 966 h 941 h 944 h 1004
Samp01 h 955 h 930 h 967 h 986 h 993 h 1011 h 962 h 951 h 1025
Samp01 h 942 h 952 h 967 h 974 h 977 h 1010 h 900 h 980 h 992
Samp01 h 950 h 967 h 975 h 963 h 1011 h 991 h 979 h 965 h 1013
Samp01 h 927 h 924 h 946 h 978 h 1006 h 982 h 953 h 932 h 978
Hdr 02
Samp02
Samp02
Samp02
Samp02
h
h
h
h
48
1003
970
994
992
h
h
h
h
48
935
932
952
922
h
h
h
h
48
974
957
973
1004
h
h
h
h
48
983
956
968
981
h
h
h
h
48
997
1007
1036
950
h
h
h
h
48
985
995
990
976
h
h
h
h
48
911
987
974
942
h
h
h
h
48
980
992
947
950
h
h
h
h
48
994
1005
1022
990
time/memory optimisation, the SCA cell numbers are encoded in such a
h
h
h
h
h
h
h
h
58
983
943
960
971
947
962
969
925
h
h
h
h
48
975
939
961
964
h
h
h
h
h
h
h
h
58
992
973
961
954
931
966
962
961
h
h
h
h
48
992
969
960
941
h
h
h
h
h
h
h
h
58
1023
1023
1042
1032
1007
1010
1023
984
h
h
h
h
48
1028
1019
1036
1029
58
58
58
h 1002 h 1003 h 1012 h
h 994 h 957 h 947 h
h 989 h 964 h 999 h
h 999 h 978 h 987 h
h 957 h 984 h 997 h
h 997 h 985 h 994 h
h 970 h 984 h 1010 h
h 996 h 979 h 970 h
h
h
h
h
48
978
996
997
996
h
h
h
h
48
995
959
981
995
h
h
h
h
48
1005
966
996
974
h
h
h
h
58
959
1001
999
995
988
1025
1000
1016
48
962
1005
1022
995
Samp02
Samp02
Samp02
Samp02
Hdr 03
Samp03
Samp03
Samp03
Samp03
Samp03
Samp03
Samp03
h
h
h
h
960
948
954
937
h
h
h
h
h
h
h
49
998
972
1000
985
954
952
971
h
h
h
h
944
944
946
936
h
h
h
h
h
h
h
49
934
940
942
924
941
945
942
h
h
h
h
959
972
968
963
h
h
h
h
h
h
h
49
988
957
969
1000
967
977
974
h
h
h
h
986
972
966
979
h
h
h
h
h
h
h
49
951
947
963
982
980
955
977
h 1007 h 1000 h
h 988 h 1005 h
h 1029 h 985 h
h 1002 h 960 h
h
h
h
h
h
h
h
49
994
999
1037
962
1010
976
1005
h
h
h
h
h
h
h
49
1012
1007
993
970
1009
998
982
h
h
h
h
h
h
h
955
916
972
970
49
912
999
969
953
955
948
997
h
h
h
h
938
945
947
939
h 1004 h
h 983 h
h 1010 h
h 970 h
960
963
964
929
h
h
h
h
h
h
h
49
962
988
950
955
961
944
955
49
1004
995
1029
983
999
982
1008
49
981
935
964
976
937
965
972
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
919
953
962
923
h 1021 h
h 1013 h
h 1017 h
h 995 h
960
984
967
998
h
h
h
h
h
h
h
49
1000
978
960
963
939
952
976
49
1023
1015
1029
1038
995
1021
1019
49
995
1016
993
1003
984
989
983
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
h
999
975
977
999
h
h
h
h
h
h
h
49
999
979
974
1009
993
981
980
h
h
h
h
990
998
999
974
h 988
h 1022
h 1012
h 999
h
h
h
h
h
h
h
49
1000
961
983
984
983
995
983
49
976
1003
1011
998
986
999
1010
h
h
h
h
h
h
h
72
Here are skipped the blocks of data for samples between 04 and 31
...
...
The Ctrl3 word contains 16 blocks coded in hexadecimal (one for each gain selector). The 4 first bits are related to parity information. The 4 following to
bit errors and SEU : 8 means OK. The 8 last bits contain the SPAC status : it should be equal to 07 for the first event and 05 for the following ones. A
SPAC error should normally be propagated to the INFGA status.
Ctrl3
4807
4807
4807
4807
4807
4807
4807
4807
4807
4807
4807
4807
4807
4807
4807
4807
The INFGA status contains several checks regarding BCID, EVTID, SCAC performed at the FEB level. It should normally be equal to 0x0.
INFPGA status 0x0
Unknown 0
samples 32
gains 1
D
What to remind from the old discussion forum?
D.1
On the triggers
How to find which triggers are included in a trigger stream (eg: L1Calo)?
• open a terminal and type: /det/tdaq/scripts/start trigger tool
• login as user
• Double click on the correct SUPER MASTER KEY (you can find this in the top field of
the trigger tab in the tdaq panel). Wait for this to load. Ignore the window which opens
automatically and use the original window which opened
• Click on L1 Streaming and a new window will open
• Double click the stream you want (eg: L1Calo) and a list of the triggers comes up You can
also use Streams instead of L1 Streaming for L2 and EF info
How can I learn specifics about the definitions of triggers being run right now?
• From a terminal : > /det/tdaq/scripts/setup_TDAQ_15.2.0.sh
> /det/tdaq/scripts/start_trigger_tool
• The trigger GUI will pop up. Login as ”User”
• Click Search. You get will a list of all SuperMaster keys.
• How to find the SuperMasterKey currently used in the run :
– On the AtlasOperations wiki page, click on the Run Control WhiteBoard (at the top of
the page)
– The trigger menu is given in the ”basic Run parameters” sections
• Choose the corresponding masterKey from the list in the trigger GUI. You will see all of
the L1 trigger bits. Double-click on the ID number and a window will pop up showing the
L1 triggers. One can expand any of them to see the logical definition and the different L1
conditions. Keep expanding down the tree to see thresholds, etc...
D.2
Data flow picture
73
74
E
More info about LAr FEB errors
The LArFEBMon algorithm performs different checks on the data integrity. Here are briefly
detailed the different errors reported; for more details on the bit significance, you should report
to the table 1 of the reference document pu6414cyclone v29.doc downloadable at the adress
http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html44
The data sent by one Front End Board (FEB) are always accompanied by a StatusWord,
describing the status of several internal FEB components. This status word coded on 12 bits is
encapsulated in Digital Signal Processor (DSP) header for further decoding. From this StatusWord,
are derived by the algorithm 6 types of errors depending on the observed error bits :
• Bit 6 : parity error;
• Bits 2 or 7 : BCID mismatch between 2 halves or within one half;
• Bits 3 or 8 : Sample header mismatch between 2 halves or within one half;
• Bits 1 or 9 : EVTID mismatch between 2 halves or within one half;
• Bits 4, 11 or 12 Wrong SCAC status within one half FEB or in one half of FEB;
• Bit 5 : Gain mismatch within time samples;
Additionally to these 6 types of errors, are performed 4 checks45
• Type mismatch : data blocks of several FEBs are of different types (Raw data, Physics data,
or Calibration data). The first readout data block is taken as reference.
• SCA out of range : the decoded SCA (analog pipeline located between shaper and ADC in
the readout chain of the FEB) adress is outside the physical range [0;144]. This is probably
due to a more severe data corruption or bad bite stream conversion.
• Non uniform number of samples : data blocks of several FEBs have a different number of
samples. The first readout data block is taken as reference.
• Empty FEB data block : one FEB does not send any data (but the presence of a DSP header
proves that it is included in the readout and therefore should send data).
44
http://wwwlapp.in2p3.fr/atlas/Electronique/RODs/index.html
in the case of calibration runs or physics run in physic format, the sca block is not available, that prevents us to
perform the 3 last checks detailed here after
45
75

LAr operation manual

Transcription

Similar documents

Monster Jobs - by Carpe Data

1. Go to ezlm.adp.com and click the

Performance Series http://www.edperformance.com

Dues Card Instructions - Grand Chapter of Texas

April 2015 Newsletter.pub

Friday • June 3 Contestant Registration 10 PM @ JJ`s Clubhouse

- Color

OS X Formatting Guide (10.6