Monte Carlo (MC) event generation in ATLAS uses AthGeneration
, which provides
a variety of generators and the detector simulation. While the generator software
is available independent from ATLAS software, it is important to use it through
the ATLAS interfaces to ensure consistent usage of these tools, reproducibility
for other collaboration members, and consistent settings of program parameters.
There are
numerous generators available
for use in ATLAS, each with its own advantages and disadvantages. In many cases,
new physics samples use the
MadGraph5_aMC@NLO
event generator (or just “MadGraph” for short).
The MC generation part of the tutorial consists of two parts: generation
and validation. These need to be done in separate directories, using
different release setups. To keep a sensible directory structure, use
the following commands from your tutorial
directory:
mkdir MCTutorial
cd MCTutorial
mkdir MCGeneration
For this tutorial, we’re working in an area that will be saved so that you have everything preserved in front of you and can play around for as long as you like. Normally, if you are running test jobs or a small production, you should use a responsive locally-mounted disk, like
/tmp/$USER
on lxplus or the scratch space available to you in the batch system. The general idea you should follow is to preserve what you need in a stable area like AFS or EOS, copy job inputs to the local area, run, and then copy outputs to stable storage again (preferring EOS for storing large outputs). Generally, AFS will be more responsive for storing software that you want to compile and run against, and EOS will be more performant (and has more space) for storing large input and output files. If you are running in that mode, don’t forget to also copy back any log files you need, or to check carefully for job failures. Incidentally, this is basically what grid jobs do as well! When in doubt, you should check if$TMPDIR
is defined and use it for your jobs.
Begin by moving to the MCGeneration
directory and setting up a recent
AthGeneration release. These use the 23.6 series.
tutorial/MCTutorial/MCGeneration
cd MCGeneration
setupATLAS
This will start a new shell with the container. When the new shell is open, use the following command to set up the release.
asetup AthGeneration,23.6.39
The MC generation tools make use of a jobOptions
(often referred to
as JOs) file that uses python to define commands for AthGeneration
to execute.
JOs are used for many different tasks using ATLAS software. The format and common methods used in JOs are very procedure-specific. If you need to write JOs for a task, it is helpful to look at existing examples.
We will create the JOs to produce a pair of leptoquarks with a mass of 1000 GeV that decay to first and second generation leptons/quarks with a final state containing either two electron or two muons.
In your MC generation work area, create a directory called 100000
and
the file 100000/mc.MGPy8EG_A14N23LO_LO_LQ_S1_PairProd_SameFlav_m1000.py
.
mkdir 100000
touch 100000/mc.MGPy8EG_A14N23LO_LO_LQ_S1_PairProd_SameFlav_m1000.py
The 6- or 7-digit directory name (100000) is known as a
DSID
(Dataset Identifier). This is used as a unique numerical identifier for the specific JOs. For local testing, you can use dummy 6- or 7-digit numbers, placing exactly one JO file in eachDSID
directory. A uniqueDSID
is assigned to each of your JOs in the central sample production procedure. Smaller numbers (below 500000) are reserved already, so new production generally will use larger numbers.
The JOs filename is required to follow a certain format. The string between
mc.
and .py
is known as the DID
or “physics short” and provides a succinct description of
the process described by the JOs. It is required to contain information
about the generator(s) and PDF(s) used and should contain other information
such as the model and other properties related to the production and decay
mechanism. It is also required to contain no more than 50 characters and should
be unique to the sample if possible (e.g. each signal point in a scan should have
different physics shorts, and differently configured top-quark production samples
should have different physics shorts). The
first part (MGPy8EG
) indicates which tools are used in the event generation.
MG
refers to LO MadGraph, which is used for the matrix element calculation.
(NLO MadGraph is denoted as aMC
) Py8
indicates that Pythia8 is used
for the parton shower and hadronization step. EG
refers to EvtGen, which is
an afterburner that ensures the decays of B hadrons are correctly modeled.
A14NN23LO
refers to the LO NNPDF2.3 PDF set used with the A14 tune. LO_LQ_S1
is the MadGraph model that is used. See if you
can parse the rest to understand what physics process (production mode, decay
mode, and BSM particle masses) is being simulated by this JOs.
Copy the code below into your JOs file:
# Import all of the necessary methods to use MadGraph
from MadGraphControl.MadGraphUtils import *
from MadGraphControl.MadGraph_NNPDF30NLO_Base_Fragment import *
# Some includes that are necessary to interface MadGraph with Pythia
include("Pythia8_i/Pythia8_A14_NNPDF23LO_EvtGen_Common.py")
include("Pythia8_i/Pythia8_MadGraph.py")
# Mass of the leptoquark
lq_mass = 500
# Number of events to produce
safety = 1.1 # safety factor to account for filter efficiency
nevents = runArgs.maxEvents * safety
# Make sure LQ PDG IDs are known to TestHepMC:
pdgfile = open("pdgid_extras.txt", "w+")
pdgfile.write("""
-9000005
9000005
""")
pdgfile.close()
# Here is where we define the commands that will be passed to MadGraph
# Import the LQ model
process = """
import model LO_LQ_S1
"""
# Define some multi-particle representations
process += """
define charm = c c~
define up = u u~
define q = u u~ d d~ c c~ s s~
define e = e- e+
define mu = mu- mu+
"""
# Define the physics process to be simulated
process += """
generate g g > e mu up charm
"""
# This defines the MadGraph outputs
process += """
output -f
"""
# Define the process and create the run card from a template
process_dir = new_process(process)
settings = {'ickkw': 0, 'nevents':nevents}
modify_run_card(process_dir=process_dir,runArgs=runArgs,settings=settings)
# Set some values in the param card
# BSM particle masses
masses={'9000005':lq_mass, #S1
'1000021':1000000. } # chi10 - needed because of a bug in the model
# Leptoquark width
# This is hard-coded here, but could be calculated on the fly with a function
lq_width = 39.7887
decays={'9000005':"""DECAY 9000005 %g #leptoquark decay""" % lq_width}
# These are the couplings of the leptoquarks to first and second
# generation fermions
yuks1ll={'1 1':"""0.000000e-00 # yll1x1"""}
yuks1rr={'1 1':"""1.000000e-01 # yRR1x1"""}
yuks1rr={'2 2':"""1.000000e-01 # yRR2x2"""}
# Create the param card and modify some parameters from their default values
modify_param_card(process_dir=process_dir,params={'MASS':masses,'DECAY':decays,'YUKS1LL':yuks1ll,'YUKS1RR':yuks1rr})
# Do the event generation
generate(process_dir=process_dir,runArgs=runArgs)
# These details are important information about the JOs
evgenConfig.description = 'Single Leptoquark coupling lam1122. m_S1 = %s GeV' % (lq_mass)
evgenConfig.contact = [ "Jason Veatch <jason.veatch@cern.ch>" ]
evgenConfig.keywords += ['BSM','exotic', 'scalar', 'leptoquark']
arrange_output(process_dir=process_dir, runArgs=runArgs)
Note the lines that include the variable
safety
. The production system expectsmaxEvents
to come out of the generation step before they are passed down the simulation chain. This is not always equal to the number of events the generator is instructed to produce. There can be multiple reasons for this, but the main one is filters, which are discussed later. The MadGraph step producesnevents
events and the events then pass through Pythia8 and might pass through additional filters before being written into an output file. If the number of events in the output file is less thanmaxEvents
, the job will produce an error. Thesafety
factor makes sure that MadGraph produces enough events to account for any losses due to filters and other inefficiencies.
In the normal containers provided by
setupATLAS
, some text editors likeemacs
,nano
, orpico
will not work. It is generally recommended to separate the terminal in which you do your file editing from that in which you run athena. The editorvim
will still work in the container.
Running a Gen_tf.py
command will use the JOs to produce an events file. This
is a truth-level description of the event before detector effects are taken into
account.
tutorial/MCTutorial/MCGeneration
Use the following command:
Gen_tf.py --ecmEnergy=13600. \
--maxEvents=100 \
--randomSeed=123456 \
--outputEVNTFile=evgen.root \
--jobConfig=100000
Gen_tf.py
is one of several transforms that Athena provides. A transform is a high-level configuration that combines multiple job option files that your job can build off of. These transforms are used heavily on the grid for official samples, and they are designed to always be run from an empty directory. That means if you run two jobs in the same directory at the same time, you might see crashes.
This will take a few minutes to run and will use the JOs in 100000
to generate
100 events at a center-of-mass energy of 13.6 TeV. A large amount of text will be
written to your screen as well as log.generate
. If the process runs successfully,
you should see a message at the end saying:
INFO leaving with code 0: “successful run”
and a file called evgen.root
in the EVNT format. You can also find the LHE-format
events produced by MadGraph in a file called events.lhe
.
If you rerun the Gen_tf.py
command again in the same directory, it will overwrite
all of the outputs.
Now let’s look at the output to see if we are generating the expected signal.