At some point in your analysis you will have to deal with systematics. There is no way around it; and be it just to verify that you are statistics limited and you don’t have to worry about systematics.
However, systematics is also quite literally an advanced topic, as in you have to have an analysis before you can evaluate its systematics. As such if this is the first time through the tutorial and you haven’t added any CP tools to your algorithm you should skip this section for now and move on to using some CP tools in your analysis. Once you have that you can come back here and learn how to evaluate your systematics.
The way systematics evaluation works is that you evaluate your analysis at different points in nuisance parameter space and give the results you get to your statistics program which then combines them into an overall result with an associated systematic error. How that happens will not be covered here, instead we’ll focus on how to prepare the intermediate results you need as inputs for the statistics program.
One classical example of a systematic is the calorimeter energy scale: We are pretty certain of what it is, but our assumed energy scale is almost guaranteed to be slightly off (as our calibration has limited accuracy). And if the energy scale is slightly off the energy of measured objects will be off as well, impacting our result. To characterize this we introduce a nuisance parameter, which represents the energy scale and which we can vary to see the effect which it has on the result. We also have an external constraint on that parameter, i.e. we calibrated the calorimeter and know how consistent a given nuisance parameter value is with that calibration.
For our purposes a nuisance parameter is an experimental parameter we don’t know for certain and which is not part of our result, but which affects our result. By convention it is scaled so that the external constraint can be represented by a unit Gaussian, i.e. 0 corresponds to the nominal value and +/-1 correspond to +/-1 sigma variations.
In our code we represent nuisance parameter values via
SystematicVariation
objects, which contain both the name of the
nuisance parameter and the value. An actual point in nuisance
parameter space is represented by a SystematicSet
object, which can
contain multiple SystematicVariation
objects (though in most cases
it is empty or contains only one of them).
To incorporate systematics into an analysis it is typically sufficient to reconfigure the CP tools for the different points in nuisance parameter space. To that end CP tools affected by systematics incorporate the ISystematicsTool interface. This is typically best done by collecting all your systematics tools into a single vector and looping over them. For the rest of this discussion we’ll assume you have such a vector defined in your algorithm class:
std::vector<CP::ISystematicsTool*> m_systematicsTools;
In general you will have to generate a list of all the nuisance
parameter points you want to evaluate. Before doing that you need the
list of all nuisance parameters that you want to evaluate. For that
ISystematicsTool
has two functions: recommendedSystematics()
and
affectingSystematics()
. By and large you want to stick with the
recommended systematics, which are the systematics the CP group
actually recommends you to use. Affecting systematics are all the
systematics that actually affect the tool, which can also include
cross checks overly pessimistic systematic, etc.
To get the recommended systematics do:
CP::SystematicSet recommendedSystematics;
for (auto tool : m_systematicsTools)
recommendedSystematics.insert (tool->recommendedSystematics());
Now you should convert this to a list of nuisance parameter points to evaluate. The simplest thing you can do (and which is sufficient for many analyses) is to do a +/-1 sigma variation for each systematic. There is a specific tool for just doing that:
std::vector<CP::SystematicSet> systematicsList
= CP::make_systematics_vector(recommendedSystematics);
Though for practical use you probably want to make this a member variable in your algorithm:
std::vector<CP::SystematicSet> m_systematicsList;
As to how you actually set that variable, you are in a bit of a bind:
make_systematics_vector
on the
recommended systematics you are probably fine just doing this inside
initialize()
of your algorithm.Please note that you need to get the list of systematics after you configure your tools. Some tools have multiple possible systematics configurations which are selected using configuration parameters. So if you do this before configuring your tools you get the wrong answer.
Please note that this is really just the most basic way of generating your nuisance paramater points. For some more advanced features take a look at MakeSystematicsVector class. Two of the more popular options are:
Note that ideally you would pass the list of nuisance parameters into your statistics tool and it would give you the best list of nuisance parameters back. However, currently (09 Jul 17) none of our tools supports that.
Actually applying your systematics is very simple. Let’s assume that your execute function currently looks like this:
execute ()
{
// do something
}
Then with systematics it will look like this:
execute ()
{
for (auto& systematic : m_systematicsList)
{
for (auto tool : m_systematicsTools)
tool->applySystematicVariation (systematic);
// do something
}
}
Or almost, you will have to make sure that you use unique names within each systematic, e.g. if you fill a histogram like this:
hist("h_met")->Fill (met);
it may then look like this:
hist("h_met_" + systematic.name())->Fill (met);
This is not the most efficient way of handling this, as it involves a fair amount of string operations. Ideally
hist()
would take theSystematicVariation
object as a second argument to avoid that. Though, unless you have to fill a lot of histograms this is probably just fine.
You may have noticed that we just reconfigure and rerun all the tools for all the systematics, which is the simplest way to do this, as it ensures that we have a consistent view of the event, even if multiple CP tools are affected by the same systematic. It is also perfectly safe, as CP tools will just ignore any systematics that don’t affect them. However, it is also not the most efficient way of doing this, as we rerun all the tools even if they are not affected by the systematics. So there is some room for optimization (and some analysis frameworks do this), but there is some potential for mistakes and unless a lot of people will run your code it is probably not worth it.