WARNING: This section may no longer be up-to-date
This section should not be relevant to the typical user. It documents the details of how to implement a new driver to make the EventLoop package work in a new environment. If this is what you are trying to do, it may be a good idea to contact me up-front with the details of what you are trying to do, so I can give you some additional guidance.
The first decision you have to make is whether your driver should be a part of the EventLoop package or live in a separate package. That’s really up to you, but so far I am keeping everything in one package for simplicity. However, even if you keep your driver in a separate package, there are probably some changes that need to be made to the EventLoop package anyway.
The basic driver design will consist of three or four components:
This part of the EventLoop design is still very fluid. As we add more drivers some of the interfaces may change to accommodate their needs. That is one of the reasons why it is better if I know which drivers are out there, so that I can go and fix them if I break things. Anyway, this also means that you can request changes to the way EventLoop works behind the scenes to make your driver implementation easier.
When designing your driver, you have the choice of storing additional information inside the unique submission directory, as long as it doesn’t collide with any “official” files put there. If your files are fairly large you may consider removing them after the job has finished in order to save space.
The Driver class provides an interface for code that runs on the
submission node. As such your class needs to derive from that class and
override its virtual functions. So far the only virtual function is
doSubmit, which is called when submitting a new job.
Depending on the nature of your driver, you may also want to add further
configuration options. These can go either into your Driver class
itself, or into the driver-independent Job class. Which of the two is
preferable mostly depends on whether this is something you expect the
user to set on a job-by-job basis, or something that they would want to
keep the same for all their jobs. A combination is also possible, with a
field in the Driver class that can be overridden by a field in the Job
class. Configuration options that affect output datasets have the
additional option of going into
doSubmitmethod will be split into a
doGathermethod, which allows drivers to disconnect from a running job and then reconnect at a later stage.
The basic functionality of the
doSubmit method can be summarized like
Jobobject that will be needed on the worker node. This is mostly the list of algorithms, but also the actual samples being run over (for meta-data access), and potentially the list of output datasets.
location/hist-sample.root. The root tool
haddcan be used for merging.
When writing your driver you will have to interact fairly heavily with
the SampleHandler package. This package is still fairly new and not much
utilized, which means that we can still fix things that seem broken or
impractical. As a first thing you have to decide how your samples will
be represented. If what you need is a list of files, call
makeTDSet to get the list. If on the other hand your system is
aware of datasets, you may want to use
SampleGrid objects and store
the information in the meta-data. You may have to define the appropriate
meta-data fields if they don’t exist already.
For your output datasets the preferred method is not to copy them back
to the submission node, but directly to a storage element (see reasons
in the section on output datasets above). For you that means that you
should try to figure out how to do that. Once you have done that, you
need to figure out how to access those files and create a new
object to do so. In most cases this will be a
SampleLocal or a
SampleGrid object, but if your storage element is sufficiently special
you may need a whole new
Sample class. If that is the case, I can help
Now your output histograms have to go a separate way from your output
datasets. How this works depends on your batch system. Most batch
systems send some information back to the submission node, so you can
just include it there. Once all of them have arrived, you can use
to combine the output histogram files into a single one. Or if you want,
you can also try to add together histogram files as they arrive. The
later saves some time when running with a large number of sub-jobs.
SampleCompositethat needs to be supported, but is not supported right now. From a practical perspective a
SampleCompositeholds an entire SampleHandler that you then need to run over and combine. Not too much changes for you, except that you may have to combine histogram and output files over multiple datasets. I hope to address this issue soon.
The worker class contains the code that actually runs on the worker
node. As such, it is both in control of running the job as well as
providing all the hooks the user algorithms need to access their inputs
and outputs. To facilitate that, the
Worker base class contains a fair
amount of functionality itself and does some translating between the
algorithms and the implementation of the derived classes.
When initializing the
Worker object you have to do a couple of things:
Then when actually running you have to do a couple of things per event, and in this order:
Worker::tree(tree)with the new input tree.
Worker::treeEntry(entry)with the index of the tree entry currently processed.
Worker::algsChangeInput(), which will notify the algorithms that a new input file is available. It is important that this happens after you register both the tree and the next entry to process.
Worker::algsExecuteto tell algorithms to do the actual processing of the event.
Worker object has finished its processing events, it needs to
do a couple more things:
Worker::algsFinalizeto tell all algorithms that they are finished processing and need to perform any final work left.
Driver::saveOutput. This is a public static function, so it can be called by either the Driver or the Worker.
algsExecute, simplifying the process by one step.
What you need for your steering code will be highly system dependent.
You probably need a shell script that runs on the worker and a binary
that creates your
Worker object. For creating that binary you can just
add another if clause to the
util/event_loop_worker.cxx source file.
Please don’t add a lot of code to that file, just call a function that
does everything for your particular driver. Or you can add a completely
new binary if you choose. I prefer not to do that, since I don’t want
to have a large number of binaries sitting in my path.
I’m still working out, how best to do the unit test. For now take a
test/ut_driver_direct.cxx, which shows how I do it now.
However, I am not really happy with it, so it is probably going to