### Python interface The HERDOS software uses the Python3 scripting language as its user interface. This means that all job configuration and steering is done via python, and in principle, simple application code can also be implemented directly in python. This is mainly achieved by using python binding techinique based on boost.python. The typical data processing chain in HERDOS consists of a inear arrangement of smaller processing blocks, called Algorithms in HERDOS. The tasks of Algorithms vary from simple ones like skimming data from input files to complicated tasks such as performing full detector simulation. The specific selection and arrangement of Algorithms completely depends on user's requirements. When processing data, HERDOS will call all Algorithms defined in the job accordingly, starting with the first one and proceeding with the module next to it. ![Fig. Processing sequence of HERDOS](figs/sequence.png) The event data, to be processed by the algorithms is stored in a common storage shared by all Algorithms, named DataStore. Each Algorithm has read and write access to the storage, as shown in the following figure. The following script shows a simple python script that calls HERDOS to execute a simple data processing task. In this example, an Algorithm named as "HelloAlg" is configured. During the event loop, HERDOS will call `HelloAlg::execute()` for each event. There are a few other modules, such as `Sniper` and `Task`. They will be introduced in detail in the developer documentation. ```Python #!/usr/bin/env python # -*- coding:utf-8 -*- import Sniper task = Sniper.Task("task") task.setLogLevel(2) import HelloWorld alg = task.createAlg("HelloAlg") task.setEvtMax(5) task.run() ``` There are a few examples defined under the `offline/Example` folder, including - offline/Example/AnalysisExample: example of analyzing HERDOS data file - offline/Example/EDMtests: example of reading/writing event data - offline/Example/HelloHist: example of calling `RootWriter` to create hitsograms - offline/Example/HelloTree: example of calling `RootWriter` to create TTrees - offline/Example/HelloWorld: example of simple Algorithm, Service and Tool ### Run official simulation There are a few python scripts pre-defined to run detector simulation and reconstruction jobs. After HERDOS is compiled, they are placed under the `offline/install/script` folder, including: - Full detector simulation: `offline/install/scripts/SimpleSimConfig/devrun.py` Example of running this scripts 1. External environment `source /cvmfs/herd.ihep.ac.cn/HERDOS/SLC7/Pre-release/ExternalLibs-GCC8.5.0/bashrc.sh ` 2. Offline environment `source /path/to/install/setup.sh ` 3. Now you have all related environment ready run scripts ```py python offline/install/scripts/SimpleSimConfig/devrun.py --seed 123 --g4mac comp_vertical-full.g4mac --particle proton --energy 60 -o ./sim.root -N 3000 --seed integer of random number to put into Geant4 --g4mac geant4 macro used, examples can be found in /path/to/install/g4macro/*.g4mac, for example comp_vertical-full.g4mac, comp_iso-R1.8m.g4mac, comp_powerlaw.g4mac --particle particle type, supported : electron, proton, helium etc, for other nuclei, you can add ion_3_6_3 for Lithum3 --energy MC energy in GeV, --energy 0.3 150 if specify an energy range 0.3-150 GeV. -o MC output name. -N number of events used in generation. ``` Other options: powerlaw you can apply ``` /gps/ene/type Pow /gps/ene/alpha -1 /gps/ene/min 0.3 /gps/ene/min 150 ``` [All other Geant4 macro parameters are supported in --g4mac](https://www.fe.infn.it/u/paterno/Geant4_tutorial/slides_further/GPS/GPS_manual.pdf) #### MC simulation results ![Fig. MC Simulation tree list](figs/mctreelist.png) ![Fig. MC Simulation branches](figs/mcbranch.png) ### Run official digitalization & reconstruction In the section of simulation example, we generate 3k MC events of 60GeV proton with vertical direction. Proper digitalization and reconstruction process are required, alike the simulation, these algorithms are treated as packages/modules and can be imported separately. ``` python /offline/install/scripts/GlobalTrack/run_recon.py usage: run_recon.py --input INPUT --output OUTPUT [--geometry GEOMETRY] [--calopca] [--calodigi] [--psddigi] [--fitdigi] [--fitcluster] [--fittrack] [--scddigi] [--scdcluster] [--scdtrack] [--globaltrack] the following arguments are required: --input, --output ``` 1. FIT reconstruction example with digitalization, cluster finding and fit track finding ``` python run_recon.py --input sim.root --output out.root \n --fitdigi --fitcluster --fittrack ``` 2. SCD reconstruction example with digitalization, cluster finding and fit track finding ``` python run_recon.py --input sim.root --output out.root \n --calopca --scddigi --scdcluster --scdtrack ``` 3. Global track finding with FIT & SCD cluster information \n track reconstruction example with SCD&FIT digitalization, cluster finding and global track finding \n ``` python run_recon.py --input sim.root --output out.root \n --scddigi --scdcluster --fitdigi --fitcluster --globaltrack ``` From the scripts, the packages available (not all) in HERDOS are imported here, including digitalizaiton and reconstruction. The order of modules imported are not fixed. #### Digitalizatoin & Reconstruction results The detailed explaination of tree/branch can be found in sector. ![Fig. MC Simulation tree list](figs/retreelist.png) ![Fig. MC Simulation branches](figs/rebranch.png) ### Analyse data files The file format of data files generated by HERDOS is ROOT. The data I/O and event data model is developed based on `podio`. For complex analysis, where you need to read correlations between EDM objects (e.g. analyze MCParticle together with MC hits), we recommend to write your HERDOS algorithms. A simple example is provided in the `AnalysisAlg` algorithm under `Example/AnalysisExample`, as shown below. ```c++ bool AnalysisAlg::initialize() { LogInfo << " initialized successfully" << std::endl; hist_edep = new TH1D("hist_edep","hist_edep",50,0,0.2); return true; } bool AnalysisAlg::execute() { LogDebug << "Processing event " << m_iEvt << std::endl; ++m_iEvt; auto TrackSimHits = getROColl(TrackingSimHitCollection, "scdhits"); auto CaloSimHits = getROColl(CaloSimCellCollection, "calohits"); if (TrackSimHits) { // Do some analysis here. } if (CaloSimHits) { for (size_t i=0; isize(); ++i) { auto edep = CaloSimHits->at(i).getEdep(); hist_edep->Fill(edep); } } return true; } ``` The code example above reads the simulated SCD and calorimeter hits from the detector simulation output file, using C++ macro `getROColl`, then loop through the collection of hits and draw some histograms. Full code and python script can be found in `offline/Examples/AnalysisExample/src/AnalysisAlg.cc`, `offline/Examples/AnalysisExample/src/AnalysisAlg.cc` and `offline/Examples/AnalysisExample/scripts/AnalysisAlg.py`. Detailed introduction of EDM classes and the data management services will be introduced in the developer documentation. If you would only like to perform quick and simple analyze, writing ROOT macro scripts is also possible. But you need also setup the HERDOS runtime environment before running the analysis, since ROOT needs to load dictionaries of EDM classes. A simple ROOT script is provided as an example: ```c++ #include "TFile.h" #include "TChain.h" #include using namespace edm; void ReadEDM(){ TChain *data = new TChain("events"); data->AddFile("data_1.root"); data->AddFile("data_2.root"); vector* calohits=0; data->SetBranchAddress("calohits",&calohits); TCanvas* c = new TCanvas("c","c",800,600); c->SetGrid(); TH1D* h1 = new TH1D("h1","energy deposit",50,0,0.2); int nEntries=data->GetEntries(); for(int i=0;iGetEntry(i); std::cout << "size: " << calohits->size() << std::endl; for(size_t j=0;jsize();j++){ h1->Fill(calohits->at(j).edep); } } h1->Draw(); } ``` The above script opens ROOT files, and read the EDM objects directly from TTrees. Podio uses the format of std::vector of EDM objects to for persistent data storage. The concreate EDM classes are defined in the `offline/EventDataModel/DataModel/data_layout.yml` file, while the c++ code is automatically generated. Please reach to the Doxygen documentation, or check code directly under `offline/EventDataModel/DataModel`. ### Profile your code using jMonitor Sometimes it is nesessary to profile the performance of your application software, such as measuring the CPU usage, memory usage or disk I/O activities as a function of time. In HERDOS we provide the jMonitor command to profile software performance. The basic usage is listed below. ``` usage: jMonitor [-h] [-T MAXTIME] [-M MAXVIR] [--enable-parser] [-p FATALPATTERN] [--enable-monitor] [--enable-perf] [--enable-io-profile] [-i INTERVAL] [-l {root,matplotlib}] [-b {ps,prmon}] [--enable-plotref] [-f PLOTREFFILE] [-o PLOTREFOUTPUT] [-m {Kolmogorov,Chi2}] [-c HISTTESTCUT] [--gen-log] [-n NAME] command [command ...] positional arguments: command job to be monitored optional arguments: -h, --help show this help message and exit -T MAXTIME, --time-limit MAXTIME Time limit of process (s), if exceeded, process will be killed -M MAXVIR, --memory-limit MAXVIR Memory limit of process (Mb), if exceeded, process will be killed --enable-parser If enabled, log of the process will be parsed -p FATALPATTERN, --pattern FATALPATTERN Python re patterns for the parser --enable-monitor If enabled, process will be monitored (memory, cpu) --enable-perf If enabled, process will profiled via perf --enable-io-profile If enabled, process disk IO will profiled -i INTERVAL, --interval INTERVAL Time interval for monitor -l {root,matplotlib}, --plotting-backend {root,matplotlib} Backend for plotting monitoring figures -b {ps,prmon}, --monitor-backend {ps,prmon} Backend performance profiling --enable-plotref If enabled, results of the process will be compared -f PLOTREFFILE, --plotref-files PLOTREFFILE reference file for plot testing -o PLOTREFOUTPUT, --plotref-output PLOTREFOUTPUT output root file for plot comparison -m {Kolmogorov,Chi2}, --histtest-method {Kolmogorov,Chi2} Method of histogram testing -c HISTTESTCUT, --histtest-cut HISTTESTCUT P-Value cut for histogram testing --gen-log whether to generate log file -n NAME, --name NAME name of the job ```