This package will help whom using ROOT Ntuple to run DNN test with Keras.
README : https://jaehoonlim.github.io/SimpleDNNTest
- ROOT : https://root.cern.ch
- Keras : https://keras.io
This work was suppored by Global Science experimental Data hub Center (GSDC) in Korea Institute of Science and Technology Information (KISTI).
- ROOT >= 5.34 (include ROOT 6)
- Python >= 2.7 (include Python 3)
- TensorFlow >= 1.4.0 rc0
And h5py, matplotlib
usage: convertROOTtoNumpy.py [-h] [-t T] [-b B [B ...]] I
positional arguments:
I 'I'nput root file or path
optional arguments:
-h, --help show this help message and exit
-t 'T'ree name
-b 'B'ranch name
ex)
python mkSampleRootFile.py
python convertROOTtoNumpy.py samples/Signal.root -t TEST_tree -b TEST_val1 TEST_val2 TEST_val3
For TensorFlow, we will convert ROOT Ntuple file to NumPy array file.
convertROOTtoNumpy.py needs ROOT Ntuple file as input file. When convertROOTtoNumpy.py run finished, it will give you 2 kinds of output files. First one, NumPy array file(.npy), contains varialbes you selected form ROOT Ntuple for TensorFlow. Second one, Python pickle file(.pkl), contains the name of varialbes with column index. You can check the variable name of NumPy array by using this pickle file.
without '-t' option, convertROOTtoNumpy.py will automatically find a tree in ROOT Ntuple.
without '-b' option, all branches in the tree will be converted.
usage: Train.py [-h] [-b B] [-f F] [-a A [A ...]] [-p P] [-w W] [-v V [V ...]] [-e E] [-r R] I
positional arguments:
I 'I'nput (signal) numpy file path
optional arguments:
-h, --help show this help message and exit
-b 'B'ackground numpy file path
-f boolean 'F'lag variable for signal & background (signal=true)
-a NN 'A'rchitecture (default=[50,10])
-p 'P'ersent of validation sample (default=25.0)
-w 'W'eight file path (default='./TrainResult/')
-v 'V'ariable list (default=all variables)
-e 'E'poch (default=100)
-r 'R'andom seed number (default=11111111)
ex)
python Train.py samples/Signal.npy -b samples/Background.npy -a 20 10 -p 40.0 -w TEST_weight -v TEST_val1 TEST_val2 TEST_val3
or
python Train.py samples/Allsample.npy -f isSignal -a 20 10 -p 40.0 -w TEST_weight -v TEST_val1 TEST_val2 TEST_val3
Train.py study characteristics of signal what we want and background what we don't want. As you can see on example, input can be 2 NumPy files, one for signal and the other for background, or 1 NumPy file with boolean flag which signal is True.
Train.py is using the Multi-Layer Perceptron (MLP) model. You can set the MLP model, the number of layers and the number of nodes for each layer, by '-a' option. '-a 20 10' means 2 layers with 20 and 10 nodes for first and second layer, respectively.
Details of MLP Model
- Activation for hidden layer : ReLU
- Activation for output layer : softmax
- Loss function : categorical_crossentropy
- Optimizer : AdaGrad
- Metrics : Accuracy
Before start training, Train.py will divide samples to test sample and validation sample. Test sample will be used to training, and validation sample will be used to check over-training. Base on validation sample's accuracy, training will be stoped automatically for prevent over-training.
After training, Train.py will give you 3 plots and train weight file.
With Loss plot, you can check loss function results of test sample and validation sample.
With Over-Train Check plot, you can check DNN discriminator values of each samples.
With Receiver Operating Characteristic (ROC) Curve plot, you can check signal efficieny and background rejection rate. On plot, each number means rate where signal efficieny or background rejection rate is 90 % and 95 %.
usage: Apply.py [-h] [-o O] [-w W] [-t T] I
positional arguments:
I 'I'nput root file or path
optional arguments:
-h, --help show this help message and exit
-o 'O'uput file path (default: ./Output/)
-w 'W'eight file path (default: ./TrainResult/)
-t 'T'ree name (default: None)
ex)
python Apply.py samples/Signal.root -o TEST_output -w TEST_weight -t TEST_tree
With train weight file form Train.py, Apply.py will start last phase of DNN test.
Based on training result, Apply.py will give you a result ROOT Ntuple file same as you input with DNN test result. You can see the variable 'DNNValue' on output ROOT file.
usage: runDNN.py [-h] [-i I] [-b B] [-f F] [-t T] [-v V [V ...]] [-o O] [-w W] [-a A [A ...]] [-p P] [-e E] [-r R]
Run without 'input root file (-i option)' will read arguments in script
optional arguments:
-h, --help show this help message and exit
-i 'I'nput (signal) root file path
-b 'B'ackground root file path
-f boolean 'F'lag variable for signal & background (signal=true)
-t 'T'ree name
-v 'V'ariable list (default=all variables)
-o 'O'uput file path (default='./Output/')
-w 'W'eight file path (default='./TrainResult/')
-a NN 'A'rchitecture (default=[50,10])
-p 'P'ersent of validation sample (default=25.0)
-e 'E'poch (default=100)
-r 'R'andom seed number (default=11111111)
ex) with ‘-i’ option
python runDNN.py -i samples/Signal.root -b samples/Background.root -a 20 10 -p 40.0 -o TEST_output -w TEST_weight -v TEST_val1 TEST_val2 TEST_val3
without ‘-i’ option
python runDNN.py
runDNN.py will help you to run all test step above at once. runDNN.py will run convertROOTtoNumpy.py, Train.py, and Apply.py sequentailly.
As you can see on example with '-i' option, you have to give all options at once.
without option, runDNN.py will read the options form TestDNN funcion inside runDNN.py.


