Skip to content

JaehoonLim/SimpleDNNTest

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SimpleDNNTest

This package will help whom using ROOT Ntuple to run DNN test with Keras.
README : https://jaehoonlim.github.io/SimpleDNNTest

Acknowledge

This work was suppored by Global Science experimental Data hub Center (GSDC) in Korea Institute of Science and Technology Information (KISTI).

0. Requirement

  • ROOT >= 5.34 (include ROOT 6)
  • Python >= 2.7 (include Python 3)
  • TensorFlow >= 1.4.0 rc0

And h5py, matplotlib

1. convertROOTtoNumpy.py

usage: convertROOTtoNumpy.py [-h] [-t T] [-b B [B ...]] I  
  
positional arguments:  
   I           'I'nput root file or path  
  
optional arguments:  
  -h, --help    show this help message and exit  
  -t           'T'ree name  
  -b           'B'ranch name  

ex)

python mkSampleRootFile.py  
python convertROOTtoNumpy.py samples/Signal.root -t TEST_tree -b TEST_val1 TEST_val2 TEST_val3  

For TensorFlow, we will convert ROOT Ntuple file to NumPy array file.

convertROOTtoNumpy.py needs ROOT Ntuple file as input file. When convertROOTtoNumpy.py run finished, it will give you 2 kinds of output files. First one, NumPy array file(.npy), contains varialbes you selected form ROOT Ntuple for TensorFlow. Second one, Python pickle file(.pkl), contains the name of varialbes with column index. You can check the variable name of NumPy array by using this pickle file.

without '-t' option, convertROOTtoNumpy.py will automatically find a tree in ROOT Ntuple.
without '-b' option, all branches in the tree will be converted.

2. Train.py

usage: Train.py [-h] [-b B] [-f F] [-a A [A ...]] [-p P] [-w W] [-v V [V ...]] [-e E] [-r R] I  
  
positional arguments:  
   I           'I'nput (signal) numpy file path  
  
optional arguments:  
  -h, --help   show this help message and exit  
  -b           'B'ackground numpy file path  
  -f           boolean 'F'lag variable for signal & background (signal=true)  
  -a           NN 'A'rchitecture (default=[50,10])  
  -p           'P'ersent of validation sample (default=25.0)  
  -w           'W'eight file path (default='./TrainResult/')  
  -v           'V'ariable list (default=all variables)  
  -e           'E'poch (default=100)  
  -r           'R'andom seed number (default=11111111)  

ex)

python Train.py samples/Signal.npy -b samples/Background.npy -a 20 10 -p 40.0 -w TEST_weight -v TEST_val1 TEST_val2 TEST_val3  

or

python Train.py samples/Allsample.npy -f isSignal -a 20 10 -p 40.0 -w TEST_weight -v TEST_val1 TEST_val2 TEST_val3  

Train.py study characteristics of signal what we want and background what we don't want. As you can see on example, input can be 2 NumPy files, one for signal and the other for background, or 1 NumPy file with boolean flag which signal is True.

Train.py is using the Multi-Layer Perceptron (MLP) model. You can set the MLP model, the number of layers and the number of nodes for each layer, by '-a' option. '-a 20 10' means 2 layers with 20 and 10 nodes for first and second layer, respectively.

Details of MLP Model

  • Activation for hidden layer : ReLU
  • Activation for output layer : softmax
  • Loss function : categorical_crossentropy
  • Optimizer : AdaGrad
  • Metrics : Accuracy

Before start training, Train.py will divide samples to test sample and validation sample. Test sample will be used to training, and validation sample will be used to check over-training. Base on validation sample's accuracy, training will be stoped automatically for prevent over-training.

After training, Train.py will give you 3 plots and train weight file.

With Loss plot, you can check loss function results of test sample and validation sample.

With Over-Train Check plot, you can check DNN discriminator values of each samples.

With Receiver Operating Characteristic (ROC) Curve plot, you can check signal efficieny and background rejection rate. On plot, each number means rate where signal efficieny or background rejection rate is 90 % and 95 %.

3. Apply.py

usage: Apply.py [-h] [-o O] [-w W] [-t T] I  
  
positional arguments:  
   I         'I'nput root file or path  
  
optional arguments:  
  -h, --help show this help message and exit  
  -o         'O'uput file path (default: ./Output/)  
  -w         'W'eight file path (default: ./TrainResult/)  
  -t         'T'ree name (default: None)  

ex)

python Apply.py samples/Signal.root -o TEST_output -w TEST_weight -t TEST_tree  

With train weight file form Train.py, Apply.py will start last phase of DNN test.

Based on training result, Apply.py will give you a result ROOT Ntuple file same as you input with DNN test result. You can see the variable 'DNNValue' on output ROOT file.

4. runDNN.py

usage: runDNN.py [-h] [-i I] [-b B] [-f F] [-t T] [-v V [V ...]] [-o O] [-w W] [-a A [A ...]] [-p P] [-e E] [-r R]  
  
Run without 'input root file (-i option)' will read arguments in script  
  
optional arguments:  
  -h, --help   show this help message and exit  
  -i           'I'nput (signal) root file path   
  -b           'B'ackground root file path  
  -f           boolean 'F'lag variable for signal & background (signal=true)  
  -t           'T'ree name  
  -v           'V'ariable list (default=all variables)  
  -o           'O'uput file path (default='./Output/')  
  -w           'W'eight file path (default='./TrainResult/')  
  -a           NN 'A'rchitecture (default=[50,10])  
  -p           'P'ersent of validation sample (default=25.0)  
  -e           'E'poch (default=100)  
  -r           'R'andom seed number (default=11111111)  

ex) with ‘-i’ option

python runDNN.py -i samples/Signal.root -b samples/Background.root -a 20 10 -p 40.0 -o TEST_output -w TEST_weight -v TEST_val1 TEST_val2 TEST_val3  

without ‘-i’ option

python runDNN.py  

runDNN.py will help you to run all test step above at once. runDNN.py will run convertROOTtoNumpy.py, Train.py, and Apply.py sequentailly.

As you can see on example with '-i' option, you have to give all options at once.
without option, runDNN.py will read the options form TestDNN funcion inside runDNN.py.

About

Simple package for DNN test by using Keras with ROOT Ntuples

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages