This repository contains components of the Jet SSD study. Feel free to reproduce the results and report issues if any. Check slides here and read more here.
The environment is specified in the environment.yml file. It can be setup like this:
conda env create -f misc/environment.yml
conda activate jetssd
The environment is quite big. Depending on which part of the study needs reproduction, some of the packages may be unnecessary.
Also download color palette with sh scripts/get-color-palette.sh.
For the fastest and simplest setup, download files from Zenodo.
Depending on your setup, it may be possible to access the files from CERN eos. The files are located in /eos/project/d/dshep/CEVA/. Run your ROOT simulation in case these files are inaccessible with the card from here.
Files are available in /eos/project/d/dshep/CEVA/. If not accessible, to generate HDF5 files for all the available ROOT data, replace <src> with /eos/project/d/dshep/CEVA/ or the destination directory from the previous step. First, run
mkdir data
python scripts/generate-configuration-file.py <src> data/file-configuration.json
to generate information about file content (files may be irregular when there were failures during simulation). The configuration file will be stored in the data folder.
To generate HDF5 files replace <src> with source dirtectory , e.g. /eos/project/d/dshep/CEVA/RSGraviton_WW_NARROW and <dst> with output directory. This line will output one file for each <src> directory.
python hdf5-generator.py <src> -o <dst> -v
Depending on your setup, it may be possible to access the files from CERN eos. The files are located in /eos/project/d/dshep/CEVA-hdf5/mix. If not accessible, please run the optional steps from above first.
To run, replace dataset_misc section in ssd-config.yml with proper source directory name (<dst> from previos step or /eos/project/d/dshep/CEVA-hdf5) and target sizes. Lastly, change dataset section in ssd-config.yml with proper target directory paths for HDF5 files, e.g. replace /path/to/train0 with ./foo/my-train-0.h5. The following line will generate target files.
python hdf5-generator-mix.py
Two notebooks help with understanding and verifying the data.
a-walk-throughgives a general introduction.profile-datasetshows dataset details.
Prerequisite:
- Change
ssd-config.ymlfile paths in thedatasetto point to yourHDF5files.
Run training with:
python jet-ssd-train.py <name_fpn> -vfor full precision,python jet-ssd-train.py <name_int8> -m <path_fpn> -v8for INT8 precision,python jet-ssd-train.py <name_twn> -m <path_fpn> -vtfor ternary weights.
Run evaluation with (all plots will be stored in plots directory):
python jet-ssd-eval.py <name_fpn> <name_twn> <name_int8> -v
Other notebooks:
- Verify inference by running the
ssd-inferencenotebook, - Check ternary filters by running the
show-filtersnotebook.
Prerequisite:
- Change
ssd-config.ymllearning_rateandmax_epochs.
To run training with regularizer, add -r flag:
python jet-ssd-train.py <name_fpn> -rv -s <path_to_net-config> -m <path_to_previous_iteration>
Uncomment target max_channels in jet-ssd-prune.py and run:
python jet-ssd-prune.py <name_fpn> -rv -s <path_to_net-config>This will produce a newnet-config.ymlfile and overwrite the input model. You may want to make copies of your models before running the pruning procedure.
- Convert model to ONNX with
python jet-ssd-onnx-export.py <name> -v. - Measure Jet SSD inference time on CPUs GPUs with
python jet-ssd-benchmark-inference.py <name> -b <batch_size> -v. Additinal flag--onnxfor ONNX runtime and--trtfor TensorRT. Without a flag, the tests will run for PyTorch. - To get the final inference plot, change the data in
jet-ssd-inference.pyand run.
All plots will be saved in the plots folder unless otherwise specified in ssd-config.yml.