Scikit-HEP project - welcome!

The Scikit-HEP project is a community-driven and community-oriented project with the aim of providing Particle Physics at large with an ecosystem for data analysis in Python. The project started in Autumn 2016 and is under active development.

It is not just about providing core and common tools for the community. It is also about improving the interoperability between HEP tools and the scientific ecosystem in Python, and about improving on discoverability of utility packages and projects.

For what concerns the project grand structure, it should be seen as a toolset rather than a toolkit. The project defines a set of five pillars, which are seen to embrace all major topics involved in a physicist's work. These are:

  • Datasets: data in various sources, such as ROOT, Numpy/Pandas, databases, wrapped in a common interface.
  • Aggregations: e.g. histograms that summarize or project a dataset.
  • Modeling: data models and fitting utilities.
  • Simulation: wrappers for Monte Carlo engines and other generators of simulated data.
  • Visualization: interface to graphics engines, from ROOT and Matplotlib to even beyond.

Toolset packages

To get started, have a look at our GitHub repository.

The list of presently available packages follows, together with a very short description of their goals:

Interoperability and data manipulation:

  • formulate: Easy conversions between different styles of expressions.
  • root_numpy: Interface between ROOT and NumPy.
  • uproot: Minimalist ROOT I/O in pure Python and Numpy.
  • root_pandas: Module for conveniently loading/saving ROOT files as pandas DataFrames.
  • uproot-methods: Pythonic behaviours for non-I/O related ROOT classes.

Interface to HEP libraries:

  • numpythia: Interface between Pythia and NumPy.
  • pyjet: Interface between FastJet and NumPy.

Event processing:

  • awkward-array: Manipulate arrays of complex data structures as easily as Numpy.

Particles and decays:

  • DecayLanguage: Describe and convert particle decays between digital representations.
  • Particle: PDG particle data and identification codes.

Histogramming:

  • histbook: Versatile, high-performance histogram toolkit for Numpy.
  • boost-histogram: Python bindings for the C++14 Boost::Histogram library.

Fitting:

  • probfit: Cost function builder. For fitting distributions.
  • iminuit: MINUIT from Python - Fitting like a boss.

Simulation:

  • pyhepmc: Next generation Python bindings for HepMC3.

Machine Learning:

  • NNDrone: Collection of tools and algorithms to enable conversion of HEP ML to mass usage model.

Visualization:

  • vegascope: View Vega/Vega-Lite plots in your web browser from local or remote Python processes.

Units and constants:

  • hepunits: Units and constants in the HEP system of units.

Miscellaneous:

  • scikit-hep-testdata: Common package to provide example files (e.g., ROOT) for testing and developing packages against.
  • scikit-hep: Toolset of interfaces and tools for Particle Physics. To become a metapackage.

In some cases, the packages have to do with bridging between different technologies and/or popular packages from the Python scientific software stack.