jpata / api Goto Github PK
View Code? Open in Web Editor NEWDescription and discussion of the HEP API
Description and discussion of the HEP API
How should we represent ROOT objects in julia?
I quite like the approach of just using the bare Cxx types https://github.com/JuliaHEP/ROOTFramework.jl/blob/master/src/tdirectory.jl#L38
TFile() = @cxx TFile()
Pros:
Cons:
How to best access (r/w) a TTree? There are numerous proposed models out there, we should learn from them. I would adopt something that's very close to raw ROOT, and then something additional with some julian AbstractDataFrame semantics.
via __getattr__
:
objs = tree.myBranch1
print objs
object schema is generated on-the-fly, i.e. if a branch contains a complex class (std::vector
, pat::Electron
), it will be loaded with Cling.
http://www.rootpy.org/auto_examples/tree/model_simple.html
tree.myBranch1
only gets loads branch using TBranch.GetEntry
, only oncetree.__setattr__
, fill row-by-row as usual in ROOThttps://github.com/cbernet/heppy
tree.myBranch1
AutoFillTreeProducer
, which knows how to translate a complex event model into a "flat ntuple" structure likeevent.leptons = [Lepton(pt=120, eta=0.5, phi=0.2, mass=12), ...]
=>
tree.nleptons # ::Int32, variable per row
tree.leptons_pt # (NTuple{NMAX, Float32}) with some predefined NMAX, each row has values up to tree.nleptons
Example of scheduling:
#example of how to save an object (with derived characteristics)
leptonTypeVHbb = NTupleObjectType("leptonTypeVHbb", baseObjectTypes = [ leptonType ],
variables = [
NTupleVariable("looseIdSusy", lambda x : x.looseIdSusy if hasattr(x, 'looseIdSusy') else -1, int, help="Loose ID for Susy ntuples (always true on selected leptons)"),
NTupleVariable("looseIdPOG", lambda x : x.muonID("POG_ID_Loose") if abs(x.pdgId()) == 13 else -1, int, help="Loose ID for Susy ntuples (always true on selected leptons)"),
...
]
)
#putting it all together into a tree
treeProducer= cfg.Analyzer(
class_object=AutoFillTreeProducer,␣
defaultFloatType = "F",
verbose=False,
vectorTree = True,
globalVariables = [
NTupleVariable("puWeightUp", lambda ev : getattr(ev,"puWeightPlus",1.), help="Pileup up variation",mcOnly=True),
NTupleVariable("puWeightDown", lambda ev : getattr(ev,"puWeightMinus",1.), help="Pileup down variation",mcOnly=True),
...
],
globalObjects = {
"met" : NTupleObject("met", metType, help="PF E_{T}^{miss}, after default type 1 corrections"),
....
},
collections = {
"selectedLeptons" : NTupleCollection("selLeptons", leptonTypeVHbb, 8, help="Leptons after the preselection"),
...
}
)
tdf[:myBranch1] => Vector{Float32}
transforms to in-memory columnwritetree(df::DataFrame)
Example of row-by-row access
df = TreeDataFrame(["file1.root"]; treename="tree")
for i=1:nrow(df)
load_row(df, i) #load all branches using TTree::GetEntry
n__jet = df.row.n__jet() #otherwise only this will actually to TBranch::GetEntry(i - 1)
jet__pt = df.row.jet__pt()[1:n__jet]
end
@oschulz I'm continuing the discussion on gitter here.
How to best write objects to TFile or TDirectory? ROOT has the concept of a current working directory, creating tons of confusion. I would propose not to deal with that part of the API at all (i.e. TObject::SetDirectory
, TDirectory::Add
), but rather make a sane version of it.
HDF5.jl might be an inspiration: https://github.com/JuliaIO/HDF5.jl#quickstart
h5write("/tmp/test2.h5", "mygroup2/A", A)
#or
h5open("mydata.h5", "w") do file
write(file, "A", A) # alternatively, say "@write file A"
end
#or
using HDF5
h5open("test.h5", "w") do file
g = g_create(file, "mygroup") # create a group
g["dset1"] = 3.2 # create a scalar dataset inside the group
attrs(g)["Description"] = "This group contains only a single dataset" # an attribute
end
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.