xenonnt / straxen Goto Github PK

View Code? Open in Web Editor NEW

19.0 19.0 33.0 25.52 MB

Streaming analysis for XENON

License: BSD 3-Clause "New" or "Revised" License

Python 63.13% Jupyter Notebook 36.87%

straxen's People

Stargazers

Watchers

straxen's Issues

Parse artificial deadtime in channel 799

See #60

Support nfold=1 for nVETORecorder Plugin

For commissioning it might be nice to save all raw_records_nv.

Lone_hits sticking out of a chunk

When processing runs with the latest strax + straxen, I get these errors in 4/~450 runs (7769, 7770, 7775, 7795):

ValueError: Attempt to create chunk [007769.lone_hits: 1589849417sec 999999000 ns - 1589849423sec 499999000 ns, 912904 items, 6.6 MB/s] whose data ends late at 1589849423499999280

ValueError: Attempt to create chunk [007775.lone_hits: 1589870842sec 999999000 ns - 1589870848sec 499999000 ns, 922094 items, 6.7 MB/s] whose data ends late at 1589870848499999130

They appear at some seemingly random time in the run, and given how rare they are, this could be an edge case in one of the algorithms. It would be surprising though; in strax, hits shouldn't be able to extend beyond the record they are part of. (Peak(let)s could, but for peaks PulseProcessing explicitly clips them to chunk boundaries.)

Since three of the failing runs are close together, maybe there is simply something wrong with the records for these runs, and the error will disappear in future reprocessings?

Lower number of cores and max messages while increasing timeout for failing runs

If bootstrax fails a run, it retries it several times. If for some reason it continues to fail for a given run we change the target to raw_records to prevent us from trying using a broken event_basics for many times. Additionally, we should give it more time and less cores/max mailbox messages (as that makes it more reliable).

Add tests for mini-analyses

Suggested by @tunnell: since mini-analyses have a common interface, doing a no-crash test for them should be especially easy.

Targeting in bootstrax for processing not working for records

Due to some of the updates on peak processing / clustering the scope of targeting records (e.g. in bootstrax) raises an error (added below). I suspect that there is a cross reference somewhere to peaks-like structure since processing up to raw_records or peaks does work. Processing up to records however does not.

Concluding, this works:

python bootstrax.py --target peaks --cores -1 --process 6331
python bootstrax.py --target raw_records --cores -1 --process 6331

This does not work:

python bootstrax.py --target records --cores -1 --process 6331

The error:

(py36) xedaq@eb3:~/joran$ cat last_bootstrax_exception.txt
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
  File "/home/xedaq/miniconda/envs/py36/lib/python3.6/concurrent/futures/process.py", line 175, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
  File "/home/xedaq/miniconda/envs/py36/lib/python3.6/site-packages/npshmex.py", line 141, in shm_wrap_f
    result = f(*args, **kwargs)
  File "/home/xedaq/software/strax/strax/plugin.py", line 639, in do_compute
    return self._fix_output(results)
  File "/home/xedaq/software/strax/strax/plugin.py", line 312, in _fix_output
    self._check_dtype(result)
  File "/home/xedaq/software/strax/strax/plugin.py", line 276, in _check_dtype
    raise ValueError(f"Plugin {pname} expects {expect} as dtype??")
ValueError: Plugin ParallelSourcePlugin expects {'records': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Length of pulse to which the record belongs (without zero-padding)', 'pulse_length'), '<i4'), (('Fragment number in the pulse', 'record_i'), '<i2'), (('Baseline in ADC counts. data = int(baseline) - data_orig', 'baseline'), '<f4'), (('Level of data reduction applied (strax.ReductionLevel enum)', 'reduction_level'), 'u1'), (('Waveform data in ADC counts above baseline', 'data'), '<i2', (110,))]), 'diagnostic_records': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Length of pulse to which the record belongs (without zero-padding)', 'pulse_length'), '<i4'), (('Fragment number in the pulse', 'record_i'), '<i2'), (('Baseline in ADC counts. data = int(baseline) - data_orig', 'baseline'), '<f4'), (('Level of data reduction applied (strax.ReductionLevel enum)', 'reduction_level'), 'u1'), (('Waveform data in ADC counts above baseline', 'data'), '<i2', (110,))]), 'aqmon_records': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Length of pulse to which the record belongs (without zero-padding)', 'pulse_length'), '<i4'), (('Fragment number in the pulse', 'record_i'), '<i2'), (('Baseline in ADC counts. data = int(baseline) - data_orig', 'baseline'), '<f4'), (('Level of data reduction applied (strax.ReductionLevel enum)', 'reduction_level'), 'u1'), (('Waveform data in ADC counts above baseline', 'data'), '<i2', (110,))]), 'veto_regions': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Index of sample in record in which hit starts', 'left'), '<i2'), (('Index of first sample in record just beyond hit (exclusive bound)', 'right'), '<i2'), (('Internal (temporary) index of fragment in which hit was found', 'record_i'), '<i4')]), 'pulse_counts': dtype([(('Lowest start time observed in the chunk', 'time'), '<i8'), (('Highest endt ime observed in the chunk', 'endtime'), '<i8'), (('Number of pulses', 'pulse_count'), '<i8', (248,)), (('Number of lone pulses', 'lone_pulse_count'), '<i8', (248,)), (('Integral of all pulses in ADC_count x samples', 'pulse_area'), '<i8', (248,)), (('Integral of lone pulses in ADC_count x samples', 'lone_pulse_area'), '<i8', (248,))])} as dtype??
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "bootstrax.py", line 628, in run_strax
    max_workers=args.cores)
  File "/home/xedaq/software/strax/strax/context.py", line 885, in make
    save=save, max_workers=max_workers, **kwargs):
  File "/home/xedaq/software/strax/strax/context.py", line 811, in get_iter
    allow_rechunk=self.context_config['allow_rechunk']).iter():
  File "/home/xedaq/software/strax/strax/processor.py", line 254, in iter
    raise exc.with_traceback(traceback)
  File "/home/xedaq/software/strax/strax/processor.py", line 196, in iter
    yield from final_generator
  File "/home/xedaq/software/strax/strax/mailbox.py", line 316, in _read
    res = msg.result(timeout=self.timeout)
  File "/home/xedaq/miniconda/envs/py36/lib/python3.6/concurrent/futures/_base.py", line 432, in result
    return self.__get_result()
  File "/home/xedaq/miniconda/envs/py36/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
    raise self._exception
ValueError: Plugin ParallelSourcePlugin expects {'records': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Length of pulse to which the record belongs (without zero-padding)', 'pulse_length'), '<i4'), (('Fragment number in the pulse', 'record_i'), '<i2'), (('Baseline in ADC counts. data = int(baseline) - data_orig', 'baseline'), '<f4'), (('Level of data reduction applied (strax.ReductionLevel enum)', 'reduction_level'), 'u1'), (('Waveform data in ADC counts above baseline', 'data'), '<i2', (110,))]), 'diagnostic_records': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Length of pulse to which the record belongs (without zero-padding)', 'pulse_length'), '<i4'), (('Fragment number in the pulse', 'record_i'), '<i2'), (('Baseline in ADC counts. data = int(baseline) - data_orig', 'baseline'), '<f4'), (('Level of data reduction applied (strax.ReductionLevel enum)', 'reduction_level'), 'u1'), (('Waveform data in ADC counts above baseline', 'data'), '<i2', (110,))]), 'aqmon_records': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Length of pulse to which the record belongs (without zero-padding)', 'pulse_length'), '<i4'), (('Fragment number in the pulse', 'record_i'), '<i2'), (('Baseline in ADC counts. data = int(baseline) - data_orig', 'baseline'), '<f4'), (('Level of data reduction applied (strax.ReductionLevel enum)', 'reduction_level'), 'u1'), (('Waveform data in ADC counts above baseline', 'data'), '<i2', (110,))]), 'veto_regions': dtype([(('Channel/PMT number', 'channel'), '<i2'), (('Time resolution in ns', 'dt'), '<i2'), (('Start time of the interval (ns since unix epoch)', 'time'), '<i8'), (('Length of the interval in samples', 'length'), '<i4'), (('Integral in ADC x samples', 'area'), '<i4'), (('Index of sample in record in which hit starts', 'left'), '<i2'), (('Index of first sample in record just beyond hit (exclusive bound)', 'right'), '<i2'), (('Internal (temporary) index of fragment in which hit was found', 'record_i'), '<i4')]), 'pulse_counts': dtype([(('Lowest start time observed in the chunk', 'time'), '<i8'), (('Highest endt ime observed in the chunk', 'endtime'), '<i8'), (('Number of pulses', 'pulse_count'), '<i8', (248,)), (('Number of lone pulses', 'lone_pulse_count'), '<i8', (248,)), (('Integral of all pulses in ADC_count x samples', 'pulse_area'), '<i8', (248,)), (('Integral of lone pulses in ADC_count x samples', 'lone_pulse_area'), '<i8', (248,))])} as dtype??

Waveform display gives errors depending on seconds_range

You can get either:

ValueError: No results returned?
error from array merging. Appears to be passed one array with stuff and two arrays with 0 elements and float dtype (probably artifact from chunk_array)

Other mini-analyses, such as event_scatter, seem to be working fine. No doubt there is some underlying strax issue, perhaps related to AxFoundation/strax#181.

setup.py

Does it make sense to make this installable as a package

Multitude of plugins, when saving intermediate plugins?

Like #174 there are downsides of having a lot of plugins for processing (all of our) sub-detectors. Due to e.g. high energy #161, neutron veto #86 and muon veto #173 plugins there are trade-offs between having a lot of small steps in between the plugins (like e.g. peaklets -> peaks) and saving all of those and the computational effort to keep all of that data stored somewhere.

We may need to - at some point - carefully think about which datatypes we want to store/compute. As a starters some plugins could be better of not being saved e.g. by setting https://github.com/AxFoundation/strax/blob/master/strax/plugin.py#L24 or by merging it with another plugin.

Rucio does not check options

Issue:

Loading data from the rucio catalogue does not check if the versions and options match the strax configuration.

More specifically, the function st.is_stored(run_id,data_type) will always return true if there is any data of that data_type and run in the rucio database. The lineage is not checked.

Example:

import straxen
st = straxen.contexts.xenonnt_online()
st.set_config(dict(s1_max_rise_time=50000,
                  peak_split_filter_wing_width=1000000))
st.is_stored('009149','peak_basics')
st.get_array('009149','peak_basics')

The data belonging to this configuration does not exist. However since is_stored will return true anyway, get_array will retrieve the data from rucio despite it having different settings

Add changing DEFAULT_RECORD_LENGTH for raw_records for LED calibration

Add reading strax_fragment_length from the DAQ db to change this value for processing raw-records. Should be added in DAQ-reader and bootstrax.

Can change chunksize per plugin

Since AxFoundation/strax#277 it's possible to optimize the chunksize per plugin. This raises the default to 200 MB (uncompressed) but we could also use another value depending on the type of plugin.

For example, we can make a diagnostic plugin like pulse_counts very small.

ZeroDivisionError in holoviews display

As mentioned in #48 (comment), you currently get a ZeroDivisionError when trying to use the holoviews display. We should post the full error trace here and then try to figure out what (upstream?) change caused this to appear.

Bootstrax targets for many detectors

When processing data we need to specify what our targets are (https://github.com/XENONnT/straxen/blob/master/bin/bootstrax#L1064).

Up to now our life has been easy, we just consider processing the TPC data and the targets involved therein. However as soon as we also include high energy #161, neutron veto #86 and muon veto #173. What if - for some reason - either of these plugins are not processing correctly or do not finish. Do we as a last resort not process all of the plugins? This doesn't seem seems preferable.

To this end I propose to write a schema here:

https://github.com/XENONnT/straxen/blob/master/bin/bootstrax#L425

Where at first we assume that we start processing all of the (sub) detectors to their latest plugins (please note that these cannot be set in a single st.make call as they are of different datakinds). If this fails we change the targets (e.g. to replace this line https://github.com/XENONnT/straxen/blob/master/bin/bootstrax#L1174 and e.g. ) The downside is that if any of these plugins have an error somewhere, it crashes bootstrax. A second try would lower the requirements to the TPC to its latest plugin e.g. providing event_info. Alternatively we might argue that live_processing should only deal with the TPC and not care about the other sub-detectors (which is not preferred, think e.g. about the raw_records_prenv).

We could also make eb2 proces the data for the NV, MV and HE.
Vanilla solutions are obviously also options (and probably preferred).

Data management on eventbuilders

Yesterday runs 9236, 9237, 9238 were irrecoverably lost from the event builders. The big question is how it can be that we at some point seem to have lost the data stored on /data/xenonnt_processed or have somehow passed line 586 in Ajax.

The reason turned out to be a faulty logic in this line that said that that are less than two hours old should be deleted.

RECONSTRUCTION
These are the events that happened (focusing on runs 9236).

"2020-08-31T18:05:47.776Z" - Processing finished
Bootstrax successfully processed the run and then deleted the live_data at this time which is only done after we have successfully stored all the data (there is a check in set_status_finished that makes sure the data has been written to disk). Furthermore, one can see from the deleted entries that we have saved .

"2020-08-31T19:.." - Ajax deletes the 'unregistered' data
The bug was here

7208899 MainThread root clean_unregistered::    found 398 runs stored on/data/xenonnt_processed/. Checking that each is in the runs-database
7209768 MainThread root remove_if_unregistered::        run 9236 is NOT registered in the runDB
7209768 MainThread root No data for 009236 found! Double checking /data/xenonnt_processed/!
7209769 MainThread root Cleaning /data/xenonnt_processed/009236-raw_records_nv-rfzvpzj4mf
7209770 MainThread root Cleaning /data/xenonnt_processed/009236-raw_records_aqmon-rfzvpzj4mf

"2020-08-31T20:18:53.934Z" - Ajax removes entries from runs-database
In the clean_database routine, ajax notices that this run is stored for >2 h and that processing has finished. For this we check if the data is actually stored on this host on line 586. The corresponding output from ajax is added below:

10812030 MainThread root Loop finished, take a 3600 s nap
14412139 MainThread root clean_unregistered::   found 396 runs stored on/data/xenonnt_processed/. Checking that each is in the runs-database
14412978 MainThread root clean_abandoned::      No more matches in rundoc
14413442 MainThread root clean_database::       delete entry of data from 9236 at /data/xenonnt_processed/009236-raw_records_aqmon-rfzvpzj4mf as it does not exist
14413442 MainThread root deleting /data/xenonnt_processed/009236-raw_records_aqmon-rfzvpzj4mf finished
14413442 MainThread root changing data field in rundoc
14413442 MainThread root update with {'host': 'eb5.xenon.local', 'type': 'raw_records_aqmon', 'file_count': 36, 'at': datetime.datetime(2020, 8, 31, 20, 18, 53, 934849, tzinfo=<UTC>), 'by': 'eb5.xenon.local.ajax'}
...

"2020-08-31T20:19:05" - Bootstrax notices that all processed data is now removed and fails the run
Please note that this further substantiates that the processing did occur as needed.

Processing muon veto data

At the moment we stop at raw_records_mv. If we want to reproduce the 1T-like muon-veto the bare minimum is that we build some muon_veto regions where timestamps in between are vetoed. This should be possible with the existing peak-finding.

Adapt pax conversion plugin to requirements of strax v0.9

check readable folders for microstrax

If an nfs mount is not correctly configured microstrax may not be able to read the from the datadirectory where the latest data is. We just noticed that remounting the disk solved the issue.

When starting microstrax we should check that all folders that are being registered are actually accessible for microstrax as otherwise it raises the complaint that the metadata is not available for said run.

dt field DAQreader

The proposed one DAQ one DAQreader solution of the linked-mode requires a change of the DAQreader. Currently the dt field is specified as an option which wont work any longer.

Lone hit area only includes samples above threshold

In strax, the left and right boundaries of a hit are set by the region that actually crossed the threshold. To ensure we include the full area in integrations, we extend the boundary of peaks outwards by some amount.

For lone hits, we do not do this, but instead report the hit area directly. Thus, unless the pulse compression filter is activated, the lone hit areas will be biased downwards very significantly compared to actual 1 PE areas. With the pulse compression filter active, the lone hit area is instead biased slightly upwards, since the filter can cause 1PE pulses to become slightly negative around their maxima.

We could accept this, change the hit definition, or compute lone hit integrals with the left/right extension. The latter should probably be a separate function rather than changing the hitfinder, unless we want to apply the extensions without regard to neighboring peaks/hits or record breaks.

Reduce file count in pulse_counts and veto_regions

Currently pulse_counts and veto_regions area not rechunked on saving. That meaks we get a lot of small files, which is problematic for data storage.

If we would rechunk it, we couldn't use it for online monitoring of the pulse rate anymore (though the website doesn't support this yet) since it would only write one chunk to disk at the end of the run.

The easiest solution seems to be to have two savers: one without rechunking (for use in monitoring) and one with (for storage). The alternative would be to re-pack the data after writing it but before transferring it.

Bootstrax ommits veto_regions

Currently bootstrax doesn't compute the veto_regions available since #207.

There are two ways to do this:

Add it in the same way we did for the He Plugins and the Nv
Let bootstrax create a new plugin that takes all of the endpoint plugins (e.g. PeaksHe + PeaksNv + event_info_double) and aggregates that in a plugins with save_when.NEVER. This will simplify strax and a single st.make would be needed.

Peak processing plugin has insufficient condition for TF2 checking

Hello,

In attempting to walk through beginning tutorial steps, I came across a tensorflow failure to find graph elements at straxen/plugins/peak_processing.py", line 222. E.g.,

ValueError: Tensor Tensor("dense_6/BiasAdd:0", shape=(?, 2), dtype=float32) is not an element of this graph.

Upon investigation, I see that special cases were put into place for the peak_processing plugin to use tensorflow v2. Unfortunately, it looks like the condition for checking for v2 is insufficient.

straxen/straxen/plugins/peak_processing.py

Line 176 in 62d964d

self.has_tf2 = parse_version(tf.__version__) > parse_version('1.9.9')

Perhaps one could use parse_version(tf.__version__) >= parse_version('2.0.'), since 2.0 is still pre-release?

(Improvment) Add baseline_value and baseline_rms to pulse_counts.

pulse_counts are very useful to monitor the PMT count rate in our TPC. To be able to cross check if a higher PMT rate is caused by either an increased signal rate or change in noise, I would like to propose to add two additional fields to pulse_counts. The first field stores the average baseline value per PMT and in the second field we store the average baseline rms per PMT. Just as a comparison:

Loading pulse_counts of a single 1h nitrogen run takes 1.10 s while loading the baseline and baseline_rms values stored in records of the very same run takes about 10 min.

Bootstrax - warning integration

Bootstrax writes warnings and messages to the daq database such that they can be displayed on the website. Change the following:

infermode: .... lowering mode to ... should be a message not a warning
We should not write the full traceback to the website
Make runid field an integer

Feature Idea: Units

The data frames come with "comments" st.data_info('event_info'); If we had a separate column "units", that would allow scripts to automatically pull the correct units for axis labels.

Convert 'daq times' to epoch times

The DAQ provides timestamps in ns since run start with resolution of the ADC sample size. This needs to be converted to 'ns since unix epoch'. This is hard to do in the DAQ but maybe it can be done in the DAQReader plugin at the stage where all the sub-files from each readout thread are combined.

waveform_display does not seem to work

In the current straxen master, if I run:

import strax
import straxen
print('strax', strax.__version__)
print('straxen', straxen.__version__)
st = straxen.contexts.xenon1t_dali()
run_id = '170204_1410'
st.waveform_display(run_id, seconds_range=(0, 0.15))

I get the following error:

---------------------------------------------------------------------------
DataNotAvailable                          Traceback (most recent call last)
<ipython-input-3-95868e942342> in <module>
----> 1 st.waveform_display(run_id, seconds_range=(0, 0.15))

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/straxen-0.9.0-py3.6.egg/straxen/mini_analysis.py in wrapped_f(context, run_id, **kwargs)
    113                             config=kwargs.get('config'),
    114                             register=kwargs.get('register'),
--> 115                             storage=kwargs.get('storage', tuple()))
    116 
    117                 # If user did not give time kwargs, but the function expects

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/strax/context.py in get_array(self, run_id, targets, save, max_workers, **kwargs)
    905                 max_workers=max_workers,
    906                 **kwargs)
--> 907             results = [x.data for x in source]
    908         return np.concatenate(results)
    909 

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/strax/context.py in <listcomp>(.0)
    905                 max_workers=max_workers,
    906                 **kwargs)
--> 907             results = [x.data for x in source]
    908         return np.concatenate(results)
    909 

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/strax/context.py in get_iter(self, run_id, targets, save, max_workers, time_range, seconds_range, time_within, time_selection, selection_str, keep_columns, _chunk_number, **kwargs)
    755                                          save=save,
    756                                          time_range=time_range,
--> 757                                          chunk_number=_chunk_number)
    758 
    759         # Cleanup the temp plugins

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/strax/context.py in get_components(self, run_id, targets, save, time_range, chunk_number)
    631 
    632         for d in targets:
--> 633             check_cache(d)
    634         plugins = to_compute
    635 

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/strax/context.py in check_cache(d)
    551                 to_compute[d] = p
    552                 for dep_d in p.depends_on:
--> 553                     check_cache(dep_d)
    554 
    555             # Should we save this data? If not, return.

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/strax/context.py in check_cache(d)
    551                 to_compute[d] = p
    552                 for dep_d in p.depends_on:
--> 553                     check_cache(dep_d)
    554 
    555             # Should we save this data? If not, return.

/opt/conda/envs/strax-dev/lib/python3.6/site-packages/strax/context.py in check_cache(d)
    539                     # other requested data types is not.
    540                     raise strax.DataNotAvailable(
--> 541                         f"Time range selection assumes data is already "
    542                         f"available, but {d} for {run_id} is not.")
    543                 if '*' in self.context_config['forbid_creation_of']:

DataNotAvailable: Time range selection assumes data is already available, but peaklets for 170204_1410 is not.

Common timezone information

We should update

straxen/straxen/rundb.py

Line 280 in 88d3e4e

def run_metadata(self, run_id, projection=None):

such that all timestamps are returned with a proper timezone information to avoid confusion between UTC and LabTime.

Datastructure documentation omits raw_records_he etc.

See https://straxen.readthedocs.io/en/latest/reference/datastructure.html. However, pulse_counts and lone_hits are listed.

Make eb0-2 only run if ebs3-5 are lagging

An idea by @darrylmasson is to have ebs0-2 would only start processing in case eb3-5 are processing extremely high data rates. To this end we can have the older ebs query the runs-database to see in the bootstrax collection if the new ebs (eb3-5) are already in some runs for some time. If that is the case, also use these older ebs only contribute to processing in case of extreme data rates. Otherwise they wouldn't do much (as they are slower in processing the runs, see https://xe1t-wiki.lngs.infn.it/doku.php?id=xenon:xenonnt:dsg:daq:eb_speed_tests_update#conclusion).

Could be added to #74.

Bootstrax fails in the pipeline for empty He data.

I've saved run 9282 as an example under /live_data/009282.

It simply says it's stuck.

Idea: straxen.clear_all_pycache

Maybe it would be a nice feature to add a function clear_all_pycache or the like to delete all __pycache__ folders. Sometimes (especially after doing a git pull) Numba throws some errors that can simply be solved by removing everything stored in several pycache_ folders.

On the other hand I’m not sure if this is a feature that strax/straxen should have. Even better of course would be if numba wouldn’t be failing (than this monkey-patch would be obsolete).

If others see merrit in this too I'll come up with a simple function to add to straxen.

Delete empty records

After hit finding it might be possible to end up with emtpy records (data fields completely ZLE). Removing these records would gain us a bit in terms of speed and performance.

Throw a warning if fried rice is down insteat of crashing everything.

In our current implementation of the rundb.py straxen raises an error when fried rice is down:

~/mymodules/straxen/straxen/rundb.py in _find(self, key, write, allow_incomplete, fuzzy_for, fuzzy_for_options)
    153                         'protocol': 'rucio'}}}
    154             doc = self.collection.find_one({**run_query, **dq},
--> 155                                            projection=dq)
    156             if doc is not None:
    157                 datum = doc['data'][0]

# (some long mongo trace back)

ServerSelectionTimeoutError: fried.rice.edu:27017: timed out

This issue was introduced with https://github.com/XENONnT/straxen/pull/164/files .

In most cases analyzers will work with locally stored data and hence do not need to access anything via the rundb storage system. Hence I propose rather to throw a warning and to drop the rundb storage system from the registered storage rather than raising an error and stopping any ongoing analysis.

st.waveform_display requires removed peak_classification; update demo

In #36, peak_classification was removed. This seems to be used in st.waveform_display but I am not sure of the appropriate fix.

Additionally, the demos should be updated to reflect peak_classification missing. For example: https://github.com/XENONnT/straxen/blob/master/notebooks/tutorials/strax_demo.ipynb

See:

st.waveform_display('170204_2111', seconds_range=(0, 0.15))

KeyError                                  Traceback (most recent call last)
<ipython-input-11-3f3d31936807> in <module>
----> 1 st.waveform_display('170204_2111', seconds_range=(0, 0.15))

/dali/lgrandi/strax/straxen/straxen/mini_analysis.py in wrapped_f(context, run_id, **kwargs)
     91             if len(requires):
     92                 deps_by_kind = strax.group_by_kind(
---> 93                     requires, context=context, require_time=False)
     94                 for dkind, dtypes in deps_by_kind.items():
     95                     if dkind in kwargs:

/dali/lgrandi/strax/strax/strax/utils.py in group_by_kind(dtypes, plugins, context, require_time)
    472         if context is None:
    473             raise RuntimeError("group_by_kind requires plugins or context")
--> 474         plugins = context._get_plugins(targets=dtypes, run_id='0')
    475 
    476     if require_time is None:

/dali/lgrandi/strax/strax/strax/context.py in _get_plugins(self, targets, run_id)
    413         plugins = collections.defaultdict(get_plugin)
    414         for t in targets:
--> 415             p = get_plugin(t)
    416             # This assignment is actually unnecessary due to defaultdict,
    417             # but just for clarity:

/dali/lgrandi/strax/strax/strax/context.py in get_plugin(d)
    359 
    360             if d not in self._plugin_class_registry:
--> 361                 raise KeyError(f"No plugin class registered that provides {d}")
    362 
    363             p = self._plugin_class_registry[d]()

KeyError: 'No plugin class registered that provides peak_classification'```

Written with @tunnell

Plugins omit description

The majority of the plugins omit a description. For documentation we should add a description to all plugins.

For a plugin description see e.g. the DAQreader:
https://github.com/XENONnT/straxen/blob/master/straxen/plugins/daqreader.py#L62

Perhaps @WenzDaniel can help with the NVeto plugins

Make enum for classification types

Observation from @jpienaar13: it would be useful if straxen had an enum (see https://docs.python.org/3/library/enum.html#intenum) that encodes that 1 corresponds to S1, 2 to S2, and higher numbers to potential future peak types we might want to add.

y-axis of time_v_channel graph in waveform_display not correct

#128 loads data and shows the graph. But the waveform_display graph has a little problem: Notice the PMT number goes from 0 to 1, when it should go from 0 to number of channels.

The code to reproduce the plot is:

import straxen    
st = straxen.contexts.xenon1t_dali(build_lowlevel=True)
run_id = '170204_1710'
df = st.get_array(run_id, "event_info")
event = df[4]
st.waveform_display(run_id, time_within=event)

(boot)strax handling of last steps

Luca pointed out the a problem with run 8675 where the number of files was uploaded to the rundb by eb5. However eb4 claimed to be the one that correctly processed the run but didn't include the filecount.

Reconstruction of events
It's really a conglomerate of bad screw-ups and quite unlikely events. I'm going to make corresponding issues on bootstrax/strax. Let me summarize what happened:

Eb5 finished but on the very last chunk it failed saving because of rsync was hammering it.
The exception was caught just while strax was renaming _temp to . This means the folders have the appropriate names.
In fact the processing proceeded so far the the call was made in bootstrax to count the files and update the database.
Eb3 tried and failed like eb5 because of rsync.
Eb4 then started and completed processing. However before it made the call to count the number of files bootstrax was killed as we ran out of diskspace.

All of these things seem very unlikely but somehow it all happened to this one run.

Bottomline: it was bootstraxs' fault and didn't update the filecount on eb4.

Issues to fix

Strax: don't start renaming things from _temp to while savers are not finished.
bootstrax:
- Check that the filecount is in the data after stop (crash first recover later)
- Don't crash this hard if we don't have space to process data. Also wait longer.
- Can we check that all the savers succeded? We need to do so somehow.

Avoid C / C++ compilations in test build

Currently, Travis builds spend a long time compiling some C++ modules, according to a long stream of warnings like

cc1plus: warning: command line option ‘-std=gnu99’ is valid for C/ObjC but not for C++

It's not super clear what the main culprit is, maybe grpcio? Probably we can install some things via conda to avoid this.

Fix labels in mini-analyses to be compatible with latex rendering

Make sure to update pulse_length in clean_up_empty_records

We need to make sure to update pulse_length in
https://github.com/XENONnT/straxen/blob/master/straxen/plugins/nveto_pulse_processing.py#L69 as it might lead to funny behavior down the chain.

Better rundb registration

When strax registers new rundb entries, the host field

straxen/straxen/rundb.py

Line 158 in 4c59975

'host': self.hostname,

should probably refer to the hostname alias ('dali') rather than the full hostname ('dali-login1.etc.etc').

We should also include the lineage hash of the data (e.g. in the meta field under lineage_hash) rather than just the lineage. This will make searching a lot easier.

is so old that it cannot be seen as ‘online’ anymore. Looking back 4 weeks seems reasonable, for longer periods it might be better to refer to the offline monitor.
the lineage is changed of the plugin. In case of updates in plugins the lineage might change. It doesn’t make much sense to keep data that cannot be opened by the strax(en) version on the event builders. To this end, we should try to make only essential changes to the xenonnt_online context: https://github.com/XENONnT/straxen/blob/master/straxen/contexts.py#L66.