alphatwirl / alphatwirl Goto Github PK

View Code? Open in Web Editor NEW

10.0 3.0 10.0 3.72 MB

A Python library for summarizing event data into multivariate categorical data

License: BSD 3-Clause "New" or "Revised" License

Python 99.33% R 0.17% C 0.50% Shell 0.01%

data-analysis root-cern pandas r data-frame alphatwirl

alphatwirl's People

Contributors

Stargazers

Watchers

Forkers

dsmiff cmsra1 skalafut shane-breeze taisakuma benkrikler lucien1011 eshwen dbanthony

alphatwirl's Issues

"can't pickle _thread.lock" in ResumableDatasetLoop with SubprocessRunner in Python 3

at 8db62fc, a few commits after v0.24.2

Traceback (most recent call last):
  File "./bdphi-scripts/bdphiROC/twirl_mktbl_heppy.py", line 721, in <module>
    main()
  File "./bdphi-scripts/bdphiROC/twirl_mktbl_heppy.py", line 62, in main
    run(reader_collector_pairs)
  File "./bdphi-scripts/bdphiROC/twirl_mktbl_heppy.py", line 609, in run
    treeName='tree'
  File "./bdphi-scripts/external/atheppy/atheppy/fw.py", line 110, in run
    self._run(loop)
  File "./bdphi-scripts/external/atheppy/atheppy/fw.py", line 285, in _run
    loop()
  File "./bdphi-scripts/external/alphatwirl/alphatwirl/datasetloop/loop.py", line 60, in __call__
    pickle.dump(self.reader, f, protocol=pickle.HIGHEST_PROTOCOL)
TypeError: can't pickle _thread.lock objects

The error occures in ResumableDatasetLoop:

alphatwirl/alphatwirl/datasetloop/loop.py

Lines 52 to 61 in 8db62fc

 def __call__(self): 

 self.reader.begin() 

 for dataset in self.datasets: 

 self.reader.read(dataset) 

 path = os.path.join(self.workingarea.path, 'reader.p.gz') 

 with gzip.open(path, 'wb') as f: 

 pickle.dump(self.reader, f, protocol=pickle.HIGHEST_PROTOCOL) 

 return self.reader.end()

This happens only in Python 3, not in Python 2.7
This happens only in the parrallel mode subprocess, not in htcondor.

The proc in SubprocessRunner might not be picklable in Python 3:

alphatwirl/alphatwirl/concurrently/SubprocessRunner.py

Lines 44 to 50 in 8db62fc

 proc = subprocess.Popen( 

 args, 

 stdout=subprocess.PIPE if self.pipe else None, 

 stderr=subprocess.PIPE if self.pipe else None, 

 cwd=taskdir 

 ) 

 self.running_procs.append(proc)

can try to pickle before self.reader.read(dataset) or self.reader.begin()

propagate the config for logging to resume.py

related to #25, propagate the log levels to resume.py as well.

don't skip a test for Summarizer for Python 3

alphatwirl/tests/unit/summary/test_Summarizer.py

Lines 67 to 71 in 6106bde

 @pytest.mark.skipif(sys.version_info >= (3, 0), reason="requires python 2") 

 def test_to_tuple_list_key_not_tuple(obj): 

 obj.add('A', (12, )) # the keys are not a tuple 

 obj.add(2, (20, )) # 

 assert [(2, 20), ('A', 12)] == obj.to_tuple_list()

It is skipped because int and str cannot be compared in python 3

alphatwirl/alphatwirl/summary/Summarizer.py

Line 59 in 6106bde

keys_sorted = sorted(self._results.keys())

rename valOutColumnNames

valOutColumnNames is a misnomer
- TableConfigCompleter.py
- build_counter_collector_pair.py
They are not column names for values but for summaries.
- should be called summaryColumnNames or something.

handle TChain with no files

alphatwirl/alphatwirl/loop/EventLoop.py

Lines 36 to 39 in fcf553e

 events = self.build_events() 

 self.nevents = len(events) 

 self._report_progress(0) 

 self.reader.begin(events)

self.reader.begin(events) can fail if events are built from TChain with no file.

This could happen if no files are verified here:

alphatwirl/alphatwirl/roottree/build.py

Lines 24 to 28 in fcf553e

 if self.config['check_files']: 

 paths = self._verify_files(paths, self.config['skip_error_files']) 

 chain = ROOT.TChain(self.config['tree_name']) 

 for path in paths: 

 chain.Add(path)

ROOT is not imported in Travis

ROOT doesn't appear to be imported at Travis according to Codecov
cannot investigate at the moment because the job log is empty at Travis

update test for run.py

update test for run.py

to test SIGINT is actually ignored:

alphatwirl/alphatwirl/concurrently/run.py

Line 23 in 6c8580b

signal.signal(signal.SIGINT, signal.SIG_IGN)

to test log level and handler are in effect:

alphatwirl/alphatwirl/concurrently/run.py

Lines 85 to 99 in 611214e

 def setup_logging(): 

 path = 'logging_levels.json.gz' 

 if not os.path.isfile(path): 

 return 

 with gzip.GzipFile(path, 'r') as f: 

 loglevel_dict = json.loads(f.read().decode('utf-8')) 

 for name, level in loglevel_dict.items(): 

 logger = logging.getLogger(name) 

 logger.setLevel(level) 

 handler = logging.StreamHandler(sys.stdout) 

 formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') 

 handler.setFormatter(formatter) 

 logging.getLogger('').addHandler(handler)

support decorator for using "concurrently"

support decorator for using "concurrently"
- a step towards making concurrently an independent package

absorb collectors into readers

this will be surgical changes.

it will help to solve #10.

steps:

add the operator + (or method merge()) to Reader
create an alternative to EventsInDatasetReader, in which readers for the same data set are merged at the end. - EventDatasetReader
add merge() to ReaderComposite
add merge() to AllwCount, AnywCount, and NotwCount
replace the return in end() in EventDatasetReader .
deprecate EventsInDatasetReader

organize requirements

pointed out in #30 (comment)

need at least three separate lists of requirements

to use alphatwirl
to run the tests
to compile the docs

they can be specified in install_requires in setup.py and requirements files for pip.

HTCondorJobSubmitter crashes if problem submitting

When trying to run on lxplus, with my working directory under /eos I observed the following traceback:

Traceback (most recent call last):
  File "/afs/cern.ch/user/b/bkrikler/.local/bin/fast_carpenter", line 11, in <module>
    load_entry_point('fast-carpenter', 'console_scripts', 'fast_carpenter')()
  File "/eos/user/b/bkrikler/CHIP/fast-carpenter/fast_carpenter/__main__.py", line 67, in main
    return process.run(datasets, sequence)
  File "/afs/cern.ch/user/b/bkrikler/.local/lib/python2.7/site-packages/atuproot/atuproot_main.py", line 49, in run
    return self._run(loop)
  File "/afs/cern.ch/user/b/bkrikler/.local/lib/python2.7/site-packages/atuproot/atuproot_main.py", line 101, in _run
    result = loop()
  File "/eos/user/b/bkrikler/CHIP/alphatwirl/alphatwirl/datasetloop/loop.py", line 55, in __call__
    self.reader.read(dataset)
  File "/eos/user/b/bkrikler/CHIP/alphatwirl/alphatwirl/datasetloop/reader.py", line 27, in read
    reader.read(dataset)
  File "/eos/user/b/bkrikler/CHIP/alphatwirl/alphatwirl/loop/EventDatasetReader.py", line 66, in read
    runids = self.eventLoopRunner.run_multiple(eventLoops)
  File "/eos/user/b/bkrikler/CHIP/alphatwirl/alphatwirl/loop/MPEventLoopRunner.py", line 93, in run_multiple
    return self.communicationChannel.put_multiple(eventLoops)
  File "/eos/user/b/bkrikler/CHIP/alphatwirl/alphatwirl/concurrently/CommunicationChannel.py", line 131, in put_multiple
    return self.dropbox.put_multiple(packages)
  File "/eos/user/b/bkrikler/CHIP/alphatwirl/alphatwirl/concurrently/TaskPackageDropbox.py", line 60, in put_multiple
    runids = self.dispatcher.run_multiple(self.workingArea, pkgidxs)
  File "/eos/user/b/bkrikler/CHIP/alphatwirl/alphatwirl/concurrently/HTCondorJobSubmitter.py", line 129, in run_multiple
    njobs = int(regex.search(stdout).groups()[0])
AttributeError: 'NoneType' object has no attribute 'groups'

The underlying problem in this case was that htcondor submission on lxplus doesn't work below /eos only under /afs, however the traceback there hid this somewhat. It might be helpful if the match object from the regular expression on line 127 was first checked to be valid and not None. If it is None, then the stdout and stderr could be printed and some exception raised. That would help a user understand more immediately why the jobs weren't submitted I suspect.

change the way to give a progressReporter to a task

At the moment, a progressReporter is given to a task as an argument:

alphatwirl/alphatwirl/concurrently/Worker.py

Lines 24 to 29 in dd0a7a3

 def _run_task(self, package): 

 try: 

 result = package.task(progressReporter = self.progressReporter, *package.args, **package.kwargs) 

 except TypeError: 

 result = package.task(*package.args, **package.kwargs) 

 return result

this potentially causes a name conflict if a task already has an argument progressReporter for a different purpose.

Instead of Worker giving the reporter to a task, the task should find the reporter if the task wants a reporter.

A possible solution: a worker register a reporter to a DB, e.g., ProgressReporterAgency or ProgressReporterOffice defined in the module progressbar. A task gets the reporter from the DB.

move resume.py from fwtwirl

move resume.py from fwtwirl
write a test for resume.py

make repr short

within one line
use str for long text
for all classes

collect results of jobs as they finish

collect results from jobs as they finish instead of collecting them after all jobs finish.

enable coverage in Travis-CI

disabled now because it takes too long to finish
https://travis-ci.org/github/alphatwirl/alphatwirl/builds/673907629

correct module names for multiple _deprecated_class_method_option

When _deprecated_class_method_option() or _deprecated_func_option() is used multiple times on the same method or function, module_name of the outer decorator will be the module name of the inner decorator, that is deprecation.

need to check if the decorated function is already decoreated and properly set module_name to the module name of the original function.

allow elements of the option binnings of KeyValueComposer to be None

allow elements of the option binnings of KeyValueComposer to be None
currently binnings itself can be None. but when binnings is a list, its element is not allowed to be None
this feature is supported in qtwirl

Add support for SGE batch systems

I've no idea how much work this would involve, but to bring alphatwirl out to more sites, support beyond HTCondor would be nice to be able to run components in parallel. In particular, sun grid engine batch systems, although somewhat old-fashioned, are still common such as lxplus or on the Imperial College batch system.

HTCondorSubmitter tries to access the first element of a potentially empty list

Line 74 of HTCondorJobSubmitter calls a function then immediately gets the first element in what it returns. However two of the three return statements in that function return empty lists, and so this can lead to the code crashing.

fix travis for python 2.7 with ROOT

https://travis-ci.org/alphatwirl/alphatwirl/builds/412573877
- error with Python 2.7 with ROOT
- while installing scandir, required by pytest

SetBranchAddress() doesn't work for std::vector in Python 3

alphatwirl/alphatwirl/roottree/BranchAddressManagerForVector.py

Line 46 in 53ec9d0

tree.SetBranchAddress(leaf.GetName(), itsVector)

this line causes segmentation violation.

test gets stuck in python 3.6 in some circumstances

at the commit 7405535, the test gets stuck in some cirumstances.

example how to reproduce

in the top directory of alphatwirl,

pytest -s tests

the test starts normally.

============================================ test session starts =============================================
platform darwin -- Python 3.6.3, pytest-3.4.0, py-1.5.2, pluggy-0.6.0
rootdir: #######, inifile:
plugins: mock-1.6.3, cov-2.5.1, console-scripts-0.1.4, hypothesis-3.38.5
collected 619 items

and all tests pass.

tests/unit/summary/test_Summarizer.py .....s....                                                       [  0%]
tests/unit/summary/test_WeightCalculatorOne.py .                                                       [  0%]
tests/unit/summary/test_convert_key_vals_dict_to_tuple_list.py ......                                  [  0%]
tests/unit/summary/test_parse_indices_config.py .

=================================== 579 passed, 40 skipped in 5.09 seconds =================================

but, the test doesn't finish. the shell prompt won't appear.

hit ctrl-c

Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/Users/sakuma/anaconda3/envs/py3_6-pd0_20/lib/python3.6/multiprocessing/popen_fork.py", line 29, in poll
    pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
$

the problem doesn't always happen

the same problem doesn't happen for example if

-s option is not used
the trailing slash is given as in tests/

pytest tests
pytest -s tests/

the same problem doesn't happen in python 2.7

platform darwin -- Python 2.7.12, pytest-3.3.2, py-1.5.2, pluggy-0.6.0
rootdir: #######, inifile: pytest.ini
plugins: mock-1.6.3, cov-2.5.1, console-scripts-0.1.3

replace GenericKeyComposerB with KeyValueComposer

Unit tests package dependencies not documented

I was trying to understand why the unit tests were failing when I run pytest and it turns out it's because I didn't have the pytest-console-scripts package installed. I realised that these were needed by looking in the travis CI config file, but it would be good to define pytest, pytest-mock, pytest-cov and pytest-console-scripts either in the requirements.txt file, or mention it in the README explicitly.

Extend the contents used for DataFrame filenames

This is based on a series of posts first made to Slack. But for the TL;DR:

The TableFileNameComposer class currently creates filenames that contain only the binning scheme for the DFs
Sometimes the same binning scheme will be used whilst other variables are changed to create a DF (eg, for systematics, where we change the weighting or corrections).
I suggest we have TableFileNameComposer.call() receive the full DF config specification, such that any generic filename composer one would like to create has the full amount of information available to it. So on this line, you would just pass in ret:

ret['outFileName'] = self.createOutFileName(ret)

For the meantime, I have a hacky solution to my problem, which can be used instead of TableFileNameComposer:

class WithInsertTableFileNameComposer():                                           
    def __init__(self, composer, inserts):                                         
        self.inserts = inserts                                                     
        self.composer = composer                                                   
        self.frame_idx = 0                                                         
                                                                                   
    def __call__(self, columnNames, indices, **kwargs):                                     
        this_insert = self.inserts[self.frame_idx]                                 
        suffix = kwargs.get("suffix", self.composer.default_suffix)      
        kwargs["suffix"] = "--{}.{}".format(this_insert, suffix)                   
        self.frame_idx += 1                                                        
        return self.composer(columnNames, indices, **kwargs)

propagate the config for logging to batch jobs

optimize the sending results from worker nodes for data frames

currently, the package concurrently, uses pickle to send the results from the worker nodes.

pickle files with data frames tend to become large. and loading large pickles is slow.

develop an alternative implementation which is specifically optimized for data frames, for example, by using dask so that loading the results at the interactive node is fast.

ValueError with duplicate datasets

Release version

0.16.0 (expected to appear in 0.20.0 as well)

Issue

ValueError is raised on loop/merge.py#L7 when duplicate datasets (two or more dataset entries with the same name attribute)

Description

Duplicate datasets can appear by user error. The final duplicate dataset will overwrite all the other in loop/EventDatasetReader.py#L74, but loop/EventDatasetReader.py#L68 is filled with runids from all dataset duplicates.

When the code reaches loop/merge.py#L7 it tries to access runids from duplicate datasets that have been overwritten in loop/EventDatasetReader.py#L74

Resolution

Everything works fine if there are no duplicate datasets.

However, this could be checked or enforced somewhere earlier.

should min be applied on bin boundaries rather than values?

In the version 0.18.3, the following code produce awkward results

from qtwirl import qtwirl
filepath = 'https://github.com/alphatwirl/qtwirl/raw/v0.03.1/tests/data/sample_chain_01.root'
results = qtwirl(
    file=filepath, tree_name='tree',
    reader_cfg=dict(keyAttrNames='met', binnings=RoundLog(0.2, 100, min=10, underflow_bin=0)))
print results

met	n	nvar
0.000000	102	102
6.309573	0	0
10.000000	53	53
15.848932	77	77
25.118864	100	100
39.810717	141	141
63.095734	153	153
100.000000	166	166
158.489319	114	114
251.188643	78	78
398.107171	14	14
630.957344	2	2
1000.000000	0	0

There shouldn't be the 2nd row, i.e. the row with met = 6.309573.

This is even wrong. It looks like there is no events with 6.309573 <= met < 10.0, which is wrong.

This happens because, in RoundLog (and Round as well), the next to the underflow bin is the bin for the min.

An example

>>> b = RoundLog(0.2, 100, min=10, underflow_bin=0)
>>> b(10)
6.309573444801936
>>> b(8)
0

Here is why this happens. Because 10 is greater than or equal to min (in fact exactly the min), it returns the lower edge of the bin. However, because of the rounding issue of float, 10 doesn't fall into the interval [10, 10^1.2) but one below [10^0.8, 10). So it returns 10^0.8 = 6.30957. On the other hand, 8 is below min, it returns underflow_bin.

This is a featrue but is very uncomfortable. 8 is greator than 6.3. If there is a bin [10^0.8, 10), 8 should fall in that bin.

As long as the bin is determined after the value is compared with the min, this can happen.

This uncomfortable situation can be avoided if the minimum value is the lower edge of the bin the argument min falls in.

develop progress presentation for jupyter notebook

There are ways to delete stdout in jupyter notebook
There also might be scroll bar widget

RuntimeWarning in travis

alphatwirl/tests/unit/datasetloop/test_resume_py.py

Lines 56 to 63 in e0ea580

 # assert '' == ret.stderr 

 ## commented out because of "RuntimeWarning" in travis 

 ## https://travis-ci.org/alphatwirl/alphatwirl/jobs/409237073 

 ## E AssertionError: assert '' == '/home/travis/miniconda/envs...ok( name, 

 ## *args, **kwds )\n' 

 ## E + /home/travis/miniconda/envs/testenv/lib/ROOT.py:301: RuntimeWarning: 

 ## numpy.dtype size changed, may indicate binary incompatibility. Expected 

 ## 96, got 88

https://travis-ci.org/alphatwirl/alphatwirl/jobs/409237073

add unit tests to test if receive_one() in TaskPackageDropbox unpickles one result

sett comment in #51

localize the concept of data sets

localize the concept of data sets to certain sub packages

support the "with" statement for using "concurrently"

support the "with" statement for using "concurrently"
- a step towards making concurrently an independent package

make it possible to include lambdas in tasks

currently, it is impossible for a task in concurrently to include lambda. this is because lambda isn't picklable.

there are several solutions there to serialize lambdas, including dill.

note that unlike results, tasks are generally small. serializing and unserializing are not necessary to be very fast.

avoid mutable default

$ find ./ -name \*.py | xargs grep 'def' | grep '\[ *\]'
./parallel/build.py:def build_parallel(parallel_mode, quiet=True, processes=4, user_modules=[ ],
./concurrently/HTCondorJobSubmitter.py:    def __init__(self, job_desc_extra=[ ], job_desc_dict={}):
./selection/modules/with_count.py:    def __init__(self, name='All', selections=[ ]):
./selection/modules/with_count.py:    def __init__(self, name='Any', selections=[ ]):
./selection/modules/basic.py:    def __init__(self, name='All', selections=[ ]):
./selection/modules/basic.py:    def __init__(self, name='Any', selections=[ ]):
./loop/ReaderComposite.py:    def __init__(self, readers=[]):
$ find ./ -name \*.py | xargs grep 'def' | grep '{ *}'
./parallel/build.py:        logger.warning('unknown parallel_mode "{}", use default "{}"'.format(
./concurrently/HTCondorJobSubmitter.py:    def __init__(self, job_desc_extra=[ ], job_desc_dict={}):
./selection/factories/expand.py:def expand_path_cfg(path_cfg, alias_dict={ }, overriding_kargs={ }):

possible more. arguments can be listed in multiple lines.

make simple_repr work

https://github.com/alphatwirl/alphatwirl/blob/master-20190512-01-simple-repr/alphatwirl/misc/simple_repr.py

not sure how to make it picklable.

change the default datasetColumnName to 'dataset'

at either

alphatwirl/alphatwirl/collector/ToTupleListWithDatasetColumn.py

Line 6 in 705bcce

datasetColumnName='component'

and
alphatwirl/alphatwirl/collector/ToDataFrameWithDatasetColumn.py

Line 10 in 705bcce

datasetColumnName = 'component'

or
alphatwirl/alphatwirl/configure/build_counter_collector_pair.py

Lines 27 to 29 in 705bcce

resultsCombinationMethod = ToTupleListWithDatasetColumn(

summaryColumnNames = tblcfg['keyOutColumnNames'] + tblcfg['valOutColumnNames']

)
build_counter_collector_pair can become a class so that datasetColumnName can be changed by an option to __init__().

add option to run profile in batch jobs

add an option to run the profile in batch jobs.
- the option exists in run.py
- but no ways to use the option
- can be an option for dropbox or dispatcher

use deque

use deque for self.boundaries:
https://github.com/TaiSakuma/AlphaTwirl/blob/766483220649396bb3a6106059f08cd1791f2f65/alphatwirl/binning/Round.py#L25

update how HeppyResult checks if a dir is a component

take over https://github.com/CMSRA1/AlphaTwirl-old/issues/4
continue from https://github.com/CMSRA1/AlphaTwirl-old/issues/3

use dict.setdefault

rewite this construction

if key not in counts:
  counts[key] = <<initial value>>

with

counts.setdefault(key, <<initial value>>)

for example in
https://github.com/TaiSakuma/AlphaTwirl/blob/v0.9.x/AlphaTwirl/Summary/Count.py#L16-L17

It will run faster because it searches for key once instead of twice

update tests for TaskPackageDropbox

update tests for TaskPackageDropbox
- the current tests only cover limited situations
- what to vary
  - the number of tasks, e.g., 0, 1, or 5
  - the number of fails before success for each task, for example, [0, 0, 0, 0, 0], [0, 0, 2, 0, 0], [0, 1, 1, 0, 1] if the number of tasks is 5.
  - sequences of runs in poll
    - permutation
    - the number of runs in each pol

EventLoop doesn't provide Reader a way to terminate execution elegantly

Problem

The code in the EventLoop class handles the reader (typically a composite reader) and provides it with events from the tree(s) by:

Call reader.begin() method
For each event:
a. Call reader.event()
Call reader.end()

In each case, the return of each of reader's methods is ignored. This makes it awkward for a reader to tell the event loop to terminate early, or not run at all in the case of begin(). begin() can raise an exception of course, but in some cases that is more extreme than really needed.

Solution

Have the return of each calll to reader begin, event, and end checked, and execution halted if the return does not evaluate to true.

Specific example

I'm running over a large heppy dataset and including the existing RA1 AlphaTools sequence as a single Reader. The heppy dataset is being split up using AlphaTwirl and submitted to the Bristol htcondor batch system. One of the datasets in this input is not supposed to be handled by AlphaTools and AlphaTools knows this, so it would normally just ignore it, however this causes an exception to be raised crashing the batch job, which is then re-submitted. To enable things to finish elegantly, I would prefer to be able to signal to the EventLoop that this jobs should not proceed in the begin() method of the reader that wraps AlphaTools.

	def __call__(self):
	self.reader.begin()
	for dataset in self.datasets:
	self.reader.read(dataset)

	path = os.path.join(self.workingarea.path, 'reader.p.gz')
	with gzip.open(path, 'wb') as f:
	pickle.dump(self.reader, f, protocol=pickle.HIGHEST_PROTOCOL)

	return self.reader.end()

	proc = subprocess.Popen(
	args,
	stdout=subprocess.PIPE if self.pipe else None,
	stderr=subprocess.PIPE if self.pipe else None,
	cwd=taskdir
	)
	self.running_procs.append(proc)

	@pytest.mark.skipif(sys.version_info >= (3, 0), reason="requires python 2")
	def test_to_tuple_list_key_not_tuple(obj):
	obj.add('A', (12, )) # the keys are not a tuple
	obj.add(2, (20, )) #
	assert [(2, 20), ('A', 12)] == obj.to_tuple_list()

	events = self.build_events()
	self.nevents = len(events)
	self._report_progress(0)
	self.reader.begin(events)

	if self.config['check_files']:
	paths = self._verify_files(paths, self.config['skip_error_files'])
	chain = ROOT.TChain(self.config['tree_name'])
	for path in paths:
	chain.Add(path)

	def setup_logging():
	path = 'logging_levels.json.gz'
	if not os.path.isfile(path):
	return
	with gzip.GzipFile(path, 'r') as f:
	loglevel_dict = json.loads(f.read().decode('utf-8'))

	for name, level in loglevel_dict.items():
	logger = logging.getLogger(name)
	logger.setLevel(level)

	handler = logging.StreamHandler(sys.stdout)
	formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
	handler.setFormatter(formatter)
	logging.getLogger('').addHandler(handler)

	def _run_task(self, package):
	try:
	result = package.task(progressReporter = self.progressReporter, package.args, *package.kwargs)
	except TypeError:
	result = package.task(package.args, *package.kwargs)
	return result

	# assert '' == ret.stderr
	## commented out because of "RuntimeWarning" in travis
	## https://travis-ci.org/alphatwirl/alphatwirl/jobs/409237073
	## E AssertionError: assert '' == '/home/travis/miniconda/envs...ok( name,
	## args, *kwds )\n'
	## E + /home/travis/miniconda/envs/testenv/lib/ROOT.py:301: RuntimeWarning:
	## numpy.dtype size changed, may indicate binary incompatibility. Expected
	## 96, got 88

	resultsCombinationMethod = ToTupleListWithDatasetColumn(
	summaryColumnNames = tblcfg['keyOutColumnNames'] + tblcfg['valOutColumnNames']
	)