ucl / tlomodel Goto Github PK

View Code? Open in Web Editor NEW

11.0 5.0 5.0 143.27 MB

Epidemiology modelling framework for the Thanzi la Onse project

Home Page: https://www.tlomodel.org/

License: MIT License

Python 99.88% Shell 0.11% Dockerfile 0.01%

epidemiology individual-based-modelling health-economic-evaluation healthcare-systems simulation-modeling

tlomodel's People

Contributors

Stargazers

Watchers

Forkers

tamuri matt-graham giordano tebajanga bhagedorn-idm

tlomodel's Issues

Formalised declaration of ancillary information by modules

Please can we have some basic functionality for dependency handling.
ie. simulation fails to run unless another module is already registered.

Outputting 'person journeys'

As part of developing a module it would be helpful to see a log of the changes that occur for one particular person. This would encompass:

tracking changes to elements in the data frame for that person
tracking all HSI that involve that person

The way I can think of doing this would be to have a Logging Event that occurs each day (last event each day -- determimed by setting the timestamp to be one microsecond to midnight) that had a 'self.person_id_to_track' property. It would store the 'sim.population.props' dataframe each time-step so that it can compare the df from the "yesterday" with that of "today" for just that one person_id. It could then identify any changes and output this to the log. It could also scan the log from the HealthSystem to identify all HSI that have occurred which have involved that person.

Means of tracking excel data files that are imported using GitHub

So that we use the GitHub versioning for the excel datafiles and all sets of files remain in one place.

Better installation docs

Add more detailed steps to get started from nothing.

Allow running of unit tests with large populations

Researchers develop their unit tests with relatively small population sizes but would be good to stress test rare events with large population sizes locally.

Researchers and travis would still use small populations but before merging into master this could be used to test very rare events

Add link to wiki

In Github, it's not obvious that there is a wiki.

Let's add a link to https://github.com/UCL/TLOmodel/wiki from the README.rst file

Modify LinearModel to better handle external predictors

The following line

TLOmodel/src/tlo/lm.py

Line 167 in 82758d0

df = df.assign(**new_columns)

creates a copy of the dataframe, adding columns for each external variable. The benefits are that the external variable is indistinguishable from other properties of the population, so can be operated on in the same way. The downstide is...it creates a copy of the dataframe.

We need to monitor how external variables are used, and then determine what to do to avoid copying the dataframe.

Slight error in "Implementing a disease module"

On the Implementing a disease module page, code for test file, replace
from tlo.methods import demography
with
from tlo.methods import demography, contraception

Also, how does one know the test is running as expected? I get copious amounts of output (if run as a standalone Python script), but no idea if it is correct. Although when run using pytest, it says it passed.

Tests on the consistency of self.parameters and PARAMETERS and the resourcefile

In checking disease modules, I am noticing that it is easy for there to be discrepancies regarding parameters and this can be symptomatic of a deeper issue with the code (e.g. typos and changes that come from multiple revision of the code leading to deprecation of some features etc)

It would be good to have a check as follows that would:

allow these internal consistencies to be picked-up (and then investigated) during development and checking
be a part of the 'pull request' protocol to guarantee this for all modules in master.

Each item in each of the following places must map perfectly 1:1

The declaration of 'PARAMETERS' in a module
The dict used in the module, 'self.parameters'
The list of parameter values that are provided in the resource file [If the 'load_parameters_from_dataframe' method is used]

And .... that each of those parameters must be used somewhere (either in the module itself or referred to from another place)

I can see how the internal consistency can be established between PARAMETERS, self.parameters and the resourceful. However, checking for 'use' of the parameters would require a "cold read" of the files and a recognition of actual usage (as opposed to a comment or the initial declaration).

Add to wiki on module design

add to “module design” with:
List of naming conventions
Things to check prior to PR (merge from master; other tests work, one more specific tests; adherance to naming conventjonsa)

Create utility function to load parameters from Excel sheet

A common pattern is for parameter values to be defined in an Excel worksheet having two columns "name" and "value". Value can be any valid Parameter type. Currently, each has to be loaded by hand (for example, this block of code)

TLOmodel/src/tlo/methods/oesophageal_cancer.py

Line 214 in c0e4785

def read_parameters(self, data_folder):

We want to write a utility function that given a name/value dataframe (the module would still be responsible for loading the workbook etc.) and module PARAMETERS definitions (i) gets the corresponding parameter value from the dataframe (ii) checks the types match (does any necessary conversion) and (iii) assigns to the corresponding entry in module.parameters

If types don't match, fail with error.

If parameter with given name doesn't exist in Excel sheet, fail with error.

Use tlo-specific logging

Current issues with logging:

reading of log files is slow
- ast_eval call for each row
- incremental adding of rows to a dataframe
- lots of replication
reading of logfiles crash on non-python expressions
only list or dict per row, no direct dataframe
currently following convention for logging setup and message format

Aims:

Robust logging API
- wrapper for python logging, being able to control what happens in each call
Simple process for configuration
- setup and configuration for defaults, allowing for sensible overriding
- configuration applies to all registered and future loggers
Improved logging output
- performance
- non-python expressions
- descriptions/metadata about what is being logged, mentioned in #39

Saving to file simulations in a suspended state and resuming

We have a common use case as follows:

We want to run a simulation up to a certain point (e.g. to before some policy changes)
Then we want to run the simulation from that point under multiple sets of assumptions ('forward projections')
The state from which the simulation starts should be the same in each 'forward projection'
This could be accomplished through control of the random seed so that the first part of simulation. -- but it's wasteful to repeat verbatim the first part of the simulation.

It would seem like a solution to this would be to able to save the simulation at a certain point to a file. Then load up the file and resume the simulation under the same or different parametric conditions.

I thought this might be relatively straight forward using pickle (i.e. pickle the sim: which contains the sim.population.props and the event_queue, and all the modules and their internal contents). Then, unpickle the sim, manipulate any parameters in the modules, and restart the sim using sim.simulate(end_date = end_of_part_two_date). (see script below)

However, I tried this and the unpicking failed with a RecursionError. Stack overflow suggested this is a common error for pickling complex classes and suggested increasing the limit on recursions -- but this led to the console crashing for me.

Do you have any thoughts on this?

Short-term:

Having a basic form of this functionality (even if hacky) would really help Iwona's MRes project. Any ideas would be very welcome!?

Medium-term:

Will this be a part (or could it be?) of the run management system?

from pathlib import Path

from tlo import Date, Simulation
from tlo.methods import contraception, demography

outputpath = Path("./outputs")
resourcefilepath = Path("./resources")

start_date = Date(2010, 1, 1)
end_date_part_one = Date(2011, 1, 2)
popsize = 1000

sim = Simulation(start_date=start_date)
sim.register(demography.Demography(resourcefilepath=resourcefilepath))
sim.register(contraception.Contraception(resourcefilepath=resourcefilepath))
sim.seed_rngs(1)
sim.make_initial_population(n=popsize)
sim.simulate(end_date=end_date_part_one)



import pickle

with open(outputpath / 'pickled_basic_object', 'wb') as f:
    pickle.dump({'1': 1, '2': 2}, f)

with open(outputpath / 'pickled_sim', 'wb') as f:
    pickle.dump(sim, f)

with open(outputpath / 'pickled_event_queue', 'wb') as f:
    pickle.dump(sim.event_queue, f)

with open(outputpath / 'pickled_basic_object', 'rb') as f:
    x = pickle.load(f)

with open(outputpath / 'pickled_sim', 'rb') as f:
    x = pickle.load(f)   # fails

with open(outputpath / 'pickled_event_queue', 'rb') as f:
    x = pickle.load(f)   # fails

# # Increasing recursion limits -- didn't help!
# # https://stackoverflow.com/questions/3323001/what-is-the-maximum-recursion-depth-in-python-and-how-to-increase-it
# import sys
# sys.getrecursionlimit()
# sys.setrecursionlimit(90000)

Testing Slack integration

Only testing!

HIV module

Can start work based on @tdm32 current implementation. The R code, data files and notes are here.

Streamlining the interaction with request_consumables in the HealthSystem

Most use cases with 'request_consumables' in the HealthSystem involve getting one particular item_code or package_code. Despite this, the current implementation requires that for each request an elaborate 'cons_req_as_footprint' dict() is created.

Therefore, create a helper function that accepts a single item_code or a package_code and return a bool (for availability), to make this easier.

ie. the usage in that simple case, changes from;

item_code = self.module.parameters['anti_depressant_medication_item_code']
result_of_cons_request = self.sim.modules['HealthSystem'].request_consumables(
             hsi_event=self,
             cons_req_as_footprint={'Intervention_Package_Code': dict(), 'Item_Code': {item_code: 1}}
         )['Item_Code'][item_code]

to:

item_code = self.module.parameters['anti_depressant_medication_item_code']
result_of_cons_request = self.sim.modules['HealthSystem'].request_consumables_as_item_code(self, item_code)

Run and parameter management

What is the best way for us to capture how uncertainty in model parameters is propagated to model outputs?
We would like all or many input parameters to be associated with several credible values and for it to be easy to run the model to run with each and have the results be bound together in order that summaries can be made that cut across the runs induced with each set of parameter values.

The system we have now would provide a work flow as follows:

for each of a set of values for parameters:
script declares range of values of parameters
edit resourcefiles to update parameter values
run model
produce results
repeat
zip log files from each run together

... , but perhaps this can be streamlined..... e.g.

resource files contain a 'best' parameter value and information about the distribution of of true values for that parameter.
model can be run in 'use best parameter value' mode (which is what we have now), OR 'sample uncertainty' mode
sample uncertainty mode automates the process as above, given inputs about the number of samples.

HealthSystem schedule_hsi()

It is most common for the usage of this to be:

self.sim.modules['HealthSystem'].schedule_hsi_event(
                     hsi_event=hsi_event,
                     priority=0,
                     topen=self.sim.date,
                     tclose=None
                 )

Therefore, add in default such that:

priority=1
topen=self.sim.date
tclose=None

validate date when scheduling event

i.e. assert isinstance(date, Date) in schedule_event

test that on_birth in a module fills it's own properties

Specification of categorical variables

When you declare a categorical type variable, you should also need to specify categories=["b","c","d"], ordered=False or similar. The list of categories should probably be compulsory to specify; ordered could default to True.

If a user tries to assign a value not in the list, NaN is used instead. If ordered, the order given in the list is used as the sort order for the property.

Updates to wiki and cookbook

I think the following can now be added to the wiki:

Management
File locations (e.g. resource file)
Running of analyses
Use of the logger in the disease module
Use of flake8

Substantive
New definitions of skeleton.py which includes the healthsystem
Use of pytests
Use of assert() to check things on the fly

And for the cookbook:
Syntax for establishing and using pytests.

Extend checklist on Pull Requests to include check of Naming/File Conventions

Naming/file conventions are as described on the wiki at module design

Boolean and date parameter input checking

Excel and pandas can do some strange things together with boolean and date fields.

Overall it would probably work as expected, fairly low priority but if we start using these types for disease modules then we should put in some error checking.

Boolean: pandas reads in an excel TRUE as 1, and FALSE as False
- Could just set 1 for a parameter for TRUE and not cast the type?
Dates: pandas will read an excel date as a datetime.datetime object
- but also any int, real number or strings like MAR1 would also become dates
- not sure what would happen if there was no year given as well...

include age in population.props

It may be useful to include age in the population.props dataframe rather than merging population.age into props in many functions. It would need to be constantly updating in the background I think.

Test of the full simulations including all disease modules - at large pop size and small pop sizes

I am noticing that the tests in master for several of the disease modules fail when running at small population sizes and using the logger (due to the outputting of inf, in several cases).

Going forward all disease module authors should confirm that their module works at arbitrarily small sample sizes (this is included in the checklist for PR).

But, we need to go back to the disease modules so that a test of this format works:

def test_all_modules_running_at_small_population_size():
# Get ready for temporary log-file
f = tempfile.NamedTemporaryFile(dir='.')
fh = logging.FileHandler(f.name)
fr = logging.Formatter("%(levelname)s|%(name)s|%(message)s")
fh.setFormatter(fr)
logging.getLogger().addHandler(fh)

# Establish the simulation object
sim = Simulation(start_date=start_date)
sim.seed_rngs(0)

# Define the service availability
service_availability = list(['*'])

# Register the appropriate modules
sim.register(demography.Demography(resourcefilepath=resourcefilepath))
sim.register(contraception.Contraception(resourcefilepath=resourcefilepath))
sim.register(lifestyle.Lifestyle())
sim.register(healthsystem.HealthSystem(resourcefilepath=resourcefilepath,
                                       service_availability=service_availability))
sim.register(healthburden.HealthBurden(resourcefilepath=resourcefilepath))

sim.register(oesophageal_cancer.Oesophageal_Cancer(resourcefilepath=resourcefilepath))
sim.register(depression.Depression(resourcefilepath=resourcefilepath))
sim.register(epilepsy.Epilepsy(resourcefilepath=resourcefilepath))
sim.register(hiv.hiv(resourcefilepath=resourcefilepath))
sim.register(tb.tb(resourcefilepath=resourcefilepath))
sim.register(male_circumcision.male_circumcision(resourcefilepath=resourcefilepath))

# Run the simulation and flush the logger
sim.make_initial_population(n=100)
sim.simulate(end_date=end_date)
check_dtypes(sim)

# read the results
fh.flush()
output = parse_log_file(f.name)
f.close()

# Do the checks
# correctly configured index (outputs on 31st decemnber in each year of simulation for each age/sex group)
dalys=output['tlo.methods.healthburden']['DALYS']
age_index = sim.modules['Demography'].AGE_RANGE_CATEGORIES
sex_index = ['M', 'F']
year_index = list(range(start_date.year, end_date.year + 1))
correct_multi_index = pd.MultiIndex.from_product([sex_index, age_index, year_index], names=['sex', 'age_range', 'year'])
dalys['year']=pd.to_datetime(dalys['date']).dt.year
assert (pd.to_datetime(dalys['date']).dt.month == 12).all()
assert (pd.to_datetime(dalys['date']).dt.day == 31).all()
output_multi_index = dalys.set_index(['sex','age_range','year']).index
assert output_multi_index.equals(correct_multi_index)

# check that there is a YLD for each module registered
yld_colnames= list()
for colname in list(dalys.columns):
    if 'YLD' in colname:
        yld_colnames.append(colname)

module_names_in_output=set()
for yld_colname in yld_colnames:
    module_names_in_output.add(yld_colname.split('_',2)[1])
assert module_names_in_output == {'Epilepsey','Depression','Oesephageal cancer'}

Running unit tests on windows fails

@ihawryluk and @jwr42 have experienced two reasons for the unit tests failing when running on windows.

dtypes don't match, when a column is assigned as a type int e.g. in demography the type is changed to int32 instead of int64
Permissions error in test_healthsystem.py because tempfile.NamedTemporaryFile does not work on windows

 tempfile.NamedTemporaryFile([mode='w+b'[, bufsize=-1[, suffix=''[, prefix='tmp'[, dir=None[, delete=True]]]]]])¶
...
(it can be so used on Unix; it cannot on Windows NT or later)...

Useful defaulting in the LinearModel

The LinearModel is almost always used with an intercept of 1.0 and a LinearModelType.MULTIPLICATIVE.

It ** may ** be useful to make this a default setting such that this case can be written more concisely.

starting up

Hi Asif,
Have just started working on a new computer and went through all the startup steps again, installing and pycharm configurations etc. I think I got there but it was a bit tricky. Do you think it would be possible to do a screencast of doing it, from a completely blank machine to make it easier?
Thanks very much
Tim

Running with the same seed does not give same result

In the course of implementing changes to improve performance (see #63), @stefpiatek noticed that repeated runs of the analysis_hiv¹ script do not yield the same final population dataframe at the end of the simulation. Further debugging show the events are not running in the same order (nevermind the different events).

The way TLOmodel is designed, by setting the seed for the simulation (e.g. sim.set_seed(0)) and then using the rng supplied by the module should always reproduce the same run. That means either:

The TLOmodel framework is not initialising the random number generators properly or
The disease modules have introduced some randomness which is not controlled by the module rng

We're checking both but please can modellers ensure all random calls use the RandomState object supplied by the module (i.e. self.rng inside module or self.module.rng inside events).

1: not picking on this module, just a use case!

Excel table read-in, lookup and event scheduling

Can the code that reads the excel file, automatically "know" which properties of the person need to be interrogated in order to determine the appropriate probability of that event occurring, such that the influences of variables on the probability can be manipulated through editing of the excel sheet and without editing of the code?

Add column for five year age-group (0-4, 5-9 etc) to population.age

repeating events to have end date for repeating pattern

Write guide to using LinearModel

Add a short tutorial/guide to using LinearModel helper to the wiki. The tests have some detached examples, but probably not enough for real world use.

Core population demography module

Initialises the following properties of individuals in the population using demographic data:

personid
dateofbirth
sex

Computed properties

For instance, we record date_of_birth as an actual property, but many processes depend on age, which should be computed on the fly from DOB and current date.

Debug logging/output for disease module development

During development of disease modules, it would be useful to be able to dump entire population/affected individuals pre- and/or post-event. Part of a 'debug' mode that can be turned off.

Pycharm Tox configuration fails based on pytest version

When creating a tox configuration from running within PyCharm, the py36 pytest run fails.

This seems to be because the newest version of pytest (installed in the tox environment) is not compatible with the pytest_runner that PyCharm uses.

Error message given:

py36 runtests: commands[0] | /Users/stef/UCL/TLOmodel/.tox/py36/bin/python /Applications/PyCharm.app/Contents/helpers/pycharm/pytestrunner.py -p pytest_teamcity --cov --cov-report=term-missing -vv tests
Traceback (most recent call last):
  File "/Applications/PyCharm.app/Contents/helpers/pycharm/pytestrunner.py", line 60, in <module>
    main()
  File "/Applications/PyCharm.app/Contents/helpers/pycharm/pytestrunner.py", line 34, in main
    pluginmanager=_pluginmanager, args=args)
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/pluggy/hooks.py", line 289, in __call__
    return self._hookexec(self, self.get_hookimpls(), kwargs)
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/pluggy/manager.py", line 87, in _hookexec
    return self._inner_hookexec(hook, methods, kwargs)
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/pluggy/manager.py", line 81, in <lambda>
    firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/pluggy/callers.py", line 203, in _multicall
    gen.send(outcome)
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/_pytest/helpconfig.py", line 89, in pytest_cmdline_parse
    config = outcome.get_result()
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/pluggy/callers.py", line 80, in get_result
    raise ex[1].with_traceback(ex[2])
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/pluggy/callers.py", line 187, in _multicall
    res = hook_impl.function(*args)
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/_pytest/config/__init__.py", line 720, in pytest_cmdline_parse
    self.parse(args)
  File "/Users/stef/UCL/TLOmodel/.tox/py36/lib/python3.6/site-packages/_pytest/config/__init__.py", line 924, in parse
    assert self.invocation_params.args == args
AssertionError

If the tox environment pytest version is pinned to 5.0.1 it runs fine, broken in 5.1.0+. But might need to think about how we want to pin the version of pytest or see if PyCharm is updated to fix this

Set up automated Windows tests

Probably AppVeyor

Helper Function for Getting a value for an individual through look-up in a date frame

This is the operation that is done by merge with the reset index() modifier.

It is used to assign a value (often a probability) to individuals in the model based on their properties (age, sex, marital status, for example) by looking up in a long-form data frame that has been imported to the module and is contained within Parameters{}.

As this is a tricky operation it would be good if a specialised helper function could do it. It would perhaps:

accept arguments about what is to be merged.
check that the data frame after the merge has the appropriate number of rows and that the index is in tact.
still run if one of the expected model properties is absent (in the case that the module that would look after such a property is not registered) and behave as if the valus for that ascent property were set to a null value (ie. 0, False, Etc).

int32 default on windows machines raising errors

some errors appearing on windows machines due to default setting returning int32 instead of in64

demography lines 191 and 276 could be changed to:
df.loc[df.is_alive, 'age_days'] = age_in_days.dt.days
to resolve this temporarily

Initialise logging and RNG seed with instantiation of Simulation

Whilst testing the Simulation saving/restoring (issue #86), I've noticed this problem.

Currently, logging is set up after the Simulation instance. i.e.:

sim = Simulation(start_date=...)
sim.set_seed(123)
sim.configure_logging(...)

However, this means invocation of set_seed and configure_logging can change behaviour based on when they are called. e.g.:

If disease/intervention objects are registered before sim.configure_logging(), logging from the __init__ of disease objects is lost.
If set_seed is called before methods are registered, disease modules don't have their RNG set properly.

Possible solutions:

Enforce simulations are always setup as follows:

sim = Simulation(start_date=...)
# sim.register() disease methods here
sim.configure_logging(...)
sim.set_seed(123)

This can still lead to non-reproducible behaviour because the seed is not recorded for simulation and disease method instances. If there is any randomness in their __init__, it can't be reproduced.

A better solution (my first pass) is

a) Add 'filename' and 'seed' options to Simulation constructor, which sets up logging output file and simulation's RNG right at the start.
b) Set the seed of module's RNG in register. Although, any randomness in __init__ will still be lost. I can't think of a way around this. We need enforce only having minimal setup in the __init__ and any meat in read_parameters() (which is called immediately by Simulation on register)

Welcome comments!

add summary of deaths in demography logging event class

Useful to have a summary of number of deaths each year (or month) by COD in the demography module. Currently this is tracked separately in the HIV and TB modules but summary outputs would be better

Environment set-up instructions

There are two pages with set-up instructions on: the README.rst file at the top-level directory, and those on the wiki page at https://github.com/UCL/TLOmodel/wiki/Installation. A user could end up using the instructions which aren't necessarily the better ones for them, without realising other instructions exist.

We need to add a link on both pages to point to the other page, with suitable text to put them in context, as they are for different audiences, i.e. the README are instructions sufficient for a software developer, but those on the wiki are for the epi modellers, who are not software developers.

The modellers are using PyCharm.

Improve performance

This is a parent issue to track other issues related to framework and/or model performance.

To explore:

Over-allocating population dataframe
- allows new individuals to be added to pop without append
- was slower than appending when only few columns
Age update event (slowest after birth events)
- runs everyday; sets four columns
- simple calculations and assignment
onbirth() calls
- many separate df.at calls in a given onbirth()
- called on many modules on every birth

Notes:

Enhancing performance (Pandas doc) talking about rewriting bottlenecks in Cython and/or Numba

Method for easily saving and retrieving simulation objects

Documentation

In our currrent workflow there are multiple places where we have to document the same thing --- this invites error and discrepancies and it would be better to agree on a standard way of documenting thing and to use tools to pull together summary tables etc for write-up.

What I propose is a pattern that is used in #107

The declaration of PARAMETERS and PROPERTIES -- both the name of the description -- are considered the single source of truth. The description of the parameter name should be a full description such as would make sense outside the context of the code itself.
The proposed value of each parameter is provided in the resourcefile. This can be a .csv file. However, it will often be useful to have this in .xlsx file in order that each parameter value can be associated with:
- a reference or justification for the value that would be suitable for inclusion in a published methodological document
- links to underlying sources, pictures etc.
The word document write-up only provides:
- a narrative description of the module
- a demonstration of the calibration of the module to data (which has come from an analysis script)
- a demonstration of the effect of the interventions on disease burden for this module.
  i.e. The word document does not aim to reproduce any description of properties, parameters, or how they were evaluated etc etc.

This would entail an update to the checklist [https://github.com/UCL/TLOmodel/wiki/Checklist-For-Developing-A-Disease-Module]

This then requires a helper function that creates tables that can be pasted into word documents:

A list of properties name and descriptions [straight from the code]
A list of parameters, associated with its description [from the module code] and it's value and justification [from the resourcefile].

Create Guidelines document

Create a guidelines document that stores record of agreed best practices in the design of modules.

Allow running a module to optimise parameters

Tara's HIV model outputs summary statistics. The parameters of the module can be optimised by, for example, minimising sse of those statistics wrt historical data.

Rather than having to implement the model in two places, explore wrapping the module in an function that can be passed to an optimiser.

Ordering of Modules and the Set of Modules Required for a "Standard Run"

(This might be related to: #98 (comment))

The order in which modules are registered with a simulation object matters;

It determines the order in which the initalise_population and initialise_simulation events are run, which will also often cause the order of PollingEvents to be reface this too if they have the same repeating frequency.
Modules like HealthSystem need to be registered first in order that other modules can register with them and so that this if the first event that is run each "day" (before any PollingEvent)
Module programmers are finding ways around this -- by adopting certain conventions regarding the order of modules being registered and scheduling their events with exotic timestamps (like one microsecond to midnight) to try to guarantee that their event is that last one to run in a day.

Similarly, there is now a standard set of modules that are required for the most basic simulation; i.e. Demography, Contraception, Enhanced_LifeSytle, SymptomManager (and Labour, ....soon)

And, if the healthsystem is being used:
HealthSystem, HealthSeekingBehaviour and DxManager.

The list of essential modules is changing and bugs are developing analysis scripts not keeping track. Erorrs that arise from a module not being registered are hard to track because it might not be obvious what is missing (i.e. that births are not happening or that healthseeking is not occurring).

It would be good if we could find a way to:

let the simulation module re-order the 'order' of the modules that are registered with it (before doing anything else), according to rules we set. This would allow an explicit logic to be relied upon without demanding any consistency in the 'analysis scripts' etc.
let the simulation issue a warning if one of those recognised 'fundamental' modules is not registered with it before the simulation is run.