snews2 / snews_publishing_tools Goto Github PK

View Code? Open in Web Editor NEW

2.0 2.0 2.0 4.7 MB

Publishing Tool for SNEWS

License: BSD 3-Clause "New" or "Revised" License

Python 68.35% Jupyter Notebook 31.65%

snews_publishing_tools's People

Contributors

Stargazers

Watchers

Forkers

dallaval5u

snews_publishing_tools's Issues

Refactor required for hop v0.5.0

Thanks to the feedback from Ricardo, I noticed that there is a new hop release and some important changes.

As they documented here in the most recent release v0.5.0 there are a few things that need to be adapted to our scripts. The most important being; the flag for Stream instance has changed from --persist to --until-eos.

This requires us to make changes in the subscription classes. Alternatively, for the current version, we can force hop version=0.4.0 in the requirements, and adapt the changes for a later release.

no-auth option

For publishing we have auth=True option

Should we check if the user has hop credentials (hop auth locate we can even grab the username if the authentication file exists), and if they do not have the credentials we raise a warning first and default to auth=False in publisher. Or do we only want people with credentials to test the tools? (or that they know they don't have credentials and manually set auth=False ? )

For subscription, we do not have an auth option at all. Maybe at least there we should check this and say "yo, you don't have permission to subscribe, go check your hop credentials"

improve object encapsulation in `snews_pt`

While making a minor update to the snews_pub module I noticed the encapsulation of the main data class SNEWSTiersPublisher is not enforced well. There are several utility classes in snews_pt_utils and snews_format_checker that must know all about the internal structure of SNEWSTiersPublisher. For example, the format checker includes lengthy conditional statements that depend on the published message type, which strongly couples the submodules and breaks the usual OOP design patterns. I foresee this becoming a potential maintenance headache.

This can be fixed with some refactoring and potentially use of an inheritance hierarchy for different message types, where each type knows how to internally check it's own message format for consistency. I have some thoughts but we should discuss if this is a good use of limited time.

python 3.11 fails unit test

Unit test with python 3.11 return an error during build while install confluence-kafka. The failed job is here:

https://github.com/SNEWS2/SNEWS_Publishing_Tools/actions/runs/3533700711/jobs/5929666759#step:6:132

nested method

Shouldn't this method be defined in the class rather than defined inside tier_decider . Every time tier_decider is called append message will be redefined. Seems inefficient

SNEWS_Publishing_Tools/SNEWS_PT/snews_pt_utils.py

Line 289 in 4030146

def _append_messages(tier_function, name):

Store hop authentication as repository secret?

Publishing and subscribing to SNEWS topics is part of the unit testing. Hop authentication is required for this.

To add unit tests to a GitHub workflow, the authentication information could be stored as a repository secret:
https://docs.github.com/en/actions/security-guides/encrypted-secrets#creating-encrypted-secrets-for-a-repository

The authentication could be added each time if the username and password were secrets. Another option may be to make a secret auth.toml file. admin access is required for either option.

Observation messages time resolution

I cannot fully recall the consensus on the time resolution; currently, we are using "%H:%M:%S:%f" which is up to 6 digits on the floating-point, therefore, has a few ms resolutions. There were discussions about using 64 bits or two time strings where the latter indicates ns after seconds.

Let's keep this issue to track.

yield message

Here

SNEWS_Publishing_Tools/SNEWS_PT/snews_sub.py

Line 97 in 181ed53

display(message)

we can yield the alert message therefore, it can be picked by other scripts. yield allows for returning a value without terminating the command.

We can read it here

SNEWS_Publishing_Tools/SNEWS_PT/__main__.py

Line 70 in 181ed53

sub.subscribe()

and allow for passing the output to other scripts. i.e. it would be something like;

@main.command()
@click.option('--plugin','-p', type='str', default="None")
@click.pass_context
def subscribe(ctx, plugin):
    """ Subscribe to Alert topic
    """
    sub = Subscriber(ctx.obj['env'])
    try:
        alert_message = sub.subscribe()
       if plugin != "None":
           os.system(f"python {plugin} {alert_message}")
    except KeyboardInterrupt:
        pass

Therefore, any plugin script can be given and the alert content is redirected

make sure change log is up to date

The change log may not be updated; check it.

Dumped Alert Message Content

Following Riccardo's feedback,
The alert content is not accurately saved in the dumped JSON, the date fields seem to be mixed.
Also, we are currently not dumping the sublist number, which should be added as well.

Dependencies for SNEWS_PT at IceCube

IceCube is in the process of upgrading the machines where SN alerts are handled. At the time of writing, the following packages are in conflict with SNEWS_PT's requirements. The highest package version currently available on this machine are also listed (in short, Python 3.6-compatible versions).

Package    SNEWS_PT Requires     Available Version
click      ~8.1.2                8.0.4
ipython    ~7.32.0               7.16.2
pandas     ~1.4.2                1.1.5
setuptools ~62.1.0               57.0.0
inquirer    2.9.1                2.8.0

This list is subject to change, so I will update this issue accordingly.

documentation & unit tests & deployment

We need to write documentation for all the functions & classes, and also readme for the repo, preferably a detailed tutorial. Also, we need to write a readthedocs file, this can be copied from SN_alert_app repo and modified.

Simple unit tests, and slightly fancier connection tests e.g. is server alive need to be written.

We need to deploy to somewhere like pypi

get rid of internal 'detector' variables

With this 7ee0e30 now it can fetch the detector name from the environment. Thus, we can clear the codes a little bit and ask users to put their names in the environment file once they are installed the package.

In principle, we can also prompt this to the user if the existing detector name is == 'TEST'

Test-connection echo

When I run snews_pt test-connection I get the following echo:

/Users/joe/src/gitjoe/SNEWS_Publishing_Tools/snews_pt/__init__.py:23: UserWarning: You are using default detector name "TEST"
Please change this by snews_pt.snews_pt_utils.set_name()
  warnings.warn(warning_text, UserWarning)
Testing your connection to kafka://kafka.scimma.org/snews.experiments-firedrill.
 Should take 4-5 seconds...

The test takes about 4-5 seconds, but I do not receive an echo indicating the success or failure of the test.

`SNEWSTiersPublisher` should have a `to_json` function

The SNEWSTiersPublisher should have a to_json function that dumps messages to JSON format. This should be symmetric to the from_json member function.

Alert Subscription is missing

We plan to make this repo a front-end for the user, while a simple interface for publication is ready, we are still missing the alert subscription.

write end-to-end tests of all module functions

Adding @KaraMelih's suggestion from the LNGS hackathon coordinating document to get started on full tests of snews_pt functionality.

hop 0.7.0 requires extra updates to requirements

Updating to hop 0.7.0 from 0.5.1 also requires adc_streaming 2.3.0 and confluence_kafka 1.9.2. Details can be found in the hop-client issue: scimma/hop-client#193.

flesh out the SNEWS 2.0 alert system technical paper

Fill in missing plots and text in the technical paper draft being prepared for submission JINST.

Current working title is: The SNEWS 2.0 Alert System for the Coincident Detection of Neutrinos from Core-Collapse Supernovae.

@habig can give write access if needed.

does snews_pt need so many specific package requirements?

I noticed the current requirements list in snews_pt is pretty specific about all package versions. This seems a bit excessive for most of the requirements, though any dependency on the hop client probably has to be tied to a specific hop minor release. Can we clean this up?

`snews_pt` needs a license

snews_pt has been released without a license. It's a good idea to make a LICENSE file. I suggest an as-is license like BSD 3-clause (we use it in snewpy) but @KaraMelih , @Storreslara , and @joesmolsky need to agree.

What to test

Our garbage checkers are checking a few things but some are still missing;

Should it require 'neutrino_time' if the message is meant for SigTier and there are 'neutrino_times'
check the p_values content.
Right now it accepts [0.5, "bla", 0.45]
check t_bin_width type, doesn't check yet

review project documentation

Review snews_pt project documentation, after @KaraMelih's suggestion in the LNGS hackathon coordination doc. Preliminary checklist:

Check top-level READMEs to ensure consistent instructions.
Check readthedocs to make sure installation and usage examples are consistent and useful.
Ensure notebooks and example scripts are properly documented.
Revisit docstrings and make sure they are consistent with module contents.

Significance and coincidence tier separation

Significance tier messages are sent to significance and coincidence tier. They should only be sent to significance and not coincidence.

time formatting

SNEWS_Publishing_Tools/SNEWS_PT/snews_pt_utils.py

Line 42 in 0541b03

def str_to_datetime(self, nu_time, fmt='%y/%m/%d %H:%M:%S'):

maybe the default format should be that from the env file? That is already read as an attribute.

publishing messages

I just pushed an updated version of the CLI. I'd like to raise two issues;

First; Right now publish method does not allow extra arguments.
While this might be the desired use. I think it should not fail to publish, rather

either mark the extra columns and publish, or
split these columns, publish the fixed template part, and report back the extra fields to the user.
Manipulations in the publish class can be made see here

Second, I think the snews_pt is meant for users to interact with the system without having to deal with any backend tools. The publishing tools are already nice and usable, however, we should also carry the means to subscribe to alert channel, and I think this subscription should also allow downloading the received alerts in json or some other format. I don't think users would monitor the terminal all the time.

I also would like to suggest starting using python config files for broker, channel, token, etc (basically replacing the env file). This way later, we can either ask users to put their experiments once in the config file or try to fetch it from hop.auth. I can work on that if agreed.

Understanding signature

Is this just looking for the argument, shouldn't we also check that it's the right type ?

SNEWS_Publishing_Tools/SNEWS_PT/snews_pt_utils.py

Line 304 in 4030146

if data.get('nu_time', False):

Detectors with gold alerts

Where should detectors with gold alerts publish messages?

As an example, if KamLAND sees two events within 10 seconds, there should a message sent to the alert topic. Right now, a single detector cannot trigger an alert.

Should detectors publish directly to the alert topic or do we want to change SNEWS_PT to allow detectors to send two neutrino times and trigger an alert?

Problem with Kafka

I am getting this Kafka error when trying to do subscribe to snews_pt or run the hop connection with snewpdag:

Traceback (most recent call last):
File "/home/marta/miniconda3/envs/snewpy_env/bin/snews_pt", line 8, in
sys.exit(main())
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/click/core.py", line 1657, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
return f(get_current_context(), *args, **kwargs)
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/snews_pt/main.py", line 79, in subscribe
sub.subscribe()
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/snews_pt/snews_sub.py", line 93, in subscribe
with stream.open(self.alert_topic, "r") as s:
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/hop/io.py", line 119, in open
return Consumer(
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/hop/io.py", line 299, in init
self._consumer.subscribe(topics)
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/adc/consumer.py", line 41, in subscribe
topic_meta = self.describe_topic(topic, timeout)
File "/home/marta/miniconda3/envs/snewpy_env/lib/python3.8/site-packages/adc/consumer.py", line 63, in describe_topic
cluster_meta = self._consumer.list_topics(timeout=timeout.total_seconds())
cimpl.KafkaException: KafkaError{code=_TRANSPORT,val=-195,str="Failed to get metadata: Local: Broker transport failure"}

If anyone has a hint on this issue and can help, I would appreciate it.

add feedback to ensure that experiment messages are received

We probably need a logging channel on HopSkotch that confirms receipt of test messages send to the observation channel.

Switch to ISO 8601

We need to switch our datetime format to ISO

https://pynative.com/python-iso-8601-datetime/

allow for interacting with the alert message

The current version saves the alerts to a json file with the format;

{current_date: 
           data_dictionary}

and appends to it when more alerts are received.

We can provide a tool to fetch this and read-in again as a dictionary, such that the user can access fields and data.

fake obs rate or local storage option

To give a -sort of- feedback to the user, we can periodically calculate their false observation rates and report back every week/month.

And/or we can allow store_locally=True option when publishing their observations, and allow user to store their own messages to their local machines.

Suggested interface update

Suggested interface update.
As a user I would like something like this:

from SNEWS_PT import Publisher,ObservationMessage

with Publisher(address=kafka_address) as pub: # using contextmanager will make sure that the connection is closed if an exception happens
    msg = ObservationMessage(detector=my_detector, 
                             neutrino_time=my_neutrino_time, 
							 pvalue=my_pvalue
) #automatically filling everything else, maybe even detector_id can be filled from an env file
    pub.publish(msg) # this will set msg.send_time
    #oops, that was not an observation, retract
    pub.retract(msg) # this can actually publish a RetractMessage under the hood, using the msg.send_time to identify the message to retract.

GitHub workflow for pytest

I'm trying to add automated testing to push/pull requests. The test functions work locally, but not on GitHub. The error message seems to be related to khafka header files. The full output is here.

https://github.com/SNEWS2/SNEWS_Publishing_Tools/runs/4729226671?check_suite_focus=true

I disabled the workflow for now. Any help or suggestions welcome!

properly storing the alerts

In the latest modification, I added an option to specify where to dump the received alerts in the subscribe method.
By default it is None and if this is the case, it fetches from the test_config.env here

Then, there is this helper function here which makes a name with the current date like; '0_2022-03-11_ALERT.json' the first zero being the index i.e. it is the first alert message on 2022-03-11 and my idea was to increase this if there are more messages within the same day. This same name is also passed to CLI's --plugin option, so we call any given custom script with name (e.g. on 1st of April, we receive an alert, and you have a follow-up script named 'customscript.py', as soon as the alert comes, save the JSON, and call python customscript.py 0_2022-04-01_ALERT.json so it always finds the correct alert and does stuff)

Now there are two open issues;
1 - The counter is not yet properly implemented (I could use help)
2 - We should decide on how to handle the UPDATE alerts.
- We can make a new JSON each time, and say '0_<date>_ALERT-UPDATED.json'
- or we can overwrite the same JSON

I guess for the updates, I'd go for the former, as files do not occupy a lot, and people might wanna go back and say "oh at first, there was this message and I did that, and then this one with the updates caused this etc"

snews2 / snews_publishing_tools Goto Github PK

snews_publishing_tools's People

Contributors

Stargazers

Watchers

Forkers

snews_publishing_tools's Issues

Recommend Projects

Recommend Topics

Recommend Org