spacetelescope / jwql Goto Github PK

The James Webb Space Telescope Quicklook Application

License: BSD 3-Clause "New" or "Revised" License

Jupyter Notebook 3.94% Python 85.55% CSS 0.85% JavaScript 2.88% HTML 6.65% Shell 0.13%

python html css javascript django sphinx jenkins pytest jupyter-notebook conda

jwql's Issues

Create wiki page describing how to make API docs for code

Now that we have sphinx all set up, it would be useful to have instructions on how to add API docs for any future code that we write.

Build sphinx documentation

Now that we have some actual code with docstrings in our jwql package, we can start adding the tools to build official API documentation. I have done this before with sphinx and it has worked well, so unless anybody has a better suggestion I think we should use that.

The documentation that we build could then be hosted on Read the Docs.

Explore using django for building a web application

For the wfc3ql and acsql projects, we used Flask as the web framework for building web applications, however I'm not sure the lightweight nature of Flask is going to cut it for jwql, since jwql will be scaled up by ~5 compared to wfc3ql and acsql.

For this reason, I would like to explore using django as an alternative to Flask. django appears to be an 'industry standard' for web frameworks in python. (There are even entire dedicated conferences to it, so people must use it!)

It looks like the jdango website has decent documentation and basic tutorials on how to use it. Perhaps this is a good starting point.

Build filename parser utility function

It would be useful to have a function in the utils.py module that returned the individual elements of a given filename, for example:

from jwql.utils.utils import parse_filename
filename_dict = parse_filename('jw94015001001_02102_00001_nrcb1_uncal.fits')

where filename_dict is:

{
    'program_id' : '94015',
    'observation' : '001',
    'visit' : '001',
    'visit_group' : '02',
    'parallel_seq_id' : '1',
    'activity' : '02',
    'exposure_id' : '00001',
    'detector' : 'nrcb1',
    'suffix' : 'uncal'
}

Ask DMS/archives how proprietary data will be accessed and authenticated

Now that we have support from the JWST mission office to access JWST proprietary data from the same location as the DMS/archive teams will use, we should figure out the implementation details of how these data are accessed and how the access is authenticated.

Build ginga plugin for web application

It would be awesome if we could interact with JWST FITS files via ginga straight through the web application. Through some quick research, @gkanarek pointed out that this should be possible.

Build filesystem monitor

We should build a script that will monitor and gather statistics about the filesystem. We should be able to easily answer questions like:

How many files are there in total?
How much disk space is used by all of the files?
How many <some_filetype> files are there?
Other questions I can't think of right now.

I envision the script will create a series of bokeh or matplotlib plots that show these statistics over time. These plots can eventually be hosted on our web application. Currently our filesystem is just some static test filesystem, so the plots will be quite boring for now. But this will become more useful after launch.

This script will become one of the monitors that get run by cron once a day on our virtual machine. Output products should be directed to a specific directory (to be determined).

Draft a site map for the web application

A good starting point for designing the structure of the web application is to build a site map. We can build a draft site map for now and iterate on it until we are happy

Make Wiki page describing why and how to use the Logging function

Ask ITSD to build a test and production server

Vera Gibbs suggests that we request a test server and environment sooner rather than later, as it helps ITSD plan their work better. The test environment should mimic the production environment, so we should do some thinking on what we would like our production environment to be first.

Build database monitor

Similar to #47, we should also build a monitoring script that gathers information about the MAST database. Some things that would be good to know:

Number of files in the database
Number of files in the database for a given instrument and/or observing mode
Number of header keywords stored in database

Output should be bokeh or matplotlib plots that track these things over time.

I'm working on securing central storage disk space for the jwql project. Eventually outputs should be stored there.

Determine which skipped keywords are important for jwql

For various reasons, the MAST API/database purposefully skips the storage of certain JWST header keywords (see attached file). However, Kim DuPrie and Lisa Gardner suggested that skipped keywords could be added to the database/API if there is a use case. We should identify the keywords that we think will be important for our application and ask that they add these.

skipped_jwst_keyword.txt

Create README for contributing to this project

It may be helpful to others if there were some instructions on how to get started on contributing code to this project (i.e. cloning the repo, submitting pull requests, etc.)

Update README with new environment installation

With #30 , there will be some extra steps to install the jwql-dev environment, namely:

conda update conda
source activate root
conda env create -f environment.yml

We should update the README to reflect this.

Incorporate OWASP Top 10 into Workflow

https://www.owasp.org/images/7/72/OWASP_Top_10-2017_%28en%29.pdf.pdf

Add sphinx docs for db_monitor

With the merge of #52, we should add the db_monitor API docs to the sphinx build.

Investigate JWST Data Analysis Tools

Reproducing here from the Slack channel:

"During the NIRSpec team meeting on Tuesday, we had a demo of three of the tools the JWST Data Analysis Development Team have been working on - specviz, mosviz, and cubeviz. One of my team members asked whether JWQL would be able to use the tools for QL, but I said no, since they’re stand-alone tools and wouldn’t work in a browser. However, I did say that we might be able to learn from them in terms of choices they’ve made & issues they’ve faced when visualizing various JWST data products. we could also think about whether we want to try and be consistent with those tools in terms of theme/aesthetic/whatever."

So, once we move towards implementing the JWQL visualizations, we should make sure to investigate these tools, and see what we can adapt.

Make dev conda environment more general

We should make our dev conda environment more generalized so that it can be used on the new test server.

Build script that will generate preview images for all files in filesystem

Now that we have preview_image.py, we can build a wrapper around this module to create preview images for all files in the filesystem. The preview images should be stored under the jwql project directory on central storage in some sort of organized filesystem.

Determine if a public and/or proprietary filesystem will be available through MAST

Ideally, we would like to avoid having to maintain a separate copy of JWST data to support the jwql application. Through informal conversations with @tomdonaldson, I've been told that we will have access to a centrally-located filesystem of both public and proprietary JWST data, akin to the MAST public cache. We should make sure if this is actually true, and if so, what the organization of this filesystem will look like.

Test the MAST API for JWQL needs

Now that we have been pointed to the MAST API, we should test it out with the needs of the JWQL application in mind and gather a list of any capabilities that we would like to don't exist in the API.

Ask OPO if they have ST-specific css themes

@gkanarek pointed out that at least one other web application at ST was using some css that has the look and feel of the STScI branding. I should ask Chad Smith if OPO already has existing material for this.

Overview of instrument-specific calibration and monitoring software

We can use this thread to discuss instrument-specific calibration and monitoring software and identify what work needs to be done. A good first step would be to consolidate the tables provided in the Phase A proposal, which I will do.

Write tests for generate_preview_image

One test that we could write is to pick a program ID, make sure a preview_image and thumbail directory exists, and make sure all expected preview images and thumbails exist.

Acquire some JWST FITS files to develop with

The infrastructure of the jwql system will be heavily dependent on the FITS file structure of JWST data, specifically their extensions, header contents, and available filetypes. It would be helpful if we can get our hands on some files that are at least anticipated to be close to the actually data products to come out of the telescope after launch.

Create style guide for jwql software development

To ensure best software development practices and principles, we should make a style guide for software development on this project. We can then check pull requests against this style guide to ensure all collaborators are coding to standard.

Create an interface to the jwqldb database

Akin to database_interface.py for acsql, we need to build a module that will serve as an interface to the jwqldb database. This module will hold the classes and functions for creating the tables and connecting to the database.

Create module that generates a preview image for a given JWST file

Estimate the size of the database

It would be useful for our work in creating the database schema (issue #6) if we had a rough idea of how big we think the database will eventually be. It may tell us if we need to be concerned about disk space and/or memory issues, which could dictate how we structure the schema.

Make elevator line/pitch for JWQL

Task list

This issue serves as a place to list and discuss the various tasks that are to be completed for the project. Each task will eventually become its own issue to allow for further discussion (to be created by @bourque in the near future).

Create text files that list each header keyword and its data type for each instrument/detector/filetype combination (@bhilbert4)
Build module(s) for creating and connecting to the database (i.e. database_interface.py) (@Ogaz, @bhilbert4, @bourque)
Build module(s) that logs the execution of a script (i.e. logging.py) (@cmartlinSTScI )
Build module(s) that inserts records into the database (i.e. update_database.py) (@Johannes-Sahlmann )
Figure out how to connect Jenkins CI to repository (@hover2pi, @SaraOgaz )
Start building a web app with django (@laurenmarietta )

I've tentatively assigned individuals to tasks, but those can definitely change based on people's interests and availability. Please feel free to comment below any thoughts/feelings you have.

Build module that logs the execution of a script

One feature that has been particularly handy for both wfc3ql and acsql has been decorator functions that log the execution of a script. We should build something similar for jwql. Perhaps the logging_functions module in wfc3ql is a good reference/starting point.

github workflow

Here's the github workflow recommendations that DATB/SCSB will be using for our repos. I like this workflow a lot, but if anyone has any different preferences we can always make adjustments for this project:

https://confluence.stsci.edu/display/DATB/Git+Development+and+Release+Workflows

Figure out what to do for config unit test

Marked as xfail for now so jenkins runs, but we should decide if we want to keep that test or not.

Decide on Continous Integration and testing solutions

There are two main options used at the institute for continuous integration: Travis and Jenkins. IT has written up a policy about this, although it is still a draft: https://confluence.stsci.edu/pages/viewpage.action?pageId=99327010

I'm not sure of the specifics on writing tests that will need to interact with a database. This may determine which CI we will use. AFAIK, as relates to our project, Jenkin's strength is the short wait time to run the test suite. We have private servers for this, whereas with Travis sometimes there can a wait time of a few hours. Travis's strength is that anyone contributing a PR can start/stop a Travis run. For Jenkins, if a test suite run needs to be stopped, restarted, etc. that has to be done by someone internal with the correct permissions.

We may want to split unit testing out into a separate issue later on.

Make wiki page describing how to use the config file

Build tests for preview_image.py

We should probably have some tests for preview_image.py, like we do for permissions.py.

One obvious test would be to use a test file as input and check to see that the code successfully generates a preview image file.

Assigning to @Johannes-Sahlmann

Build module that sets the appropriate permissions for a given file

When the jwql project eventually has a filesystem containing preview images, proprietary files (still uncertain), and output products generated by automated calibration and monitoring scripts, we will need to make sure these files have appropriate unix permissions as to not let anyone outside of the jwql "group" see them.

As such, it would be convenient for scripts to be able to import a module that takes care of this without having to worry what permissions to set things to. So we should build this module.

The module should take as input a path to a file, check to see if the owner of the file is the jwql admin account, and if so, (1) set the permissions appropriately, and (2) set the group membership appropriately.

This module should also come with nosetest(s) to test if the function(s) within the module are working properly.

Make web style guide for the JWQL web app

We should decide on a theme/brand for our web application and document it in a 'web style guide'. Something like this example:

Build in file download capability through web application

One feature of our web application should be to be able to download a JWST FITS file through the web app to the user's machine. I envision that, when looking at a webpage that displays a preview image of a particular observation, there are buttons to download each available filetype for that observation.

Make updated schedule for project

Now that we have passed the proposal stage and are actively developing on this project, it is a good time to refine our initial schedule to more accurately reflect the work we are actually doing and come up with more accurate deadlines.

Get access to/test the engineering database

There exists a JWST 'engineering database'. We should see if we can use this and if it could be helpful for our application.

Enable "advanced" preview images

During the meeting with the NIRCam team, they expressed interest in having preview images that includes a mosaic with all detectors that were observed during a particular observation. Perhaps this could be an option for a preview image to view on the web application.

documentation - readthedocs?

I strongly suggest we use readthedocs to autogenerate our documentation (with the help of sphinx). We even have a STScI readthedocs style sheet (ex: http://stak-notebooks.readthedocs.io/en/latest/). Not shown in this example is the inclusion of the sphinx generated API documentation pulled from doc strings in the code.

Determine which database technology to use

According to Vera Gibbs, ITSD supports MySQL, PostgreSQL, and MSS. We need to decide which of these to use for the jwql database.

Build database table(s) for tracking anomalies

One feature of our web application will be to tag images with having specific anomalies (e.g. satellite trail, various detector artifacts, etc.). Though we don't know which anomalies the instrument teams may want to track (nor will we probably know for sure until operations), we should build the framework for a database that stores anomaly information.

We already have a postgresql database built for us, we can use that one for now.

Request development web server

Now that we have a prototype for a web application, it would be useful to test its efficiency on an internal server.

Add sphinx docs for filesystem monitor

Now that the filesystem monitor is merged in (#69), we should add the sphinx documentation.

Develop Database Schema

We need to develop a schema for the jwql database. I think a decent starting point is something like the schema I used for ACS Quicklook:

In this schema, we have a master table that keeps track of each rootname that is in the database and when it was ingested. The datasets table keeps track of which filetypes exist for a given rootname. Then there is a table for each detector/extension/filetype combination which is basically a dump of the headers (columns are header keys and values are header values).

To construct this for jwql, we will need to know the following for each instrument:

What are all of the possible filetypes and what purpose do they serve?
What is the data structure for each filetype (i.e. number of extensions, what purpose each extension serves, what datatype each one is)?
What are the header keywords for each filetype/extension combination?

Generate text files that contain header keywords and their data type

In order to implement the database schema, we will need lists of each header keyword for each instrument/detector/filetype combination, akin to the acsql examples. These lists will eventually be used by database_interface to create the header tables of the database.

The code that generates these for the acsql example is here, perhaps it can be adapted for jwql use.

It would be ideal if we could generate these lists programmatically, with very little or no hard coding/file creation/file editing involved (since there sure are a lot of detectors and filetypes!).

spacetelescope / jwql Goto Github PK

jwql's Issues

Recommend Projects

Recommend Topics

Recommend Org