spacetelescope / jwql Goto Github PK
View Code? Open in Web Editor NEWThe James Webb Space Telescope Quicklook Application
License: BSD 3-Clause "New" or "Revised" License
The James Webb Space Telescope Quicklook Application
License: BSD 3-Clause "New" or "Revised" License
Now that we have sphinx
all set up, it would be useful to have instructions on how to add API docs for any future code that we write.
Now that we have some actual code with docstrings in our jwql
package, we can start adding the tools to build official API documentation. I have done this before with sphinx
and it has worked well, so unless anybody has a better suggestion I think we should use that.
The documentation that we build could then be hosted on Read the Docs.
For the wfc3ql
and acsql
projects, we used Flask
as the web framework for building web applications, however I'm not sure the lightweight nature of Flask
is going to cut it for jwql
, since jwql
will be scaled up by ~5 compared to wfc3ql
and acsql
.
For this reason, I would like to explore using django
as an alternative to Flask
. django
appears to be an 'industry standard' for web frameworks in python. (There are even entire dedicated conferences to it, so people must use it!)
It looks like the jdango
website has decent documentation and basic tutorials on how to use it. Perhaps this is a good starting point.
It would be useful to have a function in the utils.py
module that returned the individual elements of a given filename, for example:
from jwql.utils.utils import parse_filename
filename_dict = parse_filename('jw94015001001_02102_00001_nrcb1_uncal.fits')
where filename_dict
is:
{
'program_id' : '94015',
'observation' : '001',
'visit' : '001',
'visit_group' : '02',
'parallel_seq_id' : '1',
'activity' : '02',
'exposure_id' : '00001',
'detector' : 'nrcb1',
'suffix' : 'uncal'
}
Now that we have support from the JWST mission office to access JWST proprietary data from the same location as the DMS/archive teams will use, we should figure out the implementation details of how these data are accessed and how the access is authenticated.
It would be awesome if we could interact with JWST FITS files via ginga
straight through the web application. Through some quick research, @gkanarek pointed out that this should be possible.
We should build a script that will monitor and gather statistics about the filesystem. We should be able to easily answer questions like:
I envision the script will create a series of bokeh
or matplotlib
plots that show these statistics over time. These plots can eventually be hosted on our web application. Currently our filesystem is just some static test filesystem, so the plots will be quite boring for now. But this will become more useful after launch.
This script will become one of the monitors that get run by cron
once a day on our virtual machine. Output products should be directed to a specific directory (to be determined).
A good starting point for designing the structure of the web application is to build a site map. We can build a draft site map for now and iterate on it until we are happy
Vera Gibbs suggests that we request a test server and environment sooner rather than later, as it helps ITSD plan their work better. The test environment should mimic the production environment, so we should do some thinking on what we would like our production environment to be first.
Similar to #47, we should also build a monitoring script that gathers information about the MAST database. Some things that would be good to know:
Output should be bokeh
or matplotlib
plots that track these things over time.
I'm working on securing central storage disk space for the jwql
project. Eventually outputs should be stored there.
For various reasons, the MAST API/database purposefully skips the storage of certain JWST header keywords (see attached file). However, Kim DuPrie and Lisa Gardner suggested that skipped keywords could be added to the database/API if there is a use case. We should identify the keywords that we think will be important for our application and ask that they add these.
It may be helpful to others if there were some instructions on how to get started on contributing code to this project (i.e. cloning the repo, submitting pull requests, etc.)
With #30 , there will be some extra steps to install the jwql-dev
environment, namely:
conda update conda
source activate root
conda env create -f environment.yml
We should update the README
to reflect this.
With the merge of #52, we should add the db_monitor
API docs to the sphinx
build.
Reproducing here from the Slack channel:
"During the NIRSpec team meeting on Tuesday, we had a demo of three of the tools the JWST Data Analysis Development Team have been working on - specviz, mosviz, and cubeviz. One of my team members asked whether JWQL would be able to use the tools for QL, but I said no, since they’re stand-alone tools and wouldn’t work in a browser. However, I did say that we might be able to learn from them in terms of choices they’ve made & issues they’ve faced when visualizing various JWST data products. we could also think about whether we want to try and be consistent with those tools in terms of theme/aesthetic/whatever."
So, once we move towards implementing the JWQL visualizations, we should make sure to investigate these tools, and see what we can adapt.
We should make our dev
conda
environment more generalized so that it can be used on the new test server.
Now that we have preview_image.py
, we can build a wrapper around this module to create preview images for all files in the filesystem. The preview images should be stored under the jwql
project directory on central storage in some sort of organized filesystem.
Ideally, we would like to avoid having to maintain a separate copy of JWST data to support the jwql
application. Through informal conversations with @tomdonaldson, I've been told that we will have access to a centrally-located filesystem of both public and proprietary JWST data, akin to the MAST public cache. We should make sure if this is actually true, and if so, what the organization of this filesystem will look like.
Now that we have been pointed to the MAST API, we should test it out with the needs of the JWQL application in mind and gather a list of any capabilities that we would like to don't exist in the API.
@gkanarek pointed out that at least one other web application at ST was using some css
that has the look and feel of the STScI branding. I should ask Chad Smith if OPO already has existing material for this.
We can use this thread to discuss instrument-specific calibration and monitoring software and identify what work needs to be done. A good first step would be to consolidate the tables provided in the Phase A proposal, which I will do.
One test that we could write is to pick a program ID, make sure a preview_image
and thumbail
directory exists, and make sure all expected preview images and thumbails exist.
The infrastructure of the jwql
system will be heavily dependent on the FITS file structure of JWST data, specifically their extensions, header contents, and available filetypes. It would be helpful if we can get our hands on some files that are at least anticipated to be close to the actually data products to come out of the telescope after launch.
To ensure best software development practices and principles, we should make a style guide for software development on this project. We can then check pull requests against this style guide to ensure all collaborators are coding to standard.
Akin to database_interface.py
for acsql
, we need to build a module that will serve as an interface to the jwqldb
database. This module will hold the classes and functions for creating the tables and connecting to the database.
It would be useful for our work in creating the database schema (issue #6) if we had a rough idea of how big we think the database will eventually be. It may tell us if we need to be concerned about disk space and/or memory issues, which could dictate how we structure the schema.
This issue serves as a place to list and discuss the various tasks that are to be completed for the project. Each task will eventually become its own issue to allow for further discussion (to be created by @bourque in the near future).
instrument
/detector
/filetype
combination (@bhilbert4)database_interface.py
) (@Ogaz, @bhilbert4, @bourque)logging.py
) (@cmartlinSTScI )update_database.py
) (@Johannes-Sahlmann )django
(@laurenmarietta )I've tentatively assigned individuals to tasks, but those can definitely change based on people's interests and availability. Please feel free to comment below any thoughts/feelings you have.
One feature that has been particularly handy for both wfc3ql
and acsql
has been decorator functions that log the execution of a script. We should build something similar for jwql
. Perhaps the logging_functions module in wfc3ql
is a good reference/starting point.
Here's the github workflow recommendations that DATB/SCSB will be using for our repos. I like this workflow a lot, but if anyone has any different preferences we can always make adjustments for this project:
https://confluence.stsci.edu/display/DATB/Git+Development+and+Release+Workflows
Marked as xfail for now so jenkins runs, but we should decide if we want to keep that test or not.
There are two main options used at the institute for continuous integration: Travis and Jenkins. IT has written up a policy about this, although it is still a draft: https://confluence.stsci.edu/pages/viewpage.action?pageId=99327010
I'm not sure of the specifics on writing tests that will need to interact with a database. This may determine which CI we will use. AFAIK, as relates to our project, Jenkin's strength is the short wait time to run the test suite. We have private servers for this, whereas with Travis sometimes there can a wait time of a few hours. Travis's strength is that anyone contributing a PR can start/stop a Travis run. For Jenkins, if a test suite run needs to be stopped, restarted, etc. that has to be done by someone internal with the correct permissions.
We may want to split unit testing out into a separate issue later on.
We should probably have some tests for preview_image.py
, like we do for permissions.py
.
One obvious test would be to use a test file as input and check to see that the code successfully generates a preview image file.
Assigning to @Johannes-Sahlmann
When the jwql
project eventually has a filesystem containing preview images, proprietary files (still uncertain), and output products generated by automated calibration and monitoring scripts, we will need to make sure these files have appropriate unix permissions as to not let anyone outside of the jwql
"group" see them.
As such, it would be convenient for scripts to be able to import a module that takes care of this without having to worry what permissions to set things to. So we should build this module.
The module should take as input a path to a file, check to see if the owner of the file is the jwql
admin account, and if so, (1) set the permissions appropriately, and (2) set the group membership appropriately.
This module should also come with nosetest
(s) to test if the function(s) within the module are working properly.
One feature of our web application should be to be able to download a JWST FITS file through the web app to the user's machine. I envision that, when looking at a webpage that displays a preview image of a particular observation, there are buttons to download each available filetype for that observation.
Now that we have passed the proposal stage and are actively developing on this project, it is a good time to refine our initial schedule to more accurately reflect the work we are actually doing and come up with more accurate deadlines.
There exists a JWST 'engineering database'. We should see if we can use this and if it could be helpful for our application.
During the meeting with the NIRCam team, they expressed interest in having preview images that includes a mosaic with all detectors that were observed during a particular observation. Perhaps this could be an option for a preview image to view on the web application.
I strongly suggest we use readthedocs to autogenerate our documentation (with the help of sphinx). We even have a STScI readthedocs style sheet (ex: http://stak-notebooks.readthedocs.io/en/latest/). Not shown in this example is the inclusion of the sphinx generated API documentation pulled from doc strings in the code.
According to Vera Gibbs, ITSD supports MySQL
, PostgreSQL
, and MSS
. We need to decide which of these to use for the jwql
database.
One feature of our web application will be to tag images with having specific anomalies (e.g. satellite trail, various detector artifacts, etc.). Though we don't know which anomalies the instrument teams may want to track (nor will we probably know for sure until operations), we should build the framework for a database that stores anomaly information.
We already have a postgresql
database built for us, we can use that one for now.
Now that we have a prototype for a web application, it would be useful to test its efficiency on an internal server.
Now that the filesystem monitor is merged in (#69), we should add the sphinx
documentation.
We need to develop a schema for the jwql
database. I think a decent starting point is something like the schema I used for ACS Quicklook:
In this schema, we have a master
table that keeps track of each rootname that is in the database and when it was ingested. The datasets
table keeps track of which filetypes exist for a given rootname. Then there is a table for each detector
/extension
/filetype
combination which is basically a dump of the headers (columns are header keys and values are header values).
To construct this for jwql
, we will need to know the following for each instrument:
In order to implement the database schema, we will need lists of each header keyword for each instrument
/detector
/filetype
combination, akin to the acsql
examples. These lists will eventually be used by database_interface
to create the header tables of the database.
The code that generates these for the acsql
example is here, perhaps it can be adapted for jwql
use.
It would be ideal if we could generate these lists programmatically, with very little or no hard coding/file creation/file editing involved (since there sure are a lot of detectors and filetypes!).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.