Giter Club home page Giter Club logo

jwql's Issues

Build sphinx documentation

Now that we have some actual code with docstrings in our jwql package, we can start adding the tools to build official API documentation. I have done this before with sphinx and it has worked well, so unless anybody has a better suggestion I think we should use that.

The documentation that we build could then be hosted on Read the Docs.

Explore using django for building a web application

For the wfc3ql and acsql projects, we used Flask as the web framework for building web applications, however I'm not sure the lightweight nature of Flask is going to cut it for jwql, since jwql will be scaled up by ~5 compared to wfc3ql and acsql.

For this reason, I would like to explore using django as an alternative to Flask. django appears to be an 'industry standard' for web frameworks in python. (There are even entire dedicated conferences to it, so people must use it!)

It looks like the jdango website has decent documentation and basic tutorials on how to use it. Perhaps this is a good starting point.

Build filename parser utility function

It would be useful to have a function in the utils.py module that returned the individual elements of a given filename, for example:

from jwql.utils.utils import parse_filename
filename_dict = parse_filename('jw94015001001_02102_00001_nrcb1_uncal.fits')

where filename_dict is:

{
    'program_id' : '94015',
    'observation' : '001',
    'visit' : '001',
    'visit_group' : '02',
    'parallel_seq_id' : '1',
    'activity' : '02',
    'exposure_id' : '00001',
    'detector' : 'nrcb1',
    'suffix' : 'uncal'
}

Build filesystem monitor

We should build a script that will monitor and gather statistics about the filesystem. We should be able to easily answer questions like:

  • How many files are there in total?
  • How much disk space is used by all of the files?
  • How many <some_filetype> files are there?
  • Other questions I can't think of right now.

I envision the script will create a series of bokeh or matplotlib plots that show these statistics over time. These plots can eventually be hosted on our web application. Currently our filesystem is just some static test filesystem, so the plots will be quite boring for now. But this will become more useful after launch.

This script will become one of the monitors that get run by cron once a day on our virtual machine. Output products should be directed to a specific directory (to be determined).

Draft a site map for the web application

A good starting point for designing the structure of the web application is to build a site map. We can build a draft site map for now and iterate on it until we are happy

Ask ITSD to build a test and production server

Vera Gibbs suggests that we request a test server and environment sooner rather than later, as it helps ITSD plan their work better. The test environment should mimic the production environment, so we should do some thinking on what we would like our production environment to be first.

Build database monitor

Similar to #47, we should also build a monitoring script that gathers information about the MAST database. Some things that would be good to know:

  • Number of files in the database
  • Number of files in the database for a given instrument and/or observing mode
  • Number of header keywords stored in database

Output should be bokeh or matplotlib plots that track these things over time.

I'm working on securing central storage disk space for the jwql project. Eventually outputs should be stored there.

Determine which skipped keywords are important for jwql

For various reasons, the MAST API/database purposefully skips the storage of certain JWST header keywords (see attached file). However, Kim DuPrie and Lisa Gardner suggested that skipped keywords could be added to the database/API if there is a use case. We should identify the keywords that we think will be important for our application and ask that they add these.

skipped_jwst_keyword.txt

Update README with new environment installation

With #30 , there will be some extra steps to install the jwql-dev environment, namely:

conda update conda
source activate root
conda env create -f environment.yml

We should update the README to reflect this.

Investigate JWST Data Analysis Tools

Reproducing here from the Slack channel:

"During the NIRSpec team meeting on Tuesday, we had a demo of three of the tools the JWST Data Analysis Development Team have been working on - specviz, mosviz, and cubeviz. One of my team members asked whether JWQL would be able to use the tools for QL, but I said no, since they’re stand-alone tools and wouldn’t work in a browser. However, I did say that we might be able to learn from them in terms of choices they’ve made & issues they’ve faced when visualizing various JWST data products. we could also think about whether we want to try and be consistent with those tools in terms of theme/aesthetic/whatever."

So, once we move towards implementing the JWQL visualizations, we should make sure to investigate these tools, and see what we can adapt.

Determine if a public and/or proprietary filesystem will be available through MAST

Ideally, we would like to avoid having to maintain a separate copy of JWST data to support the jwql application. Through informal conversations with @tomdonaldson, I've been told that we will have access to a centrally-located filesystem of both public and proprietary JWST data, akin to the MAST public cache. We should make sure if this is actually true, and if so, what the organization of this filesystem will look like.

Test the MAST API for JWQL needs

Now that we have been pointed to the MAST API, we should test it out with the needs of the JWQL application in mind and gather a list of any capabilities that we would like to don't exist in the API.

Write tests for generate_preview_image

One test that we could write is to pick a program ID, make sure a preview_image and thumbail directory exists, and make sure all expected preview images and thumbails exist.

Acquire some JWST FITS files to develop with

The infrastructure of the jwql system will be heavily dependent on the FITS file structure of JWST data, specifically their extensions, header contents, and available filetypes. It would be helpful if we can get our hands on some files that are at least anticipated to be close to the actually data products to come out of the telescope after launch.

Create style guide for jwql software development

To ensure best software development practices and principles, we should make a style guide for software development on this project. We can then check pull requests against this style guide to ensure all collaborators are coding to standard.

Estimate the size of the database

It would be useful for our work in creating the database schema (issue #6) if we had a rough idea of how big we think the database will eventually be. It may tell us if we need to be concerned about disk space and/or memory issues, which could dictate how we structure the schema.

Task list

This issue serves as a place to list and discuss the various tasks that are to be completed for the project. Each task will eventually become its own issue to allow for further discussion (to be created by @bourque in the near future).

  • Create text files that list each header keyword and its data type for each instrument/detector/filetype combination (@bhilbert4)
  • Build module(s) for creating and connecting to the database (i.e. database_interface.py) (@Ogaz, @bhilbert4, @bourque)
  • Build module(s) that logs the execution of a script (i.e. logging.py) (@cmartlinSTScI )
  • Build module(s) that inserts records into the database (i.e. update_database.py) (@Johannes-Sahlmann )
  • Figure out how to connect Jenkins CI to repository (@hover2pi, @SaraOgaz )
  • Start building a web app with django (@laurenmarietta )

I've tentatively assigned individuals to tasks, but those can definitely change based on people's interests and availability. Please feel free to comment below any thoughts/feelings you have.

Decide on Continous Integration and testing solutions

There are two main options used at the institute for continuous integration: Travis and Jenkins. IT has written up a policy about this, although it is still a draft: https://confluence.stsci.edu/pages/viewpage.action?pageId=99327010

I'm not sure of the specifics on writing tests that will need to interact with a database. This may determine which CI we will use. AFAIK, as relates to our project, Jenkin's strength is the short wait time to run the test suite. We have private servers for this, whereas with Travis sometimes there can a wait time of a few hours. Travis's strength is that anyone contributing a PR can start/stop a Travis run. For Jenkins, if a test suite run needs to be stopped, restarted, etc. that has to be done by someone internal with the correct permissions.

We may want to split unit testing out into a separate issue later on.

Build tests for preview_image.py

We should probably have some tests for preview_image.py, like we do for permissions.py.

One obvious test would be to use a test file as input and check to see that the code successfully generates a preview image file.

Assigning to @Johannes-Sahlmann

Build module that sets the appropriate permissions for a given file

When the jwql project eventually has a filesystem containing preview images, proprietary files (still uncertain), and output products generated by automated calibration and monitoring scripts, we will need to make sure these files have appropriate unix permissions as to not let anyone outside of the jwql "group" see them.

As such, it would be convenient for scripts to be able to import a module that takes care of this without having to worry what permissions to set things to. So we should build this module.

The module should take as input a path to a file, check to see if the owner of the file is the jwql admin account, and if so, (1) set the permissions appropriately, and (2) set the group membership appropriately.

This module should also come with nosetest(s) to test if the function(s) within the module are working properly.

Build in file download capability through web application

One feature of our web application should be to be able to download a JWST FITS file through the web app to the user's machine. I envision that, when looking at a webpage that displays a preview image of a particular observation, there are buttons to download each available filetype for that observation.

Make updated schedule for project

Now that we have passed the proposal stage and are actively developing on this project, it is a good time to refine our initial schedule to more accurately reflect the work we are actually doing and come up with more accurate deadlines.

Enable "advanced" preview images

During the meeting with the NIRCam team, they expressed interest in having preview images that includes a mosaic with all detectors that were observed during a particular observation. Perhaps this could be an option for a preview image to view on the web application.

Build database table(s) for tracking anomalies

One feature of our web application will be to tag images with having specific anomalies (e.g. satellite trail, various detector artifacts, etc.). Though we don't know which anomalies the instrument teams may want to track (nor will we probably know for sure until operations), we should build the framework for a database that stores anomaly information.

We already have a postgresql database built for us, we can use that one for now.

Develop Database Schema

We need to develop a schema for the jwql database. I think a decent starting point is something like the schema I used for ACS Quicklook:

schema

In this schema, we have a master table that keeps track of each rootname that is in the database and when it was ingested. The datasets table keeps track of which filetypes exist for a given rootname. Then there is a table for each detector/extension/filetype combination which is basically a dump of the headers (columns are header keys and values are header values).

To construct this for jwql, we will need to know the following for each instrument:

  1. What are all of the possible filetypes and what purpose do they serve?
  2. What is the data structure for each filetype (i.e. number of extensions, what purpose each extension serves, what datatype each one is)?
  3. What are the header keywords for each filetype/extension combination?

Generate text files that contain header keywords and their data type

In order to implement the database schema, we will need lists of each header keyword for each instrument/detector/filetype combination, akin to the acsql examples. These lists will eventually be used by database_interface to create the header tables of the database.

The code that generates these for the acsql example is here, perhaps it can be adapted for jwql use.

It would be ideal if we could generate these lists programmatically, with very little or no hard coding/file creation/file editing involved (since there sure are a lot of detectors and filetypes!).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.