Giter Club home page Giter Club logo

braindump's People

Contributors

arfon avatar eteq avatar larrybradley avatar nden avatar pllim avatar stscicrawford avatar stscieisenhamer avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

braindump's Issues

Workflow playground

Do you want to use this repo as a workflow playground? I.E - should we all fork this to our own accounts and update with pull requests?

Blind leading the blind: Chase the memory leak

Upvote to listen to the mind-numbing story of resolvingdoing an end-around of the Spec2Pipeline memory leak. Punchline: the reason is still unknown, but one of the beast's heads have been removed.

Just testing tasklist

https://docs.github.com/en/issues/tracking-your-work-with-issues/about-tasklists

Tasks

JWQL demo/talk

@bourque would like to present JWST Quick Look tool and get some perspective from outside of INS.

Moving JWST to github

May 19th, noon, S322

A suggested list of topics to discuss. Please add other topics as you see necessary.

  • What to move from SVN to github (we don't have to move everything)

  • How to organise the code

    The current organization uses namespaces. IIRC this was necessary because of a requirement that
    steps are to be installed as a separate packages as well as as one package. Is this still a good idea?

  • What is the current nightly SSBDEV procedure, i.e. after the code is moved to github what should we
    expect from the build scripts?

  • What workflow to adopt?

    Confluence notes

    astropy

  • Testing

  • Documentation

  • Releases

Refactoring regression tests

Currently, SSB regression tests live in a private and hard to access SVN repository. Cron jobs are set up on certain machines (Mac OSX, RHEL6, and RHEL5) to update test scripts from that repo once a day and run with pandokia, compare results with "truth" images/tables, and generate HTML reports. Developers then look at the reports and can choose to "okify" failed tests (i.e., make the new results the new "truth").

Pros

  • It is already working.
  • It is familiar to us.
  • pandokia is developed in-house (free, no licensing issues).

Cons

  • pandokia has limited documentation. Some internal wiki instructions but that's about it.
  • pandokia lacks Python 3 support, as it was written about 10 years ago.
  • pandokia is not flexible in letting tests reading in big data from an arbitrary path. For example, CALWF3 input data are required to be on the current machine in the current directory.
  • SVN repository not always in-sync across machines. For example, test updated on a machine and pushed to SVN repository might not get picked up by the other machine. This problem is suspected to be caused by SVN version is too old on one or more of the machines (but upgrading the SVN version would cause a different kind of problem).
  • This system has a single point of failure, in the sense that only Joe or Christine (and perhaps Vicki if she has time) knows how to fix it. Okay, up to three points of failure.
  • Not all exceptions are caught properly by pandokia. For example, a syntax error will result in the test being omitted from final report entirely (not to be confused with being reporting as missing in the report).
  • Tests are disconnected from the actual codes that they are testing, in the sense that they are not under the same version control.
  • A test messing up can affect unrelated tests. That is, the whole regression test system can fail because of a single test.
  • Regression test codes are invisible to anyone outside SSB.
  • It is difficult to access the current SVN (e.g., no Trac site).

Proposed Changes

  • Move test codes out of the hidden and hard-to-access SVN repository back to the respective code repositories. For example, pysynphot tests go back to pysynphot GitHub repo. An exception can be made for HSTCAL (because it's written in C but its tests are in Python) and legacy codes (e.g., PyRAF, pytools). I am willing to absorb CALACS tests into acstools, but CALWF3 has opposing view. There is no reason why tests for all the different packages need to be in the same repository. It is trivial to skip/xfail a test requiring big data that is not present.
  • Replace pandokia with a new test system. For example, a private version of Travis CI or Jenkins CI.
  • Replace SVN with git, to be consistent with our recent move to GitHub. Also, this way, we do not need a special global account to modify the tests. Ideally, we can open pull requests and merge like we do with "real" codes.
  • Switch from nose to pytest. As we move forward with more and more codes depending on Astropy or its template, this is unavoidable.
  • Create a new test workflow. There is no reason for tests to run every night for all the packages. Tests should only be run when there are changes relevant to the codes being tested (e.g., a new commit or a new data file). Also, they must be able to be run easily manually as needed. For example, when there is a new pull request, we can checkout the codes from the pull request and just type py.test packagename [args] or python setup.py test packagename [args].
  • Figure out a cleaner way to store input and output test data. While smaller data can live in the code repository (see first point above), big data should be somewhere else. Also we should address things like: Do we need to version control input data? Do we need to keep old "truths" once we "okify" new ones? How will the "okify" process work with Travis or Jenkins CI?
  • While we're at it, we can review the affected tests and discard those that are outdated and do not make sense anymore. This will reduce maintenance costs going forward and lessen total run time.
  • Replace RHEL5 and RHEL6 with RHEL7. Or at the very least, get rid of RHEL5. Also, if applicable, upgrade Mac OSX test machine.
  • Do we even want to consider a Windows test machine? Maybe Windows 10?

c/c: @sosey @nden @cdsontag @jhunkeler @vglaidler @stsci-hack @justincely and whoever else in SSB that is interested

p/s: This is how I envision the change -- https://www.youtube.com/watch?v=mZ6_0wGGsuY

Celery (not the relatively useless vegetable)

I have been playing with celery (distributed processing Python package) a bit and find it an interesting package. I am very new to it but would be happy to come up with something to show what it can do.

Regular Expressions I Have Known and Loathed

Whenever I mention regular expressions, I get the impression that some people are uncomfortable with them. Perhaps this is too basic, but I can give a talk on regular expressions, using as an example parsing words out of a line of text, building up from a simple regular expression to something that is hideous to contemplate.

Project boards?

So many choices. Trello, Airtable, Waffle.io, Emacs... Which one to use? There's JIRA too but apparently we can't use it to track arbitrary things.

(Feel free to close this if it is not a good braindump topic.)

c/c @hcferguson

Python type hinting discussion

Be great to have a discussion of the if/how/when/why of using the new Python type hinting. Even if its the short "nope".

JWST data models

Any interest in discussing data models?
Any issues that need to be discussed?

Present/ talk about code coverage

I've been learning a ton about how code coverage is calculated and all the graph math that's happening on the innards of things like coveralls in my Software Testing class, and it seems like it would make an interesting brain dump. At least, I find it super interesting and fun.

Jenkins

There is interest from INS Software Engineers to learn about how to use Jenkins for their work.

Update the help

Tyler Desjardins mentions that we should consider moving emails from help[at]stsci.edu to point to the web portal where possible and appropriate. For HST (or any non-JWST), it is https://hsthelp.stsci.edu . For JWST, it is https://jwsthelp.stsci.edu . Please update info in setup.py, setup.cfg, documentation, etc as appropriate.

Please close this issue if it is irrelevant to your repository. This is an automated issue. If this is opened in error, please let pllim know!

xref spacetelescope/hstcal#317

Jenkins examples

It'd be nice to hear how to setup Jenkins on a repo and see some examples of converting pandokia tests to Jenkins.

Advanced git workflow

I see this topic as distinct from github workflows, although there will probably be some overlap. I think it would be useful to talk about how git can be used locally to aid a development workflow, and some of the more 'advanced' git features that not all users may be aware of, including:

  • git gui
  • git grep
  • git reset
  • git rebase
  • git reflog
  • git rerere

It might also be useful to talk about git integration with shell environments, including plugins that allow for tab completion of git commands and commits.

What's new in Python and Numpy

Now that build 7 is done (is it?) it looks like there's interest in a session (or two) on the above topic.
Is (Thu) Dec 15 a good day for this (please show 👍 or 👎 )?

Also we need two volunteers to prepare the two topics. Please volunteer here or send me an email.

Diagnosing memory issues in Python extensions

In case it is of interest, I have prepared short demos on the following topics:

  • detecting memory leaks in Python extensions using valgrind
  • detecting buffer overruns in Python extensions using address sanitizer

Let me know if there are other topics that would be of interest. These might include using gdb to debug Python extensions, detecting undefined C behavior in extensions, etc.

Development Tools

Things people use to help with their development:

flake8 (pyflakes + pep8)
pylint
autopep8
Valgrind

Some of these can be encorperated into your editor for on the fly style checking

Discuss ASDF

Apr 14, 2016

asdf-standard

Python implementation

ASDF versioning is documented here:

https://github.com/spacetelescope/asdf-standard/blob/master/source/versioning.rst

The #ASDF line refers to the file format -- how blocks are laid out, how offsets are calculated, etc. Basically anything a reader would need to know to separate all of the blocks in the file, but not necessarily the meaning of the tags in the YAML portion.

The #ASDF_STANDARD refers to the ASDF standard, including all of the YAML tags and their meanings. While each YAML tag is individually versioned, the #ASDF_STANDARD groups those up into a single version that can be easily checked such that a reader could say "I don't understand this version of the spec, but I'll do my best to load it anyway...".

Also there is some discussion about versioning in this link:

asdf-format/asdf-standard#90

New pytest plugins showcase

@drdavella turned some astropy test helpers into actual pytest plugins outside of astropy. I would be interested to learn how to use these for my own projects!

Glupyter feedback

I asked about these during the demo (https://github.com/spacetelescope/braindump/tree/master/glupyter_20180101) and was asked to post them as issue, so here they are:

  • Glue API could use a to_table method (instead of re-indexing)
  • Expose irregular brush or lines API (e.g., for user to arbitrarily draw a line on the CMD and select stars in/near the line)
  • Bug: Duplicate tab when linking im to obj with subset already created before

(I hope these still make sense.)

Using custom STScI template locally and on RTD - works!

I got the custom template for our docs working locally and on RTD. Here's an example of what it looks like:

http://wfc3tools.readthedocs.io/

If you want to use it with the repo that you manage, edit your conf.py to include:

html_theme = 'sphinx_rtd_theme'
html_theme_options = {
"collapse_navigation": True
}
html_logo = '_static/stsci_pri_combo_mark_white.png'
html_static_path = ['_static']
html_context = {
'css_files': [
'_static/css/custom.css',
],
}

html_last_updated_fmt = '%b %d, %Y'
html_sidebars = {'**': ['globaltoc.html', 'relations.html', 'searchbox.html']}

You can copy the corresponding logo and custom.css files from the wfc3tools package.

Recent developments on de-blending astronomical sources

Tools like SExtractor and DAOphot have been the industry standard for source detection and image segmentation. Big-survey projects like LSST and WFIRST and Euclid are investigating ways to move beyond these and in particular to fold in multi-wavelength information. A year ago I would have said LSST was barking up the wrong tree on what they were pursuing. But within the last few months they have made real progress on using the color information and Non-negative Matrix Factorization to separate overlapping objects. I can review what I heard at the last LSST all-hands meeting. If we wait a few months, I might get an update on this from a series of telecons that are just starting.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.