temuller / hostphot Goto Github PK

View Code? Open in Web Editor NEW

4.0 2.0 2.0 46.17 MB

HostPhot: global and local photometry of galaxies hosting supernovae or other transients

Home Page: https://hostphot.readthedocs.io/en/latest/

License: MIT License

Python 87.28% TeX 12.72%

cosmology supernova galaxy python photometry sep sextractor transients cutout image

hostphot's Introduction

Hola, I'm Tomás

👨🏻‍💻 About Me

🎓 I am a postdoc at the Institute of Space Sciences (ICE-CSIC) in Barcelona, Spain. I have previously studied at the University of Southampton (UoS), Pontificia Universidad Católica de Chile (PUC) and Universidad de Chile.
📄 My research is focused on cosmology with type Ia supernovae, but I am also interested in the physics of stellar transients in general, coding and machine learning.
🌱 I love music and playing the drums. I also love doing sports in general, specially playing football.

Check my personal website for more information about me!

🛠 Tech Stack

⚙️ GitHub Analytics

🧑‍🚀 Open Source Projects

💻 Projects	🌟 Stars	🍴 Forks	🐛 Issues	🔔 Pull Requests	👨‍💻 Language
HostPhot
PISCOLA
PESSTO pipeline
SN II fitting code
Astro Visualization

hostphot's People

Contributors

Stargazers

Watchers

Forkers

hostflows wolffem

hostphot's Issues

[JOSS] Queries for review

Following the checklist in the review, I have some minor suggestions and questions that I hope will improve the package.

It might be good somewhere in the documentation -- in the README, the readthedocs, or both -- to get an explicit list of the dependencies the software has, so that an advanced user could pre-generate (or re-use) a conda environment and then simply clone your repository. Since this is a more "advanced" method (as opposed to the very clean "pip install" option you already provide), it likely could be a separate page/section in the readthedocs. This would also have the added bonus of more clearly highlighting the packages you build on. It's a little more complex than your case, but as an example the astropy installation page lists the simple "pip install" option, but then has a "requirements" section further down the page. I think I found
astropy, reproject, numpy, matplotlib, sep, photutils, pandas, astroquery, pyvo, ipywidgets, sfdmap, extinction, sphinx_rtd_theme
as packages you import (plus Python as a whole), most (I think all but matplotlib and sphinx) being in requirements.txt, so the options could mirror those going forward as the project evolves.
Along a similar line, you cite some of these packages in your paper, but left others out. I noticed at least numpy, pandas, and matplotlib have literature citations you could provide, but there may be others, and it would be good to get as many citations as possible into the paper.
Also for the paper, the checklist mentions a comparison with the state-of-the-field. Are there any other similar software packages that perform this kind of work, or does hostphot fill a unique niche? Is there anything you could compare to?
requirements.txt does not list pytest, so pip install hostphot did not install it, and your README instructions therefore produce an error message quite quickly. Either add a quick line to the README telling the reader that if they want to run the test suite they should pip install pytest, or add it to requirements.txt so it's picked up automatically.
While your test suite covers one of each kind of process in the software, it only checks to see if the code passes from start to finish. Is there any way you could verify the outputs from those tests to check for consistency and correctness of the output? Otherwise you might be at risk of -- as a simple example -- an upstream bug in a photutils calculation that, say, accidentally doubles all of the fluxes from aperture_photometry, and would have no way of verifying this had happened to you at present. You can use both assert (assert some_calculated_flux == 1.5, if we knew the galaxy had a flux of exactly 1.5 in your test image, unchangingly) or if using from numpy.testing import assert_allclose tests can be run to check that arrays are similar enough to within relative and absolute tolerances specified.
Again in the test suite, there is only test one example from each item, giving a low line coverage. To test this I ran pip install pytest-cov and pytest -v --cov hostphot giving an extra pytest output of

---------- coverage: platform darwin, python 3.10.4-final-0 ----------
Name                                   Stmts   Miss  Cover
----------------------------------------------------------
src/hostphot/__init__.py                  24      8    67%
src/hostphot/_constants.py                 3      1    67%
src/hostphot/_version.py                   1      0   100%
src/hostphot/coadd.py                     23      1    96%
src/hostphot/cutouts.py                  170     90    47%
src/hostphot/dust.py                      59     15    75%
src/hostphot/global_photometry.py        149     51    66%
src/hostphot/image_cleaning.py            39     14    64%
src/hostphot/image_masking.py            105     18    83%
src/hostphot/interactive_aperture.py     167     54    68%
src/hostphot/local_photometry.py         114      8    93%
src/hostphot/objects_detect.py            79     39    51%
src/hostphot/rgb_images.py               159    142    11%
src/hostphot/utils.py                     87     10    89%
----------------------------------------------------------
TOTAL                                   1179    451    62%

For example, in test_cutouts.py there is a comment mentioning a failure in one of the examples -- what is the failure, and is it specifically for that ra-dec combination or can you run a different bit of sky for SDSS to PS1? Have you also verified the DES cutouts work? If a full test is too complex to run end-to-end, perhaps try breaking the tests up into smaller parts -- for rgb_images.py for example tests could be run on sky_mean_sig_clip as an individual function instead of as a larger part of a galaxy analysis.

In the README, under "Contributing", there is a mention that users can contact you directly, but no contact details have been provided. Would you be willing/able to provide something like an email address (or other contact service, if you had something else in mind) in the README there, so it was a little easier for users to email you with questions? Perhaps a more generic "[email protected]" so it can be shared with multiple maintainers in the future, and if you didn't want to provide a personal contact?
For some of the larger, common-use functions (ones the end-user is actually likely to want to call, at least), it might be good to include your real-world examples of usage (from the README/readthedocs) as part of the documentation/docstring. These would then automatically be included with sphinx and be in the "right" place in the API documentation as well. As an example see for instance numpy's savetxt documentation and source code. Note that these do not need to run themselves, but would just be examples of what the inputs and outputs are expected to look like.
I don't think it's necessary to duplicate the _choose_workdir functions within each .py file. From a quick test, I think you can keep __workdir__ solely within _constants.py and reference it from that file's version directly. I made a minor change to delete the version of the function in __init__.py, and remove the underscore before the function in _constants.py:

__workdir__ = "images"


def choose_workdir(workdir):
    """Updates the work directory.

    Parameters
    ----------
    workdir: str
        Path to the work directory.
    """
    global __workdir__
    __workdir__ = workdir

and then in the terminal I ran

Python 3.10.4 (main, Mar 31 2022, 03:38:35) [Clang 12.0.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from hostphot._constants import choose_workdir, __workdir__
>>> import hostphot
>>> print(__workdir__, hostphot._constants.__workdir__)
images images
>>> choose_workdir('other_folder')
>>> print(__workdir__, hostphot._constants.__workdir__)
images other_folder

which didn't update the cached version of __workdir__ we imported previously, but did change the version in ._constants, so I think everywhere you use __workdir__ in the script you can instead call hostphot._constants.__workdir__ and keep all changes made in a hostphot.choose_workdir call. This allows for the deletion of the version of the function in __init__.py and every non-_constants file, and then you also wouldn't need the from hostphot._constants import __workdir__ calls either, since they would be called directly when needed.

Is it possible to use non-PS1/SDSS/DES data within hostphot at the moment? From the README "Adding other surveys" section at first glance I thought perhaps there was (reduced, basic) functionality to (generically) call another survey database and extract cutouts, convert them, extract fluxes/magnitudes etc. but from the source code it appears to be a little more involved and survey-specific, and that there is only the option to use the three listed surveys at the moment. Does this mean that the README's suggestion is to open a PR to include more options in e.g. hostphot.cutouts.download_images? Maybe the language could be tweaked slightly to make it a bit more obvious that "adding other surveys" means "please open a pull request to add new functions!" if this is the case.

[JOSS] Initial comments about code and repository

Basing this on commit 011333f, using Python 3.10.5 (main, Jun 6 2022, 18:49:26) [GCC 12.1.0] on linux

A few things are missing from your readme:
- A Contributing section, with expected code style and possible adaptations of pep8 and ideally a description how the software came to be and how/if you plan to develop it further (e.g. "I plan on adding feature X", "I would be really glad if someone could help with Y")
- How you're supposed to run the tests (and ideally linters)
Data originally checked into and subsequently removed is still visible/available in the history, so every clone of the repository will still download the files. If you didn't plan on publishing it, you should re-write the repository history with something like an interactive rebase.
There are a bunch of issues reported by pylint, flake8 and check-manifest. I'd suggest running something like autopep8 or black on the code, that should fix the majority of issues. If you really want to deviate from the pep8 style, you should include the relevant linter config files as part of the repository (they seem to be manually selected in the gitub-ci config?), so the checks can be run locally. A tool like tox would be a helpful addition so that one can check their work with a single command before they commit/push.
running the tests with python -m unittest tests/* creates a load of warnings relating to the fits-file handling. It fails with the following message

ERROR: tests/__pycache__ (unittest.loader._FailedTest)
----------------------------------------------------------------------
ImportError: Failed to import test module: tests/__pycache__
Traceback (most recent call last):
  File "/usr/lib/python3.10/unittest/loader.py", line 154, in loadTestsFromName
    module = __import__(module_name)
ModuleNotFoundError: No module named 'tests/__pycache__'

when running tests with pytest (as I think it was intended?), some warnings remain. I suggest silencing the warnings you expect to happen so a new warning introduced by future additions will actually be spotted.

[JOSS] comments about paper, documentation

Could you mention what the difference between local vs global photometry is and what that implies for your code? It feels like one of the central parts of the problem to differentiate a Supernova's flux from the host galaxy so a few remarks about what that entails would be a nice starting point for someone reading your code.
in "HostPhot is fast at calculating photometry (up to a few seconds per object)" It sounds a bit like you're impyling that you implement the photometry algorithms yourself. I feel you could make the relationship between your code and the used libraries a bit clearer, i.e. saying something like "after cleaning the image, photometry is calculated with photutils".

[JOSS] More subjective, possibly nitpicky comments.

I'll also open this as a place for comments that aren't really relevant to the submission criteria but which I nevertheless hope to be constructive criticism more in line with a general code review.

As a general remark, be aware that most code will be read more often than it is written and reducing "mental load" you place on your audience (which includes future you) is a big factor to the time and nerves it will cost them to understand what is going on. I'd wager a guess and say you and others are also a lot more likely to contribute if they feel "at home". There's a few things that help with that
- Stick to established style conventions (pep8) and explain your reasoning when deviating
- I tend to make a pep8-exception when it comes to line-length for more verbose variable names (e.g. size->image_size, pos->host_galaxy_coordinates). But keep in mind that at 4-5 layers of indentation, things become really hard to follow, so still set a reasonable limit like 120 chars.
- Aim to reduce the number of surprises in general: The same or very similar parameters should have the same naming scheme and type when used in multiple functions.
- The number of things one can simultaneously keep in their head is limited, so keep the number of function-parameters manageable. If possible, bundle them together into logical groups, i.e. use a singleastropy.SkyCoord instead of ra and dec or a combine multiple flags into a config- dataclass.
Comments like this that don't add any information beyond what the function name already contains should be more about the bigger picture and why something is done or just omitted.

test_preprocessing.py:
    # coadd
    coadd_images(name, coadd_filters, survey)

type-hints would be a nice addition, even though the types are already mentioned in the docstring. It allows for automatic checks and shows users of your library squiggly lines in their IDE for some of the mistakes they might make.
I'm a bit confused by the working dir mechanism. Why not just let the user use os.chdir()? In case you want to be more flexible, you could have a global configuration object that has something like working_directory: pathlib.Path and then use it whenever you want to write something (like hdu.writeto(config.working_directory/'foo.fits')). In my understanding "dunder" varriables (__foo__) are reserved for file metadata or to inspect internal state but should not be required for the program to run.
Stuff like filter names that are restricted to certain values should ideally be constants or enums. Or at least their possible values should be listed in the docstring with the parameter explanation.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.