Giter Club home page Giter Club logo

Comments (2)

temuller avatar temuller commented on July 19, 2024

Thanks for these suggestions. I have made some changes (under the dev branch) following these comments. I'll merge this once I have finished with the update.

Following the checklist in the review, I have some minor suggestions and questions that I hope will improve the package.

1. It might be good somewhere in the documentation -- in the README, the readthedocs, or both -- to get an explicit list of the dependencies the software has, so that an advanced user could pre-generate (or re-use) a conda environment and then simply clone your repository. Since this is a more "advanced" method (as opposed to the very clean "pip install" option you already provide), it likely could be a separate page/section in the readthedocs. This would also have the added bonus of more clearly highlighting the packages you build on. It's a little more complex than your case, but as an example the [astropy](https://docs.astropy.org/en/stable/install.html#requirements) installation page lists the simple "pip install" option, but then has a "requirements" section further down the page. I think I found
   astropy, reproject, numpy, matplotlib, sep, photutils, pandas, astroquery, pyvo, ipywidgets, sfdmap, extinction, sphinx_rtd_theme
   as packages you import (plus Python as a whole), most (I think all but matplotlib and sphinx) being in `requirements.txt`, so the options could mirror those going forward as the project evolves.

I have added an explicit list with the required packages in the README and readthedocs.

2. Along a similar line, you cite some of these packages in your paper, but left others out. I noticed at least numpy, pandas, and matplotlib have literature citations you could provide, but there may be others, and it would be good to get as many citations as possible into the paper.

I have added either a proper citation or a link (if no citation was found) to the packages used.

3. Also for the paper, the checklist mentions a comparison with the state-of-the-field. Are there any other similar software packages that perform this kind of work, or does `hostphot` fill a unique niche? Is there anything you could compare to?

I have not found other public codes that do the same tasks as hostphot. Past works have relied on custom codes that are survey specific, not public and more narrow on their scope. In addition, these have mainly been used for distant galaxies, where there are no problems with foreground star/objects. Therefore, hostphot provides a unique opportunity to the community to perform local and global photometry of different surveys, masking foreground objects. The community can also contribute and improve this code as it is public.

4. `requirements.txt` does not list pytest, so `pip install hostphot` did not install it, and your README instructions therefore produce an error message quite quickly. Either add a quick line to the README telling the reader that if they want to run the test suite they should `pip install pytest`, or add it to `requirements.txt` so it's picked up automatically.

pytest has been added to the requirements

5. While your test suite covers one of each kind of process in the software, it only checks to see if the code passes from start to finish. Is there any way you could verify the outputs from those tests to check for consistency and correctness of the output? Otherwise you might be at risk of -- as a simple example -- an upstream bug in a `photutils` calculation that, say, accidentally doubles all of the fluxes from `aperture_photometry`, and would have no way of verifying this had happened to you at present. You can use both `assert` (`assert some_calculated_flux == 1.5`, if we knew the galaxy had a flux of exactly 1.5 in your test image, unchangingly) or if using `from numpy.testing import assert_allclose` tests can be run to check that arrays are similar enough to within relative and absolute tolerances specified.

I now test the outputs of the photometry with numpy.testing import assert_allclose as was suggested.

6. Again in the test suite, there is only test one example from each item, giving a low line coverage. To test this I ran `pip install pytest-cov` and `pytest -v --cov hostphot` giving an extra pytest output of
---------- coverage: platform darwin, python 3.10.4-final-0 ----------
Name                                   Stmts   Miss  Cover
----------------------------------------------------------
src/hostphot/__init__.py                  24      8    67%
src/hostphot/_constants.py                 3      1    67%
src/hostphot/_version.py                   1      0   100%
src/hostphot/coadd.py                     23      1    96%
src/hostphot/cutouts.py                  170     90    47%
src/hostphot/dust.py                      59     15    75%
src/hostphot/global_photometry.py        149     51    66%
src/hostphot/image_cleaning.py            39     14    64%
src/hostphot/image_masking.py            105     18    83%
src/hostphot/interactive_aperture.py     167     54    68%
src/hostphot/local_photometry.py         114      8    93%
src/hostphot/objects_detect.py            79     39    51%
src/hostphot/rgb_images.py               159    142    11%
src/hostphot/utils.py                     87     10    89%
----------------------------------------------------------
TOTAL                                   1179    451    62%

For example, in test_cutouts.py there is a comment mentioning a failure in one of the examples -- what is the failure, and is it specifically for that ra-dec combination or can you run a different bit of sky for SDSS to PS1? Have you also verified the DES cutouts work? If a full test is too complex to run end-to-end, perhaps try breaking the tests up into smaller parts -- for rgb_images.py for example tests could be run on sky_mean_sig_clip as an individual function instead of as a larger part of a galaxy analysis.

It is hard to test steps in between as there are no reference values apart from the actual magnitudes (end product). For test.cutouts I now successfully test all the implemented surveys. I now also test rgb_images.py which was previously untested and removed some functions that were not been used.

7. In the README, under "Contributing", there is a mention that users can contact you directly, but no contact details have been provided. Would you be willing/able to provide something like an email address (or other contact service, if you had something else in mind) in the README there, so it was a little easier for users to email you with questions? Perhaps a more generic "[[email protected]](mailto:[email protected])" so it can be shared with multiple maintainers in the future, and if you didn't want to provide a personal contact?

I now redirect the users to my profile so they can get my e-mail from there. This way, if I change my e-mail in the future, it should be updated in my profile.

8. For some of the larger, common-use functions (ones the end-user is actually likely to want to call, at least), it might be good to include your real-world examples of usage (from the README/readthedocs) as part of the documentation/docstring. These would then automatically be included with `sphinx` and be in the "right" place in the API documentation as well. As an example see for instance `numpy`'s savetxt [documentation](https://numpy.org/doc/stable/reference/generated/numpy.savetxt.html) and [source code](https://github.com/numpy/numpy/blob/v1.23.0/numpy/lib/npyio.py#L1434=). Note that these do not need to run themselves, but would just be examples of what the inputs and outputs are expected to look like.

I have added some examples in the docstrings of the main functions.

9. I don't think it's necessary to duplicate the `_choose_workdir` functions within each `.py` file. From a quick test, I think you can keep `__workdir__` solely within `_constants.py` and reference it from that file's version directly. I made a minor change to delete the version of the function in `__init__.py`, and remove the underscore before the function in `_constants.py`:
__workdir__ = "images"


def choose_workdir(workdir):
    """Updates the work directory.

    Parameters
    ----------
    workdir: str
        Path to the work directory.
    """
    global __workdir__
    __workdir__ = workdir

and then in the terminal I ran

Python 3.10.4 (main, Mar 31 2022, 03:38:35) [Clang 12.0.0 ] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from hostphot._constants import choose_workdir, __workdir__
>>> import hostphot
>>> print(__workdir__, hostphot._constants.__workdir__)
images images
>>> choose_workdir('other_folder')
>>> print(__workdir__, hostphot._constants.__workdir__)
images other_folder

which didn't update the cached version of __workdir__ we imported previously, but did change the version in ._constants, so I think everywhere you use __workdir__ in the script you can instead call hostphot._constants.__workdir__ and keep all changes made in a hostphot.choose_workdir call. This allows for the deletion of the version of the function in __init__.py and every non-_constants file, and then you also wouldn't need the from hostphot._constants import __workdir__ calls either, since they would be called directly when needed.

I am starting to think that I could just fix the working directory to be 'images' as most user will probably never change it. Also, making the changes you suggest does not change the working directory everywhere, only the one from hostphot._constants (not the one from hostphot.cutouts, for example) in the current python "session".

10. Is it possible to use non-PS1/SDSS/DES data within `hostphot` at the moment? From the README "Adding other surveys" section at first glance I thought perhaps there was (reduced, basic) functionality to (generically) call another survey database and extract cutouts, convert them, extract fluxes/magnitudes etc. but from the source code it appears to be a little more involved and survey-specific, and that there is only the option to use the three listed surveys at the moment. Does this mean that the README's suggestion is to open a PR to include more options in e.g. `hostphot.cutouts.download_images`? Maybe the language could be tweaked slightly to make it a bit more obvious that "adding other surveys" means "please open a pull request to add new functions!" if this is the case.

The idea is that if the user wants a new survey to be added, they can do a PR or give me some basic information so I can add them myself. This has been rephrased. Hopefully, it is more clear now.

from hostphot.

Onoddil avatar Onoddil commented on July 19, 2024

To avoid quote replying to your quote reply and it being a bit of a mess I'll just number my replies to your replies.

  1. Great, thanks!
  2. Perfect
  3. That's fine, I just had to check as per the demands of the review list. If there's no direct comparison then no comparison can be made.
  4. Thanks
  5. It would be good to expand the tests as time goes on and new features are added, but I agree that the nature of quite a lot of the functionality limits the verification step somewhat (how do you check for "I downloaded an image?" in a way that verifies the image is exactly the same, or how do you check that an interactive GUI works in an automated script?), so it's good that you've added bits where possible.
  6. Coverage is now at 80% as you mentioned, which is much more complete than previously. With such a small codebase small bits of untestable code will have a large impact on coverage, so I'm not too sure how you'd get much higher without having to re-write a lot of the code or spending a lot of time. Again, it would be good to make sure coverage remains high as time goes on, but that's outside of the scope of the review!
  7. That's a good way to handle contact details, and thanks for adding the redirect.
  8. Thank you for adding examples to the docstrings.
  9. Just leaving the directory as a fixed folder name is probably fine to be honest, and avoids all of this complication. But yes, the idea would be to not have a hostphot.cutouts.__workdir__ (or have one per individual .py file anymore, as you have now), you'd always use the one in hostphot._constants instead. But the way you have it set up now does work, so I was just being nitpicky in trying to tidy up the code; either keeping it as it is, seeing if you like my suggestion, or leaving the directory as a fixed value would all be fine, none of that is relevant for my review at the moment so I'll just leave it as-is.
  10. The new wording is much clearer, thanks for clarifying that.

This answers all of my questions from the review so I'll close this issue!

from hostphot.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.