Comments (27)
I've copied the data from /project/projectdirs/lsst/jamesp/extracted
to /home/jchiang/DC1/extracted
on lsst-dev. @laurenam let me know if you have problems accessing those files.
from ssim_dc1.
@laurenam I've changed the permissions. Please try again.
from ssim_dc1.
Good point! Take the above as useful information for future, multiband, releases :)
from ssim_dc1.
I basically need the outputs of a processing run in a properly butlerized repo (I usually think of this as a given processing as a rerun). For the visit-level analysis, a single visit will suffice to test out the code and at the coadd-level, a single patch should do (I don't need all the visit-level inputs that went into the coadd). We have a colorAnalysis script to look at the stellar locus on color-color plots, but I'm under the impression you only have one band in this DC, so I won't be able to test those scripts yet.
I should also add that, in order to do this work, I will need first need to talk with my pointy-haired superiors to see if they can prioritize this and get it into a sprint...but I am willing!
from ssim_dc1.
Thanks!
basically need the outputs of a processing run in a properly butlerized repo (I usually think of this as a given processing as a rerun). For the visit-level analysis, a single visit will suffice to test out the code and at the coadd-level, a single patch should do (I don't need all the visit-level inputs that went into the coadd).
The think that I am a bit confused by (and it is probably trivial) is that we have a properly butlerized repo. But, it has 75Tb in it! So, I'm not sure how to copy just the pieces out of it you need. Is that obvious?
We have a colorAnalysis script to look at the stellar locus on color-color plots, but I'm under the impression you only have one band in this DC, so I won't be able to test those scripts yet.
Yes, DC1 is 40 sq degrees in r-band only. DC2 will likely by 300 sq degrees in all 6 bands.
from ssim_dc1.
Sorry for the confusion. The qa scripts are run as CommandLineTask
s, so they are very similar to running any other CommandLineTask
s (e.g. processCcd.py). As an example, on lsst-dev
, I used the command:
"hscVisitAnalysis.py /datasets/hsc/repo/ --rerun RC/w_2017_28/DM-11184/:private/lauren/DM-11090/w_2017_28/ --id visit=1166 ccd=0..8^10..103 --tract=9813 --config doApplyUberCal=False"
to produce the plots for the processed single frame visit data that lives in /datasets/hsc/repo/rerun/RC/w_2017_28/DM-11184
(and the tract information is actually required for the visit-level analysis, so your skymap is required here as well). The output goes to /datasets/hsc/repo/rerun/private/lauren/DM-11090/w_2017_28/
in this case. While that repo contains ~8Tb, to run the above I only need the outputs relevant to that particular visit (including the schema, config, metadata, and, of course, the calexps and catalogs).
The same goes for coadd, where the data is identified by tract/patch/filter, e.g. --id tract=9813 patch=4,5 filter=HSC-I
. So there I would only need all the deepCoadd output relevant to patch 4,5.
So you would only need to copy over the content of the repo relevant to a single visit and a single patch...does that make sense?
I will need to add some datasets to obs_lsstSim
since they are not part of the common list in obs_base
.
from ssim_dc1.
So you would only need to copy over the content of the repo relevant to a single visit and a single patch...does that make sense?
Right. So this is the part I am asking for help on.
We have 75 TB of stuff in these directories.
cori08:DC1-imsim-dithered % ls
_mapper deep_assembleCoadd_metadata/ ref_cats@
background_values/ deep_makeCoaddTempExp_metadata/ ref_cats_orig/
calexp/ eimage/ registry.sqlite3
config/ icExp/ schema/
deepCoadd/ icSrc/ src/
deepCoadd-results/ processEimage_metadata/ srcMatch/
Each of the directories has all of the visits in them. So, is there a straightforward way for us to extract what you need?
from ssim_dc1.
Ok, looking at LsstSimMapper.yaml
, I think I need:
- _mapper
- calexp/v${VISIT}-f${FILTER}/*
- config/*
- deepCoadd/skyMap.pickle
- deepCoadd-results/${FILTER}/${TRACT}/${PATCH}/*
- deep_assembleCoadd_metadata/${FILTER}/${TRACT}/${PATCH}.boost
- registry.sqlite3
- schema/*
- src/v${VISIT}-f${FILTER}/*
- srcMatch/v${VISIT}-f${FILTER}/*
Also, I will need to be able to setup and use the same reference catalogs you used...how does that work for sims?
from ssim_dc1.
Also, I will need to be able to setup and use the same reference catalogs you used...how does that work for sims?
I'm going to let @SimonKrughoff answer or point you to the correct person (either @danielsf or @jchiang87 may also know).
Is there someone interested in working with Lauren to copy these files for her and then later using her DM QA scripts to run on our output? @jamesp-epcc Would you be interested in this as a way to start to get familiar with the output?
from ssim_dc1.
Yes, I am happy to give this a go. It sounds like a good way to get familiar with the data.
from ssim_dc1.
Sorry for asking such a basic question, but what is the full path to the "DC1-imsim-dithered" directory on Cori? I've had a look in a few places but can't find it. It may be in a location that I don't currently have permission to access...
from ssim_dc1.
/global/cscratch1/sd/descdm/DC1/DC1-imsim-dithered
from ssim_dc1.
You can find the DC1 reference catalog at
/project/projectdirs/lsst/danielsf/dc1_reference_catalog_8deg_radius.txt
from ssim_dc1.
Thanks. I am able to access the data now. I have written a script that extracts the files Lauren listed above for a single visit and single patch. This comes to about 5 or 6GB for the ones I have tried so far. What would be the best way to transfer this data to Lauren?
from ssim_dc1.
Lauren needs this data on lsst-dev. This is a machine used by the LSST developers at the NCSA center in Illinois. I don't have access to the machine but there are many who do. We need to find someone who can copy the data there (since Lauren doesn't have access to NERSC). Simon is out for a few days since he is in the process of moving to Tucson.
Who is a good candidate for this?
from ssim_dc1.
I don't have access to lsst-dev myself, but I have put the extracted data in /project/projectdirs/lsst/jamesp/extracted on Cori, where I believe someone else on the project should be able to read it. (This data is for visit 1993939 and patch 18,13 but I can easily rerun the script if a different visit or patch is preferable).
from ssim_dc1.
I can do it.
from ssim_dc1.
I can do it.
Great.
from ssim_dc1.
lauren@lsst-dev01:~ $ ls /home/jchiang/DC1/extracted
ls: cannot access /home/jchiang/DC1/extracted: Permission denied
@jchiang87 I think you need to give me read permission on your home dir.
from ssim_dc1.
Have you run the forced measurements on the coadds as part of DC1? I don't see the forced source files in the directory @jchiang87 created on lsst-dev /home/jchiang/DC1/extracted
. Of potential note, forcedPhotCoadd.py
gets run when using the multibandDriver.py
driver script in pipe_drivers
, but not when running the non-driver multiband.py
command-line version. My plotting scripts make great use of the forced output (it's an extremely important dataset for many science cases)...would it be possible for you to run that on (at least) tract=0 patch=18,13 and add the output to the above directory?
from ssim_dc1.
I might be able to run it myself, but I would need the contents of the deepCoadd-results/merged/
directory for tract=0 patch=18,13.
from ssim_dc1.
I'm fairly certain that we don't run forcedPhotCoadd.py
in our version of the Level 2 pipeline, but @SimonKrughoff or @tony-johnson would be able to say more definitely. In case you want to try to run it yourself, I copied the contents of that directory from NERSC to /home/jchiang/DC1/extracted/deepCoadd-results/merged/0/18,13
on lsst-dev. Otherwise, if you send me the full command line, I could try to run it at NERSC.
from ssim_dc1.
Thanks...I think I've got it running now. For reference, the command line looks something like
forcedPhotCoadd.py /home/jchiang/DC1/extracted/ --output /datasets/hsc/repo/rerun/private/lauren/DM-11452/ --id tract=0 patch=18,13 filter=r
from ssim_dc1.
Just for my education:
I thought forced photometry was run on all of the warped exposures in order to get the flux we see for cmodel and psf_flux. Is that something else?
from ssim_dc1.
We are running forcedPhotCcd.py
on the warped images (?) for each visit for the lightcurves in Twinkles (but not for the DC1 data). I think the cmodel and psf_flux measurements for DC1 are obtained from the measureCoaddSources.py
task. Not sure how those results would differ from the output of forcedPhotCoadd.py
.
from ssim_dc1.
Some info on forced measurements from Jim Bosch's HSC pipeline paper (https://arxiv.org/abs/1705.06766):
The final step is another run of the source measurement suite, but this time in forced mode: we hold all position and shape parameters fixed to the values from the previous measurement in the reference band. This ensures that the forced measurements are consistent across bands and use a well-constrained position and shape, which is particularly important for computing colors from differences between magnitudes in different bands.
from ssim_dc1.
Ah, so if this is forced across bands for the merged exposures, is it relevant for us since we are only using r-band in DC1?
from ssim_dc1.
Related Issues (20)
- DC1 dataproduct retention HOT 5
- Ingest simulation truth info on DC1 objects in db tables HOT 3
- List of in-region patches for DC1-phoSim-3a Level 2 outputs HOT 1
- Differing coordinate system conventions in CatSim and PhoSim HOT 22
- SUNYSB Hack Day output HOT 14
- Characterize the issues with the PhoSim background. HOT 18
- Track items for which reprocessing would be helpful
- Produce new truth catalogs HOT 2
- Resolve 20 mas astrometry bias in the DC1 datasets HOT 2
- create dask dataframes for DC1 data with analysis flags HOT 5
- Run QA analysis scripts on DC1 output HOT 49
- Run L2 pipeline on a subset of DC1 data using v14_0 (or later) HOT 15
- We should add a DC1 landing page HOT 3
- Try QA Parquet Table Interface
- Write down details about dithering strategy/dithering effort
- Update scripts in this repo HOT 6
- perform psf and shape tests on DC1 HOT 2
- Help with Notebook Tutorials HOT 5
- Check the DESC note and finish it up
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ssim_dc1.