Comments (14)
You'll also want to include a local linearized world coordinate system.
I think we may need to treat large galaxies differently in more ways than subsampling. Take Andromeda, for example, and imagine hypothetically that we had enough computing power to actually fit the ELBO jointly for it and all the celestial objects in front of it. Our galaxy model is pretty primitive, and would not capture most of the variation in light from Andromeda. The ELBO would then try to explain deviations from our primitive model for Andromeda and the actual light in part with parameters from the objects in front of it.
In other words, by trying to fit the residuals from Andromeda, which will be highly structured, Celeste will bias the objects in front of it.
For truly large galaxies like this, we might be better off treating the large galaxy as background noise, which we would have to estimate ourselves, possibly iteratively. One way to think of this is as our "model" of Andromeda is just a rasterized pixel map, which we would then fit jointly with all the objects in front of it. Andromeda's catalog entry would then be derived from this pixel map, rather than fit jointly with everything else. I'd be interested to hear other ideas, though. How does the current catalog handle this problem?
from celeste.jl.
I added linearized world coordinates to the list.
It's hard to predict exactly how large galaxies are going to give us trouble. How about we first try to optimize each object independently, with the data encoded as described in this issue, to get familiar with running at scale and to see where the model make mistakes? @rcthomas , is encoding the data this way something you might be up for working on, along with me and ryan?
from celeste.jl.
Oops sorry I missed the mention. Yeah I think I can help with this; let's talk about it at the meeting tomorrow.
from celeste.jl.
I put the current version of relevant files in bin/staging
.
from celeste.jl.
Hi @kbarbary -- I've been thinking a bit more about how this issue. You might start by consolidating a lot of the scripts in the bin
directory into a single script the supports the following kind of sample usage:
./run_celeste.jl --stage preprocess --run 3900 --camcol 4 --field 829
Stages might include "preprocess" (generates a JLD file of Task objects), "infer" (runs OptimizeElbo for each Task, and outputting a fits catalog) , and "score" (compares outputted catalog to a catalog build for "coadd" images). Each stage reads the previous stage's output from disk and writes its output to disk, in a file named after its stage, run, camcol, and field.
Operating on run-camcol-field triples, rather than bricks/bulkans, should get us started generating big catalogs more quickly.
from celeste.jl.
To reiterate a point that Rollin made now that Kyle is in the room, running
tractor in NERSC is tricky, so we should plan on something like
download_fits_files.py having been run to get all the necessary files in
one place before julia gets invoked.
On Thu, Feb 4, 2016 at 10:30 AM, Jeffrey Regier [email protected]
wrote:
Hi @kbarbary https://github.com/kbarbary -- I've been thinking a bit
more about how this issue. You might start by consolidating a lot of the
scripts in the bin directory into a single script the supports the
following kind of sample usage:./run_celeste.jl --stage preprocess --run 3900 --camcol 4 --field 829
Stages might include "preprocess" (generates a JLD file of Task objects),
"infer" (runs OptimizeElbo for each Task, and outputting a fits catalog) ,
and "score" (compares outputted catalog to a catalog build for "coadd"
images). Each stage reads the previous stage's output from disk and writes
its output to disk, in a file named after its stage, run, camcol, and field.Operating on run-camcol-field triples, rather than bricks/bulkans, should
get us started generating big catalogs more quickly.—
Reply to this email directly or view it on GitHub
#95 (comment)
.
from celeste.jl.
The fits files are mostly (all?) already in the cosmo repository at
NERSC---we shouldn't need to download anything.
On Thu, Feb 4, 2016 at 10:38 AM Ryan [email protected] wrote:
To reiterate a point that Rollin made now that Kyle is in the room, running
tractor in NERSC is tricky, so we should plan on something like
download_fits_files.py having been run to get all the necessary files in
one place before julia gets invoked.On Thu, Feb 4, 2016 at 10:30 AM, Jeffrey Regier [email protected]
wrote:Hi @kbarbary https://github.com/kbarbary -- I've been thinking a bit
more about how this issue. You might start by consolidating a lot of the
scripts in the bin directory into a single script the supports the
following kind of sample usage:./run_celeste.jl --stage preprocess --run 3900 --camcol 4 --field 829
Stages might include "preprocess" (generates a JLD file of Task objects),
"infer" (runs OptimizeElbo for each Task, and outputting a fits catalog)
,
and "score" (compares outputted catalog to a catalog build for "coadd"
images). Each stage reads the previous stage's output from disk and
writes
its output to disk, in a file named after its stage, run, camcol, and
field.Operating on run-camcol-field triples, rather than bricks/bulkans, should
get us started generating big catalogs more quickly.—
Reply to this email directly or view it on GitHub
<
#95 (comment).
—
Reply to this email directly or view it on GitHub
#95 (comment)
.
from celeste.jl.
I have mapped the files that you are getting via download_fits_files.py to where they are on the NERSC global file system. See e.g. from Cori:
/global/projecta/projectdirs/sdss/data/sdss/dr8/boss/photoObj/301/1000/1
This corresponds to looking here:
http://data.sdss3.org/sas/dr8/groups/boss/photoObj/301/1000/1/
Note also that data.sdss3.org is sdss3data.lbl.gov; so this is where they are serving the data up from that you are getting with download_fits_files.py.
from celeste.jl.
Thanks for the additional pointers.
Off-topic: It's not the highest priority, but I started working on a Julia replacement for the tractor bits, based on the Python astroquery.sdss module. This module is BSD-licensed, so vastly preferred over porting tractor, which is GPL (and therefore incompatible with the Celeste and SloanDigitalSkySurvey licenses). It's not much code so should be pretty quick.
from celeste.jl.
Ryan's point is that tractor doesn't have to be a "dependency" at all. You generate the inputs once as csv and you are done. That simplifies the problem a lot. There is a setup that works on Cori under hpcports, you run the setup once and store that as file data.
from celeste.jl.
Were we using tractor for something other than downloading fits files?
Because that's a pretty heavyweight way to go, to put it mildly, for
downloading fits files---you can just fetch those files with wget from
sdss3.org, right? Or was tractor doing something else?
On Thu, Feb 4, 2016 at 11:11 AM R. C. Thomas [email protected]
wrote:
Ryan's point is that tractor doesn't have to be a "dependency" at all. You
generate the inputs once as csv and you are done. That simplifies the
problem a lot. There is a setup that works on Cori under hpcports, you run
the setup once and store that as file data.—
Reply to this email directly or view it on GitHub
#95 (comment)
.
from celeste.jl.
It's a bit more than wget but it's a limited subset of functionality.
from celeste.jl.
This issue is getting unwieldy, so I broke this up into a multiple issues (#114, #115, #116, and #117).
from celeste.jl.
Thanks, those smaller more detailed issues are helpful. Now that the WCS dust has settled, I can move on to working on those.
from celeste.jl.
Related Issues (20)
- Can't import Celeste, TypeError: SizedArray: in S, expected S<:Tuple, got Tuple{Int64,Int64} HOT 7
- use Documenter.jl for code-level documentation
- Jupyter integration
- RFC: Change e_scale parameter and CatalogEntry.gal_scale from pixels to degrees
- refactor PSF.jl and remove Transforms.jl
- greedy approach to joint inference
- merge Config and read_settings_file()
- set `active_pixel_bitmap` in a better way HOT 4
- fix stars classified as galaxies HOT 1
- fix WARNING: eval from module Main to Model
- require FITSIO v"0.11" HOT 2
- unit tests for AccuracyBenchmark.jl
- unit tests for ParallelRun
- restore Mac OS X support HOT 6
- exclude saturated light sources from AccuracyBenchmarks HOT 1
- segfault in MCMC test_infer with multiple threads HOT 4
- incremental compilation broken on Mac HOT 5
- Celeste relies on case-insensitive names of columns HOT 1
- Package compatibility caps
- Info about upcoming removal of packages in the General registry
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from celeste.jl.