zooniverse / theia Goto Github PK
View Code? Open in Web Editor NEWBuilding the next-generation Floating Forests pipeline
Building the next-generation Floating Forests pipeline
floating forests only cares about coastline images. detecting coastlines is hard, but maybe the answer lies in a) pixel_qa data that estimates water vapor or b) a simple neural net to look at the rgb histograms from #16
images with no water or images that are only water shouldn't be uploaded
service should just do more logging in general so we can look at the kube logs to see what it's doing
locate potential scenes for download according to user criteria
Needs to describe image acquisition request, including search criteria, target project, and configuration options.
Desired Behavior: When visiting the Theia homepage, I can click the "Connect to Zooniverse" button to login/authenticate.
Current Behavior: The footer component of the webpage obscures the "Connect to Zooniverse" button and I cannot click it to follow the login link. Note: if I use dev tools to hide the footer component, I'm able to click the button and follow the link.
Device Info: Chrome 89 on Mac OSX 10.14, also tested and see same behavior for Firefox 87.
Related to #13
mocking parts of the system has been difficult, read this
Currently celery tasks are retried forever, which has the potential for a ton of noise. It's possible to specify pretty easily how many retries a task should get and what kind of back-off period there should be before retrying
I had to create an issue because you can't upload files to the wiki
They should be pretty much done but there may be a little additional work needed in the adapter to get them looking right.
Since tiles are an entire directory full of images we need a way to process all of them in a stage, especially for uploading them all
When using oauth login links we have to be careful to avoid allowing activation of the oauth login process without ensuring the request originated by a known logged in user.
This is a recent exploit that was raised in rails land via omniauth/omniauth#809
A mitigation would be a CSRF validation via a POST method to the social auth routes before redirecting to the upstream social auth provider. Depending on what your application does with the upstream user data it may be a vector for account take over. I assume in this app it won't be, most likely changing the logged in user at worst but something to keep in mind with oauth flows.
If we want to be able to deploy this in EC2 then it should be containerized so it can run on kubernetes.
build config file to load theia containers into the kube
Current behavior: The current app pulls static images (for logos, etc) via Wordpress upload URLs (e.g., https://chelseatroy.com/wp-content/uploads/2020/07/nasa-partner.png).
Desired behavior: Preference would be to add these images as static assets bundled with the app (e.g., in /static dir?), but also could move these to Zoo-hosted blob storage (e.g., https://static.zooniverse.org/assets/zooniverse-icon-web-black.png).
Right now anyone who can access the /api
routes can do configuration for any project without any kind of authentication or authorization. It'd be ideal to get django-rest-framework to play nice with django-social-auth but this has been quite a struggle.
need to integrate with:
Since we're going to tentatively build this pipeline out in Python, we need to select Python versions of our familiar tools:
As well as tools for handling some novel challenges:
Necessary operations:
Create kubernetes secrets for staging and production API keys
The plan for the first iteration of the pipeline is to have users create a KML file in something like Google Earth and then upload it to search for imagery. It'd probably be nice if we could figure out how to parse that format.
possibly flower:
the package is already installed for doing this, there's just some configuration work to be done.
Wrapper around USGS ESPA web services to allow ordering of scenes etc
Users of the pipeline will need to be authenticated with their panoptes credentials to ensure they can access the project they want to add images to
a manifest file relating image filenames to their various geo-coordinates should be generated using a gis_operation and then we should use that in conjunction with uploading subjects to panoptes
There is a small web form app that can authenticate users with panoptes. Fetch a list of projects they can see from panoptes, then check the local db to see which pipelines are associated with that list of projects.
There is a pixel_qa channel that estimates which pixels are likely to be clouds, and tiles which have mostly that should be skipped and not uploaded to panoptes
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.