Giter Club home page Giter Club logo

Comments (8)

Bridg109 avatar Bridg109 commented on August 20, 2024

Right now all dependencies are imported through sagerx.py and user_defined_macros...which should allow us to move things outside of dynamic dag pretty easily.

Another consideration here using JSON files to store this data. JSON would allow us to define functions in it to pass to dataset_dicts if needed as one offs. There are pros and cons (I think mostly cons) but just an alternative

https://stackoverflow.com/questions/51936785/store-python-function-in-json/51938459

from sagerx.

Bridg109 avatar Bridg109 commented on August 20, 2024

Could also consider moving dynamic DAG into its own folder with settings files for each DS also stored in same folder. This way moving them out of dynamic DAG will be easy

from sagerx.

yevgenybulochnik avatar yevgenybulochnik commented on August 20, 2024

Sorry totally missed these comments earlier. Im still somewhat torn about the best way to handle this. The JSON alt makes sense but I do think the cons outway the pros if we are using eval and functions as strings. One of the comments kinda points out that this is dangerous. Im not sure how risky it actually is for our use case from a security perspective, but everywhere else I have read you want to be extremely cautious about this.

from sagerx.

Bridg109 avatar Bridg109 commented on August 20, 2024

Alternative consider making independent DAGs for each data source. Then using either functional or object paradigm make a generate tasks code block that can be placed as a single line of code to generate all tasks for each DAG. This will module and abstract the DAG functions, organize everything.

Looking at the datasets we currently have there is benefit on a few (CMS, RxNorm) to adjust our process and grab historical files. So I feel like in the short future we will be pulling alot of these out of the dynamic DAG anyways, so my thought is how do we do that and still keep the code maintainable and able to share a common set of processes.

This is pretty complex and might be above my software engineering skills at the moment.

from sagerx.

Bridg109 avatar Bridg109 commented on August 20, 2024

As example the Download_dataset should be a class in its own module. With subclasses and methods that can be adjusted from a dataset info class.

Then a SQL class should be formed in its own module

Then a DS class should call both of these as part of composition

from sagerx.

jrlegrand avatar jrlegrand commented on August 20, 2024

@Bridg109 / @yevgenybulochnik - does this issue represent the huge topic of re-structuring the dynamic DAG into like functional programming or a class system? What would be a better name for this issue than "DS Dict" to capture that? Feel free to rename it.

from sagerx.

jrlegrand avatar jrlegrand commented on August 20, 2024

See notes about RxNorm and NADAC on this PR when we tackle this issue: #158

from sagerx.

jrlegrand avatar jrlegrand commented on August 20, 2024

Fixed by #215

from sagerx.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.