Comments (6)
Thanks for the well thought out comments @SKRohit. Here are my answers:
- Yes I believe that would be enough.
- Every datasource is connected atleast to a data pipeline, therefore if you delete all pipelines then you will delete all datasources in essence.
- See above - if you delete all pipelines than there should be no datasources left as each datasource produces a data pipeline per commit.
In general, internally we are preparing a big change in the next month that will rewrite a lot of this logic and make things easier. For now, please implement as simple as possible logic that goes through pipelines and deletes their artifact and metadata stores. Please try to decouple functions as after the refactor it might still be useful! Thanks!
from zenml.
Hi, I'd like to work on this issue.
Please help me out with the details.
from zenml.
Thank you @harshasridhar. Thank you for the contribution, it is greatly appreciated!
Here are a few pointers:
When the user uses zenml clean
the following needs to happen.
For each pipeline in the pipeline_store
specified in the zenml_config
, you need to delete the metadata_store
and the artifact_store
. Here is how:
- Deletion of the artifact store [this can be remote or local -> using
path_utils
is important here. This should be simple. - Deletion of the metadata store [this can be remote or local] -> Local its just a sqlite so using
path_utils
works but remote it is a MySQL, in which case a sqlDROP
statement needs to be made on the specific database
Finally, the pipeline_store
needs to be deleted.
For each concept above the docs go into some detail: https://docs.zenml.io . I hope thats good for a starting point but might require more discussions. Please feel free to join the slack to chat directly. Thanks again for your effort!
from zenml.
@htahir1 I am looking into this issue. And this is what I understood and what I have doubts about.
- Every
BasePipeline
object hasmetadata_store
andartifact_store
attributes so deleting those for each pipeline would be enough? - Also, every
BasePipeline
object also hasdatasource
attribute which is aBaseDatasource
object and it has its ownmetadata_store
andartifact_store
should we consider them as well for deletion? In my opinion, I think it should be deleted separately since there is a possibility thatartifact_store
andmetadata_store
of datasources and pipelines could different let me know your thoughts. - Also, should
zenml clean
also delete datasources whaich are not related to any pipeline?
from zenml.
#540 is addressing this now in a simpler way
from zenml.
This issue has been implemented now in #540 so I'm going to close this.
from zenml.
Related Issues (20)
- [BUG]: Leaking secret experiment tracker URI HOT 3
- [BUG]: Race condition Bug HOT 3
- [BUG]: Importing `annotations` from `__future__` breaks pipeline compilation
- [BUG]: Rate limiting Vulnerability HOT 1
- [BUG]: Critical Vulnerability related to file access HOT 1
- [BUG]: Unable to open the quickstart colab notebook HOT 2
- [BUG]: Node Selector doesn"t work on job HOT 1
- Integrate `safetensors` for model serialization HOT 4
- [BUG]: Integration Materializers links not working (Documentation) HOT 5
- [BUG]: THIS RESULTED FROM USING ZENML WITH ML FLOW HOT 1
- [BUG]: File priviledge changing timing, potential TOCTOU HOT 3
- Check if Juypter is installed HOT 2
- [BUG]: {{date}} and {{time}} placeholders not being replaced HOT 2
- [BUG]: Cannot use `zenml init` with templates even after installing `zenml[templates]` HOT 3
- Add Argilla annotator stack component
- [BUG]: Unable to determine source root HOT 4
- [BUG]: GreatExpectations materializer not automatically selected
- [BUG]: Logs are not working HOT 1
- [BUG]: Switch to old version dashboard from 0.57.0 zenml-server HOT 4
- [BUG/Feature Request]: Reusing a step overwrites artifact names
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from zenml.