Giter Club home page Giter Club logo

Comments (14)

HumairAK avatar HumairAK commented on June 23, 2024 1

But I'm fine with the #13 (comment) suggestion. Try to keep the number of logical groups low and if the separation creates a problem, we'll report it and merge the components back into a single namespace so that operations can resume.

@tumido @anishasthana @4n4nd does this work for you guys?

from apps.

HumairAK avatar HumairAK commented on June 23, 2024 1

I think we're pretty settled on multi namespaces and all our current work is making this assumption. Our talks with upstream also suggest that they recognize that multi-namespaces is something they need to support. We don't foresee this changing any time soon so I'm closing this issue again, if the question arises again we can revisit. Feel free to continue linking issues/threads here to keep track of any discussions surrounding this topic.

from apps.

anishasthana avatar anishasthana commented on June 23, 2024

I'm not convinced that starting with a single namespace for all components makes sense. Per some discussions offline,

it makes more sense to deploy groups of components per namespace. i.e Jupyterhub + spark in one namespace, Superset + data catalog(once it's in) in another.

This way we can separate concerns per namespace. This will give us better control on quotas for namespaces/applications. It would also simplify management of the ODH (split logs across namespaces). It would be very painful to have to debug issues if everything was in one namespace. Catastrophic failures could also snowball into everything being down. (An admin accidentally deletes all routes as opposed to just the routes for superset)

from apps.

HumairAK avatar HumairAK commented on June 23, 2024

I think the biggest concern has been to stick close to upstream, then should we find problems/issues, we relay them back upstream.

But we already know from past experience (e.g. from idh) that grouping all components in a single namespace can be counter-productive. So in line with our operate-first goals and influencing upstream, may I suggest we do the following:

  • we list out the issues with single namespace deployment (Anish has already done some of this above)
  • we go ahead with our initial plan to use multi-namespace, with a logical grouping of components per namespace
  • we make an issue upstream linking this proposed structure and this particular thread as the motivation, and air out concerns with the upstream proposed kfdef with all components (citing reasons for why we feel this is not particularly useful) in 1 ns.

let me know what you guys think!

from apps.

tumido avatar tumido commented on June 23, 2024

Let's take this a bit broadly:

kfdef resource is a namespaced resource. It's easy to put everything in it and have it deployed into a single namespace and a single network. However it's not realistic expectation to have a customer do it.

Deploying full ODH + Kubeflow in a single kfdef would result into many pods (~50 or more) being run at the same time at the single namespace - all of that without any workload on top of it. I also expect many clashing resource names (like configmaps for odhargo and Kubeflow's argo, database pods, because all of them are named postgres, etc... ) as a result.

On the other hand, the kfdef resource is flexible enough, so you can really define any renderable manifests as an application in it. Or any set of applications. And this only defines what the kfclt operator will try to deploy into your single namespace.

I don't think that anybody in ODH upstream expects customers to deploy it all into a single namespace. Customer is probably expected to pick what they want and need and modify it. The same way, nobody it expecting customers to be running a Superset instance with a sqlite as a production service, yet that's what ODH manifests suggest. It's a showcase that is expected to be further modified to be operational.

And part of that is to scope the resources into manageable pieces. With this comes the project/namespace differentiation. Let me illustrate this on couple of examples:

  1. Do I want my spark cluster to deplete all the namespace quota while I also have Superset and Argo running in there? Definitely not. My SRE's responsibility is to ensure that if I have too many Spark instances running it's blocking the spark only, therefore I need to have it running in a separate project with appropriate resource quota on that namespace.
  2. Let's say I'm deploying ODH Argo and Kubeflow's argo at the same namespace. Now I have 2 argo instances in different versions in the very same namespace. That's not gonna behave.
  3. Let's say I deploy Airflow and the datacatalog into the same namespace. Now I have a namespace which is deploying multiple database clusters. And even 2 databases of the same kind. One postgres instance for AirFlow, one mysql for Hue, another postgres for Thriftserver... (this example is a bit artifical, since data catalog is not in the new ODH yet, though the original implementation has 2 different databases)
  4. Argo users are usually expected to have permissions for lowlevel Kubernetes resouce interactions, view pods, logs, pvcs... Do I want all my users of Argo to have access to these resources and messing around, when I'm running my personalized jupyterhub clusters and databases?

To sum it up, we shouldn't be deploying everything into the same namespaces only because we can do that and it's easy to do. It's not a reasonable and realistic expectation. We need to provide realistic service as an outcome of this project.

Let's do things the right way and deploy it in the way we see it being managed for the best SRE experience. And if we encounter issues with that? That's expected. Let's report it or even better fix it. We can provide overrides for our local setting or bundle them into overlays which can be applied on top of ODH to allow for proper inter-namespace networking...

from apps.

tumido avatar tumido commented on June 23, 2024

To me, this whole kfdef thing in a single namespace sounds like buffet, a nice little Smörgåsbord. You can pick whatever you want, fix up yourself a nice delicious plate. But nobody ever expected you to eat it all at once. And nobody's pleased you did it.

from apps.

4n4nd avatar 4n4nd commented on June 23, 2024

I agree with keeping multiple namespaces, it seems like the logical way to do it. We should create an issue on upstream ODH with a list of groupings that we think are logical and have the ODH team approve it.
This could be used as documentation by ODH "customers".

from apps.

anishasthana avatar anishasthana commented on June 23, 2024

We could also have upstream ODH point to Operate-first as a good, opinionated way to do things.

from apps.

durandom avatar durandom commented on June 23, 2024

My suggestion would be to deploy everything into one large NS and then separate out piece by piece.
Does separation out pose any problems? Like leftovers and garbage?

But I'm fine with the #13 (comment) suggestion. Try to keep the number of logical groups low and if the separation creates a problem, we'll report it and merge the components back into a single namespace so that operations can resume.

And btw, if a conflict of resources exists because of a single NS deploy, this is also a bug to be reported.

from apps.

HumairAK avatar HumairAK commented on June 23, 2024

Great write-up @tumido , I agree with pretty much everything. For this bit right here:

I don't think that anybody in ODH upstream expects customers to deploy it all into a single namespace. Customer is probably expected to pick what they want and need and modify it. The same way, nobody it expecting customers to be running a Superset instance with a sqlite as a production service, yet that's what ODH manifests suggest. It's a showcase that is expected to be further modified to be operational.

There are a lot of assumptions here I think. This may be true, but I'd argue that this is not clear. I wonder if there are docs that explain this level of flexibility as being intended for such a use case and encouraged. If there are, we should find them and confirm our thoughts. If there are not, we should either make an issue upstream asking for docs or submit a pr ourselves in the appropriate location.

from apps.

anishasthana avatar anishasthana commented on June 23, 2024

I'm happy with that proposal

from apps.

HumairAK avatar HumairAK commented on June 23, 2024

Cool, closing this issue, we'll go with the aforementioned proposal.

from apps.

durandom avatar durandom commented on June 23, 2024

@HumairAK @anishasthana looks like upstream is not really clear on single-ns vs multi-ns.

opendatahub-io/odh-dashboard#26 (comment)

I'm reopening this issue until we get clearer guidance

from apps.

HumairAK avatar HumairAK commented on June 23, 2024

We're also continuing further discussions upstream with regards to monitoring, follow that discussion here

Let's use this issue to track the various issues being spawned upstream from the multi/single namespace discussions. Please link any other issues you are all starting here, thanks.

from apps.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.