Comments (28)
Sure thing. To make it easier to pull in everything I suppose to make a metapackage odl
on PyPI which pulls in all subpackages as namespace packages and as a result installs odl
with all the rest thrown into its namespace.
To get just the core part, there could be an odl-core
package.
from odl.
Sure, we only need a way for odl
to not "crash" if the users dont have cuda
for example, simply excluding odl-cuda
in a nice way.
from odl.
The namespace package issue
I played around a bit with the namespace package idea yesterday, but to be honest I couldn't get it to work. Unfortunately the documentation on this topic is quite sparse and does not really cover possible problems very well. Implementation-wise my impression is that support for this feature is a bit fragile. And, worst of all, it does not seem to work properly with the pip install -e
option which is so useful that I don't want to miss it.
What one is supposed to do
There are two slightly different ways of creating a namespace package in Py2 and Py3 < 3.3, plus one for 3.3 and later.
In general, odl
would be a namespace package and odl.core
, odl.solvers
etc the subpackages. In any implementation, we would have a directory structure as follows (the folders odl.core
and odl.solvers
need not be in the same parent folder, and names are also irrelevant):
├── odl.core
│ └── odl
│ └── core
│ └── <core modules>
├── odl.solvers
│ └── odl
│ └── solvers
│ └── <solvers modules>
└── ...
1. Using setuptools
According to the setuptools doc on namespace packages, each odl
directory would have to contain an __init__.py
file with only the line
__import__('pkg_resources').declare_namespace(__name__)
In addition, each subpackage would have to use the option namespace_packages=['odl']
in the respective setup.py
call to setup()
.
2. Using pkgutil
(standard library module)
The recommended way of marking a package as a namespace package is via pkgutil.extend_path
(documentation here). In that case, one adds a __init__.py
to each odl
directory in the tree and adds the lines
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
Side note from PEP 420:
Every distribution needs to provide the same contents in its
__init__.py
, so that extend_path is invoked independent of which portion of the package gets imported first. As a consequence, the package's__init__.py
cannot practically define any names as it depends on the order of the package fragments on sys.path to determine which portion is imported first.
A I read it, we would no longer be able to propagate names to the top-level odl
namespace.
3. Implicit namespace package (Python 3.3 and later)
This feature is triggered by a subpackage not having an __init__.py
in its odl
directory. No need to do anything else, but I don't know exactly the consequences and how this feature collaborates with pip
.
Pros and Cons of namespace packages
Pros
- Nice import statements
import odl.solvers
just as ifsolvers
was a submodule in the largeodl
distribution. OneApparently, that's not what happens. One still needs to explicitly import the submodule. It just looks like it is part of the "big blob"import odl
would make the namespace of the submodules immediately accessible.odl
.- Each subpackage could be maintained separately and have its own (isolated) dependencies.
- Distro and import package names would be nicely aligned.
Cons
- Since it seems impossible to have a "parent" package and several "child" or "plugin" packages using the parent's namespace (at least with this mechanism), what is now
odl
would have to be moved toodl.core
, for example. - The nice "propagate to top level" imports would only reach the next-to-top level, i.e.
odl.Rn
would then beodl.core.Rn
. This counteracts the advantage of a unified namespace to some extent.
Alternative
Instead of making a namespace package odl
and several subpackages (including odl.core
which is now odl
), we could keep odl
as a proper package and use a common naming scheme for the packages which are based on odl
and extend its functionality.
To make distribution simpler for users, the core part could get the distro name odl-core
on PyPI, while odl
would pull in all dependencies it can get (possibly excluding CUDA and other, non-PyPI packages, e.g. ASTRA).
Conclusion
Before thinking further about how to get namespace packages working for our environment, we should decide if it provides the functionality we want to have (see Pros/Cons). What it boils down to is:
Do we want
odl.core.Rn
and nice namespace packages likeodl.solvers
with nicely aligned names in distro and import
orodl.Rn
and proper packages likeodl-solvers
imported asodl_solvers
, where we need to tailor a solution to distribute the whole bunch?
Useful Links
Here's a collection of links where I got my information from:
- PEP 420 (2012) dealing with implicit namespace packages. This PEP has two predecessors - PEP 382 (2009) and PEP 402 (2011) which were both rejected in favor of this one. It is implemented since Python 3.3, but not backported to Python 2. The situation for Py2 and Py3 pre-3.3 is described in a section of the PEP.
- A stackoverflow question with some good answers, some however before or between the PEPs, so statements about concepts being future-proof or not have to be read in that context.
- Another useful stackoverflow discussion.
- Yet another one relevant for us, rather describing the problem than a solution.
- zope.interface is a namespace package, so by simply copying what they did we should succeed, too.
- oslo, an OpenStack related project which apparently dropped out of namespace packages altogether due to poor usability. The issue description particularly mentions the trouble with the
pip install -e
option whichI have not been able to make work at allhas a workaround. They opted for a naming scheme with underscore,oslo_foo
instead of the namespaceoslo.foo
which we could consider, too. It would be easy to go back to namespace once a good solution is available (or we can just forget about Python < 3.3).
Edit: Added conclusion.
Edit 2: Added a pro (the most important one) and complemented a con.
Edit 3: Added alternative.
Edit 4: Corrected the alleged extra pro added in edit 2 and added choices in conlcusions
from odl.
Working namespace package implemented in a branch.
from odl.
Regarding Git repo layout, a possible solution would be to make a meta-repository odl
which simply includes all existing extensions as Git submodules. This would still require manual installation of all submodules, but at least the fetching process would be much simpler if one wants everything. It may also be easier to sync revisions of subpackages such that the latest master
of the meta-package would always be working (hopefully), possibly backed by CI.
from odl.
Excellent writeup. I can certainly see the issues this causes. I'll try to do some further research myself as well.
from odl.
Another alternative would be to have a single "main" package, lets call it odl
, which could contain the current code, or it could be moved to odl-core
. Anyway, we then make a bunch of sub-libraries that we name as you suggested odl-solvers
etc.
We then add these as optional dependencies in setup.py
, thus users should be able to install odl-cuda
with something like (in the odl folder)
python setup.py cuda
which should then try to install the odl-cuda
package (also hosted on PyPI).
If we finally in the odl
package do some magic in the __init__
files, we should be able to
import odl-cuda as cuda
from cuda import *
__all__ += cuda.__all__
then users could still do anything they can do atm.
If we finally put the doc inside the odl
library, we should be able to have a unified doc somewhere.
from odl.
For your proposed method, we could use the setuptools extras similar to what we do with 'testing'
right now. For example, for CUDA we would add
setup(
...
extras_require = {
...
'cuda': 'odl-cuda'}
to the setup.py
of odl
. This option would then be triggered by the command pip install odl[cuda]
. See example 6 here.
from odl.
That should be quite fine, no? I assume an user could do pip install odl[cuda,solvers]
etc then?
Main issue then would be to keep the documentation organized, It would be nice if we had a single doc.
from odl.
I hope that's how it works. In any case, we could add an all
extra target so by calling pip install odl[all]
one would get everything.
from odl.
And yes, I think we could go for that solution.
from odl.
My opinion is that we dont make a odl-core
package and leave that stuff in the odl
package.
from odl.
I fully agree.
from odl.
So we had a good discussion on this yesterday, which I'll summarize here.
At the core of this problem, there are two options. Having a monolithic package, or a core package with "add-ons".
As discussed above, namespace packages seem to have a bunch of issues themselves, and do not seem fit for this package.
A branch for integrating solvers with odl
using a structure where odl_solvers
is a subpackage och odl
was attempted. This also turned out to have several issues, one is that we would not be able to import odl.solvers
as one would expect. Another issue is that the doc becomes fragmented (we need to have the solvers
doc inside the main package).
Another method would be to have no formal connection between the libraries, instead letting odl_solvers
have odl
as a dependency. This is unfavorable since users would then need to import and manage several packages, causing a more complicated install process and workflow.
A final method is the method used by py.test, which has an addon structure which is loosely coupled. They use a plugin architecture using dynamic discovery. This would be useful if we had a very tightly specified interface and wanted users to be able to add plugins. An example on how to use this is that plugins could register FnBase
spaces, and we could import them dynamically in uniform_discr
. The issue with this is that it seems you cannot directly access the plugins as an external user, nor is the doc automatically merged.
Proposal
In light of this (and unless we find another method), I propose that we re-merge all sub packages into the main odl package. Each of them should be inside odl/SUB_PACKAGE
and they should not be imported togeather with odl
i.e. one needs to write odl.solvers.conjugate_gradient
instead of simply odl.conjugate_gradient
.
from odl.
The proposal sounds like a good idea, however I'm not completely sure of what you mean. Will one have to do imports like
import odl
import odl.solvers
or is it simply when calling that one has to do odl.solvers.whatever_method
?
from odl.
What I mean is that user can either do:
from odl.solvers import whatever_method
whatever_method
or
import odl
odl.solvers.whatever_method
or
import odl.solvers
odl.solvers.whatever_method
The won't be able to do
from odl import whatever_method
or
import odl
odl.whatever_method
from odl.
import odl
import odl.solvers
@aringh This is how it would look like with namespace packages. odl.solvers
would be an independent package which only looks like it belongs to odl
.
So in an ideal world, there would be a solution which allows us to
- have a main package
odl
, - have several add-ons like the solvers as separate packages with isolated dependencies,
- install everything with one command only
- get all with one import
import odl
as subpackagesodl.solvers
e.g. - maintain continuous integration and documentation in one place (main repo)
Since we're not living in an ideal world, we don't get all that.
- Namespace packages: probably impossible to get 1. and 4., with a big question mark for 5.
- Separate packages: 4. is not possible, especially not the given syntax. We would have to settle with something like
odl_solvers
, and each import must be explicit. Question mark for 5. - Monolithic package: Naturally no 2., but the rest is possible. However, we have some experience with compartmentalizing dependencies (like CUDA), so this should be doable.
I agree to merging back to main.
from odl.
Do you also agree on assigning the task to yourself? :)
Final note:
We could add labels "solvers", "core", etc to make working with issues easier.
from odl.
Yeah, what the heck ;-)
from odl.
Actually putting odl
into PyPI
is the last of the major "making ODL a serious package" things we have to do, IMO this should be prio now.
from odl.
I have an account on PyPi, and the procedure seems to be very simple. The thing is more when do we dare to attach a version number to it. Right now it's a bit arbitrary 0.9-ish. Hopefully we only change things under the hood, but since the library is quite heterogeneous in that respect, it's hard to get right.
What we could do is to call it 0.9 and mark the more shaky parts as experimental and likely to undergo large changes.
from odl.
My suggestion is that we simply make a 0.10 tag about now (just make sure all tests and doc works), and then release it with a big disclaimer.
from odl.
Okay, good. 0.1 it would then be. People probably expect 1.0 after 0.9, not 0.10
from odl.
No, 0.10 goes after 0.9, see for example the examples in PEP440.
See also wikipedia on the issue
Most free and open-source software packages, including MediaWiki, treat versions as a series of individual numbers, separated by periods, with a progression such as 1.7.0, 1.8.0, 1.8.1, 1.9.0, 1.10.0, 1.11.0, 1.11.1, 1.11.2, and so on.
If 0.1
came after 0.9
version checks version >= (0, 9)
would fail
from odl.
Okay, I'm convinced. Maybe that's also a way to indicate that the 1.0 release may come at any point, for example after 89 further releases until 0.99. So users don't expect 1.0 to come soon when current status is 0.9. Let's go for 0.10 then.
from odl.
There seems to be a nice way to integrate Python package index testing into our upcoming Jenkins environment, namely with devpi. It's a simple index server implementation with some interesting features:
- It caches packages installed from PyPi when installing with the
devpi
command. - One can upload to it a package and then test-install with
pip install -i <devpi server>
. - It can trigger a Jenkins test every time a package is uploaded so installing with
pip
becomes part of the CI - For compiled packages, one can upload a wheel which speeds up installation significantly but also solves issues with virtualenv where I had a hard time installing Scipy.
- Possibly one can even combine those tools plus tox to locally run a mini-CI.
I suggest we have a look into this and consider integrating in into our CI workflow.
from odl.
Oh my goooood! does happy dance
I guess odlpp is missing in that for now?
from odl.
Yes, sorry :-) We can try to build a wheel from the source. I don't think it's that hard, we just need to add an option to CMake.
from odl.
Related Issues (20)
- Mathematical details about the `odl.tomo.operators.ray_trafo.RayTransform` HOT 1
- FanBeam and FanFlat geometry error HOT 1
- fan2para
- scipy warning with BroadcastOperator and sparse matrices HOT 1
- Issue when using odl and import pytorch HOT 5
- FISTA for tomographic reconstruction HOT 6
- Do ADMM and PDHG algorithms have positivity constraints or padding?
- ImportError: cannot import name 'OperatorAsModule' from 'odl.contrib.torch' HOT 1
- Proximal of LInfty wrong? HOT 7
- pytorch autograd depreciated HOT 4
- FBP not scaled when PYFFTW_AVAILABLE is True HOT 4
- odl.tomo.backends.skimage_randon.skimage_radon_back_projector not compatible with skimage==0.19
- primal_dual_hybrid_gradient solver: why compute derivative ? HOT 1
- Reconstruct Mayo dataset - errors in importing statements
- odl.operator.operator.OpDomainError: unable to cast tensor to an element of the domain uniform_discr
- question about odl_torch.OperatorModule
- Error with compatibility with latest numpy version HOT 1
- RayTransform: ValueError: The 'astra_cuda' `impl` is not found. HOT 6
- The problem about indexing parallel2Dgeometry HOT 1
- BroadcastOperator doesn't work with scipy > 1.8.1 HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from odl.