Comments (16)
Vinny,
You're right that out.params is a reference to params. That could be viewed as useful, and my initial view was that making a copy was likely to cause more confusion. But I think you're right that the outputs should be a copy -- in your script, for example, "params" is still around.
I think this should be a relatively straightforward fix.
from lmfit-py.
Thanks for the quick response, Matt. Having minimize() return a copy of the parameters would be great, and would solve issue (2). Regarding issue (1), I understand why it is useful to have minimize() modify the passed Parameters object, and why you might not want to change this behavior. But would it also be possible to add a Parameters.reset() method (or something like that) that would reset each Parameter to user_value and remove any output information from the fit, such as the standard error? That would be convenient for doing sequential fits where one wants to start with the same initial guess.
Thanks,
Vinny
from lmfit-py.
Right, it's a question of whether the copy is made on input or at output.
There is code that relies on (and most examples show) the passed in params being updated by the fit. There is also the reasonable use case (and I know of people who are doing this) of first fitting with one method (say, Nelder-Mead), and then doing leastsq() to refine the values and get error bars. Doing such a 2-step fit with "brute" would be really nice too (and a recent request).
Currently, the usage for that is:
out1 = minimize(residual, params, args=(x, y1), method='nelder')
out2 = minimize(residual, params, args=(x, y1), method='leastsq')
and with "copy params on input", this would become
out1 = minimize(residual, params, args=(x, y1), method='nelder')
out2 = minimize(residual, out1.params, args=(x, y1), method='leastsq')
print report_fit(out2.params)
which would leave params as initial values, out1.params as intermediate, and out2.params as final values. And it would allow your case of fitting several data sets sequentially:
out1 = minimize(residual, params, args=(x, y1), method='leastsq')
out2 = minimize(residual, params, args=(x, y2), method='leastsq')
to start with the same initial values for each data set.
I agree, this is the better approach, even if it breaks current code. We'll make it so that params is copied on input.
Since this will break some current usage, I think we have to bump the version number, and I think we should document this behavior very well for both cases. A nice side effect is that report_fit() could take an 'initial' argument as a Parameters dict from which initial values are read. I'd like to get a few other things in line for a version 0.8, including a) removing anneal(), b) adding brute(), c) coming up with a better way to support user-supplied Jacobians, and d) more examples in the docs.
I'm not sure 0.8 will happen very fast. But I started a branch for this change on github master. There are broken examples, and confidence_intervals and 1D models will need some tweaking to accommodate this change....
from lmfit-py.
I think i would prefer a initial attribute of the parameter instead and add and reset method to params.
This will break less code. Maybe providing a simple copy-method to the parameters-class is
also helpful.
from lmfit-py.
Hi Till, Vinny,
Yes, as I look into and think about this more, I think that:
a) copy-on-input will break a lot of code.
b) a simple copy.deepcopy(Parameters) works fine.
c) Python's pass-by-object does sort of suggest that params (an ordered dict or a list of Parameter() objects) is mutable, and should be expected to change.
I would be willing to say that, being version 0.something, it's OK to break code to fix an API problem. But I'm not currently convinced it is the right thing to do. To me, right now, copy-on-output seems the most reasonable option, so that after
fitter = Minimizer(objfunc, params, ....)
fitter.fit()
params is altered to have the best values, correlations, and so forth, and fitter.params is a copy of this, so that later work with params does not alter the fitter results. I think this will break a lot less code.
As Till says, a Parameter does have an initial value. Giving Parameter() and Parameters() both reset() methods to set the value to the initial value (or some passed in value) seems like a fine idea. Adding a copy() method would also be fine. So, I think the current favorite option is
- copy params on output (after the passed in params has been updated).
- add reset() and copy() methods to Parameter() and Parameters().
Sorry for holding so many opinions on this in a single day! Suggestions or opinions on any of this would be most welcome.
from lmfit-py.
That solution sounds reasonable. For the record (and in case the feedback is helpful) I like the copy-on-input solution better, because I found the modification of the passed Parameters object led to some unexpected results. To me it's intuitive to do something like this:
fit_results = [minimize(function, params, args=dataset) for dataset in datasets]
and have that fit all my datasets sequentially using the same initial guesses, but when I did something like this I found that the results of the fits weren't what I thought they would be (because the initial guesses are modified after each fit). Of course the modification of params is documented in the manual, but there are more complex fitting schemes where one has to keep careful track of how params is being modified. I guess my expectation is that python functions will not in general modify mutable arguments, even if they can, unless the semantics make it clear that they will. For example, if fit() were a method of Parameters, such that one would write "params.fit()" and have it modify params and return a Minimizer object, I would expect that params might be modified (I'm not suggesting adding a method to Parameters, just giving an example).
Anyway, as I see it the tradeoff is that copy-on-input will break a lot of existing code, but it may also prevent a lot of future code from being broken. I could be in the minority in what I consider "intuitive," though.
Vinny
from lmfit-py.
Chiming in as a user (but so far not a contributor) I wouldn't like to see existing code broken. I would definitely use a reset() method if it were added.
from lmfit-py.
Hi Vinny, Till, danielballan,
Thanks. The feedback is definitely helpful, and your use cases reasonable. It's true that most functions don't modify mutable arguments. As a relevant example, scipy.optimize.leastsq() takes a list of values as starting values, but does not modify these. The solution is part of the output, not changed in place. And writing your example as
fit_results = [minimize(function, copy(params), args=dataset) for dataset in datasets]
seems a little strange, as does
params = [copy(params) for i in range(len(datasets))]
fit_results = [minimize(function, pars, args=dataset) for dataset, pars in zip(datasets, params)]
Then again, I might actually make a DataSet class with a params attributes, and so do
fit_results = [minimize(function, ds.params, kw_args={'ds': ds}) for ds in datasets]
which would be fine. And, no one can rely on a function not modifying mutable input.
The principle objection seems to be "it will break existing code", which is fair, but not the only consideration. Still, I do get the feeling we're going to regret having params modified.
I wonder if it might reasonable to implement copy-on-output (and reset() and copy()) for 0.8.0, and change all the examples and docs to use Minimizer.params instead of the params argument, and deprecate this usage so that with 0.9.0 params becomes copy-on-input? I've never thought that far in advance about API changes or version numbers, but perhaps this is the right approach.
Opinions welcome! At this point, I think we shouldn't rush into any decision about copy-on-input or not. Fixing to be copy-on-output will be the first step.
from lmfit-py.
I have a possible solution that would not break code. I've written a class called Model that takes an input function, determines the parameters automatically using inspect.getargspec() (this was before I knew about wrap.wrap_function(), which does the same thing, though Model is encapsulated), and constructs a residual function. I also wrote a function called fit(), which is a wrapper around minimize(). Together the two allow me to write analysis code like this:
# create model (residual function is automatically constructed)
def corrfn(tau, base, beta, gamma):
return base + beta*exp(-2.*gamma*tau)
my_model = Model(corrfn, independent_vars = 'tau')
# extract parameters from Model function, and assign initial guesses
params = my_model.params()
params['base'].value = 1.0
# ... (more parameter assignments)
fit_result = fit(my_model, data, params)
The last line makes sense semantically to me: I fit()
a Model
to the data
by varying the params
.
If this is of interest I could clean it up and submit, time permitting. The reason I say this provides a solution is that fit() can use copy-on-input for params, so you could leave the behavior of minimize() untouched, apart from a change to copy-on-output. Model could use some sane defaults to construct the residual function, so that together fit() and Model would provide an even higher-level interface to leastsq().
Vinny
from lmfit-py.
@vnmanoharan I really like that corrfn
can be defined without incorporating anything specific to lmfit. This is convenient if you then want to use the model for plotting, etc.
This higher-level interface is similar to the one used by statsmodels, which may adopt lmfit as a dependency for nonlinear fits. I've been working on something similar over at statsmodels/statsmodels#860, but the way you use inspect
brings it home.
from lmfit-py.
Yes, that looks very nice, and a reasonable approach to doing many fits. This is similar to what wrap_function() and make_paras_and_func() do: try to make easy fits for relatively simple functions even easier. And yes, having something like this in lmfit would be very helpful.
Somewhat related, models1d.py is a start at trying to provide very easy fits to common built-in lineshapes (Gaussian, Lorentzian, Exponential, etc).
These (and much in lmfit) are very under-documented. Really, any ideas, code, examples, and docs for how to do these things better would be most welcome. I think anything in the area of "let's making modeling data easier, more robust, and more Pythonic" would be fabulous.
I don't use statsmodels or pandas regularly, but merging lmfit into either of those (if it is a reasonable fit) would be fine with me. I'm an experimentalist and even among software projects this is something of a side project for me. I'm happy that others find it a useful approach to fitting data, and I'd be happy to be part of a team working on this. I'm very much open to suggestions such as a) making a github organization for lmfit, b) having a mailing list, or c) moving it into scipy.optimize or statsmodels or similar project.
from lmfit-py.
I got really excited about the possibilities of the Model
class and tried implementing something myself in the PR linked above. But I am happy to defer to other efforts. I just threw this together because I will start using it immediately. Python really needs concise curve fitting!
from lmfit-py.
Dan, thanks for putting this together so quickly. It looks nice, nicer than what I had. I like the use of keyword arguments to pass fit parameters, which will make the code easy to use for newcomers. A few comments
- In your ipython notebook you show how to pass Parameters as keywords
result = model.fit(data, t=t,
N=Parameter('N', value=7, vary=False),
tau=Parameter('tau', value=1, min=0))
This is a great feature. I don't think you need to specify the name, though, when you construct a Parameter
. The user can omit it and just write N=Parameter(value=7, vary=False)
, and it looks like your code will still assign the name correctly. If not, this would be a handy feature to add.
- I would have Model build the residual upon construction rather than on each call to
fit()
. Sincedata
is passed from minimize() to the residual, I don't think you need to have thedata
argument in_build_residual()
- Some models may have implicit independent variables (e.g. an image, where the indices are the independent variables), so maybe
independent_vars
could be an optional argument toModel.init()
. - I would like to see fit() also take an optional parameter
sigma
and the generated residual function return(model-data)/sigma
if sigma is specified (as shown in the examples in the documentation) so that the user can easily do a weighted fit. This seems like a sane default that would be useful to a lot of people. - There might still be an issue with the passed Parameters being modified by the fit. What would happen if I did the following:
N = Parameter(value=7, vary=False)
tau = Parameter(value=1, min=0)
results = [model.fit(dataset, t=t, N=N, tau=tau) for dataset in datasets]
I could be wrong, but it seems that this would modify N and tau after each fit because these are passed by reference. Can you make the function so that it explicitly copies any Parameter
objects that are input, either in **kwargs
or params=
? That would reduce unexpected side effects (and I could close this issue).
Thanks,
Vinny
from lmfit-py.
Great. Thank you.
- Now passing
N=Parameter(value=7)
is equivalent toN=Parameter(name='N', value=7)
. - Good point. I fixed it so the residual is built once when
Model
is instantiated. - I made
independent_vars
optional. - OK,
sigma
is added. - You are correct. I wasn't quite clear on the desired behavior, but now I understand, and I followed your suggestion. Parameters passed in
params
or by keyword argument are copied on input usingcopy.deepcopy
.
Also:
- I also added missing data handling designed like statsmodels. See notebook for demo.
- I did some minor recfactoring so that pandas Series work out of the box -- no need to unpack
Series.values
.
I will add some unit tests before I hang this up, but I'm pretty happy with it.
from lmfit-py.
Woah, #57 was already merged! Did not expect that to happen so fast. These refinements are in a new PR, #58 .
from lmfit-py.
Thanks again, Dan. This resolves my issues. Sorry I did not contribute any code, but your solution was better than anything I would have written. And Matt and Tillsten, thanks for your work on lmfit -- it's made fitting much easier for me.
Vinny
from lmfit-py.
Related Issues (20)
- Test suite: Skipped tests cause test suite failures HOT 6
- Test suite failures: Pytest fails with multiprocessing, ConstantModel has not independent_vars HOT 1
- Test suite failures: Pytest fails with multiprocessing, ConstantModel has not independent_vars HOT 1
- Test suite failures: Pytest fails with multiprocessing, ConstantModel has not independent_vars HOT 1
- Test suite failures: Pytest fails with multiprocessing, ConstantModel has not independent_vars HOT 1
- Cannot select method of scipy least_squares function HOT 6
- Model.make_params() sometimes ignores set a parameter with a prefix
- an aborted fit with `method='least_squares'` does not report correct fit results and statistics HOT 3
- ModelResult.eval_uncertainty does not work for complex models HOT 2
- loaded ModelResult bug (eval_uncertainty() not working) HOT 1
- `result.params.uvars` not correctly created with Minimize HOT 1
- jacobian matrix HOT 1
- Inconsistencies with non-parameter model function arguments HOT 1
- Retain default values for keyword arguments when manually choosing param_names
- Model.fit() gives the wrong 'rsquared' when 'weights' is given HOT 3
- OS systems where multithreading.Pool is not allowed HOT 1
- lmfit-py is affected by the uncertainities Python3.12 incompatibility HOT 9
- ModelResult loads does not work with user defined parameter expressions HOT 4
- Defining `self.min_rel_change` in confidence.py as variable
- Default `method="leastsq"` in `Model().fit()` when `Parameters` have bounds HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lmfit-py.