Comments (8)
Okay. There are a couple ways to use your GA implementation.
So the _opt
objective function (and throughout my other code) I assume the first and last breakpoints at x.min()
and x.max()
. Since the first and last point aren't actually connected to other lines, this shouldn't make much difference. If you don't mind having breakpoints at x.min() and x.max(), then you can use
number_of_line_segments = 2
degree = 3
my_pwlf = pwlf.PiecewiseLinFit(x, y, degree=degree, disp_res=False)
my_pwlf.use_custom_opt(number_of_line_segments)
total_set = set(np.floor(my_pwlf.x_data))
pop, hof, stats = genetic_algorithm(total_set, my_pwlf.nVar,
my_pwlf.fit_with_breaks_opt, ngen=20,
mu=125, lam=250, cxpb=0.7, mutpb=0.2,
tournsize=5, verbose=True)
print(hof[0])
x_opt = [my_pwlf.x_data.min()]
x_opt += list(hof[0])
x_opt.append(my_pwlf.x_data.max())
ssr = my_pwlf.fit_with_breaks(x_opt)
plt.figure()
plt.plot(x,y, label='data')
# predict
xHat = np.linspace(min(x), max(x), num=10000)
yHat = my_pwlf.predict(xHat)
plt.plot(xHat, yHat, label='predict')
plt.legend()
plt.show()
print('breaks: ', my_pwlf.fit_breaks)
Which would give you something like
ssr: 104503.32784991479
breaks: [ 32.90733248 40. 748.29923958]
However, if x.min() and x.max() must indeed be in your set. Then we need to make a couple changes.
number_of_line_segments = 2
degree = 3
my_pwlf = pwlf.PiecewiseLinFit(x, y, degree=degree, disp_res=False)
my_pwlf.use_custom_opt(number_of_line_segments)
total_set = set(np.floor(my_pwlf.x_data))
pop, hof, stats = genetic_algorithm(total_set, number_of_line_segments+1,
my_pwlf.fit_with_breaks, ngen=20,
mu=125, lam=250, cxpb=0.7, mutpb=0.2,
tournsize=5, verbose=True)
print(hof[0])
ssr = my_pwlf.fit_with_breaks(list(hof[0]))
plt.figure()
plt.plot(x,y, label='data')
# predict
xHat = np.linspace(min(x), max(x), num=10000)
yHat = my_pwlf.predict(xHat)
plt.plot(xHat, yHat, label='predict')
plt.legend()
plt.show()
print(ssr)
print('breaks: ', my_pwlf.fit_breaks)
which gave something like
ssr: 123888.40048433887
breaks: [ 34. 40. 187.]
Hopefully this should help
from piecewise_linear_fit_py.
Might need to use a different optimization algorithm for this. GA or EGO work well with integers, but this may actually not be faster optimizations. The advantage in your case would be that the optimum output will always be an integer.
I had an experimental branch called exga, I was using DEAP to build a custom GA from a discrete set.
If I understand correctly, possible breakpoints will be between integers between [0, 1023]?
Here is my Genetic algorithm:
https://github.com/cjekel/piecewise_linear_fit_py/blob/expga/pwlf/ga.py
And here is how I called such algorithm (focus between t2 = time() and t3 = time()):
https://github.com/cjekel/piecewise_linear_fit_py/blob/expga/examples/compare_fitfast_and_ga.py
You would change total_set = set(my_pwlf.x_data)
to be the set of discrete integers for possible breakpoints.
This was fairly experimental, and I didn't play too much with the hyper parameters of the GA.
Let me also point you in the direction of EGO, but I'll need some time to come up with the code.
from piecewise_linear_fit_py.
The code in https://github.com/cjekel/piecewise_linear_fit_py/blob/expga/examples/compare_fitfast_and_ga.py is using the incorrect objective function. It should be my_pwlf.fit_with_breaks_opt
in line 30. You could then use something like the following example to populate the my_pwlf
parameters correctly.
I've added an example of using EGO to do this. https://github.com/cjekel/piecewise_linear_fit_py/blob/master/examples/EGO_integer_only.ipynb
from piecewise_linear_fit_py.
Hi,
Giving more details, my dataset is actually all float because it is a data collected from a digital oscilloscope, for example, for a particular current sensor I might have values from [0, 150]
amperes for Y and [0, 5]
volts for X. The microcontroller transforms this X values into digital domain representing it with an unsigned integer of 10 bits ([0, 1023]
, as you correctly understood).
To be able to linearize it and generate a polynomial that transforms the [0, 1023]
values into [0, 150]
, I apply a scalar transformation in my X so that [0, 5]
becomes [0, 1023]
.
One option would be to restrict the final precision representing the resulting range with unsigned integers, but it doesn't perform very well.
Instead, I found that preserving the float precision in this transformation leads to a better result in the linearization process, so in this way my X is actually a float value, but the breaks should occur at integers only. You can see in details the application in this notebook.
About GA:
Following your GA example (but using fit_with_breaks_opt
at lines 29
and 33
, as you mentioned), I was able to apply it to my dataset, but i have to delete the following lines:
total_set.remove(x.min())
total_set.remove(x.max())
It works but only for a PiecewiseLinFit of first degree, this should be a limitation or I did something wrong? Here is the code.
About EGO: I will try to apply your example to my dataset, at first glance it seems to work very well :)
from piecewise_linear_fit_py.
Also, here is the documentation on the GA scheme: https://deap.readthedocs.io/en/master/api/algo.html#deap.algorithms.eaMuPlusLambda
from piecewise_linear_fit_py.
Thank you so much, now I understood and then I was able to adapted to my needs. Check it again, I think it really works well in this case.
I still want to look at EGO anyway, do you think it would perform better?
In this example the dataset is downsampled (using a simple average) by 1000 times, do you think that more data leads to a more accurate results?
from piecewise_linear_fit_py.
If you can afford to perform the fit with with more data, you can see if the results are different than the down-sampled results. I wouldn't know without trying.
from piecewise_linear_fit_py.
I still want to look at EGO anyway, do you think it would perform better?
Either should work, it's hard to say which will be better. If I had to prefer one, it would probably be the GA.
from piecewise_linear_fit_py.
Related Issues (20)
- .fit() fails with 1 segment HOT 5
- How to force the fit process to have a fixed Intercept? HOT 1
- Can I get y_values if I have only x and slopes values? HOT 1
- Limit the slope of each segment of the curve HOT 1
- Re-constructing Piecewise PWLF HOT 2
- Set Slope of Segment to 0 HOT 2
- How to fit multiple functions simultaneously HOT 3
- Hi, i want to make sure that there are no fitted lines between points that are too far apart i.e. set a min value( fragment optimization) how to achieve this? HOT 1
- Why last beta is always positive? HOT 3
- How to prevent poor fitting HOT 4
- Create a pwlf using custom coefficient HOT 1
- p values does not seem accurate HOT 3
- Error for coefficients of linear equations HOT 7
- pwlf with unknown line segments HOT 9
- assure the slopes to be lower and lower HOT 1
- divide by zero error in calc slopes if two break points are the same, or if a breakpoint is on the boundary HOT 2
- Issue using .fit() HOT 5
- How to plot segments with fit_breaks information HOT 4
- support random seed on init
- How to calculate prediction intervals? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from piecewise_linear_fit_py.