Giter Club home page Giter Club logo

Comments (8)

cjekel avatar cjekel commented on May 22, 2024 2

Okay. There are a couple ways to use your GA implementation.

So the _opt objective function (and throughout my other code) I assume the first and last breakpoints at x.min() and x.max(). Since the first and last point aren't actually connected to other lines, this shouldn't make much difference. If you don't mind having breakpoints at x.min() and x.max(), then you can use

number_of_line_segments = 2
degree = 3

my_pwlf = pwlf.PiecewiseLinFit(x, y, degree=degree, disp_res=False)
my_pwlf.use_custom_opt(number_of_line_segments)
total_set = set(np.floor(my_pwlf.x_data))
pop, hof, stats = genetic_algorithm(total_set, my_pwlf.nVar,
                                    my_pwlf.fit_with_breaks_opt, ngen=20,
                                    mu=125, lam=250, cxpb=0.7, mutpb=0.2,
                                    tournsize=5, verbose=True)
print(hof[0])
x_opt = [my_pwlf.x_data.min()]
x_opt += list(hof[0])
x_opt.append(my_pwlf.x_data.max())
ssr = my_pwlf.fit_with_breaks(x_opt)

plt.figure()
plt.plot(x,y, label='data')
# predict
xHat = np.linspace(min(x), max(x), num=10000)
yHat = my_pwlf.predict(xHat)
plt.plot(xHat, yHat, label='predict')
plt.legend()
plt.show()

print('breaks: ', my_pwlf.fit_breaks)

Which would give you something like
ssr: 104503.32784991479
breaks: [ 32.90733248 40. 748.29923958]

However, if x.min() and x.max() must indeed be in your set. Then we need to make a couple changes.

number_of_line_segments = 2
degree = 3

my_pwlf = pwlf.PiecewiseLinFit(x, y, degree=degree, disp_res=False)
my_pwlf.use_custom_opt(number_of_line_segments)
total_set = set(np.floor(my_pwlf.x_data))
pop, hof, stats = genetic_algorithm(total_set, number_of_line_segments+1,
                                    my_pwlf.fit_with_breaks, ngen=20,
                                    mu=125, lam=250, cxpb=0.7, mutpb=0.2,
                                    tournsize=5, verbose=True)
print(hof[0])
ssr = my_pwlf.fit_with_breaks(list(hof[0]))

plt.figure()
plt.plot(x,y, label='data')
# predict
xHat = np.linspace(min(x), max(x), num=10000)
yHat = my_pwlf.predict(xHat)
plt.plot(xHat, yHat, label='predict')
plt.legend()
plt.show()
print(ssr)
print('breaks: ', my_pwlf.fit_breaks)

which gave something like
ssr: 123888.40048433887
breaks: [ 34. 40. 187.]

Hopefully this should help

from piecewise_linear_fit_py.

cjekel avatar cjekel commented on May 22, 2024 1

Might need to use a different optimization algorithm for this. GA or EGO work well with integers, but this may actually not be faster optimizations. The advantage in your case would be that the optimum output will always be an integer.

I had an experimental branch called exga, I was using DEAP to build a custom GA from a discrete set.

If I understand correctly, possible breakpoints will be between integers between [0, 1023]?


Here is my Genetic algorithm:
https://github.com/cjekel/piecewise_linear_fit_py/blob/expga/pwlf/ga.py

And here is how I called such algorithm (focus between t2 = time() and t3 = time()):
https://github.com/cjekel/piecewise_linear_fit_py/blob/expga/examples/compare_fitfast_and_ga.py

You would change total_set = set(my_pwlf.x_data) to be the set of discrete integers for possible breakpoints.

This was fairly experimental, and I didn't play too much with the hyper parameters of the GA.


Let me also point you in the direction of EGO, but I'll need some time to come up with the code.

from piecewise_linear_fit_py.

cjekel avatar cjekel commented on May 22, 2024 1

The code in https://github.com/cjekel/piecewise_linear_fit_py/blob/expga/examples/compare_fitfast_and_ga.py is using the incorrect objective function. It should be my_pwlf.fit_with_breaks_opt in line 30. You could then use something like the following example to populate the my_pwlf parameters correctly.

I've added an example of using EGO to do this. https://github.com/cjekel/piecewise_linear_fit_py/blob/master/examples/EGO_integer_only.ipynb

from piecewise_linear_fit_py.

joaoantoniocardoso avatar joaoantoniocardoso commented on May 22, 2024

Hi,

Giving more details, my dataset is actually all float because it is a data collected from a digital oscilloscope, for example, for a particular current sensor I might have values from [0, 150] amperes for Y and [0, 5] volts for X. The microcontroller transforms this X values into digital domain representing it with an unsigned integer of 10 bits ([0, 1023], as you correctly understood).
To be able to linearize it and generate a polynomial that transforms the [0, 1023] values into [0, 150], I apply a scalar transformation in my X so that [0, 5] becomes [0, 1023].
One option would be to restrict the final precision representing the resulting range with unsigned integers, but it doesn't perform very well.
Instead, I found that preserving the float precision in this transformation leads to a better result in the linearization process, so in this way my X is actually a float value, but the breaks should occur at integers only. You can see in details the application in this notebook.


About GA:
Following your GA example (but using fit_with_breaks_opt at lines 29 and 33, as you mentioned), I was able to apply it to my dataset, but i have to delete the following lines:

total_set.remove(x.min())
total_set.remove(x.max())

It works but only for a PiecewiseLinFit of first degree, this should be a limitation or I did something wrong? Here is the code.


About EGO: I will try to apply your example to my dataset, at first glance it seems to work very well :)

from piecewise_linear_fit_py.

cjekel avatar cjekel commented on May 22, 2024

Also, here is the documentation on the GA scheme: https://deap.readthedocs.io/en/master/api/algo.html#deap.algorithms.eaMuPlusLambda

from piecewise_linear_fit_py.

joaoantoniocardoso avatar joaoantoniocardoso commented on May 22, 2024

Thank you so much, now I understood and then I was able to adapted to my needs. Check it again, I think it really works well in this case.

I still want to look at EGO anyway, do you think it would perform better?

In this example the dataset is downsampled (using a simple average) by 1000 times, do you think that more data leads to a more accurate results?

from piecewise_linear_fit_py.

cjekel avatar cjekel commented on May 22, 2024

If you can afford to perform the fit with with more data, you can see if the results are different than the down-sampled results. I wouldn't know without trying.

from piecewise_linear_fit_py.

cjekel avatar cjekel commented on May 22, 2024

I still want to look at EGO anyway, do you think it would perform better?

Either should work, it's hard to say which will be better. If I had to prefer one, it would probably be the GA.

from piecewise_linear_fit_py.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.