Author 
Message 

Graphmastur
Advanced Member
Joined: 25 Mar 2009 Posts: 360

Posted: 31 Dec 2009 09:40:40 pm Post subject: 


First off, look here. It is a program, that with a set of data, will find the equation that best fits it. It uses an algorithm called Symbolic Regression, which is further described here.
The last link shows a basic tutorial of how it works, but it uses the integral to find it's error. Well, I understand the concept of the integral, and kinda understand Symbolic Regression, but could anybody help with how I actually use Integration? Like, if I were to use it in a program?
Basically, to make this easier, if I have the functions Y_{1}=X^{2} and Y_{2}=X, and I want to use integration to find the area (It is area, right?) between the two curves, in the range between 1 and 1.
Thank you! 

Back to top 


Graphmastur
Advanced Member
Joined: 25 Mar 2009 Posts: 360

Posted: 05 Jan 2010 10:07:50 pm Post subject: 


Sorry to doublepost, but will anyone help with this integration? I am at a loss at how to do it in a program. Will someone help me? 

Back to top 


DarkerLine ceci n'est pas une 
Super Elite (Last Title)
Joined: 04 Nov 2003 Posts: 8328

Posted: 06 Jan 2010 09:48:45 am Post subject: 


To find the area between these two curves over 1..1, you would calculate fnInt(abs(Y_{1}Y_{2}),X,1,1. In general, if you don't have access to an "fnInt"like function, you would estimate that just like any other integral (probably using Riemann sums or something like that, I don't actually know the specifics).
I must point out that taking (or estimating) integrals is probably not the best fitness measure when you're trying to fit a function to a set of points. First of all, if there are places where the points are more spread out, an integral will count those points more for no good reason at all. Second, adding up the squares of the differences is actually more intuitive and useful than the absolute values, when doing any sort of fit.
This next bit is just a guess, so I'm hoping more experienced mathematicians will correct me here: since the usual inner product on functions on an interval is the integral of their product (over the interval), would there be any point to using ∫(fg)^{2} as a fitness measure?
Last edited by Guest on 01 Jul 2010 09:56:09 am; edited 1 time in total 

Back to top 


Graphmastur
Advanced Member
Joined: 25 Mar 2009 Posts: 360

Posted: 06 Jan 2010 05:43:38 pm Post subject: 


If Squares of the differences would be better, how would I do that? Note, that I am going to do this in Java, so any way you can "dumb it down" for me would be awesome, because technically, I don't even know how to do integrals. Thanks for responding, though! 

Back to top 


thornahawk μολών λαβέ
Active Member
Joined: 27 Mar 2005 Posts: 569

Posted: 01 Feb 2010 09:48:02 am Post subject: 


I seem to be late to this particular party, but I shall compensate by noting a few things.
DarkerLine was looking at the problem the right way; this is essentially finding the best fit function by minimizing the norm (the inner product of a function with itself, with respect to a preset weight function) of the difference between the function you're trying to approximate, and the approximating function (in the example you gave, a linear function). Usually, the Euclidean or 2norm, (integral of the square of a function) is used.
For example, to find the best approximating (in the Euclidean sense) linear function to e^x would require you to find the values a and b such that
∫(e^xabx)dx
is minimized (upper and lower limits of the integral depending on your interval of interest). Optimization algorithms essentially amount to systematically finding values of a and b that cause a decrease in the norm, eventually hitting the point where all further steps are increasing.
As to integration, Riemann sums would work, though they are notoriously slow to converge. Gaussian integration (and fnInt( uses a fancy version of it, called GaussKronrod quadrature) would give more accuracy for the same number of function evaluations that will be required by a Riemann sum.
thornahawk 

Back to top 


