> Check your math book ("linear regression") - it is so simple that some kind of API is not necessary.
Only if you're working in exact arithmetic. Most statistics requiring forming sums, which can be an easy way of getting loss of precision. A library which handles numerical analysis aspects is certainly worth using.
(...)
> Only if you're working in exact arithmetic. Most
> statistics requiring forming sums, which can be an
> easy way of getting loss of precision.
(...)
You are right, but it is rare case when 64 bit double is not enough. To loose precission with doubles you will have to have a sum about 2^40 times bigger than elements you add.
Well,
The calculation is quite easy to implement. You would have point objects which will be holding (x, y) coordinates. From then on, it's just simple S1 stats. Your line is given by the equation y = mx + b where m is the gradient and b is the interception with the y-axis. You can calculate m using m = ( n*(E(xy)) - (Ex)*(Ey) ) / ( (n*(E(x^2)) - (Ex)^2) ) and b using b = ( E(y) - m*(E(x)) ) / n where n is the number of point objects and x and y are the respective coordinates(E stands for the sum of). Hope that helps.
Regards,
gamehack
There is a general approach which is often used in real
world time series forecasting.
1. Plot the series and examine the features of the graph.
Look for trends, seasonal components, apparent sharp
changes in behavior, and outlying points.
2. Remove the seasonal components and the trend to get
the residuals.
3. Choose a model to fit the residuals in order to
determine how likely the assumtions are.
The trend can be functions like a
f(t) = a0 + a1*t +...+ an*t^n or harmonic functions
like f(t) = a0 + sum (ai*cos(ki*t) + bi*sin(ki*t) where
ai and bi are unknown parameters and ki are fixed
frequencies. If the variables {X1, ..., Xn} are
independent and identically distributed, you can compute
the parameters above by minimizing sum(xt-f(t))^2.