Saturday, June 22, 2024

Curve Fitting: Adjusting Data Points (feat. HP 14B and HP Prime)

 Curve Fitting: Adjusting Data Points


Graphs are generated with the HP Prime emulator software.



A Curve Fitting Problem


Problem:


We have an investment project, which at the beginning will cost us 50,000 (insert the currency of your choice). Through analysis, the projected cash flows at the beginning of each term are shown here:



Period (beginning of)

Cash Flow

1

-50000

2

-3000

3

9000

4

16000

5

18000

6

21000


What is the projected cash flow at the beginning of period 7?


We could easily use linear regression, but the problem is, the data itself does not easily fit a line. So we can just use another regression model and that’s simple, right? Not quite, since two of the cash flows are negative.


Approach


Most calculators have at least the following curve fitting regressions:


Model

Transformed

X’ =

Y’ =

True B =

Linear

Y = B + M * X

Y = B + M * X

X

Y

B

Exponential

Y = B * exp(M *X)

ln(Y) = ln(B) + M * X

X

ln(Y)

exp(B)

Logarithmic

Y = B + M * ln(X)

Y = B + M * ln(X)

ln(X)

Y

B

Power

Y = B * X^M

ln(Y) = ln(B) + M * ln(X)

ln(X)

ln(Y)

exp(B)


A lot of them rely on a transition of data to use a translated linear regression model. After the parameters slope (M) and y-intercept (B) are calculated, any adjustments are made to find the correct curve fitting parameters.


The statistical calculations operate on real numbers. If we try to fit an exponential, logarithmic, or power fit on data with non-positive numbers, we are going to have a problem because the natural logarithmic function returns non-real numbers in the transition process.


Still the data does not fit a linear regression line as good as the other regression curves. This calls for an adjustment of data. In essence, we are going to subtract out the minimum value of either the x or y data, or both “minus one”. Conceptually:


If min(X) ≤ 0, then:

P = min(X) -1

X_adj = X – P


If min(Y) ≤ 0, then:

Q = min(Y) -1

Y_adj = Y – Q


Why the “minus 1”?


If we only subtract the minimum, the lowest adjust amount would be 0. But ln(0) approaches negative infinity, which most calculators would not work with.


If we subtract the minimum minus one, the lowest adjusted amount would be 1. ln(1) =0. Beautiful!


Example: Adjusting the Raw Data


Raw data:

X

Y

1

-50000

2

-3000

3

9000

4

16000

5

18000

6

21000


min(X) = 1 > 0. No adjustment is needed.

min(Y) = -50000 ≤ 0. An adjustment is needed.


Let Q = min(Y) – 1 = -50000 – 1 = -50001


Adjusted data:


X

Y

Y’ = Y – Q

ln(Y’) (fix 3)

1

-50000

1

0

2

-3000

47001

10.758

3

9000

59001

10.985

4

16000

66001

11.097

5

18000

68001

11.127

6

21000

71001

11.170


Let’s fit the adjusted curve.


Keep in mind that:


Y’ = Y – Q

Y’ = Y - -50001

Y’ = Y + 50001

Y = Y’ – 50001





Fitting the Curve


Determine which one of the four regressions: linear, exponential, logarithmic, or power bets fits the adjusted data points. Predict the actual revenue (Y) for period 7.


Here is the data. We are going to use the data sets X and Y’.



X

Y

Y’ = Y – Q , (Q = -50001)

1

-50000

1

2

-3000

47001

3

9000

59001

4

16000

66001

5

18000

68001

6

21000

71001


Correlation Coefficients:


Model

Correlation (fix 4)

Linear

0.8477

Exponential

0.6772

Logarithmic

0.9520

Power

0.8290


Calculations were made with a HP 14B calculator. The best fit has absolute value of the correlation closest to 1. Correlations near -1 or +1 are better than correlations near 0.


The best fit is the logarithmic fit. Hence the curve fit is:


Y’ ≈ 9618.6378 + 38498.9034 * ln(X)





Remember that Y’ = Y + 50001. Then:


Y + 50001 ≈ 9618.6378 + 38498.9034 * ln(X)

Y ≈ -40382.3262 + 38498.9034 * ln(X)


To predict for X = 7:

Y ≈ -40382.3262 + 38498.9034 * ln(7)

Y ≈ 34533.0807


The beginning of period 7 will have the project cash flow of about $34,533.



Eddie


All original content copyright, © 2011-2024. Edward Shore. Unauthorized use and/or unauthorized distribution for commercial purposes without express and written permission from the author is strictly prohibited. This blog entry may be distributed for noncommercial purposes, provided that full credit is given to the author.

Numworks: Allowing Repeated Calculations in Python

Numworks: Allowing Repeated Calculations in Python Introduction Say we want the user to repeat a calculation or a routine for as lo...