-
Notifications
You must be signed in to change notification settings - Fork 16
SHRIMP: Sub WeightedLinearCorr
[The following is a generalised function, used and reused by SQUID 2.50 on at least a couple of different occasions during a 'typical' data-reduction. As such, the inputs and outputs of the functions have generalised names. The identities of the specific inputs and outputs are defined when the function is called.]
The function calculates a Y-error weighted linear regression, where the errors of index-adjacent Y values are correlated. X-error values (usually time-stamps) are assumed zero.
WeightedLinearCorr(X, Y, N, SigmaRhoY, Slope, Inter, MSWD, Prob, SlopeSig, InterSig, SlopeInterCov, Bad)
X: Vector of length N containing X-values of a single variable, for which the errors are assumed zero.
Y: Vector of length N containing Y-values, for which the errors of index-adjacent values are correlated.
N: Integer defining the number of data-points submitted to the calculation.
SigmaRhoY: Array of size N x N defining the sigma-rho matrix associated with Y.
Slope: The slope calculated by error-weighted linear regression of Y against X.
Intercept: The Y-intercept (i.e. X = 0) calculed by error-weighted linear regression of Y against X.
MSWD: The calculated Mean Square of Weighted Deviates associated with the error-weighted linear regression of Y against X.
Prob: The calculated probability of fit associated with the MSWD and (N - 2) degrees of freedom.
SlopeSig: The 1-sigma absolute error of the Slope.
InterSig: The 1-sigma absolute error of the Inter.
SlopeInterCov: The covariance of Slope and Inter.
Bad: Boolean for the purpose of error-trapping during calculations.
Values of type Boolean
Bad
Values of type Integer
i, j, N
Values of type Double
Slope, Inter, MSWD, Prob, SlopeSig, InterSig, SlopeInterCov, SumSqWtdResids, Mx, Px, Py, Pxy, w, InvOm
Variables of type Variant (produced by matrix inversion, transposition, or multiplication)
FischerInv, InvOmega, Mm, ResidT
Array of size N x 1 (addressed [1 to N, 1 to 1] ) comprising values of type Double
Resid
Array of size 2 x 2 (addressed [1 to 2, 1 to 2] ) comprising values of type Double
Fischer
Array of size N x N (addressed [1 to N, 1 to N] ) comprising values of type Double
Omega
The first step is to construct the variance-covariance matrix Omega from the input sigma-rho matrix SigmaRhoY.
For i = 1 To N
For j = 1 To N
If i = j
Omega[i, i] = SigmaRhoY[i, j] ^ 2
ElseIf abs(i - j) > 1
Omega[i, j] = 0
ElseIf i < N
Omega[i, j] = SigmaRhoY[i, j] * SigmaRhoY[i, i] * SigmaRhoY[j, j]
Omega[j, i] = Omega[i, j]
End If
Next j
Next i
InvOmega is generated by inverting the matrix Omega. Ludwig does this using Excel's MINVERSE function (https://msdn.microsoft.com/en-us/library/microsoft.office.interop.excel.worksheetfunction.minverse.aspx).
InvOmega = MINVERSE(Omega)
Next, use the elements of InvOmega to calculate the sums for Slope and Inter:
Mx = 0
Px = 0
Py = 0
Pxy = 0
w = 0
For i = 1 To N
For j = 1 To N
InvOm = InvOmega[i, j]
w = w + InvOm
Px = Px + ( InvOm * ( X[i] + X[j] ) )
Py = Py + ( InvOm * ( Y[i] + Y[j] ) )
Pxy = Pxy + ( InvOm * ( ( (X[i] * Y[j] ) + ( X[j] * Y[i] ) ) )
Mx = Mx + ( InvOm * X[i] * X[j] )
Next j
Next i
Slope = ( ( 2 * Pxy * w ) - ( Px * Py ) ) / ( ( 4 * Mx * w ) - ( Px ^ 2 ) )
Inter = ( Py - ( Slope * Px ) ) / ( 2 * w )
Slope = (2 * Pxy * w - Px * Py) / (4 * Mx * w - Px ^ 2)
Inter = (Py - Slope * Px) / (2 * w)
Now construct and invert the Fischer matrix to calculate SlopeSig, InterSig and SlopeInterCov, and generate the column-vector Resid for the purpose of calculating MSWD and Prob:
Fischer[1, 1] = Mx
Fischer[1, 2] = Px / 2
Fischer[2, 1] = Px / 2
Fischer[2, 2] = w
FischerInv = MINVERSE(Fischer)
SlopeSig = sqrt( FischerInv[1, 1] )
InterSig = sqrt( FischerInv[2, 2] )
SlopeInterCov = FischerInv[1, 2]
For i = 1 To N
Resid[i, 1] = Y[i] - ( Slope * X[i] ) - Inter
Next i
At this point, the VBA code calls a generalised function aimed at calculating the sum of squared residuals. The function seems to cater for the specification of the one-dimensional array Resid as either a row or a column... but WeightedLinearCorr is the only subroutine that ever calls the function, and WeightedLinearCorr specifies that Resid is a column. So I've simplified the function (see below) to deal solely with the case where the maximum possible subscript of the second dimension of Resid (i.e. the number of columns) is 1, because I believe no other case is possible.
Arrays ResidT and Mm are generated from the arrays Resid and InvOmega via matrix transposition and multiplication, reminiscent of subroutine WtdAvCorr. Ludwig used the Excel-implemented functions TRANSPOSE (trivial: https://msdn.microsoft.com/en-us/library/office/ff196261.aspx) and MMULT (https://msdn.microsoft.com/en-us/library/office/ff835849.aspx), as well as the (legacy) Excel function FDIST (https://support.office.com/en-US/article/FDIST-function-ECF76FBA-B3F1-4E7D-A57E-6A5B7460B786) to calculate Prob:
ResidT = TRANSPOSE(Resid)
Mm = MMULT( MMULT(ResidT, InvOmega), Resid )
SumSqWtdResids = Mm[1]
MSWD = SumSqWtdResids / (N - 2)
Prob = FDIST(MSWD, (N - 2), 1E+9)
End Sub