-
Notifications
You must be signed in to change notification settings - Fork 16
SHRIMP: Sub FormulaEval
[The following is a generalised subroutine, called within Sub EqnInterp by SQUID 2.50 with respect to several different equations during a 'typical' data-reduction. As such, the inputs and outputs of the subroutine have generalised names. The identities of the specific inputs and outputs are defined when the subroutine is called.]
The subroutine takes a text-string written as an 'Excel-formatted' equation, including a prescribed format for referencing 'ratios of interest' (square brackets surrounding double-quotes: e.g. ["206/270"] if the specified equation intends to make use of that ratio). This algebraic expression is then evaluated numerically, with the Value obtained by re-interpolating the Dodson-interpolated values of each of the constituent isotopic ratios to the time-stamp corresponding to the grand mean of all the isotopic ratios involved in the equation. But of broader interest to the SHRIMP community is the generation of an associated fractional Error (expressed to users as a percentage), derived by perturbing in turn each of the fractional error of the individual (unduplicated) peak-heights used in the equation. The arithmetic of this peak-perturbation process has never previously been elucidated publicly.
Note that this routine appears within the 'WorksheetUtils' group of modules in SQUID, and much of its workings are specific and peculiar to Excel and VBA. To a significant extent, I have documented this code anyway, to try and give a sense of the behaviour that might require replication in Java. However, where the VBA code invokes additional subroutines and functions that are complex and (in my view) peripheral to what we are trying to do, I have skipped over them. These 'skip-overs' are flagged at each occurrence in the explanation below, so that they can be pursued and documented in future should it be necessary. In the meantime, note that the next effect is 'messiness' relating to apparently-unused variables: you won't know whether they are genuinely obsolete, or just 'skipped over' in my treatment below.
FormulaEval(Formula, EqNum, ScanNum, PkInterp, PkInterpFerr, EqValTmp, EqFerr)
Evaluate numerically an algebraic expression. Arrays PkInterp, PkInterpFerr must be indexed in order of scanning.
Formula: This is a text-string specifying the equation to be evaluated, in 'Excel format' and with specific syntax wherever a 'ratio of interest' is to be employed (e.g. ["206/238"] when the isotopic ratio 206/238 is included in the equation). Specific examples of text-strings representing Excel-format equations to be evaluated via FormulaEval include:
ln(["254/238"])
ln(["206/238"])
["206/238"]/["254/238"]^2
["238/196"]/["254/238"]^0.66
(0.03446*["254/238"]+0.868)*["248/254"]
EqNum: Integer defining the index-number of the equation to be evaluated, as an input. In 'U-Pb geochronology' mode, the candidate values span -4 to -1 ('Special U-Pb equations'), and then 1 to 50 ('General equations'), as per the originating subroutine EqnInterp. For context, the five specific examples above are re-cited below with their real EqNum values:
ln(["254/238"]) has EqNum = 1
ln(["206/238"]) has EqNum = 2
["206/238"]/["254/238"]^2 has EqNum = -1
["238/196"]/["254/238"]^0.66 has EqNum = -4
(0.03446*["254/238"]+0.868)*["248/254"] has EqNum = -3
ScanNum: Integer between 1 and sIndx (as defined in the originating subroutine EqnInterp), defining the index-number of the scan (for use in addressing the associated arrays PkInterp and PkInterpFerr), as an input.
PkInterp: Array of size sIndx x Nspecies (addressed [1 to sIndx, 1 to Nspecies]), as defined by the originating subroutine EqnInterp, corresponding to interpolated ion-count peak-height values (counts/second), as an input.
PkInterpFerr: Array of size sIndx x Nspecies (addressed [1 to sIndx, 1 to Nspecies]), as defined by the originating subroutine EqnInterp, corresponding to the fractional errors of the interpolated ion-count peak-height values (counts/second), as an input.
EqValTmp: The scan-specific calculated value of the numerically-evaluated algebraic expression, as an output.
EqFerr: The scan-specific calculated 1-sigma fractional error of EqValTmp, as an output.
Values of type Boolean
b, GotUnduped, IsEq, MTcell
Values of type Integer
Col1, Col2, d1, d2, j, k, MaxConsts, NdpPk, Nr, Nrn, Nrt, nt, Num1Denom2, PkOrd, q, RatCt, RatNum, Rct, RefEqNum, UndupPkOrd, VarLen
Values of type String
CellAddr, EqRef, f0, Form, LastPertEq, PertEq, PertF, s, tForm, tmp
Values of type Double
fDelt, fVar, PertVal
Arrays comprising values of type Boolean
ErColRef
BypassEmptyCell[-4 to 99] as Static, which apparently means that this particular local variable is to continue to exist (and retain its latest value) after termination of the procedure in which it is declared (https://msdn.microsoft.com/en-us/library/z2cty7t8.aspx).
Arrays comprising values of type String
Denom, ErColRefAddr, Numer, Rat (I have added 'Rat' here as Ludwig seems not to have declared it specifically)
Arrays comprising values of type Double
Num, Den, PratVal, RatVal[50]
Variables of type Variant
NumRes, tA, tB, tc, TD
FormulaEval starts with a subroutine aimed at finding any Named Excel cell-ranges which have been "mistakenly" enclosed inside the delimiters ["RangeName"]. But I am uncertain what Ludwig means by mistakenly, and this seems a sufficiently Excel-specific quirk that it can probably be ignored, at least to begin with. I have included the call, for completeness:
f0 = Formula
--StripBracketedRangeNames f0, Nrn
--Formula = f0
The next step is to audit the equation of interest, to see how many ratios it contains in total (i.e. counting all duplicates of a ratio in an expression, I think). As far as I can tell, there are two main types of 'bracketed' expression:
- isotope ratios, which can be specified either literally (e.g. ["206/238"] as per all of our documentation thus far) or via index (e.g. [d] where ratio [d] in the related Task window is specified as 206/238), and
- 'non'-isotope ratios, which span (a) equation-indexes (i.e. the output of foregoing equations used in subsequent equations), (b) Named cell-ranges in Excel, which include, but are not restricted to, 'literal' equation-indexes, and (c) data-column headings used elsewhere in SQUID's output.
So counting the cited ratios is a two-step process. In the below code-block, EqnRats is a vector array with length equal to the total number of equations defined for the Task, and for each those equations, EqnRats contains an integer value corresponding to the number of isotope ratios (i.e. type (1) above). I believe that NeqnTerms is the analogous vector array counting instead the number of equation-indexes, Named cell-ranges in Excel, and data-column headings (i.e. type (2) above). Note that, at least to begin with, nt = 0 for all of the example equations.
Nr = EqnRats[EqNum] --number of isotope ratios used (1 or 2 in my examples)
nt = NeqnTerms[EqNum] --number of equation-indices/Named ranges/column-headers used (0 in my examples)
Nrt = Nr + nt
The value of Nrt defines most of the arrays which are vectors of length Nrt (each addressed [1 to Nrt]):
ReDim Num[1 To Nrt], Den[1 To Nrt], PratVal[1 To Nrt] --Double
ReDim Numer[1 To Nrt], Denom[1 To Nrt], Rat[1 To Nrt], ErColRefAddr[1 To Nrt] --String
ReDim ErColRef[1 To Nrt] --Boolean
EqValTmp = 0
EqFerr = 0
RatCt = 0
Next comes a long 'For' loop identifying the positions of the delimiters in the strings, and their nature. Note in the code-block below, the following string constants ('sc') are used:
scBrQL to represent the character-pair [" --(i.e.'bracket quote left')
scBrQR to represent the character-pair "] --(i.e.'bracket quote right')
scPm to represent the character ± --(i.e. 'plus-minus')
scBrQL and scBrQR cannot be defined explicitly because the double-quote character is used independently to define string limits.
BrakType[RatNum, EqNum] contains integer values which identify the content-type of each bracketed expression (RatNum) within each equation (EqNum), according to the type of delimiter used. Candidate content-types span the range of types (1) and (2) above. We may not ever cover all of these candidate values: at present, I know that BrakType = 1 everywhere in my five example equations, from which I infer that BrakType = 1 for isotope ratios, and BrakType = {something else} for equation-indexes, Named cell-ranges, and column-headers.
Indices for each species in a ratio are then extracted from EqPkOrd, which is a three-dimensional array with a 'depth' of 2, with one row for each equation defined for the Task, and one column for each occurrence of a 'ratio of interest' used in that equation (i.e. all duplicate occurrences of a 'ratio of interest' are counted separately). The third dimension of the array corresponds to numerators when 1, and denominators when 2. All values within this array are integers, corresponding to the indices of the numerators and denominators (within the list [1 to Nspecies]: see text at the beginning of Step 3 for a reminder).
SqBrakExtract is one of Ludwig's enormous number of string-related utilities, which I really don't want to delve into unless we have no choice. Suffice it to say that it first argument is the formula-string including square-brackets, and its second argument is a modified string with the square brackets removed (but double-quotes retained, I think).
Ludwig permits specific reference to error-columns associated with isotope ratios simply by using the plus-minus character ahead of the isotope-ratio citation. For example, to use the %err of ["206/238"] in an equation, a user would write [±"206/238"], and SQUID checks to see whether the first character in the square-bracket-stripped equation is ±, or not, in order to populate the Boolean array ErColRef. (Note that ErColRef = FALSE everywhere in my example equations, so I intend to pass over this if-clause until we have done more work on Task- and Equation-auditing.)
Likewise, I intend to initially pass over the Else clause dealing with equation-indexes (IsEq = TRUE, pertaining to the use of prior equation-outputs as subsequent equation-inputs), which is powerful, but not immediately relevant to our arithmetic.
For RatNum = 1 To Nrt
Col1 = InStr(Formula, scBrQL) --position of character in string, counted from left
Col2 = InStr(Formula, scBrQR)
d1 = InStr(Formula, "[")
d2 = InStr(Formula, "]")
If BrakType[RatNum, EqNum] = 1 --isotope ratio [type (1) above], main case
RatCt = 1 + RatCt
j = EqPkOrd[EqNum, RatNum, 1] --numerator
k = EqPkOrd[EqNum, RatNum, 2] --denominator
SqBrakExtract Formula, s --modify string Formula by removing square brackets
If LeftString[s, 1] = scPm --if leftmost 1 character of String s is ±
ErColRef[RatNum] = TRUE
Else
ErColRef[RatNum] = FALSE
End If
IsEq = FALSE
If ErColRef[RatNum] = TRUE --scenario not immediately relevant
--code to be added
Else
Num[RatCt] = PkInterp[ScanNum, j]
Den[RatCt] = PkInterp[ScanNum, k]
If Den[RatCt] = 0
EqValTmp = 1E+32
EqFerr = 0
Exit Sub
End If
RatVal[RatCt] = Num[RatCt] / Den[RatCt]
--Bodorkos 2017-04-04: next line inserted to force VBA-Java match:
RatVal[RatCt] = SigDigStrDbl( RatVal[RatCt], 12 ) --12 sig. digits
Rat[RatCt] = StR( RatVal[RatCt] )
s = Rat[RatCt]
End If
Else --equation-index/Named cell-range/column-header [type (2) above], consider later
IsEq = TRUE
--code to be added
End If
Formula = LeftString(Formula, d1 - 1) & s & MidString(Formula, d2 + 1)
--i.e. construct revised Formula string by concatenating, in sequence:
--1. All characters in string Formula, on the left-hand side of d1
--2. String s
--3. All characters in string Formula on the right-hand side of d2
Next RatNum
The next step is to evaluate the formula, with error-trapping:
NumRes = Evaluate(Formula)
If ( IsError(NumRes) = TRUE ) Or ( IsNum(NumRes) = FALSE )
EqValTmp = _[SQUID_error_value]_
Exit Sub
End If
EqValTmp = NumRes
If EqValTmp = 0
Exit Sub
Else --Bodorkos 2017-04-04 inserted Else to force VBA-Java match:
NumericRes = SigDigStrDbl( NumericRes, 12 ) --12 sig. digits
End If
The final step is the numerical perturbation procedure, to generate an uncertainty associated with the value. NoDupePkN is (I believe) a vector array with length equal to the total number of equations defined for the Task, and for each equation, NoDupePkN contains an integer value corresponding to the number of distinct species used in the 'ratios of interest' invoked in the equation. Herein, that integer is designated NdpPk (= 2 for my first two equations, and 3 for each of the final three). Note that the array EqPkUndupeOrd[EqNum, NdpPk] is a two-dimensional analogue of EqPkOrd, with one row for each equation defined for the Task, and one column for distinct species used in that equation, containing integer values corresponding to the indices of those species, in order of measurement. Values for my example equations are given below:
- ln(["254/238"]), which is EqNum = 1, has NdpPk = 2, and EqPkUndupeOrd = 7 9 (i.e. 238 and 254)
- ln(["206/238"]), which is EqNum = 2, has NdpPk = 2, and EqPkUndupeOrd = 4 7 (i.e. 206 and 238)
- ["206/238"] / ["254/238"] ^ 2, which is EqNum = -1, has NdpPk = 3, and EqPkUndupeOrd = 4 7 9 (i.e. 206, 238 and 254)
- ["238/196"] / ["254/238"] ^ 0.66, which is EqNum = -4, has NdpPk = 3, and EqPkUndupeOrd = 1 7 9 (i.e. 196, 238 and 254)
- ( 0.03446 * ["254/238"] + 0.868 ) * ["248/254"], which is EqNum = -3, has NdpPk = 3, and EqPkUndupeOrd = 7 8 9 (i.e. 238, 248 and 254)
The code proceeds as follows:
fVar = 0
PertF = Formula
LastPertEq = "" --i.e. an empty string
For NdpPk = 1 To NoDupePkN[EqNum]
GotUnduped = FALSE
PertEq = f0
UnDupPkOrd = EqPkUndupeOrd[EqNum, NdpPk]
For RatNum = 1 To Nr
If ErColRef[RatNum] = FALSE
PratVal[RatNum] = RatVal[RatNum]
End If
Next RatNum
For RatNum = 1 To Nr
If ErColRef[RatNum] = FALSE
For Num1Denom2 = 1 To 2
PkOrd = EqPkOrd[EqNum, RatNum, Num1Denom2]
If PkOrd = UnDupPkOrd
GotUnduped = True
If Num1Denom2 = 1 --numerator
PratVal[RatNum] = Num[RatNum] * 1.0001 / Den[RatNum]
Else
PratVal[RatNum] = Num[RatNum] / ( Den[RatNum] * 1.0001 )
End If
Exit For -- because numerator <> denominator
End If
Next Num1Denom2
End If
Next RatNum
For RatNum = 1 To Nr
d1 = InStr(PertEq, "[")
d2 = InStr(PertEq, "]")
If ErColRef[RatNum] = TRUE
s = Range( ErColRefAddr[RatNum] ) --not immediately relevant
Else
--Bodorkos 2017-04-04: next line inserted to force VBA-Java match:
PratVal[RatNum] = SigDigStrDbl( PratVal[RatNum], 12 ) --12 sig. digits
s = StR( PratVal[RatNum] )
End If
PertEq = Left$(PertEq, d1 - 1) & s & Mid$(PertEq, d2 + 1)
--three-part concatenation as above
Next RatNum
If GotUnduped = TRUE
--The following lines deals with the (not immediately relevant) possibility
--that a named user-equation is present: ignore it for the moment
--Subst PertEq, scBrQL, , scBrQR
PertVal = Evaluate(PertEq) --numeric result of perturbing one peak
--Bodorkos 2017-04-04: next line inserted to force VBA-Java match:
PertVal = SigDigStrDbl( PertVal, 12 ) --12 sig. digits
If EqValTmp = 0
EqValTmp = 1E-32
End If
fDelt = PertVal / EqValTmp - 1 --fractional difference
--In Excel, I got better results with fDelt = (PertVal - EqValTmp) / EqValTmp
--Ludwig: "Kludge to evade an Excel2001 bug" here - line above or lines below?
--tA = InterpFerr[ScanNum, PkOrd] --from Ludwig, but this line is WRONG, because PkOrd
--was (re)set above when RatNum = Nr (i.e. final RatNum). So if NdpPk (i.e. UndupPkOrd)
--is NOT part of that final ratio, the expression for tA CANNOT select the correct
--InterpFerr value. The bug affects all equations with Nr > 1 that use a species NOT
--appearing in the final ratio. All three of my example equations where Nr = 2 are
--affected, as follows:
--206 is assigned InterpFerr from 238 in formula ["206/238"]/["254/238"]^2
--196 is assigned InterpFerr from 238 in formula ["238/196"]/["254/238"]^0.66
--238 is assigned InterpFerr from 254 in formula (0.3446*["254/238"]+0.868)*["248/254"]
--I believe the following line implements the calculation correctly:
tA = InterpFerr[ScanNum, UnDupPkOrd] --this line is RIGHT
tB = 1.0001 - 1 --note that Excel 16-bit floating binary gives 9.9999999999989E-05
--The following two Ludwig lines yield poorly scaled arithmetic, and are
--commented out (05 Apr 2017):
--tc = fDelt * fDelt
--TD = (tA / tB) ^ 2 * tc
--Bodorkos (05 Apr 2017) implemented alternative lines that are
--algebraically equivalent (for TD) and better scaled:
tc = fDelt * tA / tB
tD = tc * tc
--Now the Ludwig code resumes:
fVar = fVar + TD --fractional internal variance
LastPertEq = PertEq
End If
Next NdpPk
EqFerr = sqrt(fVar)
End Sub