Specification of Information

Parameter types
Definition of time-scale
Parameters
Date functions
Radiocarbon calibration
Sapwood estimates for dendrochronology
Parameterisation of date information
Cross referencing

Parameter types

The parameters used in models can be of three different types:

Number - any general parameter
Date - parameter relating directly to time
Interval - difference between date parameters

The program attempts to keep track of types through the calculation process. The functions Number() and Date() can be used to force parameters to be of a specific type.

Because of the many different applications of OxCal, it is important that there is a well defined internal time-scale. For this purpose cal BP is not suitable as it is an integer time-scale and refers only to whole years - not to specific dates and times. The internal time-scale in OxCal is therefore based on the Gregorian calendar:

Intervals are given in Gregorian years (that is 365.2425 days)
Dates are given as one plus the number of Gregorian years after 00:00:00 UT on Monday 0001-01-01 (ISO-8601 definition; this corresponds to Julian Day 1721425.5, Julian Date 0001-01-03 and is defined as the start of the Gregorian Epoch)

This gives a continuous real number time-scale (G), which can be directly related to BC/AD dates. Note however that G-1 corresponds to the start of 2 BC and G1 corresponds to the start of AD 1, since there is no year zero in the BC/AD scheme.

Given the widespread use of cal BP as a time-scale, this is defined here as a real time-scale from the middle of AD 1950 (i.e. G1950.5 using the internal date scale).

Special pre-processor functions are included for entering mid-year dates (see pre-processor calculations). The following examples show their use:

AD(1066) gives G1066.5 which is the middle of AD 1066
BC(12) gives G-10.5 which is the middle of 12 BC
CE(1812) gives G1812.5 which is the middle of 1812 CE
BCE(79) gives G-77.5 which is the middle of 79 BCE
CE(-78) gives G-77.5 which is the middle of 79 BCE or ISO-8601 year -78
calBP(100) gives G1850.5 which is the middle of AD 1850

Further details and conversions between this format and other time-scales is given in the section on the calendar definitions.

On plots axes OxCal shows the start of the year. When reporting in BC/AD the transition between 1BC and 1AD is shown as '1BC/1AD'. The spacing between the labels is regular. Thus the axis on a plot with labels every century will show:

     .     .     .     .     .     .     .
    300   200   100 1BC/1AD 101   201   301

For a one year spacing in the labels the axis is shown as:

     .     .     .     .     .     .     .
     3     2     1  1BC/1AD  2     3     4

which effectively means:

     .     .     .     .     .     .     .
     | 3BC | 2BC | 1BC | AD1 | AD2 | AD3 |

If the same were plotted using fractional Gregorian years (±CE or OxCal internal format) it would be shown as:

     .     .     .     .     .     .     .
    -2    -1     0     1     2     3     4

and when reporting the same dates as BP the centre of the year is given - thus the equivalent axis would be:

        .     .     .     .     .     .   
       1952  1951  1950  1949  1948  1947

When reporting dates and ranges in the BC/AD or BCE/CE format OxCal will give the year in which the event occurs (without a year zero). If ISO-8601 is chosen all dates will be reported CE but a year zero will be used (as is defined in this standard and as is the practice in astronomy) and negative year numbers before that. If the Gregorian year format is chosen fractional years are given in the internal format described above. All files and data-sets use the fractional format.

For most purposes involving radiocarbon this is irrelevant but future versions of OxCal may give alternative date format outputs and they will be based on the definitions laid out here and in the section on calendar definitions.

Parameters

Parameters can be introduced into models by assigning them a parameter name. The main functions that are used are:

The type function Number() allows a general numerical parameter to be defined by an expression. This is the equivalent of the Date() function described in the next section. The other functions given here allow parameters to be assigned specific probability distribution functions, usually to describe the likelihood for specific functions: the N() function for normal distributions, the U() and Top_Hat() functions for uniform distributions and P() for more general functions. Finally Prior() allows distributions to be defined numerically using some prior information.

For any of the methods described here, or below, two different formats of parameter definition are allowed. OxCal will also try to work out the type from the context if you do not specify Date() or Number(). The following are all equivalent:

    a=200;
    a=Number(200);
    Number("a",200);

The format based on the equality operator is better suited to complex mathematical calculations but the parameter names have to be simple strings and cannot contain operators like + - / * ( ). Examples of parameter definitions are:

// Simple numbers
n1=200;                  n2=166.66*1.2;

// Normal likelihood with a mean of 200 and a sigma of 20
a1=N(200,20);            N("a2",200,20); 

// Uniform likelihood between 180 and 200
b1=U(180,220);           U("b2",180,220); 
b3=Top_Hat(200,20);      Top_Hat("b4",200,20); 

// Poisson distribution with a mean of 10
p1=Pois(10);             Pois("p2",10);

// Poisson distribution but scaled by a factor of 3
p3=Pois(10,3);           Pois("p4",10,3);

// log-normal distribution with a mean of about 1000
l1=LnN(ln(1000),ln(1.1));	LnN("l2",ln(1000),ln(1.1));

// Exponentially falling likelihood with time constant 10
d1=P(0,100,exp(-d1/10)); P("d2",0,100,exp(-d2/10)); 

// Rapidly falling distribution expressed as an inline array
d3=P(-1,11,[0,1024,512,256,128,64,32,16,8,4,2,1,0]); 
P("d4",-1,11,[0,1024,512,256,128,64,32,16,8,4,2,1,0]);

The reason that you might wish to introduce general parameters of this form is that they can be used in subsequent calculations (see operations on probability distributions).

Maths ↓

The parameters of any Bayesian model are all treated in a similar way. Each parameter has a value t_i and might also have observations or other information about it which are conceptually denoted as y_i. Each parameter introduced is assumed to have a uniform uninformed prior p(t_i). Where there is other information, this can be used to define a likelihood function p(y_i|t_i) for that parameter.

The functions N(), U(), Top_Hat(), P() and Prior(), can be used to directly assign such likelihoods:

Function	Likelihood
N(r_i,s_i)	p(y_i\|t_i) ∝ (1/(s_i√(2π))) exp(-(t_i - r_i)²/(2 s_i²))
U(r_i,s_i)	p(y_i\|t_i) ∝ H(t_i-r_i)H(s_i-t_i)
Top_Hat(r_i,s_i)	p(y_i\|t_i) ∝ H(t_i-r_i+s_i)H(r_i+s_i-t_i)

where H(x) is the Heaviside step function which is 0 when x<0, 1/2 when x=0 and 1 when x>0. The P() function defines a likelihood function directly, either as an expression, or as a literal array. The Prior() function can be used to provide such a function in numerical form as saved in a file. Usually the Prior() function is used to define a non-uniform prior but it can equally be used to provide a particular functional form for a likelihood where the prior is defined in some other way. Assuming no other prior information, a likelihood can be viewed as an informed prior, since for constant p(t):

p(t_i|y_i) ∝ p(y_i|t_i)p(t_i) ∝ p(y_i|t_i)

Date functions

Special functions are provided for specifying information about the time of events - these have the type of Date. The main functions currently implemented are:

The type function Date() is used to specify a date - the internal date format should be used (see calendar definition) for numerical values; the likelihood functions described above can also be used. The Age() also performs the same task but is for expressing dates before some specific year. That year is assumed to be the middle of AD1950 (G1950.5) if no other value is given. Using the Year= statement you can set this to some other date (again using the format described in calendar definition) either for a specific function or on a more general basis.

The R_Date(), R_Simulate() and R_Combine() functions will be covered in the next section.

C_Date() is a legacy from previous versions of OxCal and specifies a normally distributed likelihood; depending on the options set the C_Date() function can use either BP or the internal Gregorian fractional years (G). C_Simulate() can be used for simulating dates for a method that produces normally distributed likelihoods. The Sapwood() function will be dealt with below.

The function C_Combine() can be used to directly combine dates of the form C_Date() or C_Simulate().

The following examples illustrate the range of different methods of expressing dates.

    a1=Date(1066.5);       a2=Date(AD(1066));        a3=Age(884);       a4=Age(934){Year=AD(2000);};
    b1=Date(N(1066.5,10)); b2=Date(N(AD(1066),10));  b3=Age(N(884,10)); b4=C_Date(AD(1066),10);
    c1=Date(U(AD(1066),AD(1093))); Date("c2",U(AD(1066),AD(1093)));
    d2=C_Simulate(AD(1066),10);    d3=Date(N(AD(1066)+randN()*10,10));

Maths ↓

Date parameters are treated in exactly the same way as any other parameters and mathematically the Date() and Number() functions are identical. The Age() function assumes that any value or likelihood distribution relates to the time before present (defined by the Year attribute), Y.

The following show the effect of the Age() function on the likelihood:

Function	Likelihood
Age(N(r_i,s_i))	p(y_i\|t_i) ∝ (1/(s_i√(2π))) exp(-(Y-t_i - r_i)²/(2 s_i²))
Age(U(r_i,s_i))	p(y_i\|t_i) ∝ H(Y-t_i-r_i)H(s_i-Y+t_i)
Age(Top_Hat(r_i,s_i))	p(y_i\|t_i) ∝ H(Y-t_i-r_i+s_i)H(r_i+s_i-Y+t_i)

The function C_Simulate() calculates a likelihood as:

Function	Likelihood
C_Simulate(r_i,s_i)	p(y_i\|t_i) ∝ (1/(s_i√(2π))) exp(-(t_i - r_i-ε)²/(2 s_i²)) where ε is sampled from N(0,s_i)

The function C_Combine() function evaluates a combined error weighted mean and standard error following the method of Ward and Wilson 1978. If we have several determinations r_i with standard errors s_i we calculate a combined determination r_c with standard error s_c:

r_c = (Σ r_i/s_i²)/(Σ 1/s_i²)
s_c = (Σ 1/s_i²)^-1/2
T = Σ (r_i-r_c)²/s_i²

The likelihood distribution is then as for the C_Date() function. The T value has a χ² distribution on n-1 degrees of freedom as discussed in Ward and Wilson 1978.

If there is an additional component of uncertainty s_a, this is added as:

s_c = √[(Σ 1/s_i²)^-1+s_a²]

Radiocarbon calibration

The likelihood distribution for calibrated radiocarbon dates is more complicated than that for most dating methods because it is based on several different types of information:

The radiocarbon measurement for the sample (r_m) and its associated uncertainty (s_m)
You may wish to combine several of these measurements together for one sample.
The array of radiocarbon measurements (r) and their uncertainties (s) that form the data for the radiocarbon calibration curve. This curve is normally formulated so that the radiocarbon content and it's uncertainty are functions of time (r(t) ± s(t))
In some cases the radiocarbon measurements are expected to be offset from this curve by ΔR, with a normally distributed likelihood.
Finally the sample may draw on more than one calibration curve if the carbon comes from a mixed reservoir (as for example in the case of humans with a mixed diet.

OxCal provides functions for dealing with all of these scenarios. The functions are:

Calibration likelihoods

The R_Date() function provides the normal calibration against the default curve.

R_F14C() allows you to enter the radiocarbon concentration in terms of F14C as defined by Reimer et al. 2004. R_Simulate allows you to sample the date you would expect to get from a radiocarbon lab for a sample of a particular date; the 'Cal Date' parameter is either in Gregorian fractional years (default) or cal BP if that option is set.

R_Combine() allows you to combine dates and add in an additional component of systematic uncertainty (for example if you are dealing with short-lived material you may wish to add an additional uncertainty of about 8 ¹⁴C years - see Stuiver et al. 1998). A χ² test (Ward and Wilson 1978) will also be performed.

  R_Date("OxA-2000",2000,20);
  R_Combine("Jar 1a")
  {
   R_Date("OxA-2001",2010,20);
   R_Date("OxA-2002",1970,20);
   R_Date("OxA-2003",2020,20);
  };
  R_Simulate("Mid AD10",AD(10),20);

Here is an example of the use of the functions for post-bomb calibration.

 Options()
 {
  Resolution=0.2;
  Curve="bomb21nh1.14c";
 };
 Plot()
 {
  Curve("Bomb21NH1","bomb21nh1.14c")
  {
   Reservoir(0.25,0.1);
  };
  R_F14C("OxA-8000",1.345,0.003);
  R_Combine("Jar 1b")
  {
   R_F14C("OxA-8001",1.345,0.003);
   R_F14C("OxA-8002",1.347,0.003);
   R_F14C("OxA-8003",1.341,0.003);
  };
  R_Simulate("Mid 1980",AD(1980),20);
 };

Maths ↓

If the curve function is defined as a radiocarbon concentration r(t) ± s(t) then the calibrated likelihood distribution for a radiocarbon determination of r_i ± s_i is proportional to:

p(y_i|t_i) ∝ exp[-(r_i-r(t_i))²/(2(s_i²+s²(t_i)))]/√(s_i²+s²(t_i))

In practice this calculation is performed at increments defined by the resolution (which is every five years by default). If we have a ΔR defined to be r_d ± s_d the the likelihood for the reservoir offset is given by:

p(y_d|t_d) ∝ exp[-(r_d-t_d)²/(2 s_d²))]/s_d

and the likelihood for the calibrated date r_i ± s_i is given by:

p(y_i|t_i) ∝ exp[-(r_i-r(t_i)-t_d)²/(2(s_i²+s²(t_i)))]/√(s_i²+s²(t_i))

This is the likelihood that is used in OxCal during all MCMC analysis in line with the recommendations of Jones and Nicholls 2002. If the ΔR applies to only one calibration we can combine the two distributions to arrive at:

p(y_i|t_i,y_d) ∝ exp[-(r_i-r(t_i)-r_d)²/(2(s_i²+s²(t_i)+s_d²))]/√(s_i²+s²(t_i)+s_d²)

This is the distribution calculated in the calculation stage of the program, and if no MCMC analysis is required.

The function R_Simulate(r_i,s_i) calculates a likelihood as:

p(y_i|t_i) ∝ exp[-(r(r_i)-r(t_i)-ε)²/(2(s_i²+s²(t_i)))]/√(s_i²+s²(t_i))
where ε is sampled from N(0,s_i)

The function R_Combine() function evaluates a combined error weighted mean and standard error following the method of Ward and Wilson 1978. If we have several determinations r_i with standard errors s_i we calculate a combined determination r_c with standard error s_c:

r_c = (Σ r_i/s_i²)/(Σ 1/s_i²)
s_c = (Σ 1/s_i²)^-1/2
T = Σ (r_i-r_c)²/s_i²

The likelihood distribution is then as for the R_Date() function. The T value has a χ² distribution on n-1 degrees of freedom as discussed in Ward and Wilson 1978.

If there is an additional component of uncertainty s_a, this is added as:

s_c = √[(Σ 1/s_i²)^-1+s_a²]

Calibration curves

The Curve() command allows you to specify the curve to use. See the section on Calibration curves for details of the data-sets. This can either be a calibration curve or a comparison data-set (in which case a comparison curve will be generated). Each point in the data-set has three values:

t_i: the calendar date
r_i: the radiocarbon concentration
s_i: the uncertainty in the concentration

The default resolution is the same as the resolution of the IntCal curves (5 years) and so no interpolation or binning is needed. However, if the resolution is set to less than 5 the curve will be interpolated by a cubic (or linear if that option is set) function - the same is true for comparison data-sets where the resolution is usually coarser. If the data-points are closer together than the resolution (as is the case with much of the bomb data) the data points are binned together before the curve is generated.

The Delta_R() function is primarily intended for marine applications (with the marine curve). What it does is to offset the measurements before calibration. When used with Bayesian models the ΔR offset is treated as a parameter (common to all relevant samples) and with a Normal likelihood distribution. Given this, the function can also be used to allow for correlated uncertainties between samples. If the parameter associated with the Delta_R() function is d, then the r_m is offset so that the likelihood is defined by R_Date(r_m-d,s_m).

The Mix_Curves() function allows two curves to be mixed together with an uncertainty in the mixing ratio. This allows for mixed reservoir dating. In principle this function can be used to mix mixed curves allowing any number of reservoirs to be mixed together. The use of this function always results in an MCMC analysis - in this the proportion for the second curve is not allowed to go out of the range 0-100% (even if for example the proportion is set at 10±10%).

Note that all calculations are performed in terms of radiocarbon concentration (rather than BP date). This includes calibration. If you wish to override this you can do this by setting the option 'Use F14C space' to off. This only affects the calibration itself all other calculations, such as reservoir mixing etc will be performed in terms of radiocarbon concentration F14C.

For most purposes it is easiest to use the curve dialogue ([Options > Curve] for single calibrations or [Tools > Curves] for the main input window) to set the required curve. The following show some examples of use of these functions:

    Curve("Atmospheric","intcal20.14c");
    R_Date(2000,20);
    Curve("Oceanic","marine20.14c");
    R_Date(2000,20);
    Delta_R("Local Marine",200,30);
    R_Date(2000,20);
    Curve("Decadal Turnover","intcal20.14c")
    {
     Reservoir(10,5);
    };
    R_Date(2000,20);
    Mix_Curves("Mixed","Atmospheric","Local Marine",40,10);
    R_Date(2000,20);

For MCMC analysis a non-normal distribution can be used for Delta_R() in place of the usual mean and standard deviation. For example, to allow ΔR to take a value anywhere between 0 and 800 you could use the command:

    Delta_R("uniform",U(0,800));

The same is true for Mix_Curves() and so, for example, to allow for any mixture of marine and atmospheric diet you could use:

    Mix_Curves("Mixed","Atmospheric","Local Marine",U(0,100));

These uniform priors allow you to use unbiased priors for ΔR and diet allowing the model to find an unbiased estimate for these parameters.

Maths ↓

Curve construction

Where binning is required the method is as follows:

All of the data-points are sorted into chronological order
The curve is required at a defined resolution, δ (typically 5 for pre-1950, or 0.2 for post-1950) and for each point (labelled i) on the curve n_i data-points (labelled ij) within ±δ/2 are identified and allocated to a bin

We need to take account of the fact theat there is often much real natural variability not reflected in the measurement uncertainty. We have the individual measurements r_ij with measurement uncertainties s_ij. The error weighted estimate of the mean is:

μ'_i=(Σ_j r_ij/s_ij²)/(Σ_j (1/s_ij²))

And the average weighted variance (Bevington and Robinson 1992) is:

σ_i²= [(Σ_j r_ij²/s_ij²)/(Σ_j (1/s_ij²)) - μ'_i²] [n_i/(n_i-1)]

The uncertainty of the mean is:

σ_μi² = 1/(Σ_j (1/s_ij²))

However, given that we do not in general know if the uncertainty in the underlying measurement or reported points in the curve is correlated, it is not safe to use this. This is especially true where the points are not raw data-points but the points of a compiled calibration curve. It is safer to assume that the uncertainty is never lower than the lowest uncertainty in the points within the bin:

σ_{min i}² = min_j(s_ij²)

So we take the variance s_i² in the bin to be the larger of σ_{min i}² and σ_i². Note that prior to v4.4, where the calibration curve was only supplied at 5 year increments (the default resolution), the minimum of σ_μi² and σ_i² was used, this was not suitable for IntCal20 which was reported at 1 year increments. Overall for each bin we have three values which are:

t_i=(Σ_j t_ij/s_ij²)/(Σ_j (1/s_ij²)) r_i=(Σ_j r_ij/s_ij²)/(Σ_j (1/s_ij²)) s_i= √max([(Σ_j r_ij²/s_ij²)/(Σ_j (1/s_ij²)) - r_i²]/[n_i/(n_i-1)] , min_j(s_ij²))

The revised data-set is then interpolated onto the required points as described below.

If the data-points are more widely spread than the resolution requires, or they do not lie on the correct points, interpolation is performed. If linear interpolation is chosen (or if the interpolation is only for one point) then the function r(t) between two points (t_i,r_i,s_i) and (t_i+1,r_i+1,s_i+1) is given by:

a_i = (r_i+1 - r_i)/(t_i+1-t_i)
r(t) = t_i + a_i(t-t_i)

If cubic interpolation is used then the function is defined to pass through all data points and have a continuous value and first derivative. The interpolation is given by:

a_i = (r_i+1 - r_i-1)/(t_i+1-t_i-1)
a_i+1 = (r_i+2 - r_i)/(t_i+2-t_i)
δr_i = (r_i+1 - r_i)
δt_i = (t_i+1 - t_i)
c_i = (3 δr_i - δt_i(2a_i + a_i+1))/δt_i²
d_i = ((a_i+1 - a_i) - (2 c_i δt_i))/(3 δt_i²)
r(t) = t_i + a_i(t-t_i) + b_i(t-t_i)²+ c_i(t-t_i)³

s(t) is interpolated in exactly the same way.

Curve mixing

The mixing ratio is treated as a parameter t_min Bayesian models. Where the mixing ratio is defined as r_m ± s_m this parameter is given a likelihood which is:

p(y_m|t_m) ∝ H(t_m)H(1-t_m)exp(-(t_m-r_m)²/(2s_m²))

in other words it is a normal distribution truncated at 0 and 1. A posterior distribution for the mixing ratio will be generated which is based on the global model. The combined curve is given by:

r'(t) = (1-t_m) r₁(t) + t_m r₂(t)
s'(t) = (1-t_m) s₁(t) + t_m s₂(t)

Curve reservoir

If a reservoir time constant τ is set it is assumed that the reservoir contains carbon from a range of ages exponentially distributed with an average age of τ. If we assume that the atmospheric radiocarbon curve is given by r(t) and the reservoir curve by r'(t), we can see that:

r'(t) = (1/τ) ∫^t r(u) exp(-(t-u)/τ) du

We also define the same relationship for the uncertainty:

s'(t) = (1/τ) ∫^t s(u) exp(-(t-u)/τ) du

By numerical integration it is possible to calculate these functions. We assume a linear extrapolation of the curve to arrive at a starting point but with errors that are ten times higher than the first point in the curve. This approximation means that the resultant curve should not be relied upon within a few time constants of the start of the curve r(t).

Where there is an uncertainty in τ of δ_τ the additional uncertainty is calculated by numerically determining dr'(t)/dτ.

In order to give a representative result even when the errors are non-gaussian (as is the case with long time constants in the post-bomb period) we numerically evaluate the greater of:

dr'(t)/dτ ≈ [r'(t,τ+δ_τ)-r'(t,τ-δ_τ)]/(2δ_τ)
dr'(t)/dτ ≈ [r'(t,τ)-r'(t,τ-δ_τ)]/δ_τ
dr'(t)/dτ ≈ [r'(t,τ+δ_τ)-r'(t,τ)]/δ_τ

and r''(t) is found by averaging:

r''(t) ≈ [r'(t,τ+δ_τ) + r'(t,τ) + r'(t,τ-δ_τ)]/3

and adding the additional uncertainty in quadrature to ds'(t) to give a revised curve r''(t) and s''(t) such that:

r''(t)=r'(t)
s''(t)= √((s'(t))² + (δ_τdr'(t)/dτ)²)

Note that this a slightly different way of calculating the uncertainty than in versions of OxCal prior to version 4. The differences are significant where the curve has rapid changes (such as with the post-1950 calibration curves).

Sapwood estimates for dendrochronology

Methods are included in OxCal for the estimation of the number of sapwood rings (S) present in a tree where the heartwood/sapwood boundary is present but where the bark edge is not. The underlying model is based on research reported in the thesis of Dan Miles (Miles 2005). The factors on which the estimate are based are:

R - the number of heartwood rings
M - the mean ring width of the heartwood

You can find a linear dependency of ln(S) on ln(M) and ln(R). Roughly speaking S increases with increasing R and S decreases with increasing M. The sapwood tool enables you to perform a linear regression to define a model for a region.

Calculating sapwood estimates

Once a model has been worked out, this can be used to estimate sapwood for wood samples where the number of sapwood rings are unknown, but where we still retain the heartwood/sapwood transition. The two OxCal commands required for this are:

The Sapwood_Model() function defines the parameters to be used for the model. The Sapwood function takes four parameters after the optional name. These are:

Hw/Sw date - the date of the last heartwood ring before the heartwood/sapwood transition (Oxcal date format)
Hw rings - the number of heartwood rings (R)
Sw rings - the number of sapwood rings actually present (this is used as a minimum constraint on S)
MRW - the mean ring width of the heartwood (M)

The program then calculates a likelihood distribution based on the information provided.

Dan Miles has calculated a model for post-Roman mainland Britain (Miles 2005), excluding Scotland, which can be used for dating oak in this area and period. It can be entered as in the example below:

  Sapwood_Model("Mainland Britain", 2.77292, 0.100001, -0.275445,0.314286377);
  Sapwood("wa21", 1329, 243, 0, 1.06);
  Sapwood("wa22", 1354, 58, 6, 2.74);

The resolution of the program (found in [Tools > Options]) should be set to 1 for this sort of analysis as five years (the default) is too coarse.

Maths ↓

We define variables based on the logarithms of S, M and R:

s=ln(S), m=ln(M) and r=ln(R)

You can find a linear dependency of s on m and r. The residuals are normally distributed in s. This linear regression can be performed using the sapwood tool provided with this package, or any statistical package. The likelihood for S is given by:

p(y_i|S_i,a,b_r,b_m) ∝ H(S) exp(-(a + b_rln(R_i) + b_mln(M_i) - ln(S_i))²/(2σ²))/S_i

or if we define q_i as the heartwood/sapwood boundary date for a sample, the likelihood for the felling date is given by;

p(y_i|t_i,a,b_r,b_m) ∝ H(t_i-q_i) exp(-(a + b_rln(R_i) + b_mln(M_i) - ln(t_i-q_i))²/(2σ²))/(t_i-q_i)

Parameterisation of date information

Specific functions are provided for radiocarbon dating and sapwood estimates because the functional forms of the likelihood distributions are unusual. For many dating methods, however normally distributed errors are appropriate that the generic date functions can be used.

For some methods however it is worth introducing extra common parameters to ensure that correlated uncertainties are correctly treated in any Bayesian analysis. An obvious example of this kind is Luminescence dating.

If we consider a simple case we might have four samples and two environmental dose rate readings. we might also estimate that the dose in the past was lower due to variations in water content.

We then introduce seven parameters to cover this situation:

DE1 - DE4 - estimated dose for samples 1-4
R1 - measured dose rate in vicinity of samples 1 and 2
R2 - measured dose rate in vicinity of samples 3 and 4
F - factor in dose rate estimated for variation in water content

We can put all of this into a model within OxCal in the following way:

  Year=2006;
  // define the independent parameters and set the likelihoods
  DE1=N(403.0,5);
  DE2=N(447.5,3);
  DE3=N(433.7,6);
  DE4=N(462.3,8);
  R1=N(0.08345,0.0009);
  R2=N(0.08467,0.0012);
  F=N(1.100,0.05);
  // calculate the dependent parameters
  D1=Age(DE1/(R1*F));
  D2=Age(DE2/(R1*F));
  D3=Age(DE3/(R2*F));
  D4=Age(DE4/(R2*F));

This shows how the general parameters can be used to define chronological information.

Note that the independent variables (in this case DE1...DE4, R1, R2 and F) all have uniform priors. The dependent parameters will not necessarily have uniform priors - though in this particular case this will not be significant unless the uncertainties are very large.

Cross referencing

Versions of OxCal previous to v4, provided the XReference function which had limited capabilities, allowing an event to be in two phases. There are now much more general ways of dealing with cross references.

The previous section gives an example of a parameter being used explicitly more that once in a model. However, there are times when you need to be able to specify cross references in other ways. You might, for example wish to define that two events in a model are synchronous, or to use the same ΔR parameter in two different places within an analysis as in:

    Curve("Marine04","Marine04.14c");
    Delta_R("Region 1",200,30);
    R_Date("a",3000,30);
    Delta_R("Region 2",100,30);
    R_Date("b",3000,30);
    Delta_R("=Region 1");
    R_Date("c",3000,30);
    Delta_R("=Region 2");
    R_Date("d",3000,30);

The general principle for cross referencing parameters is that the parameter should have one primary specification and then as many cross references to it as you wish. As in the example above the cross references are denoted by using the same name but starting with the '=' character. You can also use the &= operator as in the final example in this section. The following example shows one way of specifying a model for two phases that both start at the same time, but end at different times:

  Phase()
  {
   Sequence()
   {
    // first instance of the parameter "Start 1"
    Boundary("Start 1");
    Phase("1")
    {
     R_Date("a", 2000, 20);
     R_Date("b", 2010, 20);
     R_Date("c", 1920, 20);
     R_Date("d", 1900, 20);
     R_Date("e", 1903, 20);
    };
    Boundary("End 1");
   };
   Sequence()
   {
    // cross reference to the parameter "Start 1"
    Boundary("=Start 1");
    Phase("2")
    {
     R_Date("f", 2040, 20);
     R_Date("g", 2020, 20);
     R_Date("h", 1960, 20);
     R_Date("i", 2000, 20);
     R_Date("j", 1933, 20);
    };
    Boundary("End 2");
   };
  };

The following is equivalent (see the use of the s1=Boundary() to define the parameter and then the expression s1&=Boundary() to cross reference to it):

  // define the likelihoods
  a=R_Date(2000, 20);
  b=R_Date(2010, 20);
  c=R_Date(1920, 20);
  d=R_Date(1900, 20);
  e=R_Date(1903, 20);
  f=R_Date(2040, 20);
  g=R_Date(2020, 20);
  h=R_Date(1960, 20);
  i=R_Date(2000, 20);
  j=R_Date(1933, 20);
  // define the model
  (s1=Boundary()) < (a | b | c | d | e) < (e1=Boundary());
  (s1&=Boundary()) < (f | g | h | i | j) < (e2=Boundary());

By using cross referencing you ensure that only one independent parameter is created in a Bayesian model which ensures that correlated uncertainties are treated correctly. In some instances the program will generate a second dependent parameter which is equal to the first one.