R-code

---
title: "Southern higher-latitude lamniform sharks track mid-Cretaceous environmental change"
author: |
  | Mohamad Bazzi, Benjamin Kear, Mikael Siversson
  | Department of Organismal Biology, Uppsala University.
  | Museum of Evolution, Uppsala University.
  | Department of Earth and Planetary Sciences, Western Australian Museum.
  | 
  | Mohamad.Bazzi@ebc.uu.se and mikael.siversson@museum.wa.gov.au
geometry: margin=1in
output:
  pdf_document: default
  html_document:
    df_print: paged
editor_options:
  chunk_output_type: console
  number_sections: TRUE
---

Code compiled and maintained by Mohamad Bazzi.
Contact: Mohamad.Bazzi@ebc.uu.se

Compiled: 2020-06-16
Last updated: 2020-12-15

All analyses were done using R version 3.6.3 (R Core Team 2020).
This document is an annotated version of the R code used to run geometric morphometric analyses on shark tooth shapes.

***
# Setup
```{r,include=FALSE}
options(tinytex.verbose = TRUE)
Sys.setenv(RSTUDIO_PDFLATEX = "latexmk")
```

# Study description, Access & import/export, Disturbance
The present study apply 2D geometric morphometrics and evolutionary modelling to track changes in lamniform shark (eco)-morphological disparity across the Cenomanian-Turonian boundary (that also brackets the Oceanic Anoxic Event 2) on a geographically local-scale. The statistical analyses conducted in this study relied principally on a MANOVA model design. Within the context of our time-series assessment of morphology and disparity we considered the following parameters: Procrustes shape coordinates (response variable) and Geological age (independent categorical variable). We also statistically evaluated the effect of heterodonty on time-scaled disparity (treated as a nominal factor). The unit of analysis considered throughout was that of **specimens** and were assigned to specific chronostratigraphic ages: middle Albian (N=11), late Albian (N=65), middle Cenomanian (N=73), late Cenomanian (N=56), and early Turonian (N=84).

The material from Western Australia was collected by Mikael Siversson in his capacity as Curator of Palaeontology at the Department of Earth & Planetary Sciences, WA Museum Boola Bardip. WA Museum Boola Bardip staff are exempt from the requirement to obtain a fossicking permit for fossil collecting on crown and pastoral lands in WA (these permits are otherwise issued by the WA Department of Mines and Petroleum). Access to the Murchison House Station was granted by the owners, the Calumn family. Access to the Giralia Station was granted by the former owner Mr Rich French and later by the Baiyungu Aboriginal Corporation. The elasmobranch material from both areas were either surface collected or obtained by means of bulk sampling (sample size typically comprised about 10-50kg of sediment). The material from Richmond, QL, was collected by an amateur collector in one of the local shale quarries (known as the Council Quarry). Most of the elasmobranch material will ultimately be deposited with the local council museum in Richmond, the Kronosaurus Korner. The museum manages collecting at the Council Quarry. If further information is required regarding arrangements between the WA Museum Boola Bardip and Kronosaurus Korner, please contact Curator Michelle Johnston (curator@kronosauruskorner.com.au).

The environmental impact from bulk sampling at the WA sites is negligible due to the small sample size and very short life span of the excavation pits. The collecting site in the lower Murchison River area constitutes a relatively steep slope subjected to small to very large land slides during the wet winter season which quickly covers excavation pits. Apart from the excavation of an associated dentition of a large shark, collecting in the Giralia Range consisted of surface collecting with minimal impact on the environment. Access to sites in both areas is by 4WD vehicles only. Existing tracks limits the need for off-track driving (a few hundred meters in the Giralia Range and a similar distance in the lower Murchison River area).

**Localities** include: Murchison House Station, lower Murchison River area, Western Australia: S27°36'01'', E114°14'00''; elevation c. 160 m above sea level. Giralia Range, Western Australia; S22°53'24'', E114°08'40'': elevation c. 125 m above sea level. Council Quarry, Richmond, Queensland: S20°38'41'', E143°06'03'': elevation c. 210 m above sea level.

**Background**: Sharks are obligate water-breathers with high absolute oxygen demands. The alteration of oxygen concentration levels during times of ocean anoxia could therefore influence various aspects of shark biology (e.g., metabolic stress) (Laffoley and Baxter, 2019). The short to long-term effect of oxygen deprivation on modern shark populations remain however poorly understood (Heithaus et al., 2009; Laffoley and Baxter, 2019). Similarly, how different groups of sharks responded to ocean deoxygenation in the past (e.g., the mid-Cretaceous OAE2 event) has rarely been addressed.

### Required Libraries
```{r Required Libraries, message=FALSE, warning=FALSE}
library(easypackages)
packages("diptest","moments","knitr","xlsx","paleoTS","plotrix",
         "Momocs","Morpho","geomorph","RRPP","tidyverse",
         "reshape2","viridis","ggplot2","Hmisc","gridExtra",
         "plyr","ozmaps","sf","scatterpie","sp","colorspace",
         "ggpubr","ggdendro","car","dispRity","ggrepel",
         "harrypotter")
```

### Required Functions
```{r}
source(file = "Functions/Read points.R")
source(file = "Functions/Confidence and Prediction Intervals.r")
source(file = "Functions/Morphological Disparity with Bootstrap.R")
# Function to compute the modal value.
estimate_mode <- function(x) {
  d <- hist(x, plot = FALSE)
  mode <- d$mids[which.max(d$counts)]
  return(mode)
}
# Disparity with permutation.
disparity.calc <- function(gpa, data, ages) {
  nas <- which(is.na(data[,ages]))
  if(length(nas) > 0) gm.data <- geomorph.data.frame(coords = gpa$coords[,,rownames(data)[-nas]],ages = data[-nas,ages])
  else gm.data <- geomorph.data.frame(coords = gpa$coords[,,rownames(data)],ages = data[,ages])
  disp <- morphol.disparity(coords~ages,groups = ~ages,iter = 999, data = gm.data)
  return(disp)
}
# Match columns.
'%%' <- function(x,y) {paste(x,y, sep = " ")}

# Function to compute numerical descriptive statistics.
gms <- function(scores) {
  stats <- c("N","Mean","Median","Shapiro.W","Shapiro.p","Dip.Test.D","Dip.Test.p","Skewness",
             "IQR","Kurtosis","Minumum Range","Maximum Range")
  gms.res <- matrix(nrow = ncol(scores),ncol = length(stats))
  for(i in 1:ncol(scores)) {
    n <- nrow(scores)
    mean <- mean(scores[,i])
    median <- median(scores[,i])
    shap.test <- shapiro.test(scores[,i])
    dip.test <- dip.test(scores[,i])
    skew <- skewness(scores[,i])
    iqr <- IQR(scores[,i])
    kurt <- kurtosis(scores[,i])
    min.rg <- min(range(scores[,i]))
    max.rg <- max(range(scores[,i]))
    gms.res[i,] <- c(n,mean,median,shap.test$statistic,shap.test$p.value,dip.test$statistic,
                     dip.test$p.value,skew,iqr,kurt,min.rg,max.rg)
  }
  colnames(gms.res) <- stats
  rownames(gms.res) <- colnames(scores)
  return(gms.res)
}
```

### Local Data
```{r Data}
# 1. Occurrence dataset.
Df <- read.xlsx(file = "Tooth Dataset.xlsx",sheetIndex = 1)
# 2. Select complete and scaled specimens.
Df <- subset(x = Df,subset = Df$Scale == "x")
# 3. Provide rownames to data frame.
rownames(Df) <- paste(Df$File.Name,Df$File.Type,sep = "")
# 4. Landmark dataset.
LMs <- read2Dtps.noLMs(file = "Tooth Image Data/WamLMs.TPS",ncurve = 2,divide.curve = TRUE,
                       curve.1.pts = 79, curve.2.pts = 81)
```

### Time binning scheme
```{r}
# 1. Unite multiple columns into one by pasting strings together.
Df <- mutate(Df, subAge = as.factor(paste(Df$Sub, Df$Age)))
# 2. Drop unused levels.
Df <- Df %>%
  filter(subAge != "middle/late Cenomanian" & subAge != "NA Albian/Cenomanian" & subAge != "NA Cenomanian") %>%
  group_by(subAge, .drop = TRUE)

Df$subAge <- factor(Df$subAge)
# 3. Force rownames back.
class(Df) <- "data.frame"
rownames(Df) <- paste(Df$File.Name, Df$File.Type, sep = "")
```

```{r}
# Chronstratigraphically rearrange levels.
Df$subAge <- factor(x = Df$subAge,levels = c("middle Albian","late Albian",
                                             "middle Cenomanian","late Cenomanian",
                                             "early Turonian"))
```

### Geographic data distribution
```{r fig.cap = "Figure 1. Provenance and sample size"}
# 1. Local data.
Basin.Df <- read.xlsx(file = "Geographic Graph.xlsx",sheetIndex = 2)
Basin.Df <- Basin.Df[,-1]
# 2. Define the basic characteristics of the map.
oz_states <- ozmaps::ozmap_states
# 3. Plot.
ggplot() + 
  geom_sf(data = oz_states, colour = "black", fill = NA) +
  geom_scatterpie(aes(x = lon, y = lat, group = region,r = sqrt(N)/2),
                    data = Basin.Df,cols = LETTERS[1:5],
                    alpha = 1,sorted_by_radius = T) +
  coord_sf() + 
  scale_fill_hp(discrete = TRUE, option = "hufflepuff") + 
  theme(legend.position = "none")
```

### Digitization scheme using ggplot
```{r, warning=FALSE}
# 1. Prepare labels for sequential landmark points.
ldlab <- paste("LM",1:160, sep = "")
# 2. Convert array to data frame.
x.sp <- as.data.frame(LMs$coords[,,"Squalicorax-mutabilis-sp-nov_Siversson-et-al-2018_10G.jpg"])
# 3. Plot.
ggplot(x.sp, aes(x = x.sp$V1, y = x.sp$V2)) + 
  geom_point(shape = 21, size = 3, fill = "grey90", colour = "black") + 
  geom_text(aes(label = ldlab),hjust = 0, vjust = 0, cex = 1.5, colour = "purple", fontface = 2) + 
  labs(title = "Resampled and Scaled Points", xlab = "", ylab = "", subtitle = "") + 
  coord_fixed()
```

### Plot confusion matrix of sample sizes 
```{r}
sample.df <- as.data.frame(table(Df$subAge,Df$Family))
sp.B <- which(sample.df$Var2 == "Incertae sedis B")
sample.df <- sample.df[-sp.B, ]
# Plot.
ggplot(sample.df, aes(x = Var1, y = Var2, fill = Freq)) + 
  geom_tile() +
  scale_fill_gradient(low = "white",high = "#338080E6") +
  theme_linedraw() + xlab("sub-Ages") + ylab("Families") +
  geom_text(data = sample.df,mapping = aes(label = Freq)) +
  theme(axis.text.x = element_text(angle = 45,vjust = .8,hjust = .8,size = 7),
        axis.title = element_text(color = "#666666", face = "bold", size = 7),
        axis.text.y = element_text(angle = 0, size = 7),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 2/3)
```

# Analyses
### Procrustes Superimposition
```{r}
GPA <- gpagen(A = LMs$coords,curves = as.matrix(LMs$sliders[-79,]),
              ProcD = FALSE,print.progress = FALSE)

saveRDS(object = GPA,file = "GPA.rds",compress = T)
```

### Repeatability
```{r}
# 1. Landmark data.
ME.Lms <- read2Dtps.noLMs(file = "Measurement Error/ME Data.TPS",ncurve = 2,divide.curve = TRUE,
                          curve.1.pts = 79, curve.2.pts = 81)
# 2. Perform GPA on the ME landmark data.
ME.gpa <- gpagen(A = ME.Lms$coords,curves = as.matrix(ME.Lms$sliders[-79,]),ProcD = F,print.progress = F)
# 3. Sub-set original landmark file.
Or.Lms <- GPA$coords[,,intersect(dimnames(ME.Lms$coords)[[3]],dimnames(LMs$coords)[[3]])]
# 4. "Measurement error" for shapes using two-block partial least squares.
PLS <- two.b.pls(Or.Lms,ME.gpa$coords,iter = 999,print.progress = FALSE)
# 5. Plot.
# layout(matrix(c(1,1)),respect = T)
plot(PLS, lwd = 2, col = "black", pch = 19, font.lab = 2,cex.lab = .7, cex.axis = .7)
text(x=0,y=0.1,cex = .8,labels = "r-PLS: 0.996, P-value: 0.001, Effect size: 7.77")
```

### Morphospace
```{r}
PCA <- gm.prcomp(A = GPA$coords[,,rownames(Df)])
# Proportion of Variance.
exp <- PCA$sdev^2/sum(PCA$sdev^2)
round(exp[1:10],4)*100
# Cumulative Proportion.
cum <- cumsum(PCA$sdev^2/sum(PCA$sdev^2))
```

### Test for normality using a univariate approach
```{r results="hide"}
options(scipen = 999)
apply(X = PCA$x[,1:4],2,shapiro.test)
```

### Proportion of Variance & Cumulative Proportion
```{r}
# 1. Graphic arrangment.
layout.matrix <- matrix(c(1,1,2,2), nrow = 2, ncol = 2,byrow = FALSE)
layout(mat = layout.matrix,heights = c(2.5,2.5),
       widths = c(4,3),respect = TRUE)
# 2. Plot.
barplot(exp[1:10],xlab = "PCs", ylab = "Variance (%)",
        font.lab = 2,cex.lab = .7, cex.axis = .7)
plot(cum[1:10],type = "b",
     ylab = "",xlab = "",cex.lab = .7, cex.axis = .7)
dev.off()
```

### Centroid size object
- Throughout, the centroid size object, is treated as a continuously valued independent variable.
- *CS*, in the absence of allometry, is uncorrelated with shape variation.
```{r}
# Subset Procrustes coordinate data to match the current data frame.
size.names <- intersect(names(GPA$Csize),rownames(Df))
Csize <- GPA$Csize[size.names]
```

### High-dimensional visualization of shapes
```{r}
# 1. New data frame.
ggPCA <- data.frame(scores = PCA$x, groups = Df$Order,size = log(Csize))
# 2. Graphic.
ggplot(ggPCA,aes(x = scores.Comp1,y = scores.Comp2,colour = size)) +
  geom_point(size = 2, aes(colour = size)) +
  scale_fill_hp(house = "LunaLovegood",direction = -1) +
  scale_color_hp(house = "LunaLovegood") +
  geom_density2d(col = "grey50") +
  coord_equal() +
  theme_bw() +
  geom_hline(yintercept = 0,col = "black") + 
  xlab('Principal Component 1 (55.84%)') + ylab('Principal Component 2 (23.95%)') +
  geom_vline(xintercept=0,col="black") + 
  theme(legend.position = "bottom",
        axis.title = element_text(size = 7),
        axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7))
```

### Thin Plate Spline Deformation Grids
- The TPS function is a smooth function that maps points in one shape to corresponding points in another configuration.
- The shape objects are stored in a list. To visualize, convert the list into a matrix and apply a loop to display each configuration.
```{r}
# Min and max on PC1 to PC3.
layout(mat=matrix(c(1:6), ncol=2, byrow=F),respect = FALSE); par(mar=c(3,4,2,2))
for (i in 1:6) {
  u <- PCA$shapes[1:3]
  vi <- do.call(rbind,u)
  v <- vi[[i]]
  plotRefToTarget(GPA$consensus,v,method = "TPS")
}
dev.off()
# Min and max on PC4 and PC5.
layout(mat=matrix(c(1:4), ncol=2, byrow=F),respect = FALSE); par(mar=c(3,4,2,2))
for (i in 1:4) {
  u <- PCA$shapes[4:5]
  ti <- do.call(rbind,u)
  qv <- ti[[i]]
  plotRefToTarget(GPA$consensus,qv,method = "TPS") 
}
dev.off()
```

### Tooth shapes along PC1 and PC2
- Morphological characterization of PC-gradients. 
- The following grids correspond to variability along PC1 and PC2.
```{r}
# Variation on PC1.
pc1_shape_a <- GPA$consensus + matrix(-0.4873457*(PCA$rotation[,1]),byrow=T,160,2)
pc1_shape_b <- GPA$consensus + matrix(-0.25*(PCA$rotation[,1]),byrow=T,160,2)
pc1_shape_c <- GPA$consensus + matrix(0.0*(PCA$rotation[,1]),byrow=T,160,2)
pc1_shape_d <- GPA$consensus + matrix(0.25*(PCA$rotation[,1]),byrow=T,160,2)
pc1_shape_e <- GPA$consensus + matrix(0.3996614*(PCA$rotation[,1]),byrow=T,160,2)
# TPS-grids.
par(mfrow = c(1,5))
tps_grid(pc1_shape_a,pc1_shape_a,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc1_shape_b,pc1_shape_b,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc1_shape_c,pc1_shape_c,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc1_shape_d,pc1_shape_d,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc1_shape_e,pc1_shape_e,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
dev.off()
```

```{r}
# Variation on PC2.
pc2_shape_a <- GPA$consensus + matrix(-0.2710644*(PCA$rotation[,2]),byrow=T,160,2)
pc2_shape_b <- GPA$consensus + matrix(-0.2*(PCA$rotation[,2]),byrow=T,160,2)
pc2_shape_c <- GPA$consensus + matrix(-0.1*(PCA$rotation[,2]),byrow=T,160,2)
pc2_shape_d <- GPA$consensus + matrix(0.0*(PCA$rotation[,2]),byrow=T,160,2)
pc2_shape_e <- GPA$consensus + matrix(0.1*(PCA$rotation[,2]),byrow=T,160,2)
pc2_shape_f <- GPA$consensus + matrix(0.2*(PCA$rotation[,2]),byrow=T,160,2)
pc2_shape_g <- GPA$consensus + matrix(0.2766948*(PCA$rotation[,2]),byrow=T,160,2)
# TPS-grids.
par(mfrow = c(1,7))
tps_grid(pc2_shape_a,pc2_shape_a,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc2_shape_b,pc2_shape_b,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc2_shape_c,pc2_shape_c,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc2_shape_d,pc2_shape_d,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc2_shape_e,pc2_shape_e,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc2_shape_f,pc2_shape_f,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
tps_grid(pc2_shape_g,pc2_shape_g,grid.size = 10,shp = T,legend = T,shp.lwd = 6)
dev.off()
```

### Tooth shapes along PC3 and PC4
```{r}
# Minimum and maximum variation on PC3 and PC4.
pc3min <- GPA$consensus + matrix(-0.2260478*(PCA$rotation[,3]),byrow=T,160,2)
pc3max <- GPA$consensus + matrix(0.2095156*(PCA$rotation[,3]),byrow=T,160,2)
pc4min <- GPA$consensus + matrix(-0.2460726*(PCA$rotation[,4]),byrow=T,160,2)
pc4max <- GPA$consensus + matrix(0.1093302*(PCA$rotation[,4]),byrow=T,160,2)
# TPS-grids.
par(mfrow = c(1,4))
tps_grid(pc3min,pc3min,grid.size = 10,shp = T,legend = T,shp.lwd = 6,legend.text = "PC3 min")
tps_grid(pc3max,pc3max,grid.size = 10,shp = T,legend = T,shp.lwd = 6,legend.text = "PC3 max")
tps_grid(pc4min,pc4min,grid.size = 10,shp = T,legend = T,shp.lwd = 6,legend.text = "PC4 min")
tps_grid(pc4max,pc4max,grid.size = 10,shp = T,legend = T,shp.lwd = 6,legend.text = "PC4 max")
dev.off()
```

### Modal computation and visualization
- This section explores difference in the most frequent morphology between ages. 
```{r}
# Estimate the mode.
pc.axes <- c("PC1","PC2","PC3","PC4")
ages <- c(1:5)
mode.values <- matrix(nrow = 5,ncol = 4,dimnames = list(ages,pc.axes),byrow = T)
  for(i in 1:length(ages)) {
    axes <- 1:4
    for(j in 1:length(axes)) {
      res <- estimate_mode(PCA$x[Df$subAge == levels(Df$subAge)[ages[i]],axes[j]])
      mode.values[[i,j]] <- res; rownames(mode.values) <- levels(Df$subAge)
    }
  }

# 2. Break it down by axis.
pc1.x <- mode.values[,1]
pc2.x <- mode.values[,2]
pc3.x <- mode.values[,3]
pc4.x <- mode.values[,4]

# 3. Arrange shape configurations into an array. Multiply by the corresponding rotation matrix.
mode.LMs <- array(dim = c(160,2,5))
for (i in 1:length(pc1.x)) {
  shapes <- shape.names <- NULL
  shapes <- GPA$consensus + matrix(as.vector(pc1.x)[i]*(PCA$rotation[,1]),byrow = T,nrow = 160,ncol = 2)
  pc.names <- paste("PC",1,sep = "")
  shape.names <- paste(rep(pc.names,each = 1),rep(rownames(mode.values),1),sep = "")
  mode.LMs[,,i] <- shapes
  dimnames(mode.LMs)[[3]] <- shape.names
}
# 4. Plot results.
layout(matrix(c(1:5),nrow = 5,ncol = 1,byrow = T))
colnames <- dimnames(mode.LMs)[[3]]
for (i in 1:5) {
  z <- mode.LMs[,,i]
  tps_grid(z,z,grid.size = 1,shp = T,
           legend = F,shp.lwd = 6,legend.text = colnames[i])
}
dev.off()
```

### Shape difference due to allometry
```{r message=FALSE, results = "hide"}
# 1. Data frame.
form.df <- geomorph.data.frame(coords = GPA$coords[,,rownames(Df)],
                               size = log(Csize),
                               fam = factor(Df$Family),
                               age = factor(Df$subAge))
# 2. Multivariate regression.
#    Result: Shape does vary with size (P=0.001), but R2 is very low.
fit <- procD.lm(coords ~ size, RRPP = TRUE,data = form.df,iter=999)
anova(fit)
# 3. Allometric visualization
plotAllometry(fit, size = form.df$size, logsz = TRUE, method = "RegScore", pch = 19,cex.lab = .7,cex.axis = .7)
dev.off()
# 4. Calculate a percentage for which size account for shape differences.
round((fit$aov.table$SS[[1]]/fit$aov.table$SS[[3]])*100,2)

# 5. Group allometries.
#    Result: Different patterns of allometry. If the result is insignificant the slopes are homogeneous.
fit2 <- procD.lm(coords~size+fam, data=form.df, iter=999, print.progress = FALSE)
fit3 <- procD.lm(coords~size+age, data=form.df, iter=999, print.progress = FALSE)
fit4 <- procD.lm(coords~size*age, data=form.df, iter=999, print.progress = FALSE)
anova(fit2)
anova(fit3)
anova(fit4)
```

```{r}
# Check for homogeneity of regression slopes.
par(mar=c(5.1, 4.1, 4.1, 8.1), xpd=TRUE)
plotAllometry(fit, size = form.df$size, logsz = TRUE,method = "RegScore",pch = 19, col = form.df$fam)
legend("topright", inset=c(-0.6,0),pch = 19,legend=levels(form.df$fam),cex = .8)
dev.off()
```

### Allometry-free shapes
```{r message=FALSE}
# 1.Remove allometry.
res.x <- procD.lm(coords ~ size, data = form.df, iter = 999,print.progress = F)$residuals
# 2.Overall mean shape.
mean.x <- procD.lm(coords ~ 1, data = form.df, iter = 999,print.progress = F)$fitted
# 3.Obtain shapes with allometry removed.
shape.adj <- mean.x + res.x
Adjusted.Df <- arrayspecs(A = shape.adj, p = 160, k = 2)
```

### Compare shape spaces
```{r}
# 1.Graphic parameters.
layout(mat = layout.matrix,heights = c(2.5,2.5),widths = c(4,4),respect = TRUE)
# 2.Original
plot(gm.prcomp(A = GPA$coords))
# 3.Adjusted
plot(gm.prcomp(A = Adjusted.Df))
# 4.Mean shapes.
par(mfrow = c(1,2))
tps_grid(GPA$consensus,GPA$consensus,legend = T,legend.text = "Before")
tps_grid(mshape(Adjusted.Df),mshape(Adjusted.Df),legend = T,legend.text = "After")
dev.off()
```

### Morphospace through time
```{r warning=FALSE}
# 1.Data frame.
morphospaceTime <- data.frame(pc1 = PCA$x[,1],pc2 = PCA$x[,2], 
                              pc3 = PCA$x[,3],pc4 = PCA$x[,4],
                              age = Df$subAge,size = log(Csize))

# 2.Convert object into a molten data frame.
time_Melt <- melt(morphospaceTime,id.vars = c("age","size"))
```

### Local morphospace time-series plot
- Some morphotypes and size classes might either be underrepresented or completely missing for some time bins.
- We make the assumption that observed differences in tooth shape and size between samples of these ages (i.e., late Albian, late Cenomanian, and early Turonian) in part reflect genuine evolutionary alterations as opposed to the random effect of sampling and environmental shifts.
- Finally, because of the small sample size, we decided to refrain from interpreting the middle Albian in any detail.
```{r}
#    1.The mean and median morphology of time-binned distributions along pc-axes are first computed.
#    Summary statistics.
st.dat <- time_Melt %>% 
  group_by(age,variable) %>% 
  summarise_each(list(mean,median),pc.axis = value)

# 2. Reorder level.
time_Melt$age = factor(time_Melt$age,levels=rev(c("middle Albian","late Albian",
                                                  "middle Cenomanian","late Cenomanian",
                                                  "early Turonian")))

# 3. Time-series using frequency histograms.
ggplot(time_Melt, aes(x=value)) +
  geom_histogram(bins = 35, col = "black",fill ="black") +
  scale_y_continuous(labels = scales::number_format(accuracy = 1)) +
  ylab("Frequency") +
  geom_vline(data = st.dat, aes(xintercept = pc.axis_fn1), lwd = .5,
             linetype = "solid", color = "#CC3380") +
  geom_vline(data = st.dat, aes(xintercept = pc.axis_fn2),lwd = .5,
             linetype = "dashed", color = "#61A375") +
  stat_summary(aes(x = value, y = 1),fun.data = mean_cl_boot, geom = "errorbar",
               width = .10, lwd = .5,linetype = 1, col = "#CC3380") +
  facet_grid(vars(age),vars(variable),scales = "free",margins = FALSE) +
  theme(axis.title = element_text(color = "#666666", face = "bold", size = 7),
        axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7),
        legend.position = "none",
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 1.5/3)
```

### Family-levels morphospace
```{r warning=FALSE}
# Make asymetric color ramp, centered in 1
ramp1 = colorRampPalette(c("blue","grey"))
ramp2 = colorRampPalette(c("grey","gold","tomato","red", "darkred", "#3D0404"))
# Now specify the range and the center value
min.value=0; mid.value=1;max.value=10

col_per_step=10/max.value
max.breaks = max.value - mid.value
min.breaks = mid.value - min.value
       
low.ramp = ramp1(min.breaks*(col_per_step))
high.ramp = ramp2(max.breaks*(col_per_step))
       
myColors= c(low.ramp, high.ramp)  
# Create breaks corresponding to the color ramp
breaks = 0:length(myColors)/length(myColors) * (max.value - min.value) + min.value

fam.df <- data.frame(pc1 = PCA$x[,1], 
                     pc2 = PCA$x[,2], 
                     pc3 = PCA$x[,3],
                     pc4 = PCA$x[,4],
                     age = Df$subAge,family = Df$Family)

molten.fam <- melt(fam.df,id.vars = c("age","family"))
molten.fam$age <- with(molten.fam,factor(molten.fam$age,levels = rev(levels(molten.fam$age))))
# Plot.
ggplot(molten.fam, aes(value,colour = family, fill = family)) +
  geom_histogram(bins = 35,alpha = 0.2, na.rm = TRUE) +
  scale_y_continuous(labels = scales::number_format(accuracy = 1)) +
  # scale_fill_manual(values = c(myColors),breaks = breaks) +
  # scale_color_manual(values = c(myColors)) +
  scale_fill_viridis(discrete=TRUE,alpha=1) +
  scale_color_viridis(discrete=TRUE,alpha=1) +
  facet_grid(vars(age),vars(variable),scales = "free",margins = FALSE) +
    theme(axis.title = element_text(color = "#666666", face = "bold", size = 7),
        axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7,angle = 45),
        legend.position = "top",
        legend.text = element_text(size = 7),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 1.5/3)
```

### Monognathic tooth heterodonty
The effect of heterodonty is observed on our main axis of variation (i.e., PC1) involving the separation of anterior and lateroposteriorly situated tooth positions.
```{r}
# Correct spelling.
levels(Df$Relative.Position)[5] <- "lateral"
# Drop empty levels.
pos.df <- Df %>% drop_na(Relative.Position)
pos.df$Relative.Position <- factor(pos.df$Relative.Position)
# Re-level.
pos.df$Relative.Position <- factor(pos.df$Relative.Position,
                                   levels = c("commissural","anterior","anterolateral",
                                              "lateroposterior","lateral","posterior","symphyseal"))
# Data frame.
positions.df <- data.frame(pc1 = PCA$x[rownames(pos.df),1],
                           pc2 = PCA$x[rownames(pos.df),2],
                           pc3 = PCA$x[rownames(pos.df),3],
                           pc4 = PCA$x[rownames(pos.df),4],
                           positions = factor(pos.df$Relative.Position))

pos.melt <- melt(positions.df)
# Colors.
col.d = alpha("deepskyblue4",0.2)
# Plots.
ggplot(na.omit(pos.melt), aes(x = positions,y = value)) +
  geom_boxplot(outlier.color = col.d) + facet_wrap(.~variable,ncol = 2) +
  theme(axis.title = element_text(color = "#666666", face = "bold", size = 7),
        axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7,angle = 45,vjust = .5),
        legend.position = "top",
        panel.spacing = unit(1, "lines"),
        aspect.ratio = 1.5/3)

# Fit model.
heterodonty.gdf <- geomorph.data.frame(pc1 = PCA$x[rownames(pos.df),1],
                                       pc2 = PCA$x[rownames(pos.df),2],
                                       pc3 = PCA$x[rownames(pos.df),3],
                                       pc4 = PCA$x[rownames(pos.df),4],
                                       positions = factor(pos.df$Relative.Position))
```

```{r message=FALSE, result = "hide"}
# Lm.
he.a <- procD.lm(pc1~positions,iter = 999,RRPP = T,data = heterodonty.gdf)
he.b <- procD.lm(pc2~positions,iter = 999,RRPP = T,data = heterodonty.gdf)
he.c <- procD.lm(pc3~positions,iter = 999,RRPP = T,data = heterodonty.gdf)
he.d <- procD.lm(pc4~positions,iter = 999,RRPP = T,data = heterodonty.gdf)
anova(he.a)
```

### Statistical analyses
**MANOVA**
```{r message=FALSE, result = "hide"}
# Geomorph data frame.
rrpp.df <- rrpp.data.frame(pcs = PCA$x,pc1 = PCA$x[,1],
                           pc2 = PCA$x[,2],pc3 = PCA$x[,3],
                           pc4 = PCA$x[,4],
                           size = log(Csize),
                           gp = Df$Order, age = Df$subAge, family = Df$Family)
# MANOVA design.
age.model <- lm.rrpp(f1 = pc2 ~ age,iter = 999,RRPP = TRUE,SS.type = "I",
                     data = rrpp.df)
# Results.
bootID <- (t(simplify2array(age.model$PermInfo$perm.schedule)))[-1,]
anova(age.model, bootID = bootID, cor.type = "shrink")

# Pairwise comparisons of LS-means.
pW <- pairwise(fit = age.model,groups = interaction(Df$subAge))
# Distances between LS-means.
age.names <- rownames(pW$vars)
sum <- summary(pW, confidence = 0.95, test.type = "dist",formula = TRUE,print.progress = FALSE)
# FDR-adjuments.
round(matrix(p.adjust(p = sum$pairwise.tables$P,method = "fdr"),5,5,
             dimnames = list(age.names,age.names)),3)
```

**MANCOVA**
```{r results="hide"}
# MANCOVA model design:
ancova <- lm.rrpp(f1 = pcs ~ size+age + size:age,iter = 999,
                  RRPP = TRUE,SS.type = "III",
                  data = rrpp.df)

reveal.model.designs(ancova)

# Define a null model that excludes the factor-covariate interaction.
# This accounts for allometry.
null.model <- lm.rrpp(f1 = pcs ~ size+age,iter = 999,RRPP = TRUE,
                      SS.type = "III",data = rrpp.df)

# Univariate anova statistics for multivariate data.
anova(ancova, effect.type = "F")
# Largest effect.
cf <- coef(ancova, test = TRUE); plot(cf$stat.table$Zd,type = "o")
# Visualize model predictions.
sizeDF <- data.frame(age = age.names)
rownames(sizeDF) <- age.names
sizePreds <- predict(ancova, sizeDF)
plot(sizePreds, pch = 21, cex = 3, bg = c(2,4,1,3,5), lwd = 2)
# Pairwise comparisons of slopes.
pW.av <- pairwise(fit = ancova,fit.null = null.model,
                  groups = interaction(Df$subAge),
                  covariate = log(Csize))

# Distances between slopes.
summary(pW.av, confidence = 0.95, test.type = "dist",formula = TRUE,print.progress = FALSE)
```

### Summary statistics
```{r cache=TRUE,eval=FALSE,results="hide"}
gms(PCA$x[Df$subAge == "middle Albian",1:4]) %>% round(digits = 3)
gms(PCA$x[Df$subAge == "late Albian",1:4]) %>% round(digits = 3)
gms(PCA$x[Df$subAge == "middle Cenomanian",1:4]) %>% round(digits = 3)
gms(PCA$x[Df$subAge == "late Cenomanian",1:4]) %>% round(digits = 3)
gms(PCA$x[Df$subAge == "early Turonian",1:4]) %>% round(digits = 3)
```

```{r}
# Plots results.
sum.stat <- read.xlsx(file = "Supplementary Tables.xlsx",sheetIndex = 6)
sum.stat <- sum.stat[-c(2,11)]
melt.s <- melt(sum.stat, id.vars = c("Axes","Age"))
melt.s$Age <- factor(melt.s$Age, levels = c("middle Albian","late Albian",
                                            "middle Cenomanian","late Cenomanian","early Turonian"))
melt.s %>%
  ggplot(aes(x = variable, y = value, fill = variable)) +
  geom_bar(stat = "identity") + theme_bw() +
  theme(axis.text.x = element_blank(),aspect.ratio = 4*2) + 
  facet_grid(vars(Age),vars(Axes),margins = FALSE,scales="free_y", space="free_x")
```

### KS-test
```{r cache=TRUE,eval=FALSE}
ks.test(PCA$x[Df$subAge == "middle Cenomanian",1],PCA$x[Df$subAge == "late Albian",1])
ks.test(PCA$x[Df$subAge == "middle Cenomanian",1],PCA$x[Df$subAge == "late Cenomanian",1])
ks.test(PCA$x[Df$subAge == "middle Cenomanian",1],PCA$x[Df$subAge == "early Turonian",1])
ks.test(PCA$x[Df$subAge == "late Albian",1],PCA$x[Df$subAge == "late Cenomanian",1])
ks.test(PCA$x[Df$subAge == "late Albian",1],PCA$x[Df$subAge == "early Turonian",1])
ks.test(PCA$x[Df$subAge == "late Cenomanian",1],PCA$x[Df$subAge == "early Turonian",1])
```

### Plot LCS profile
```{r}
# Data frame.
centroid.df <- data.frame(size = log(Csize), age = Df$subAge, family = Df$Family)
# Plot.
ggplot(centroid.df, aes(x = age, y = size)) +
  geom_point(position = position_jitter(width = .1),alpha = 0.4) +
  geom_boxplot(outlier.size = 3, alpha = 0.8, fill = "white",
               colour = "black",) +
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar",width = .10, lwd = 1,
               linetype = 1, col = "black") +
  stat_summary(fun = mean, geom = "smooth", aes(group = 1), lwd = 1, col = "grey") +
  stat_summary(fun = "mean", geom = "point",lwd = 2,
               position = position_dodge(width = 1), color = "black", pch = 21) +
  labs(x = "Age (Ma)",y = "Log centroid size") +
  theme_bw()+
  theme(axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7)) +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 1.5/3)
```

### Perform two sample t-tests on age-specifc LCS
```{r cache=TRUE,eval=FALSE,results="hide"}
t.test(centroid.df$size[Df$subAge == "late Albian"],
       centroid.df$size[Df$subAge == "middle Cenomanian"])

t.test(centroid.df$size[Df$subAge == "late Albian"],
       centroid.df$size[Df$subAge == "late Cenomanian"])

t.test(centroid.df$size[Df$subAge == "late Albian"],
       centroid.df$size[Df$subAge == "early Turonian"])

t.test(centroid.df$size[Df$subAge == "middle Cenomanian"],
       centroid.df$size[Df$subAge == "late Cenomanian"])

t.test(centroid.df$size[Df$subAge == "middle Cenomanian"],
       centroid.df$size[Df$subAge == "early Turonian"])

t.test(centroid.df$size[Df$subAge == "late Cenomanian"],
       centroid.df$size[Df$subAge == "early Turonian"])
```

### Family-level centroid size evaluation
```{r}
ggplot(centroid.df, aes(x = age, y = size, fill = factor(family))) +
  geom_boxplot() + facet_wrap(~family,scales = "free") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
  scale_fill_viridis(discrete=TRUE,alpha=0.6) +
  scale_color_viridis(discrete=TRUE,alpha=0.6)
```

```{r}
# Data frame.
Df$Family <- factor(Df$Family)
pie.frame <- as.data.frame(table(Df$Family,Df$subAge))
names(pie.frame)[1] <- "Family"
names(pie.frame)[2] <- "Age"
names(pie.frame)[3] <- "Count"
# Plot.
pie.families <- ggplot(pie.frame, aes(x = "", y = Count,fill = Family)) +
  geom_bar(stat = "identity",position = "fill") + theme_bw() + facet_wrap(.~Age,nrow = 1)
# Pie chart.
pie.families + coord_polar(theta = "y") +
  theme(axis.title = element_text(size = 7),
        strip.text.x = element_text(size = 7, face = "bold"),
        axis.text.y = element_text(size = 7),
        axis.text.x = element_text(size = 7),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 1) + 
  scale_fill_viridis(discrete=TRUE,alpha=0.6) +
  scale_color_viridis(discrete=TRUE,alpha=0.6)
```

### Procrustes variance analysis with rarefaction
- Our results reveal both environmental and ontogenetic controls on increased (eco)-morphological disparity from the late Albian to late Cenomanian.
- We also show that lamniform sharks at the Order-level, experienced no significant change in disparity across the late Cenomanian–early Turonian transition.
- Estimates of partial disparity revealed however a substantial increase by small-sized carchariids in the early Turonian coinciding with reduced bottom water anoxia.

- Additive modelling provided no evidence that collection difference (i.e., surface vs. bulk) impacts disparity over time significantly (P=0.133).
- To test the effect of size on  disparity the CS was included in the model as a covariate.
```{r message=FALSE, warning=FALSE}
# Disparity data frame.
gdf <- rrpp.data.frame(coords = GPA$coords[,,rownames(Df)],
                       age = factor(Df$subAge),
                       size = log(Csize),
                       families = factor(Df$Family),
                       collection = factor(Df$Collection),
                       deposition = factor(Df$Deposition),
                       oxygen = factor(Df$Oxygen.condition))
# Linear Model.
f <- lm.rrpp(f1 = two.d.array(coords) ~ size,iter = 999,RRPP = T,data = gdf,print.progress = F)
# Coefficent is significant; does not mean however that the covariate needs to be dropped.
anova(f)
# Models.
a.disp.a <- morphol.disparity(coords ~ age,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.b <- morphol.disparity(coords ~ size*age,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.c <- morphol.disparity(coords ~ size+age + size:age,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.d <- morphol.disparity(coords ~ age+collection,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.e <- morphol.disparity(coords ~ age+collection,groups = ~age+collection,iter = 999,data = gdf,print.progress = FALSE)
a.disp.f <- morphol.disparity(coords ~ collection,groups = ~collection,iter = 999,data = gdf,print.progress = FALSE)
a.disp.g <- morphol.disparity(coords ~ age+families,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.h <- morphol.disparity(coords ~ age+size+families,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.i <- morphol.disparity(coords ~ deposition,groups = ~deposition,iter = 999,data = gdf,print.progress = FALSE)
a.disp.j <- morphol.disparity(coords ~ age+deposition,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.k <- morphol.disparity(coords ~ age+oxygen,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)
a.disp.l <- morphol.disparity(coords ~ age+deposition+collection,groups = ~age,iter = 999,data = gdf,print.progress = FALSE)

# Compare models.
layout(matrix(c(1,2),1,1),respect = T)
plot(a.disp.a$Procrustes.var[c(4,2,5,3,1)], type = "o", ylim = c(0,0.1),
     ylab = "PV",main = "Model Comparisons", cex = 1)
lines(a.disp.b$Procrustes.var[c(4,2,5,3,1)], type = "o",col = "red")
lines(a.disp.c$Procrustes.var[c(4,2,5,3,1)], type = "o",col = "blue")
lines(a.disp.d$Procrustes.var[c(4,2,5,3,1)], type = "o",col = "green", lty = 2)
lines(a.disp.g$Procrustes.var[c(4,2,5,3,1)], type = "o",col = "brown")
lines(a.disp.h$Procrustes.var[c(4,2,5,3,1)], type = "o",col = "purple")
lines(a.disp.j$Procrustes.var[c(4,2,5,3,1)], type = "o",col = "gold2")
lines(a.disp.k$Procrustes.var[c(4,2,5,3,1)], type = "o",col = "grey80")

legend(x = 1, y = 0.025, 
       legend = c("Age","Age*Size","Age+Size + Size:Age","Age+Collection","Age+Families","Age+Size+Families","Age+Deposition","Age+Oxygen"),
       cex = .5,lty = 19, col = c("black","red","blue","green","brown","purple","yellow","grey80"))

# Determine the best fit of disparity models.
a1 <- lm.rrpp(f1 = two.d.array(coords) ~ age,iter = 999,RRPP = T,data = gdf,print.progress = F)
a2 <- lm.rrpp(f1 = two.d.array(coords) ~ age+size,iter = 999,RRPP = T,data = gdf,print.progress = F)
a3 <- lm.rrpp(f1 = two.d.array(coords) ~ age+size + age:size,iter = 999,RRPP = T,data = gdf,print.progress = F)
a4 <- lm.rrpp(f1 = two.d.array(coords) ~ age+collection,iter = 999,RRPP = T,data = gdf,print.progress = F)
a5 <- lm.rrpp(f1 = two.d.array(coords) ~ collection,iter = 999,RRPP = T,data = gdf,print.progress = F)
a6 <- lm.rrpp(f1 = two.d.array(coords) ~ age+families,iter = 999,RRPP = T,data = gdf,print.progress = F)
a7 <- lm.rrpp(f1 = two.d.array(coords) ~ age+size+families,iter = 999,RRPP = T,data = gdf,print.progress = F)
a8 <- lm.rrpp(f1 = two.d.array(coords) ~ age+deposition,iter = 999,RRPP = T,data = gdf,print.progress = F)
```

```{r message=FALSE, warning=FALSE,cache=TRUE, eval=FALSE}
# Likelihood ratio test.
anova(a1) # Lowest AIC.
anova(a2)
anova(a3) # Highest log likihood.
anova(a4) # Two-factor model: collection differences does not explain variation in disparity over time.
anova(a5)
anova(a6)
anova(a7)
anova(a8)

anova(a1,a2,a3,a4,a6,a7,a8)

# Model Comparisons, in terms of the log-likelihood or covariance trace
modComp <- model.comparison(a1,a2,a3,a4,a6,a7,type = "logLik")
summary(modComp)
plot(modComp$table[,4], ylab = "AICc",ylim = c(-117139.1,-111759.8))
# Only factorial models.
modComp.select <- model.comparison(a1,a4,type = "logLik")
summary(modComp.select)

# Bootstrapped disparity (without a covariate) and rarefaction.
Disp.model <- error.plot(gpa.coords = GPA$coords[,,rownames(Df)],blank = FALSE,
                         groups = Df$subAge,
                         order = c(1:5),replicates = 999,
                         rarefy.par = list(min.N = 11,reps = 999))
# Permuation statistics.
disparity.calc(gpa = GPA,ages = "subAge",data = Df)

# Export rarefaction results.
lapply(names(Disp.model$rarefaction.results), 
       function(x) write.xlsx(Disp.model$rarefaction.results[[x]],
                              'output.xlsx', sheetName=x, append=TRUE))
```

### First-order sensitivity analysis
```{r cache=TRUE, eval=FALSE}
# Data exclusion of C. ricki from the middle Cenomanian.
table(Df$Species,Df$subAge)
table(Df$subAge)
subData <- Df[Df$Locale != "C-Y Creek, Giralia Anticline", ]
# Data frame.
ex.Disp <- geomorph.data.frame(coords = GPA$coords[,,rownames(subData)],
                               age = factor(subData$subAge))
# Disparity.
morphol.disparity(coords ~ age,groups = ~age,iter = 999,data = ex.Disp,print.progress = FALSE)$Procrustes.var
```

### Second-order sensitivity analysis
Compared with odontaspidid and carchariid teeth, anacoracid teeth are less prone to post-mortem breakage of the crown and are therefore somewhat over-represented. 
```{r cache=TRUE, eval=FALSE}
# Exclusion of over-representative taxa (i.e., Squalicorax mutabilis: N=61) on patterns of disparity through time.
# Late Cenomanian (N=28) and early Turonian (N=52).
S.mutabilis <- which(Df$Species == "mutabilis")
ex.Df <- Df[-S.mutabilis, ]
# Frame.
ex.gm <- geomorph.data.frame(coords = GPA$coords[,,rownames(ex.Df)],
                             age = factor(ex.Df$subAge),
                             size = log(Csize[rownames(ex.Df)]))
# Disparity.
dip.sm <- morphol.disparity(coords ~ age,groups = ~age,
                            iter = 999,data = ex.gm,
                            print.progress = FALSE)
```

### Third-order sensitivity analysis
```{r cache=TRUE, eval=FALSE}
# Exclude all juvenile specimens from the dataset.
juvenile <- which(Df$Ontogeny == "juvenile")
juven.df <- Df[-juvenile, ]
# Data frame.
ex.juv <- geomorph.data.frame(coords = GPA$coords[,,rownames(juven.df)],
                              age = factor(juven.df$subAge))
# Disparity.
dip.ju <- morphol.disparity(coords ~ age,groups = ~age,iter = 999,data = ex.juv,
                            print.progress = FALSE)
```

### Plot main tooth disparity and rarefaction results
```{r warning=FALSE}
# Import results.
disp.res <- read.xlsx(file = "Supplementary Tables.xlsx",sheetIndex = 2)
disp.res$Age <- factor(disp.res$Age,levels = c("middle Albian","late Albian",
                                               "middle Cenomanian",
                                               "late Cenomanian",
                                               "early Turonian"))
# Create space between levels.
pd <- position_dodge(width = 2.5)
# Plot.
ggplot(disp.res, aes(x = Age, y = Disparity, group = Clade)) +
  scale_x_discrete() +
  # Raw disparity.
  geom_line(stat = "identity", size = 1,color = "black") +
  geom_point(size = 3, shape = 21, fill = "white", color = "black") +
  geom_errorbar(aes(ymin = LowerBoot.PI ,ymax = UpperBoot.PI), width = .2,
                position = pd, show.legend = FALSE,size = .3) +
  # Exclude juvenile.
  geom_point(mapping = aes(x = Age,y = Ex.Juvenile),
             data = disp.res, size = 3, shape = 21,
             fill = "orange", color = "orange") +
  # Rarefied disparity.
  geom_point(mapping = aes(x = Age,y = Rarefied),
             data = disp.res, size = 3, shape = 17,
             fill = "white", color = "#CC3380E6") +
  geom_errorbar(aes(ymin = lower.PI ,ymax = upper.PI), width = .2,
                show.legend = FALSE,size = .3,
                color = "#CC3380E6") +
  ylab("Procrustes Variance") + xlab("") +
  theme_bw() +
  theme(axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7)) +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 2/3)
```

### Sample rarefaction profiles
```{r}
# Import results.
rare.res <- read.xlsx(file = "Supplementary Tables.xlsx",sheetIndex = 5)
rare.res$Age <- factor(rare.res$Age, levels = c("middle Albian","late Albian",
                                                "middle Cenomanian","late Cenomanian",
                                                "early Turonian"))
# Plot.
ggplot(data = rare.res,mapping = aes(x = N,y = rare.disp,colour = Age,group = Age)) + 
  geom_line(alpha = .8) + geom_errorbar(aes(ymin = lower.PI, ymax =  upper.PI)) +
  geom_point(shape = 21, fill = "white") +
  scale_x_discrete(breaks = seq(from = 11,to = 84,by = 5)) +
  scale_colour_hp_d(option = "ronweasley2",direction = -1) +
  labs(title = "Sample rarefaction") +
  theme_cleveland() +
  theme(legend.position = "right",
        strip.text.x = element_text(size = 7, face = "bold"),
        axis.title = element_text(color = "#666666", face = "bold", size = 7),
        axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7)) +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 1.5/1.5)
```

### Fourth-order sensitivity analysis
```{r message=FALSE,warning=FALSE,cache=TRUE,eval=FALSE}
# Re-import and prepare data frame again.
Sen.Df <- read.xlsx(file = "Tooth Dataset.xlsx",sheetIndex = 1)
Sen.Df <- subset(x = Sen.Df,subset = Sen.Df$Scale == "x")
rownames(Sen.Df) <- paste(Sen.Df$File.Name,Sen.Df$File.Type,sep = "")
# Coarser time binning scheme.
# Combine the middle and late Albian specimens into one coarser time-bin.
# Similarly, do the same for the Cenomanian specimens.

# Remove unused level.
mlC <- which(Sen.Df$Age == "Albian/Cenomanian")
Sen.Df <- Sen.Df[-mlC, ]
# Disparity model.
sen.disp.a <- error.plot(gpa.coords = GPA$coords[,,rownames(Sen.Df)],blank = FALSE,
                         groups = Sen.Df$Age,
                         order = c(1:3),replicates = 999,
                         rarefy.par = list(min.N = 76,reps = 999))
# Permuation statistics.
disparity.calc(gpa = GPA,ages = "Age",data = Sen.Df)
```

### Fifth-order sensitivity analysis
```{r message=FALSE, results="hide"}
# Subset data frame: 56 specimens.
late.Cenomanian <- Df[Df$subAge == "late Cenomanian", ]
# Subset by ontogeny: 32 specimens.
lc.ontogeny <- late.Cenomanian[complete.cases(late.Cenomanian[,29]),]
# Geomorph data frame.
tos.a <- geomorph.data.frame(coords = GPA$coords[,,rownames(lc.ontogeny)],
                             ontogeny = factor(lc.ontogeny$Ontogeny),
                             position = factor(lc.ontogeny$Relative.Position),
                             size = log(Csize[rownames(lc.ontogeny)]),
                             family = factor(lc.ontogeny$Family),
                             genera = factor(lc.ontogeny$Genus),
                             species = factor(lc.ontogeny$Species))

# Is size related to tooth position? NO.
m0 <- procD.lm(coords ~ 1,data = tos.a)
m1 <- procD.lm(coords ~ size+position,data = tos.a)
anova(m1)
# Alt. models.
m2 <- procD.lm(coords ~ size+ontogeny+position+family,iter = 999,RRPP = T,data = tos.a)
m3 <- procD.lm(size ~ position,data = tos.a)

model.comparison(m0,m1,m2,type = "logLik")

# PV in the full dataspa: n= 56.
m4 <- morphol.disparity(f1 = GPA$coords[,,rownames(late.Cenomanian)]~1,iter = 999)
# PV in reduced dataspace: n= 32.
m5 <- morphol.disparity(f1 = coords~1,iter = 999,data = tos.a)
# PV in reduced dataspace (with covariates and factors).
m6 <- morphol.disparity(f1 = coords~ontogeny,iter = 999,data = tos.a)
m7 <- morphol.disparity(f1 = coords~position,iter = 999,data = tos.a)
m8 <- morphol.disparity(f1 = coords~family,iter = 999,data = tos.a)
m9 <- morphol.disparity(f1 = coords~ontogeny+position,iter = 999,data = tos.a)
m10 <- morphol.disparity(f1 = coords~ontogeny+position+family,iter = 999,data = tos.a)
# Partial disparity of late Cenomanian adult and juvenile specimens.
m11 <- morphol.disparity(f1 = coords~1,groups = ~ontogeny,iter = 999,partial = T,data = tos.a)
```

```{r}
# Assemble disparity results.
res.df <- data.frame(Models = c("Full data","Reduced data","Ontogeny","Position",
                                "Family","Ontogeny+Position","Ontogeny+Position+Family"),
                     PV = c(0.06638706,0.06119887,0.06000987,
                            0.0256451,0.04606875,0.02522685,
                            0.01165321))

res.df$Models <- factor(res.df$Models, levels = c("Full data","Reduced data","Ontogeny","Position",
                                                  "Family","Ontogeny+Position","Ontogeny+Position+Family"))
# Plot.
ggplot(res.df,aes(x = Models, y = PV, fill = Models)) + 
  geom_bar(stat="identity", color = "grey") +
  labs(title = "Effect of nominal factors on tooth disparity",subtitle = "late Cenomanian") +
  ylab("Procrustes Variance") +
  scale_color_discrete_sequential(palette = "Red-Blue",nmax = 7) +
  scale_fill_discrete_sequential(palette = "Red-Blue",nmax = 7) +
  theme(axis.title.x = element_blank(),
        axis.text.x = element_blank(),
        aspect.ratio = 1)
```

### Plot alternative disparity results
```{r message=FALSE,warning=FALSE,cache=TRUE,eval=FALSE}
# Import results.
s1.Df <- read.xlsx(file = "Supplementary Tables.xlsx",sheetIndex = 3)
s1.Df$Age <- factor(s1.Df$Age,levels = c("Albian","Cenomanian","Turonian"))
# Plot.
ggplot(s1.Df, aes(x = Age, y = Disparity, group = Clade)) +
  scale_x_discrete() +
  # Raw disparity
  geom_line(stat = "identity", size = 1, color = "black") +
  geom_point(size = 3, shape = 21, fill = "white", color = "black") +
  geom_errorbar(aes(ymin = LowerBoot.PI ,ymax = UpperBoot.PI), width = .2,
                position = pd, show.legend = FALSE,size = .3) +
  # Rarefied disparity
  geom_point(mapping = aes(x = Age,y = Rarefied),
             data = s1.Df, size = 3, shape = 17,
             fill = "white", color = "#CC3380E6") +
  geom_errorbar(aes(ymin = lower.PI ,ymax = upper.PI), width = .2,
                show.legend = FALSE,size = .3,
                color = "#CC3380E6") +
  labs(title = "Lamniformes dental disparity",subtitle = "Three-stage binning scheme") +
  ylab("Procrustes Variance") + xlab("") +
  theme(axis.text.y = element_text(angle = 90, size = 7),
        axis.text.x = element_text(size = 7)) +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 1.5/3)

# Taxonomic composition by age.
Sen.Df$Family <- factor(Sen.Df$Family)
tax.frame <- as.data.frame(table(Sen.Df$Family,Sen.Df$Age))
names(tax.frame)[1] <- "Family"
names(tax.frame)[2] <- "Age"
names(tax.frame)[3] <- "Count"
# Plot.
pie.fam <- ggplot(tax.frame, aes(x = "", y = Count,fill = Family)) +
  geom_bar(stat = "identity",position = "fill") + theme_bw() + facet_wrap(.~Age,nrow = 1)
# Pie chart.
pie.fam + coord_polar(theta = "y") +
  theme(axis.title = element_text(size = 7),
        strip.text.x = element_text(size = 7, face = "bold"),
        axis.text.y = element_text(size = 7),
        axis.text.x = element_text(size = 7),
        panel.spacing = unit(0, "lines"),
        aspect.ratio = NULL,
        legend.position = "none") + 
  scale_fill_viridis(discrete=TRUE,alpha=0.6) +
  scale_color_viridis(discrete=TRUE,alpha=0.6)
```

```{r cache=TRUE, eval=FALSE}
# Does tooth shape differ between families? Yes.
frame.x <- geomorph.data.frame(coords = GPA$coords[,,rownames(Df)],families = factor(Df$Family))
model.xx <- procD.lm(coords ~ families, iter = 999, RRPP = T,data=frame.x)
anova(model.xx)
```

### Partial disparity
```{r message=FALSE,eval=FALSE}
# 1. Split data frame by Age.
split.ls <- split(Df, factor(Df$subAge))
# 2. Partial disparity.
partial.disps <- vector(mode = "list", length = length(split.ls))
for(i in 1:length(split.ls)) {
  levs <- droplevels(split.ls[[i]][,10])
  gdf.split <- geomorph.data.frame(coords = GPA$coords[,,rownames(split.ls[[i]])],
                                   size = log(Csize[rownames(split.ls[[i]])]),
                                   grp = levs)
  if(length(levels(levs)) < 2) disp <- morphol.disparity(coords ~ 1, data = gdf.split)
  else disp <- morphol.disparity(coords~1, groups = ~grp, data = gdf.split, partial = TRUE)
  partial.disps[[i]] <- disp
}
```

# Plot family-level disparity results
```{r}
# Import results.
family.disp <- read.xlsx(file = "Supplementary Tables.xlsx",sheetIndex = 4)
# Plot.
family.disp$Time <- factor(x = family.disp$Time,
                           levels=rev(c("early Turonian","late Cenomanian",
                                        "middle Cenomanian","late Albian",
                                        "middle Albian")))
# Plot.
ggplot(data = family.disp, mapping = aes(x = Time, y = Disparity,color = Family,fill = Family)) + 
  geom_bar(stat="identity") +
  coord_flip() +
  xlab("Age") +
  scale_fill_viridis(discrete=TRUE,alpha=0.6) +
  scale_color_viridis(discrete=TRUE,alpha=0.6) +
  theme(axis.title = element_text(color = "#666666", face = "bold", size = 7),
        axis.text.y = element_text(angle = 0, size = 7),
        axis.text.x = element_text(size = 7),
        legend.position = "right",
        panel.spacing = unit(0, "lines"),
        aspect.ratio = 2/3)
```

### Alternative disparity metrics
```{r message=FALSE,warning=FALSE,cache=TRUE,eval=FALSE}
# From geomorph to dispRity.
disp.gdf <- geomorph.data.frame(coords = GPA$coords[,,rownames(Df)],age = factor(Df$subAge))
disp.obj <- geomorph.ordination(disp.gdf)
# Sum of ranges.
disp.range <- dispRity(boot.matrix(disp.obj,rarefaction = TRUE,bootstraps = 999),metric = displacements)
# Median pairwise euclidean distance.
disp.pairwise <- dispRity(boot.matrix(disp.obj,rarefaction = TRUE,bootstraps = 999),metric = c(median, pairwise.dist))
# Plot results.
layout(matrix(c(1,2),nrow = 1,ncol = 2,byrow = F))
plot(disp.range, las=2,type = "line",rarefaction = 5, xlab = "",ylab = "Displacements")
plot(disp.pairwise, las=2,type = "line",rarefaction = 5, xlab = "",ylab = "Median pairwise euclidean distance",cent.tend = mean)
dev.off()
# Rarefaction graphs.
plot(disp.range,rarefaction = T)
plot(disp.pairwise,rarefaction = T)
```

# Maximum likelihood models of evolution
```{r}
sub <- Df$Sub
age <- Df$Age
# New column
Df$subAge.Num <- NULL
Df$subAge.Num <- as.factor(sub %% age)
Df$subAge.Num <- factor(Df$subAge.Num,levels = c("middle Albian","late Albian",
                                                 "middle Cenomanian",
                                                 "late Cenomanian",
                                                 "early Turonian"))
# Determine midPoint values corresponding to stages.
midPoint.value <- c(0:5)
levels(Df$subAge.Num) <- as.numeric(midPoint.value)

# Combine centroid size and age data.
size.frame <- data.frame(species=names(Csize), value=log(Csize), row.names=NULL)
withfactors <- cbind(size.frame,Df$subAge.Num)
colnames(withfactors)[3] <- "Age"
# Data frame.
paleoTSdata <- as.data.frame(withfactors[,c(1,2,3)])
# Sub-split the data for time-series analyses.
calc.obj <- ddply(paleoTSdata,~Age, summarise,mean = mean(value),var = var(value))
# Count number of specimens in each time bin.
counts <- count(paleoTSdata,c("Age")) 
# All the data needed for paleoTS input.
dataforpaleoTS <- cbind(calc.obj,counts$freq) 
# Change names.
colnames(dataforpaleoTS) <- c('AGE','mean','var','freq')
# Make paleots object.
paleotTSanalyses <- as.paleoTS(mm = dataforpaleoTS$mean, 
                               vv = dataforpaleoTS$var, 
                               nn = dataforpaleoTS$freq, 
                               tt = as.numeric(dataforpaleoTS$AGE))

# Fit a set of standard evolutionary models.
evo.models <- fit4models(y = paleotTSanalyses,silent = TRUE,method = c("Joint"), pool = F)
evo.models$modelFits # stasis.
```

```{r}
# Data frame.
AIC.df <- data.frame(models = rownames(evo.models$modelFits)[-4],
                     AICs = evo.models$modelFits$AICc[-4])
# Plot.
ggplot(data=AIC.df, aes(x=models, y=AICs)) + 
  geom_bar(stat="identity") +
  theme(panel.spacing = unit(0, "lines"),
        aspect.ratio = 2/3)
```

### Plot mean morphology through time
```{r Plot mean morphology through time}
# Compute manually and plot with ggplot.
x <- pool.var(paleotTSanalyses, ret.paleoTS = TRUE)
se <- sqrt(x$vv/x$nn)
lci <- x$mm - (1 * se)
uci <- x$mm + (1 * se)
# Data frame.
modelFit.df <- data.frame(time = 1:5,mean = x$mm,se = se,lci = lci, uci = uci)
# Plot.
ggplot(modelFit.df, aes(x=time, y=mean)) + 
  geom_errorbar(aes(ymin = lci ,ymax = uci), width = .3,
                show.legend = FALSE,size = .3) +
  geom_line() +
  geom_point() +
  theme(panel.spacing = unit(0, "lines"),
        aspect.ratio = 1) +
  coord_flip()
```

### Taxon specific size trajectories
- We test for dwarfism in lamniform species as an evolutionary response to local anoxia across the late Cenomanian-early Turonian boundary.
- Species considered are *Johnlongia allocotodon* and *Squalicorax mutabilis*. 
```{r}
# Subset.
CT.Df <- Df[Df$Genus %% Df$Species == "Johnlongia allocotodon" | Df$Genus %% Df$Species == "Squalicorax mutabilis", ]
CT.Size <- data.frame(genera = factor(CT.Df$Genus),time = factor(CT.Df$subAge), size = log(Csize[rownames(CT.Df)]))
# Plot.
ggplot(CT.Size,aes(time,size, fill = genera)) +
  geom_point(alpha = 0.5) +
  geom_boxplot(outlier.size = 3, alpha = 0.2, fill = "white", colour = "black") +
  stat_summary(fun = "mean", geom = "point",lwd = 2,
               position = position_dodge(width = 1),pch = 21) +
  stat_summary(fun.data = mean_cl_boot, geom = "errorbar",width = .10, lwd = 1,
               linetype = 1, col = "black") +
  facet_wrap(~genera, scale="free") +
  theme(aspect.ratio = 1,
        legend.position = "none")

# Statistical test.
aov.size <- aov(size~genera/time,data = CT.Size)
TukeyHSD(aov.size)
```

### Generic Richness
```{r}
genera <- with(Df,tapply(Df$Genus,Df$subAge,FUN = function(x)length(unique(x))))
plot(genera <- as.vector(genera[c(1,2,3,4,5)]),type = "l",ylab = "Genera",xlab = "Ages")
```

### Computing an average tooth shape for each family
```{r eval=FALSE, results="hide"}
Y.means <- rowsum(two.d.array(GPA$coords[,,rownames(Df)]), Df$Family)/as.vector(table(Df$Family))
F.array <- arrayspecs(A = Y.means,p = 160,k = 2)
```

```{r, eval=FALSE, results="hide"}
# Plot mean TPS-grids.
par(mfrow = c(4,3))
colnames <- dimnames(F.array)[[3]]
for (i in 1:10) {
x <- F.array[,,i]
tps_grid(x,x,grid.size = 1,shp = T,
         legend = F,shp.lwd = 6,legend.text = colnames[i]) 
}
```

### Hierarchical Clustering
```{r eval=FALSE, results="hide"}
# Euclidean distance.
fam.dist <- dist(two.d.array(F.array),method = "euclidean")
# Dendrogram.
hc <- hclust(fam.dist,method = "average") %>% as.dendrogram()
# Plot.
ggdendrogram(hc,rotate = T)
```

***References***


### Session Information
```{r}
devtools::session_info()
```