Disparity analysis involves a lot of manipulation of many matrices (especially when bootstrapping) which can be impractical to visualise and will quickly overwhelm your
Even the simple Beck and Lee 2014 example above produces an object with > 72 lines of lists of lists of matrices!
dispRity uses a specific class of object called a
These objects allow users to use S3 method functions such as
dispRity also contains various utility functions that manipulate the
dispRity object (e.g.
extract.dispRity see the full list in the next section).
These functions modify the
dispRity object without having to delve into its complex structure!
The full structure of a
dispRity object is detailed here.
## Loading the example data data(disparity) ## What is the class of the median_centroids object? class(disparity)
##  "dispRity"
## What does the object contain? names(disparity)
##  "matrix" "tree" "call" "subsets" "disparity"
## Summarising it using the S3 method print.dispRity disparity
## ---- dispRity object ---- ## 7 continuous (acctran) time subsets for 99 elements in one matrix with 97 dimensions with 1 phylogenetic tree ## 90, 80, 70, 60, 50 ... ## Data was bootstrapped 100 times (method:"full") and rarefied to 20, 15, 10, 5 elements. ## Disparity was calculated as: c(median, centroids).
Note that it is always possible to recall the full object using the argument
all = TRUE in
## Display the full object print(disparity, all = TRUE) ## This is more nearly ~ 5000 lines on my 13 inch laptop screen!
The package also provides some utility functions to facilitate multidimensional analysis.
The first set of utilities are functions for manipulating
This function creates empty
## Creating an empty dispRity object make.dispRity()
## Empty dispRity object.
## Creating an "empty" dispRity object with a matrix make.dispRity(matrix(rnorm(20), 5, 4)))(disparity_obj <-
## ---- dispRity object ---- ## Contains a matrix 5x4.
This function initialises a
dispRity object and generates its call properties.
## The dispRity object's call is indeed empty $calldisparity_obj
## Filling an empty disparity object (that needs to contain at least a matrix) fill.dispRity(disparity_obj))(disparity_obj <-
## Warning in check.dispRity.data(data$matrix): Row names have been automatically ## added to data$matrix.
## ---- dispRity object ---- ## 5 elements in one matrix with 4 dimensions.
## The dipRity object has now the correct minimal attributes $calldisparity_obj
## $dimensions ##  1 2 3 4
This function extracts a specific matrix from a disparity object. The matrix can be one of the bootstrapped matrices or/and a rarefied matrix.
## Extracting the matrix containing the coordinates of the elements at time 50 str(get.matrix(disparity, "50"))
## num [1:18, 1:97] -0.1036 0.4318 0.3371 0.0501 0.685 ... ## - attr(*, "dimnames")=List of 2 ## ..$ : chr [1:18] "Leptictis" "Dasypodidae" "n24" "Potamogalinae" ... ## ..$ : NULL
## Extracting the 3rd bootstrapped matrix with the 2nd rarefaction level ## (15 elements) from the second group (80 Mya) str(get.matrix(disparity, subsets = 1, bootstrap = 3, rarefaction = 2))
## num [1:15, 1:97] -0.12948 -0.57973 0.00361 0.27123 0.27123 ... ## - attr(*, "dimnames")=List of 2 ## ..$ : chr [1:15] "n15" "Maelestes" "n20" "n34" ... ## ..$ : NULL
This function simply counts the number of subsets in a
## How many subsets are in this object? n.subsets(disparity)
##  7
This function tells the number of elements in each subsets of a
## How many elements are there in each subset? size.subsets(disparity)
## 90 80 70 60 50 40 30 ## 18 22 23 21 18 15 10
This function creates a dispRity object that contains only elements from one specific subsets.
## Extracting all the data for the crown mammals get.subsets(disp_crown_stemBS, "Group.crown")) (crown_mammals <- ## The object keeps the properties of the parent object but is composed of only one subsets length(crown_mammals$subsets)
This function allows to merge different subsets.
## Combine the two first subsets in the dispRity data example combine.subsets(disparity, c(1,2))
Note that the computed values (bootstrapped data + disparity metric) are not merge.
This function extracts the calculated disparity values of a specific matrix.
## Extracting the observed disparity (default) get.disparity(disparity) ## Extracting the disparity from the bootstrapped values from the ## 10th rarefaction level from the second subsets (80 Mya) get.disparity(disparity, observed = FALSE, subsets = 2, rarefaction = 10)
This is the modified S3 method for
scale (scaling and/or centring) that can be applied to the disparity data of a
dispRity object and can take optional arguments (for example the rescaling by dividing by a maximum value).
## Getting the disparity values of the time subsets head(summary(disparity)) ## Scaling the same disparity values head(summary(rescale.dispRity(disparity, scale = TRUE))) ## Scaling and centering: head(summary(rescale.dispRity(disparity, scale = TRUE, center = TRUE))) ## Rescaling the value by dividing by a maximum value head(summary(rescale.dispRity(disparity, max = 10)))
This is the S3 method of
sort for sorting the subsets alphabetically (default) or following a specific pattern.
## Sorting the disparity subsets in inverse alphabetic order head(summary(sort(disparity, decreasing = TRUE))) ## Customised sorting head(summary(sort(disparity, sort = c(7, 1, 3, 4, 5, 2, 6))))
These functions allow to manipulate the potential tree components of
## Getting the tree component of a dispRity object get.tree(disparity) ## Removing the tree remove.tree(disparity) ## Adding a tree add.tree(disparity, tree = BeckLee_tree)
The functions above are utilities to easily and safely access different elements in the
Alternatively, of course, each elements can be accessed manually.
Here is an explanation on how it works.
dispRity object is a
list of two to four elements, each of which are detailed below:
$matrix: an object of class
listthat contains at least one object of class
matrix: the full multidimensional space.
$call: an object of class
listcontaining information on the
$subsets: an object of class
listcontaining the subsets of the multidimensional space.
$disparity: an object of class
listcontaining the disparity values.
dispRity object is loosely based on
C structure objects.
In fact, it is composed of one unique instance of a matrix (the multidimensional space) upon which the metric function is called via “pointers” to only a certain number of elements and/or dimensions of this matrix.
This allows for: (1) faster and easily tractable execution time: the metric functions are called through apply family function and can be parallelised; and (2) a really low memory footprint: at any time, only one matrix (or list of matrices) is present in the
R environment rather than multiple copies of it for each subset.
This is the multidimensional space, stored in the
R environment as a
list object containing one or more
matrix requires row names but not column names (optional).
By default, if the row names are missing,
dispRity function will arbitrarily generate them in numeric order (i.e.
rownames(matrix) <- 1:nrow(matrix)).
This element of the
dispRity object is never modified.
This element contains the information on the
dispRity object content.
It is a
list that can contain the following:
$call$subsets: a vector of
characterwith information on the subsets type (either
"custom"), their eventual model (
"gradual.split") and eventual information about the trees and matrices used through
chrono.subsets. This element generated only once via
$call$dimensions: either a single
numericvalue indicating how many dimensions to use or a vector of
numericvalues indicating which specific dimensions to use. This element is by default the number of columns in
$matrixbut can be modified through
$call$bootstrap: this is a
listcontaining three elements:
[]: the number of bootstrap replicates (
[]: the bootstrap method (
[]: the rarefaction levels (
$call$disparity: this is a
listcontaining one element,
$metric, that is a
listcontaining the different functions passed to the
dispRity. These are
callelements and get modified each time the
dispRityfunction is used (the first element is the first metric(s), the second, the second metric(s), etc.).
This element contain the eventual subsets of the multidimensional space.
It is a
list of subset names.
Each subset name is in turn a
list of at least one element called
elements which is in turn a
elements matrix is the raw (observed) elements in the subsets.
elements matrix is composed of
numeric values in one column and n rows (the number of elements in the subset).
Each of these values are a “pointer” (
C inspired) to the element of the
For example, lets assume a
dispRity object called
disparity, composed of at least one subsets called
disparity$subsets$sub1$elements [,1] [1,] 5 [2,] 4 [3,] 6 [4,] 7
The values in the matrix “point” to the elements in
$matrix: here, the multidimensional space with only the 4th, 5th, 6th and 7th elements.
The following elements in
diparity$subsets$sub1 will correspond to the same “pointers” but drawn from the bootstrap replicates.
The columns will correspond to different bootstrap replicates.
disparity$subsets$sub1[] [,1] [,2] [,3] [,4] [1,] 57 43 70 4 [2,] 43 44 4 4 [3,] 42 84 44 1 [4,] 84 7 2 10
This signifies that we have four bootstrap pseudo-replicates pointing each time to four elements in
The next element (
[]) will be the same for the eventual first rarefaction level (i.e. the resulting bootstrap matrix will have m rows where m is the number of elements for this rarefaction level).
The next element after that (
[]) will be the same for with an other rarefaction level and so forth…
When a probabilistic model was used to select the elements (models that have the
"split" suffix, e.g.
chrono.subsets(..., model = "gradual.split")), the
$elements is a matrix containing a pair of elements of the matrix and a probability for sampling the first element in that list:
disparity$subsets$sub1$elements [,1] [,2] [,3] [1,] 73 36 0.01871893 [2,] 74 37 0.02555876 [3,] 33 38 0.85679821
In this example, you can read the table row by row as: “there is a probability of
0.018 for sampling element
73 and a probability of
1-0.018) of sampling element
$disparity element is identical to the
$subsets element structure (a list of list(s) containing matrices) but the matrices don’t contain “pointers” to
$matrix but the disparity result of the disparity metric applied to the “pointers”.
For example, in our first example (
$elements) from above, if the disparity metric is of dimensions level 1, we would have:
disparity$disparity$sub1$elements [,1] [1,] 1.82
This is the observed disparity (1.82) for the subset called
If the disparity metric is of dimension level 2 (say the function
range that outputs two values), we would have:
disparity$disparity$sub1$elements [,1] [1,] 0.82 [2,] 2.82
The following elements in the list follow the same logic as before: rows are disparity values (one row for a dimension level 1 metric, multiple for a dimensions level 2 metric) and columns are the bootstrap replicates (the bootstrap with all elements followed by the eventual rarefaction levels). For example for the bootstrap without rarefaction (second element of the list):
disparity$disparity$sub1[] [,1] [,2] [,3] [,4] [1,] 1.744668 1.777418 1.781624 1.739679