| Title: | Branch-Level Inference Framework for Recognizing Optimal Shifts in Traits |
| Version: | 0.1.3 |
| Description: | Methods for detecting and visualizing cladogenic shifts in multivariate trait data on phylogenies. Implements penalized-likelihood multivariate generalized least squares models, enabling analyses of high-dimensional trait datasets and large trees via searchOptimalConfiguration(). Includes a greedy step-wise shift-search algorithm following approaches developed in Smith et al. (2023) <doi:10.1111/nph.19099> and Berv et al. (2024) <doi:10.1126/sciadv.adp0114>. Methods build on multivariate GLS approaches described in Clavel et al. (2019) <doi:10.1093/sysbio/syy045> and implemented in the mvgls() function from the 'mvMORPH' package. Documentation and vignettes are available at https://jakeberv.com/bifrost/, including the introductory vignette at https://jakeberv.com/bifrost/articles/jaw-shape-vignette.html. |
| License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| URL: | https://jakeberv.com/bifrost/, https://github.com/jakeberv/bifrost |
| BugReports: | https://github.com/jakeberv/bifrost/issues |
| Depends: | R (≥ 4.1) |
| Imports: | ape, future, future.apply, phytools, grDevices, stats, mvMORPH, viridis, txtplot |
| Suggests: | HDInterval, RColorBrewer, boot, classInt, evd, fitdistrplus, knitr, palaeoverse, parallel, patchwork, pbmcapply, phylolm, readxl, rmarkdown, scales, testthat (≥ 3.0.0), univariateML, spelling, htmltools |
| VignetteBuilder: | knitr |
| Config/testthat/edition: | 3 |
| Config/testthat/parallel: | false |
| Language: | en-US |
| NeedsCompilation: | no |
| Packaged: | 2026-01-15 21:51:32 UTC; cotinga |
| Author: | Jacob S. Berv |
| Maintainer: | Jacob S. Berv <jacob.berv@gmail.com> |
| Repository: | CRAN |
| Date/Publication: | 2026-01-21 19:10:19 UTC |
Generate Scaled Viridis Color Palette for Rate Parameters
Description
Creates a named color mapping for a set of numeric parameters (e.g., evolutionary rates) using the viridis color palette. Parameters are first sorted in ascending order and normalized to the range [0, 1], then mapped to evenly spaced viridis colors for intuitive visualization.
Usage
generateViridisColorScale(params)
Arguments
params |
A named numeric vector of parameter values (e.g., rates). The names will be preserved and used to label the resulting color mapping. |
Details
This function is useful for plotting results where parameters should be visually distinguished based on their magnitude (e.g., rate shifts across a phylogeny). By using the perceptually uniform viridis palette, it avoids misleading color interpretations common with rainbow scales.
Value
A named list with two elements:
NamedColorsA named character vector of hex color codes, with names corresponding to the input parameter names, ordered by increasing parameter value.
ParamColorMappingA named numeric vector of the sorted parameter values, maintaining the same order and names as
NamedColors.
See Also
viridis::viridis() for details on the color palette.
Examples
if (requireNamespace("viridis", quietly = TRUE)) {
library(viridis)
set.seed(1)
rates <- c(A = 0.1, B = 0.5, C = 0.9)
color_scale <- generateViridisColorScale(rates)
# View the color assignments
color_scale$NamedColors
# Plot with colors
barplot(color_scale$ParamColorMapping,
col = color_scale$NamedColors,
main = "Rates with Viridis Colors")
}
Plot IC Acceptance Matrix with Optional Rate-of-Improvement Overlay
Description
Create a two-layer base R plot that visualizes information criterion (IC) scores across a sequence of sub-model evaluations, highlighting which steps were accepted vs rejected. Optionally, a secondary y-axis overlays the rate of improvement (first difference of IC scores) as a line with markers.
Usage
plot_ic_acceptance_matrix(
matrix_data,
plot_title = "IC Acceptance Matrix Scatter Plot",
plot_rate_of_improvement = TRUE,
rate_limits = c(-400, 150),
baseline_ic = NULL
)
Arguments
matrix_data |
A two-column |
plot_title |
|
plot_rate_of_improvement |
|
rate_limits |
|
baseline_ic |
Optional |
Details
The function expects a two-column object where:
Column 1 contains the IC score at each step (numeric; lower is better).
Column 2 contains an indicator for acceptance (0 = rejected, 1 = accepted).
The first IC value is treated as the baseline and is plotted as a larger
black point with a numeric label. If baseline_ic is supplied, it is used as
the baseline IC score (step 1) in place of matrix_data[1, 1] for both the
baseline annotation and the rate-of-improvement series (diff(IC)). This is
useful because matrix_data begins with the first evaluated shift model (rather
than the true no-shift baseline). To achieve this behavior, pass the true baseline via
baseline_ic to avoid labeling the first evaluated model as the baseline.
Accepted steps are drawn as blue filled points connected by a thin line; rejected
steps are drawn as small red crosses. When plot_rate_of_improvement = TRUE,
the function overlays a secondary y-axis on the right that shows diff(IC) values
(the per-step change in IC; more negative implies improvement).
The function uses only base graphics. It sets plot margins and mgp via
par(), and (when overlaying) uses par(new = TRUE) to layer the IC plot over the
rate-of-improvement axes. Initial user par is reset on exit.
Axes and scaling. Tick marks for the primary (IC) x/y axes are computed with
pretty() to give clean bounds. The secondary axis for the rate of improvement
uses rate_limits (default c(-400, 150)); adjust via the argument if your
expected diff(IC) range differs substantially.
Value
Invisibly returns NULL. Called for its plotting side effects.
See Also
par, plot, axis,
lines, points, legend,
mtext, title
Examples
ic <- c(-1000, -1012, -1008, -1025, -1020, -1030)
accepted <- c(1, 0, 1, 0, 1) # steps 2..6 relative to baseline
mat <- cbind(ic, c(1, accepted)) # mark baseline as accepted for plotting
plot_ic_acceptance_matrix(mat, plot_title = "IC Path")
# Avoid non-ASCII glyphs in titles on CRAN/CI:
plot_ic_acceptance_matrix(mat, plot_rate_of_improvement = TRUE)
# Override baseline IC:
plot_ic_acceptance_matrix(mat, baseline_ic = -995)
Print method for bifrost search results
Description
Prints a compact summary of a completed Bifrost search, including the baseline and optimal information criterion (IC) values, the inferred shift node set, key search settings, and (when present) optional diagnostics such as IC-history and IC-weight support.
Usage
## S3 method for class 'bifrost_search'
print(x, ...)
Arguments
x |
A |
... |
Unused (S3 compatibility). |
Value
Invisibly returns x. Called for its printing side effects.
Search for an Optimal Multi-Regime (Shift) Configuration on a Phylogeny
Description
Greedy, stepwise search for evolutionary regime shifts on a SIMMAP-style phylogeny
using multivariate mvgls fits from mvMORPH. The routine:
builds one-shift candidate trees for all internal nodes meeting a tip-size threshold (via
generatePaintedTrees),fits each candidate in parallel and ranks them by improvement in the chosen information criterion (IC;
GICorBIC),iteratively adds shifts that pass a user-defined acceptance threshold,
optionally revisits accepted shifts to prune overfitting using a small IC tolerance window,
optionally computes per-shift IC weights by refitting the model with each shift removed.
Models are fitted directly in multivariate trait space (no PCA), assuming a multi-rate
Brownian Motion with proportional VCV scaling across regimes. Extra arguments in ...
are forwarded to mvgls (e.g., method = "LL" or
method = "PL-LOOCV", penalty, error = TRUE, etc.).
Usage
searchOptimalConfiguration(
baseline_tree,
trait_data,
formula = "trait_data ~ 1",
min_descendant_tips,
num_cores = 2,
ic_uncertainty_threshold = 1,
shift_acceptance_threshold = 1,
uncertaintyweights = FALSE,
uncertaintyweights_par = FALSE,
plot = FALSE,
IC = "GIC",
store_model_fit_history = TRUE,
verbose = FALSE,
...
)
Arguments
baseline_tree |
A rooted SIMMAP/ |
trait_data |
A |
formula |
Character formula passed to |
min_descendant_tips |
Integer ( |
num_cores |
Integer. Number of workers for parallel candidate scoring. Uses
|
ic_uncertainty_threshold |
Numeric ( |
shift_acceptance_threshold |
Numeric ( |
uncertaintyweights |
Logical. If |
uncertaintyweights_par |
Logical. As above, but compute per-shift IC weights in parallel
using future.apply. Exactly one of |
plot |
Logical. If |
IC |
Character. Which information criterion to use, one of |
store_model_fit_history |
Logical. If |
verbose |
Logical. If |
... |
Additional arguments passed to |
Details
Input requirements.
-
Tree:
baseline_treeshould be a rootedphylo(or SIMMAP-style) tree with branch lengths interpreted in units of time. An ultrametric tree is not required. -
Trait data alignment:
rownames(trait_data)must matchbaseline_tree$tip.labelin both names and order; any tips without data should be pruned beforehand. -
Data type:
trait_datais typically a numeric matrix of continuous traits; high-dimensional settings (p\gen) are supported via penalized-likelihoodmvgls()fits.
Search outline.
-
Baseline: Fit
mvglson the baseline tree (single regime) to obtain the baseline IC. -
Candidates: Build one-shift trees for eligible internal nodes (
generatePaintedTrees); fit each withfitMvglsAndExtractGIC.formulaorfitMvglsAndExtractBIC.formula(internal helpers; not exported) and rank by\DeltaIC. -
Greedy add: Add the top candidate, refit, and accept if
\DeltaIC\geshift_acceptance_threshold; continue down the ranked list. -
Optional IC weights: If
uncertaintyweights(oruncertaintyweights_par) isTRUE, compute an IC weight for each accepted shift by refitting the final model with that shift removed and comparing the two ICs viaaicw.
Parallelization. Candidate sub-model fits are distributed with future + future.apply.
On Unix, multicore is used; on Windows, multisession. A sequential plan is restored afterward.
Plotting. If plot = TRUE, trees are rendered with
plotSimmap(); shift IDs are labeled with nodelabels().
Regime VCVs. The returned $VCVs are extracted from the fitted multi-regime model via
extractRegimeVCVs and reflect regime-specific covariance
estimates (when mvgls is fitted under a PL/ML method).
For high-dimensional trait datasets (p \ge n), penalized-likelihood settings in
mvgls() are often required for stable estimation. In practice, methods such as
method = "LL" or method = "H&L" combined with appropriate penalties (e.g.,
ridge-type penalties) have proven effective for intercept-only multivariate Brownian
motion models, as illustrated in the package vignettes. Users should consult the
mvMORPH documentation for details on available methods and penalties and
tune these choices to the structure of their data.
Value
A named list with (at minimum):
-
user_input: captured call (as a list) for reproducibility. -
tree_no_uncertainty_transformed: SIMMAP tree from the optimal (no-uncertainty) model on the transformed scale used internally bymvgls. -
tree_no_uncertainty_untransformed: same topology with original edge lengths restored. -
model_no_uncertainty: the finalmvglsmodel object. -
shift_nodes_no_uncertainty: integer vector of accepted shift nodes. -
optimal_ic: final IC value;baseline_ic: baseline IC. -
IC_used:"GIC"or"BIC";num_candidates: count of candidate one-shift models evaluated. -
model_fit_history: ifstore_model_fit_history = TRUE, a list of per-iteration fits (loaded from temporary files written during the run) and anic_acceptance_matrix(IC value and acceptance flag per step). -
VCVs: named list of regime-specific VCV matrices extracted from the final model (penalized-likelihood estimates if PL was used).
Additional components appear conditionally:
-
ic_weights: adata.frameof per-shift IC weights and evidence ratios whenuncertaintyweightsoruncertaintyweights_parisTRUE. -
warnings: character vector of warnings/errors encountered during fitting (if any).
Convergence and robustness
The search is greedy and may converge to a local optimum. Use a stricter
shift_acceptance_threshold to reduce overfitting, and re-run the search
with different min_descendant_tips and IC choices ("GIC" vs "BIC")
to assess stability of the inferred shifts. For a given run, the optional IC-weight
calculations (uncertaintyweights or uncertaintyweights_par) can be used
to quantify support for individual shifts. It is often helpful to repeat the analysis
under slightly different settings (e.g., thresholds or candidate-size constraints) and
compare the resulting sets of inferred shifts.
Note
Internally, this routine coordinates multiple unexported helper functions:
generatePaintedTrees, fitMvglsAndExtractGIC.formula,
fitMvglsAndExtractBIC.formula, addShiftToModel,
removeShiftFromTree, and extractRegimeVCVs. Through these,
it may also invoke lower-level utilities such as paintSubTree_mod
and paintSubTree_removeShift. These helpers are internal
implementation details and are not part of the public API.
See Also
mvgls, GIC, BIC,
plot_ic_acceptance_matrix for visualizing IC trajectories and shift
acceptance decisions, and generateViridisColorScale for mapping
regime-specific rates or parameters to a viridis color scale when plotting trees;
packages: mvMORPH, future, future.apply, phytools, ape.
Examples
library(ape)
library(phytools)
library(mvMORPH)
set.seed(1)
# Simulate a tree
tr <- pbtree(n = 50, scale = 1)
# Define two regimes: "0" (baseline) and "1" (high-rate) on a subset of tips
states <- setNames(rep("0", Ntip(tr)), tr$tip.label)
high_clade_tips <- tr$tip.label[1:20]
states[high_clade_tips] <- "1"
# Make a SIMMAP tree for the BMM simulation
simmap <- phytools::make.simmap(tr, states, model = "ER", nsim = 1)
# Simulate traits under a BMM model with ~10x higher rate in regime "1"
sigma <- list(
"0" = diag(0.1, 2),
"1" = diag(1.0, 2)
)
theta <- c(0, 0)
sim <- mvMORPH::mvSIM(
tree = simmap,
nsim = 1,
model = "BMM",
param = list(
ntraits = 2,
sigma = sigma,
theta = theta
)
)
# mvSIM returns either a matrix or a list of matrices depending on mvMORPH version
X <- if (is.list(sim)) sim[[1]] else sim
rownames(X) <- simmap$tip.label
# Run the search on the unpainted tree (single baseline regime)
res <- searchOptimalConfiguration(
baseline_tree = as.phylo(simmap),
trait_data = X,
formula = "trait_data ~ 1",
min_descendant_tips = 10,
num_cores = 1, # keep it simple / CRAN-safe
shift_acceptance_threshold = 20, # conservative GIC threshold
IC = "GIC",
plot = FALSE,
store_model_fit_history = FALSE,
verbose = FALSE
)
res$shift_nodes_no_uncertainty
res$optimal_ic - res$baseline_ic
str(res$VCVs)