Title: Classification Measures when Subclasses are Involved
Version: 1.0.0
Description: Accuracy metrics are commonly used to assess the discriminating ability of diagnostic tests or biomarkers. Among them, metrics based on the ROC framework are particularly popular. When classification involves subclasses, the package 'CompClassMetrics' includes functions that can provide the point estimate, confidence interval as well as true values if a parametric setting is known. For more details see Nan and Tian (2025) <doi:10.1177/09622802251343600>, Nan and Tian (2023) <doi:10.1002/sim.9908>, Feng and Tian (2020) <doi:10.1177/0962280220938077> and Wang et al (2016) <doi:10.1002/sim.6843>.
License: MIT + file LICENSE
Encoding: UTF-8
RoxygenNote: 7.3.2
Imports: plot3D, pracma, cubature, stats
NeedsCompilation: no
Packaged: 2026-01-18 19:34:19 UTC; nnan3
Author: Nan Nan [aut, cre]
Maintainer: Nan Nan <nannan@buffalo.edu>
Depends: R (≥ 3.5.0)
Repository: CRAN
Date/Publication: 2026-01-18 23:30:18 UTC

R function that calculates percentile confidence interval given an array of estimates

Description

This function provides percentile confidence interval

Usage

CI.func(x)

Arguments

x

an array of calculated estimates

Value

The percentile confidence interval of given values


adni2

Description

Description of adni2.

Format

A data frame with 317 rows and 7 columns:

RID

Participant ID

DX.bl

The disease class label

FDG

Numeric, value of FDG

AV45

Numeric, value of AV45

ABETA

Numeric, value of ABETA

TAU.x

Numeric, value of TAU from CSF

PTAU

Numeric, value of PTAU from CSF

Source

This is a subset of ADNI2 dataset, available at https://adni.loni.usc.edu


R function that calculates the true values of AUCo when distribution is known

Description

R function that calculates the true values of AUCo when distribution is known

Usage

auco_func(k1, k2, distribution, arg1, arg2)

Arguments

k1

number of subclasses in main class-1

k2

number of subclasses in main class-2

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input arg1 parameters

arg2

if distribution is gamma input variance parameter, if gamma input arg2 parameters

Value

The true value of AUCo under given distribution and parameters


R function that calculates the conditional probability of minimum greater than y_min given maximum equals to y_max of random variables (upper tail probability of minimum given maximum)

Description

R function that calculates the conditional probability of minimum greater than y_min given maximum equals to y_max of random variables (upper tail probability of minimum given maximum)

Usage

cdf_min_given_max_partial_upper(y_min, y_max, distribution, arg1, arg2)

Arguments

y_min

the value of y_min

y_max

the value of y_max

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input shape parameters

arg2

if distribution is normal input variance parameter, if gamma input rate parameters

Value

The conditional probability of minimum given maximum of random variables


R function that calculates the partial of joint probability of min and max over max of NIND random variables

Description

R function that calculates the partial of joint probability of min and max over max of NIND random variables

Usage

cdf_min_max_partial(y_min, y_max, distribution, arg1, arg2)

Arguments

y_min

the value of y_min

y_max

the value of y_max

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input shape parameters

arg2

if distribution is normal input variance parameter, if gamma input rate parameters

Value

The partial of joint probablity of min and max over max


R function that calculates the probability of r-th order statistics of normal random variables (CDF of r-th order statistics)

Description

R function that calculates the probability of r-th order statistics of normal random variables (CDF of r-th order statistics)

Usage

cdf_order_r(x, r, distribution, arg1, arg2)

Arguments

x

the value of x

r

r-th order statistics

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input shape parameters

arg2

if distribution is normal input variance parameter, if gamma input rate parameters

Value

The probability of r-th order statistics of random variables smaller or equal to x


R function that calculates the true values of VUSC when distribution is known

Description

R function that calculates the true values of VUSC when distribution is known

Usage

cvus_func(k1, k2, k3, distribution, arg1, arg2)

Arguments

k1

number of subclasses in main class-1

k2

number of subclasses in main class-2

k3

number of subclasses in main class-3

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input arg1 parameters

arg2

if distribution is gamma input variance parameter, if gamma input arg2 parameters

Value

The true value of VUSc under given distribution and parameters


R function that calculates the probability density of maximum of NIND random variables (PDF)

Description

R function that calculates the probability density of maximum of NIND random variables (PDF)

Usage

f_order_max(y_max, distribution, arg1, arg2)

Arguments

y_max

the value of y_max

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input shape parameters

arg2

if distribution is normal input variance parameter, if gamma input rate parameters

Value

The probability density of maximum of random variables


R function that calculates the probability density of minimum of NIND random variables (PDF)

Description

R function that calculates the probability density of minimum of NIND random variables (PDF)

Usage

f_order_min(y_min, distribution, arg1, arg2)

Arguments

y_min

the value of y_min

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input shape parameters

arg2

if distribution is normal input variance parameter, if gamma input rate parameters

Value

The probability density of minimum of NIND random variables


R function for obtaining all combinations of maximum and minimum of a given dataset

Description

R function for obtaining all combinations of maximum and minimum of a given dataset

Usage

get_max_min_permutations(df)

Arguments

df

Given dataset, in list

Value

A list of all combinations of maximum and minimum of df


R function that calculates empirical estimates of HUMcm

Description

This function provides empirical estimates of HUMcm

Usage

humc_dynamic(dat, num_sub)

Arguments

dat

test values in list, each element represents biomarker values for a disease group, ordered in ascending severity

num_sub

a vector of number of subclasses in each subclass

Value

The empirical estimate of HUMcm based on given data and num_sub

Examples

# Create a list of example data
Y1 <- c(0.9316, 0.9670, 1.3856, 1.3505, 1.0316, 1.1764, 0.7435, 0.5813, 0.4695, 0.3249)
Y2 <- c(1.63950, 1.36535, 1.79859, 0.47961, 1.50978, 1.36525,0.13515, 2.11275, 0.45659)
Y3 <- c(1.89856, 1.30920, 2.38615, 2.34785, 2.92493, 2.71615, 2.75243, 0.95060, 0.38964)
Y4 <- c(2.580,2.570,2.143,3.079,1.765,3.081,2.175,2.306,2.918,2.507,4.261,3.033,1.836,2.321)
Y5 <- c(3.969,3.044,3.318,2.862,3.655,1.523,3.722,4.074,3.662,3.571,5.177,6.321,4.932,4.129)
Y.dat <- list(Y1,Y2,Y3,Y4,Y5)
num_sub <- c(1,3,1)
# calculate HUMcm of Y.dat and num_sub
humc_dynamic(Y.dat,num_sub)

R function that calculates the true values of HUMcm when distribution is known

Description

R function that calculates the true values of HUMcm when distribution is known

Usage

humc_fourclass(distribution, arg1, arg2, num_sub)

Arguments

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input arg1 parameters

arg2

if distribution is gamma input variance parameter, if gamma input arg2 parameters

num_sub

the vector of number of subclasses in each main class

Value

The true value of HUMcm under given distribution and parameters


R function that calculates the minimum of HUMcm under given structure

Description

R function that calculates the minimum of HUMcm under given structure

Usage

humc_min(num_sub)

Arguments

num_sub

the vector of number of subclasses in each main class

Value

the minimum of HUMcm


R function that calculates non-parametric bootstrap percentile confidence interval

Description

This function provides non-parametric bootstrap percentile confidence interval of HUMcm

Usage

humc_npci(dat, num_sub, B)

Arguments

dat

test values in list, each element represents biomarker values for a disease group, ordered in ascending severity

num_sub

a vector of number of subclasses in each subclass

B

the number of iteration

Value

The non-parametric bootstrap percentile confidence interval of HUMcm

Examples

# Create a list of example data
Y1 <- c(0.9316, 0.9670, 1.3856, 1.3505, 1.0316, 1.1764, 0.7435, 0.5813, 0.4695, 0.3249)
Y2 <- c(1.63950, 1.36535, 1.79859, 0.47961, 1.50978, 1.36525,0.13515, 2.11275, 0.45659)
Y3 <- c(1.89856, 1.30920, 2.38615, 2.34785, 2.92493, 2.71615, 2.75243, 0.95060, 0.38964)
Y4 <- c(2.580,2.570,2.143,3.079,1.765,3.081,2.175,2.306,2.918,2.507,4.261,3.033,1.836,2.321)
Y5 <- c(3.969,3.044,3.318,2.862,3.655,1.523,3.722,4.074,3.662,3.571,5.177,6.321,4.932,4.129)
Y.dat <- list(Y1,Y2,Y3,Y4,Y5)
num_sub <- c(1,3,1)
# calculate the non-parametric bootstrap percentile confidence interval
humc_npci(Y.dat,num_sub,50)

R function to calculate the standardized HUMcm under given structure

Description

R function to calculate the standardized HUMcm under given structure

Usage

humc_standard(value, num_sub)

Arguments

value

the value of HUMcm

num_sub

the vector of number of subclasses in each main class

Value

The standardized HUMcm


PLCO

Description

Description of PLCO.

Format

A data frame with 239 rows and 7 columns:

ID

Participant ID

Group

The disease class label

CA125

Numeric, value of CA125

CA153

Numeric, value of CA153

CA199

Numeric, value of CA199

KLK6

Numeric, value of KLK6

CA724

Numeric, value of CA724

Source

This is a subset of PLCO dataset, available at https://edrn.nci.nih.gov.


R function for plotting the overall ROC curve and chance curve

Description

R function for plotting the overall ROC curve and chance curve

Usage

rocc_curve(k1, k2, distribution, arg1, arg2)

Arguments

k1

number of subclasses in main class-1

k2

number of subclasses in main class-2

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input shape parameters

arg2

if distribution is gamma input variance parameter, if gamma input rate parameters

Value

The overall ROC curve and chance curve


R function for plotting the empirical compound ROC curve and chance curve

Description

R function for plotting the empirical compound ROC curve and chance curve

Usage

rocc_curve_emp(dat, num_sub)

Arguments

dat

values in list, each element represents biomarker values for a disease group, ordered in ascending severity

num_sub

a vector of number of subclasses in each subclass

Value

The empirical compound ROC curve and chance curve


R function for plotting the compound ROC surface and chance surface

Description

R function for plotting the compound ROC surface and chance surface

Usage

rocc_surface(k1, k2, k3, distribution, arg1, arg2)

Arguments

k1

number of subclasses in main class-1

k2

number of subclasses in main class-2

k3

number of subclasses in main class-3

distribution

the distribution of marker value follows Normal or Gamma

arg1

if distribution is normal input mean parameters of all subclasses in a vector, if gamma input shape parameters

arg2

if distribution is gamma input variance parameter, if gamma input rate parameters

Value

The compound ROC surface and chance surface


R function for plotting the empirical compound ROC surface and chance surface

Description

R function for plotting the empirical compound ROC surface and chance surface

Usage

rocc_surface_emp(dat, num_sub)

Arguments

dat

values in list, each element represents biomarker values for a disease group, ordered in ascending severity

num_sub

a vector of number of subclasses in each subclass

Value

The empirical compound ROC surface and chance surface