Skip to content

Data Helpers

kriterion.data

Correction

Bases: StrEnum

Method for correcting extreme proportions (0s and 1s) in ROC data.

Extreme proportions produce infinite \(z\)-scores and must be adjusted before converting to \(z\)-space. See Stanislaw & Todorov (1999) for reference on these methods. The default value is INCREMENTAL.

ATTRIBUTE DESCRIPTION
NONE

No correction applied.

INCREMENTAL

Adds i/k to each cumulative frequency and 1 to the total, where i is the bin index and k is the number of bins. Default.

LOGLINEAR

Adds 0.5 to all frequencies and 1 to the total (Hautus, 1995).

EXTREME

Corrects only 0s and 1s: 0 is corrected to 0.5/n, 1 is corrected to (n - 0.5)/n (Macmillan & Kaplan, 1985).

ROCData

ROCData(
    signal: list[int],
    noise: list[int],
    condition: str | None = None,
    correction: Correction = INCREMENTAL,
)

Observed frequency counts for a rating-scale ROC experiment.

PARAMETER DESCRIPTION
signal

Frequency counts per rating bin for signal trials.

TYPE: list[int]

noise

Frequency counts per rating bin for noise trials.

TYPE: list[int]

condition

Label for the experimental condition, by default None.

TYPE: str | None DEFAULT: None

correction

Correction method for extreme proportions (0s and 1s), by default Correction.INCREMENTAL.

TYPE: Correction DEFAULT: INCREMENTAL

ATTRIBUTE DESCRIPTION
signal

Frequency counts per rating bin for signal trials.

TYPE: ndarray

noise

Frequency counts per rating bin for noise trials.

TYPE: ndarray

condition

Label for the experimental condition.

TYPE: str or None

correction

Correction method applied when computing proportions.

TYPE: Correction

n_signal

Total number of signal trials.

TYPE: int

n_noise

Total number of noise trials.

TYPE: int

signal_proportions

Cumulative proportions for signal trials at each rating threshold.

TYPE: ndarray

noise_proportions

Cumulative proportions for noise trials at each rating threshold.

TYPE: ndarray

z_signal

\(z\)-transformed cumulative signal proportions.

TYPE: ndarray

z_noise

\(z\)-transformed cumulative noise proportions.

TYPE: ndarray

RAISES DESCRIPTION
ValueError

If signal and noise have different numbers of bins.

Source code in src/kriterion/data.py
def __init__(
    self,
    signal: list[int],
    noise: list[int],
    condition: str | None = None,
    correction: Correction = Correction.INCREMENTAL,
) -> None:
    if len(signal) != len(noise):
        raise ValueError("signal and noise must have the same number of bins")
    self.signal = np.asarray(signal)
    self.noise = np.asarray(noise)
    self.condition = condition
    self.correction = correction

compute_proportions

compute_proportions(
    arr: ndarray, correction: Correction = INCREMENTAL
) -> ndarray

Compute cumulative proportions from frequency counts.

PARAMETER DESCRIPTION
arr

Frequency counts per rating bin, ordered from strongest signal to strongest noise.

TYPE: ndarray

correction

Correction method to apply to avoid 0s and 1s in the output, by default Correction.INCREMENTAL.

TYPE: Correction DEFAULT: INCREMENTAL

RETURNS DESCRIPTION
ndarray

Cumulative proportions with the final 1.0 omitted (n - 1 values).

Source code in src/kriterion/data.py
def compute_proportions(
    arr: np.ndarray,
    correction: Correction = Correction.INCREMENTAL,
) -> np.ndarray:
    """Compute cumulative proportions from frequency counts.

    Parameters
    ----------
    arr : np.ndarray
        Frequency counts per rating bin, ordered from strongest signal
        to strongest noise.
    correction : Correction, optional
        Correction method to apply to avoid 0s and 1s in the output,
        by default `Correction.INCREMENTAL`.

    Returns
    -------
    np.ndarray
        Cumulative proportions with the final 1.0 omitted (n - 1 values).
    """
    match correction:
        case Correction.NONE:
            return _proportions_uncorrected(arr)
        case Correction.INCREMENTAL:
            return _proportions_incremental(arr)
        case Correction.LOGLINEAR:
            return _proportions_loglinear(arr)
        case Correction.EXTREME:
            return _proportions_extreme(arr)