Objective Functions¶

Several objective functions are supported for fitting models: \(\chi^2\), \(G^2\), log-likelihood, and \(\text{SSE}\).

The \(\chi^2\) and \(G^2\) statistics are also available in _cumulative variants. The standard variants compute the statistic from cell counts directly. This is conventional but it makes the procedure sensitive to sparse cells, e.g. when an observer makes few or no responses at a given criterion.

The _cumulative variants instead evaluate the fit via a binary split at each criterion threshold, with responses at or above the threshold versus those below, and then summing across all thresholds. This is robust to sparse cells, and is useful for getting a reliable fit during optimisation. However, successive thresholds share responses with all preceding ones, violating independence, so this is not recommended for model comparison.

See ¹ for reference.

kriterion.objectives ¶

chi_squared ¶

chi_squared(obs: ndarray, exp: ndarray) -> float

The \(\chi^2\) statistic for a single response class.

\[ \chi^2 = \sum_{k=1}^{K} \frac{(O_k - E_k)^2}{E_k} \]

where \(O_k\) and \(E_k\) are the observed and expected cell counts in bin \(k\).

PARAMETER	DESCRIPTION
`obs`	Observed cell counts \(O_k\) per rating bin. TYPE: `ndarray`
`exp`	Expected cell counts \(E_k\) from the model. TYPE: `ndarray`

Source code in src/kriterion/objectives.py

def chi_squared(obs: np.ndarray, exp: np.ndarray) -> float:
    """The $\\chi^2$ statistic for a single response class.

    $$
    \\chi^2 = \\sum_{k=1}^{K} \\frac{(O_k - E_k)^2}{E_k}
    $$

    where $O_k$ and $E_k$ are the observed and expected cell counts in bin $k$.

    Parameters
    ----------
    obs :
        Observed cell counts $O_k$ per rating bin.
    exp :
        Expected cell counts $E_k$ from the model.
    """
    return float(np.sum((obs - exp) ** 2 / exp))

chi_squared_cumulative ¶

chi_squared_cumulative(
    obs_prop: ndarray, exp_prop: ndarray, n: int
) -> float

The \(\chi^2\) over binary (above/below) splits at each criterion threshold.

This implementation is provided as an alternative approach for optimisation that may result in a better fit; however it is not recommended for statistical analysis/model comparisons due to violation of the assumption of independence.

Each of the \(K-1\) thresholds divides the \(n\) trials into two cells: those rated at or above threshold \(k\), and those below. The \(\chi^2\) statistic is summed over both cells of every threshold:

\[ \chi^2 = \sum_{k=1}^{K-1} \sum_{j \in \{a,\,b\}} \frac{(O_{kj} - E_{kj})^2}{E_{kj}} \]

where at threshold \(k\) the above-cell is \((O_{ka}, E_{ka}) = (O_k, E_k)\) and the below-cell is \((O_{kb}, E_{kb}) = (N - O_k,\, N - E_k)\). Expanding the inner sum over both cells:

\[ \chi^2 = \sum_{k=1}^{K-1} \left[ \frac{(O_k - E_k)^2}{E_k} + \frac{((N - O_k) - (N - E_k))^2}{N - E_k} \right] \]

where \(O_k\) and \(E_k\) are the observed and expected cumulative counts at threshold \(k\), and \(N\) is the total trial count.

PARAMETER	DESCRIPTION
`obs_prop`	Observed cumulative proportions \(\hat{p}_k\) at each threshold. TYPE: `ndarray`
`exp_prop`	Expected cumulative proportions \(p_k\) from the model. TYPE: `ndarray`
`n`	Total trial count for this response class. TYPE: `int`

Source code in src/kriterion/objectives.py

def chi_squared_cumulative(obs_prop: np.ndarray, exp_prop: np.ndarray, n: int) -> float:
    """The $\\chi^2$ over binary (above/below) splits at each criterion threshold.

    This implementation is provided as an alternative approach for optimisation that may
    result in a better fit; however it is not recommended for statistical analysis/model
    comparisons due to violation of the assumption of independence.

    Each of the $K-1$ thresholds divides the $n$ trials into two cells: those
    rated at or above threshold $k$, and those below. The $\\chi^2$
    statistic is summed over both cells of every threshold:

    $$
    \\chi^2 = \\sum_{k=1}^{K-1} \\sum_{j \\in \\{a,\\,b\\}}
        \\frac{(O_{kj} - E_{kj})^2}{E_{kj}}
    $$

    where at threshold $k$ the above-cell is $(O_{ka}, E_{ka}) = (O_k, E_k)$ and
    the below-cell is $(O_{kb}, E_{kb}) = (N - O_k,\\, N - E_k)$. Expanding the
    inner sum over both cells:

    $$
    \\chi^2 = \\sum_{k=1}^{K-1} \\left[
        \\frac{(O_k - E_k)^2}{E_k}
        + \\frac{((N - O_k) - (N - E_k))^2}{N - E_k}
    \\right]
    $$

    where $O_k$ and $E_k$ are the observed and expected cumulative counts at
    threshold $k$, and $N$ is the total trial count.

    Parameters
    ----------
    obs_prop :
        Observed cumulative proportions $\\hat{p}_k$ at each threshold.
    exp_prop :
        Expected cumulative proportions $p_k$ from the model.
    n :
        Total trial count for this response class.
    """
    obs_count = obs_prop * n
    exp_count = exp_prop * n
    delta_sq = (obs_count - exp_count) ** 2
    above = delta_sq / exp_count
    below = delta_sq / (n - exp_count)
    return float(np.sum(above + below))

g_squared ¶

g_squared(obs: ndarray, exp: ndarray) -> float

The \(G^2\) (likelihood-ratio) statistic for a single response class¹:

\[ G^2 = 2 \sum_{k=1}^{K} O_k \ln \left( \frac{O_k}{E_k} \right) \]

where \(O_k\) and \(E_k\) are the observed and expected cell counts in bin \(k\).

Cressie, N., & Timothy R. C. Read. (1984). Multinomial Goodness-of-Fit Tests. Journal of the Royal Statistical Society. Series B (Methodological), 46(3), 440–464. ↩

PARAMETER	DESCRIPTION
`obs`	Observed cell counts \(O_k\) per rating bin. TYPE: `ndarray`
`exp`	Expected cell counts \(E_k\) from the model. TYPE: `ndarray`

Source code in src/kriterion/objectives.py

def g_squared(obs: np.ndarray, exp: np.ndarray) -> float:
    """The $G^2$ (likelihood-ratio) statistic for a single response class[^cressie_read_1984]:

    $$
    G^2 = 2 \\sum_{k=1}^{K} O_k \\ln \\left( \\frac{O_k}{E_k} \\right)
    $$

    where $O_k$ and $E_k$ are the observed and expected cell counts in bin $k$.

    [^cressie_read_1984]: [Cressie, N., & Timothy R. C. Read. (1984). Multinomial
    Goodness-of-Fit Tests. Journal of the Royal Statistical Society. Series B
    (Methodological), 46(3), 440–464](http://www.jstor.org/stable/2345686).

    Parameters
    ----------
    obs :
        Observed cell counts $O_k$ per rating bin.
    exp :
        Expected cell counts $E_k$ from the model.
    """
    return 2 * float(np.sum(xlogy(obs, obs / exp)))

g_squared_cumulative ¶

g_squared_cumulative(
    obs_prop: ndarray, exp_prop: ndarray, n: int
) -> float

The \(G^2\) (likelihood-ratio) statistic over binary (above/below) splits at each criterion threshold.

This implementation is provided as an alternative approach for optimisation that may result in a better fit; however it is not recommended for statistical analysis/model comparisons due to violation of the assumption of independence.

Each of the \(K-1\) thresholds divides the \(n\) trials into two cells: those rated at or above threshold \(k\), and those below. The \(G^2\) statistic is summed over both cells of every threshold:

\[ G^2 = 2n \sum_{k=1}^{K-1} \sum_{j \in \{a,\,b\}} \hat{p}_{kj} \ln \left( \frac{\hat{p}_{kj}}{p_{kj}} \right) \]

where at threshold \(k\) the above-cell is \((\hat{p}_{ka}, p_{ka}) = (\hat{p}_k, p_k)\) and the below-cell is \((\hat{p}_{kb}, p_{kb}) = (1 - \hat{p}_k,\, 1 - p_k)\). Expanding the inner sum over both cells:

\[ G^2 = 2n \sum_{k=1}^{K-1} \left[ \hat{p}_k \ln \left( \frac{\hat{p}_k}{p_k} \right) + (1 - \hat{p}_k) \ln \left( \frac{1 - \hat{p}_k}{1 - p_k} \right) \right] \]

where \(\hat{p}_k\) and \(p_k\) are the observed and expected cumulative proportions at threshold \(k\), and \(n\) is the total trial count.

PARAMETER	DESCRIPTION
`obs_prop`	Observed cumulative proportions \(\hat{p}_k\) at each threshold. TYPE: `ndarray`
`exp_prop`	Expected cumulative proportions \(p_k\) from the model. TYPE: `ndarray`
`n`	Total trial count for this response class. TYPE: `int`

Source code in src/kriterion/objectives.py

def g_squared_cumulative(obs_prop: np.ndarray, exp_prop: np.ndarray, n: int) -> float:
    """The $G^2$ (likelihood-ratio) statistic over binary (above/below) splits at each criterion threshold.

    This implementation is provided as an alternative approach for optimisation that may
    result in a better fit; however it is not recommended for statistical analysis/model
    comparisons due to violation of the assumption of independence.

    Each of the $K-1$ thresholds divides the $n$ trials into two cells: those
    rated at or above threshold $k$, and those below. The $G^2$
    statistic is summed over both cells of every threshold:

    $$
    G^2 = 2n \\sum_{k=1}^{K-1} \\sum_{j \\in \\{a,\\,b\\}}
        \\hat{p}_{kj} \\ln \\left( \\frac{\\hat{p}_{kj}}{p_{kj}} \\right)
    $$

    where at threshold $k$ the above-cell is $(\\hat{p}_{ka}, p_{ka}) = (\\hat{p}_k, p_k)$ and
    the below-cell is $(\\hat{p}_{kb}, p_{kb}) = (1 - \\hat{p}_k,\\, 1 - p_k)$. Expanding the
    inner sum over both cells:

    $$
    G^2 = 2n \\sum_{k=1}^{K-1} \\left[
        \\hat{p}_k \\ln \\left( \\frac{\\hat{p}_k}{p_k} \\right) + (1 - \\hat{p}_k) \\ln \\left( \\frac{1 - \\hat{p}_k}{1 - p_k} \\right)
    \\right]
    $$

    where $\\hat{p}_k$ and $p_k$ are the observed and expected cumulative proportions
    at threshold $k$, and $n$ is the total trial count.

    Parameters
    ----------
    obs_prop :
        Observed cumulative proportions $\\hat{p}_k$ at each threshold.
    exp_prop :
        Expected cumulative proportions $p_k$ from the model.
    n :
        Total trial count for this response class.
    """
    above = xlogy(obs_prop, obs_prop / exp_prop)
    below = xlogy((1 - obs_prop), (1 - obs_prop) / (1 - exp_prop))
    return 2 * n * float(np.sum(above + below))

log_likelihood ¶

log_likelihood(obs: ndarray, exp_prop: ndarray) -> float

Log-likelihood for a single response class.

The log-likelihood of the observed cell counts given predicted cell probabilities is:

\[ \ell = \sum_{k=1}^{K} O_k \ln p_k \]

where \(O_k\) are observed cell counts and \(p_k\) are predicted cell probabilities. Returns a negative value.

PARAMETER	DESCRIPTION
`obs`	Observed cell counts \(O_k\) per rating bin. TYPE: `ndarray`
`exp_prop`	Predicted cell probabilities \(p_k\) from the model. TYPE: `ndarray`

Source code in src/kriterion/objectives.py

def log_likelihood(obs: np.ndarray, exp_prop: np.ndarray) -> float:
    """Log-likelihood for a single response class.

    The log-likelihood of the observed cell counts given predicted
    cell probabilities is:

    $$
    \\ell = \\sum_{k=1}^{K} O_k \\ln p_k
    $$

    where $O_k$ are observed cell counts and $p_k$ are predicted cell
    probabilities. Returns a negative value.

    Parameters
    ----------
    obs :
        Observed cell counts $O_k$ per rating bin.
    exp_prop :
        Predicted cell probabilities $p_k$ from the model.
    """
    return float(np.sum(xlogy(obs, exp_prop)))

sse ¶

sse(obs_prop: ndarray, exp_prop: ndarray) -> float

Sum of squared errors for a single response class.

Calculates the \(\text{SSE}\) between the observevd and expected cumulative proportions.

PARAMETER	DESCRIPTION
`obs_prop`	Observed cumulative proportions at each threshold. TYPE: `ndarray`
`exp_prop`	Expected cumulative proportions from the model. TYPE: `ndarray`

Source code in src/kriterion/objectives.py

def sse(obs_prop: np.ndarray, exp_prop: np.ndarray) -> float:
    """Sum of squared errors for a single response class.

    Calculates the $\\text{SSE}$ between the observevd and expected cumulative proportions.

    Parameters
    ----------
    obs_prop :
        Observed cumulative proportions at each threshold.
    exp_prop :
        Expected cumulative proportions from the model.
    """
    return float(np.sum((obs_prop - exp_prop) ** 2))

Cressie, N., & Timothy R. C. Read. (1984). Multinomial Goodness-of-Fit Tests. Journal of the Royal Statistical Society. Series B (Methodological), 46(3), 440–464. ↩