Objective Functions¶
Several objective functions are supported for fitting models: \(\chi^2\), \(G^2\), log-likelihood, and \(\text{SSE}\).
The \(\chi^2\) and \(G^2\) statistics are also available in _cumulative variants. The standard variants compute the statistic from cell counts directly. This is conventional but it makes the procedure sensitive to sparse cells, e.g. when an observer makes few or no responses at a given criterion.
The _cumulative variants instead evaluate the fit via a binary split at each criterion threshold, with responses at or above the threshold versus those below, and then summing across all thresholds. This is robust to sparse cells, and is useful for getting a reliable fit during optimisation. However, successive thresholds share responses with all preceding ones, violating independence, so this is not recommended for model comparison.
See 1 for reference.
kriterion.objectives
¶
chi_squared
¶
The \(\chi^2\) statistic for a single response class.
where \(O_k\) and \(E_k\) are the observed and expected cell counts in bin \(k\).
| PARAMETER | DESCRIPTION |
|---|---|
obs
|
Observed cell counts \(O_k\) per rating bin.
TYPE:
|
exp
|
Expected cell counts \(E_k\) from the model.
TYPE:
|
Source code in src/kriterion/objectives.py
chi_squared_cumulative
¶
The \(\chi^2\) over binary (above/below) splits at each criterion threshold.
This implementation is provided as an alternative approach for optimisation that may result in a better fit; however it is not recommended for statistical analysis/model comparisons due to violation of the assumption of independence.
Each of the \(K-1\) thresholds divides the \(n\) trials into two cells: those rated at or above threshold \(k\), and those below. The \(\chi^2\) statistic is summed over both cells of every threshold:
where at threshold \(k\) the above-cell is \((O_{ka}, E_{ka}) = (O_k, E_k)\) and the below-cell is \((O_{kb}, E_{kb}) = (N - O_k,\, N - E_k)\). Expanding the inner sum over both cells:
where \(O_k\) and \(E_k\) are the observed and expected cumulative counts at threshold \(k\), and \(N\) is the total trial count.
| PARAMETER | DESCRIPTION |
|---|---|
obs_prop
|
Observed cumulative proportions \(\hat{p}_k\) at each threshold.
TYPE:
|
exp_prop
|
Expected cumulative proportions \(p_k\) from the model.
TYPE:
|
n
|
Total trial count for this response class.
TYPE:
|
Source code in src/kriterion/objectives.py
g_squared
¶
The \(G^2\) (likelihood-ratio) statistic for a single response class1:
where \(O_k\) and \(E_k\) are the observed and expected cell counts in bin \(k\).
| PARAMETER | DESCRIPTION |
|---|---|
obs
|
Observed cell counts \(O_k\) per rating bin.
TYPE:
|
exp
|
Expected cell counts \(E_k\) from the model.
TYPE:
|
Source code in src/kriterion/objectives.py
g_squared_cumulative
¶
The \(G^2\) (likelihood-ratio) statistic over binary (above/below) splits at each criterion threshold.
This implementation is provided as an alternative approach for optimisation that may result in a better fit; however it is not recommended for statistical analysis/model comparisons due to violation of the assumption of independence.
Each of the \(K-1\) thresholds divides the \(n\) trials into two cells: those rated at or above threshold \(k\), and those below. The \(G^2\) statistic is summed over both cells of every threshold:
where at threshold \(k\) the above-cell is \((\hat{p}_{ka}, p_{ka}) = (\hat{p}_k, p_k)\) and the below-cell is \((\hat{p}_{kb}, p_{kb}) = (1 - \hat{p}_k,\, 1 - p_k)\). Expanding the inner sum over both cells:
where \(\hat{p}_k\) and \(p_k\) are the observed and expected cumulative proportions at threshold \(k\), and \(n\) is the total trial count.
| PARAMETER | DESCRIPTION |
|---|---|
obs_prop
|
Observed cumulative proportions \(\hat{p}_k\) at each threshold.
TYPE:
|
exp_prop
|
Expected cumulative proportions \(p_k\) from the model.
TYPE:
|
n
|
Total trial count for this response class.
TYPE:
|
Source code in src/kriterion/objectives.py
log_likelihood
¶
Log-likelihood for a single response class.
The log-likelihood of the observed cell counts given predicted cell probabilities is:
where \(O_k\) are observed cell counts and \(p_k\) are predicted cell probabilities. Returns a negative value.
| PARAMETER | DESCRIPTION |
|---|---|
obs
|
Observed cell counts \(O_k\) per rating bin.
TYPE:
|
exp_prop
|
Predicted cell probabilities \(p_k\) from the model.
TYPE:
|
Source code in src/kriterion/objectives.py
sse
¶
Sum of squared errors for a single response class.
Calculates the \(\text{SSE}\) between the observevd and expected cumulative proportions.
| PARAMETER | DESCRIPTION |
|---|---|
obs_prop
|
Observed cumulative proportions at each threshold.
TYPE:
|
exp_prop
|
Expected cumulative proportions from the model.
TYPE:
|