Skip to main content

ConfigLlm

ByJG\TextClassifier\Llm\ConfigLlm controls when the LLM is consulted and whether its decision is fed back as training data. All setters return $this for fluent chaining.

Parameters

ParameterSetterDefaultApplies toDescription
lowerBoundsetLowerBound(float)0.35BinaryClassifierEscalate when score ≥ this value…
upperBoundsetUpperBound(float)0.65BinaryClassifier…and score ≤ this value. Scores outside [lowerBound, upperBound] are considered certain.
minConfidencesetMinConfidence(float)0.65NaiveBayesEscalate when the top category score is below this threshold.
minMarginsetMinMargin(float)0.15NaiveBayesEscalate when the gap between the top and second category score is below this threshold.
autoLearnsetAutoLearn(bool)trueBothWhen true, the LLM decision is fed back to the classifier as training data and the text is re-classified.

Usage

use ByJG\TextClassifier\Llm\ConfigLlm;

// Wider uncertainty zone → more LLM calls, faster learning
$config = (new ConfigLlm())
->setLowerBound(0.40)
->setUpperBound(0.60);

// Stricter NaiveBayes thresholds → escalate only when very uncertain
$config = (new ConfigLlm())
->setMinConfidence(0.80)
->setMinMargin(0.20);

// Disable active learning (LLM decides but classifier is not updated)
$config = (new ConfigLlm())
->setAutoLearn(false);

Getters

MethodReturns
getLowerBound()float
getUpperBound()float
getMinConfidence()float
getMinMargin()float
isAutoLearn()bool

Tuning guidance

Binary classifier (lowerBound / upperBound)

The classifier score represents spam probability. A score of 0.5 means "completely uncertain". The dead zone [lowerBound, upperBound] is where the statistical model lacks confidence:

  • Narrowing the zone (e.g. [0.45, 0.55]) → fewer LLM calls, but you miss borderline cases.
  • Widening the zone (e.g. [0.30, 0.70]) → more LLM calls, faster model improvement.

Multi-class classifier (minConfidence / minMargin)

  • Raise minConfidence to require a stronger winning signal before trusting the classifier.
  • Raise minMargin to require a bigger lead over the second-best category.
  • Both thresholds are independent — either one triggers escalation.

autoLearn

Set to false when you want the LLM to act as a fallback but not modify the training data (e.g. read-only mode or when the LLM is unreliable).