Pular para o conteúdo principal

BinaryClassifier Class

ByJG\TextClassifier\BinaryClassifier is the binary Robinson-Fisher Bayesian spam filter.

Constructor

new BinaryClassifier(
ConfigBinaryClassifier $config,
StorageInterface $storage,
LexerInterface $lexer,
?LlmInterface $llm = null,
?ConfigLlm $configLlm = null,
)
ParameterTypeDescription
$configConfigBinaryClassifierTuning parameters
$storageByJG\TextClassifier\Storage\StorageInterfacePersistence backend
$lexerByJG\TextClassifier\Lexer\LexerInterfaceTokeniser
$llmLlmInterface|nullOptional LLM for uncertain-score escalation
$configLlmConfigLlm|nullLLM escalation thresholds (defaults apply when null)

Methods

classify()

public function classify(string|null $text): ClassificationResult|string

Classifies $text and returns a ClassificationResult on success, or a string error code on failure.

Return typeMeaning
ClassificationResultClassification succeeded — inspect ->choice, ->score, etc.
BinaryClassifier::CLASSIFYER_TEXT_MISSING$text was null
StandardLexer::LEXER_TEXT_NOT_STRING$text is not a string
StandardLexer::LEXER_TEXT_EMPTY$text is an empty string
use ByJG\TextClassifier\ClassificationResult;

$result = $b8->classify($text);
if (!($result instanceof ClassificationResult)) {
// handle error code string
}

echo $result->choice; // 'spam' or 'ham'
echo $result->score; // float 0.0–1.0
echo $result->escalated; // true if LLM was consulted

When an LLM is injected and autoLearn=true, the statistical model is retrained on the LLM decision before returning the final result. statScores always reflects the raw score before any LLM involvement.

learn()

public function learn(string|null $text, string|null $category): null|string

Trains the classifier with $text as a known example of $category.

ParameterAccepted values
$categoryBinaryClassifier::SPAM or BinaryClassifier::HAM

Returns null on success or an error code string on failure. See Error Codes.

unlearn()

public function unlearn(string|null $text, string|null $category): null|string

Reverses a previous learn() call. Parameters and return values identical to learn().

ClassificationResult fields

FieldTypeDescription
choicestring'spam' or 'ham'
scorefloatFinal spam probability 0.01.0
scoresarray<string, float>['spam' => …, 'ham' => …] final scores
statScoresarray<string, float>Raw statistical scores before any LLM escalation
llmDecisionstring|nullLLM's label if consulted, otherwise null
escalatedbooltrue when the LLM was invoked

Constants

Category constants

ConstantValue
BinaryClassifier::SPAM'spam'
BinaryClassifier::HAM'ham'

Action constants (internal)

ConstantValue
BinaryClassifier::LEARN'learn'
BinaryClassifier::UNLEARN'unlearn'

Error code constants

ConstantReturned by
BinaryClassifier::CLASSIFYER_TEXT_MISSINGclassify()
BinaryClassifier::TRAINER_TEXT_MISSINGlearn(), unlearn()
BinaryClassifier::TRAINER_CATEGORY_MISSINGlearn(), unlearn()
BinaryClassifier::TRAINER_CATEGORY_FAILlearn(), unlearn()

Internal

ConstantValuePurpose
BinaryClassifier::DBVERSION3Expected database schema version