Day 11: Kernel Anomaly Scoring Validation & Interpretation
// Context
Section titled “// Context”By the end of Day 10, Sentinel Sandbox successfully produced stable anomaly scores using a temporally bucketed syscall representation and a CPU-only Dynamic Weightless Network (DWN).
Day 11 focused on validation and interpretation, rather than adding new architecture.
The goal was to confirm that the observed anomaly scores were:
- structurally sound
- repeatable
- meaningful in the context of kernel behavior
// What Was Done
Section titled “// What Was Done”- Re-ran training and scoring pipelines on live
ptracesyscall traces - Verified preprocessing guarantees:
- Fixed input dimensionality (43,008 bits)
- No shape drift across windows
- No histogram expansion from unknown syscalls
- Scored syscall windows generated from:
- Normal interactive shell sessions
- Abnormal / syscall-noisy executions
// Observed Behavior
Section titled “// Observed Behavior”Normal Discriminator
Section titled “Normal Discriminator”- Produces consistently positive mean responses
- Indicates effective memorization of benign kernel behavior patterns
Attack Discriminator
Section titled “Attack Discriminator”- Produces suppressed or negative responses
- Expected outcome due to normal-only training
Anomaly Score (Normal − Attack)
Section titled “Anomaly Score (Normal − Attack)”- Strongly positive mean for benign traces
- High variance, expected at prototype scale
- Distribution shift observed between normal and abnormal executions
// Interpretation
Section titled “// Interpretation”The system currently behaves as a statistical anomaly detector, not a binary classifier.
- No decision threshold is enforced
- Scores must be interpreted as relative deviations from learned benign behavior
- This aligns with classical intrusion detection system (IDS) design
// Key Insight
Section titled “// Key Insight”Anomaly detection systems should be evaluated by score distributions, not accuracy metrics.
At this stage:
- Distribution separation matters more than precision/recall
- Stability and interpretability take priority over classification performance
// What Was Intentionally Not Done
Section titled “// What Was Intentionally Not Done”- No threshold calibration
- No ROC or false-positive tuning
- No claims of production readiness
These steps are deferred to preserve scientific correctness.
// Outcome
Section titled “// Outcome”Sentinel Sandbox now demonstrates:
- End-to-end kernel-level anomaly scoring
- Representation-stable preprocessing
- Interpretable discriminator behavior
The system is ready for the next phase: threshold calibration and false-positive analysis.
// Status
Section titled “// Status”Completed