Skip to content

Day 11: Kernel Anomaly Scoring Validation & Interpretation

By the end of Day 10, Sentinel Sandbox successfully produced stable anomaly scores using a temporally bucketed syscall representation and a CPU-only Dynamic Weightless Network (DWN).

Day 11 focused on validation and interpretation, rather than adding new architecture.
The goal was to confirm that the observed anomaly scores were:

  • structurally sound
  • repeatable
  • meaningful in the context of kernel behavior

  • Re-ran training and scoring pipelines on live ptrace syscall traces
  • Verified preprocessing guarantees:
    • Fixed input dimensionality (43,008 bits)
    • No shape drift across windows
    • No histogram expansion from unknown syscalls
  • Scored syscall windows generated from:
    • Normal interactive shell sessions
    • Abnormal / syscall-noisy executions

  • Produces consistently positive mean responses
  • Indicates effective memorization of benign kernel behavior patterns
  • Produces suppressed or negative responses
  • Expected outcome due to normal-only training
  • Strongly positive mean for benign traces
  • High variance, expected at prototype scale
  • Distribution shift observed between normal and abnormal executions

The system currently behaves as a statistical anomaly detector, not a binary classifier.

  • No decision threshold is enforced
  • Scores must be interpreted as relative deviations from learned benign behavior
  • This aligns with classical intrusion detection system (IDS) design

Anomaly detection systems should be evaluated by score distributions, not accuracy metrics.

At this stage:

  • Distribution separation matters more than precision/recall
  • Stability and interpretability take priority over classification performance

  • No threshold calibration
  • No ROC or false-positive tuning
  • No claims of production readiness

These steps are deferred to preserve scientific correctness.


Sentinel Sandbox now demonstrates:

  • End-to-end kernel-level anomaly scoring
  • Representation-stable preprocessing
  • Interpretable discriminator behavior

The system is ready for the next phase: threshold calibration and false-positive analysis.


Completed