Confidence-Weighted Scoring Engine
Evidence-based, probabilistic classifier with algorithmic enhancements for security finding confidence analysis
Overview
Spyda 2.0 uses a sophisticated Confidence-Weighted Scoring Engine that goes beyond simple heuristics. Our system employs a 5-factor probabilistic classifier with dynamic adjustments based on project context, scanner reliability, and real-time threat intelligence.
Evidence-Based
Probabilistic classification using corroborated evidence from multiple sources
Dynamic Weights
Context-aware adjustment based on project type and security domains
A. Confidence Formula
Our 5-factor model positions each metric as part of an evidence-based, probabilistic classifier with UI explainability:
Confidence Score = (w₁ × Corroboration) + (w₂ × Clarity) + (w₃ × Source Credibility) + (w₄ × Exploitability) - (w₅ × Contradiction)Default weights: w₁=30%, w₂=20%, w₃=20%, w₄=20%, w₅=10%
Corroboration (30%)
Number of independent sources confirming the finding. Multiple tools = higher confidence. Tied to UI panels showing which scanners detected the issue.
Clarity (20%)
Specificity of evidence including file paths, line numbers, and rule matches. Provides explainability for security teams.
Source Credibility (20%)
Average credibility of scanning tools, calibrated over time based on false positive rates. High-accuracy tools weighted higher.
Exploitability (20%)
Real-world exploitability based on CISA KEV, EPSS scores, and vulnerability databases. Dynamically boosted when threat intel indicates active exploitation.
Contradiction (-10%)
Penalty applied when scanners provide conflicting severity assessments. Higher penalties for severe conflicts (e.g., SAST says CRITICAL but DAST says INFO).
B. Dynamic Weight Adjustment Layer
Spyda 2.0 adjusts confidence weights based on project characteristics and security domain context.
Supply Chain Projects
For dependency-heavy projects, exploitability gets prioritized (35%) to emphasize CVE/EPSS data.
AI/ML Projects
For AI systems, clarity gets boosted (35%) to capture model behavior specifics and training data risks.
Compliance-Focused Projects
For regulatory compliance, source credibility is critical (40%) to ensure audit-grade tool quality.
Integration: Uses GitHub metadata (repo topics, language stats, dependency graphs) to auto-detect project type.
C. Scanner Reliability Calibration Loop
Spyda learns scanner reliability over time, dynamically down-weighting high false-positive tools.
How It Works
- 1Initial State: All scanners start with default credibility (e.g., Snyk = 90%, unknown tools = 50%)
- 2Learning Phase: When findings are marked as false positives, the scanner's reliability score decreases by 5%
- 3Blended Scoring: Final credibility = 70% learned + 30% base credibility (floor at 30%)
- 4Per-Repo & Per-Team: Calibration is context-specific, so a scanner may be reliable for one team but not another
Example
If "Scanner X" produces 10 false positives in your repo, its credibility drops from 85% → 35% over time, reducing its influence on confidence scores.
D. Exploitability Boost Layer
Real-time threat intelligence integration boosts confidence when vulnerabilities are actively exploited.
Boost Triggers
CISA Known Exploited Vulnerabilities
If CVE is in CISA KEV catalog → Exploitability = 100%
EPSS Score ≥ 0.3
If EPSS (Exploit Prediction Scoring System) ≥ 30% → Add +20% boost
GitHub Security Advisories
If vulnerability is trending in GitHub advisories → Dynamic boost based on severity
Example Calculation
Scenario: Log4Shell (CVE-2021-44228) detected in your application
E. Contradiction Penalty 2.0
Dynamic penalty based on severity conflicts between scanners, preventing overconfidence.
Penalty Tiers
Severe Conflict
SAST: CRITICAL vs DAST: INFO/LOW
Minor Conflict
Different severities within 1-2 levels
Agreement
All scanners report same severity
Why This Matters
Contradictory evidence suggests uncertainty. If one tool flags a SQL injection as CRITICAL but another tool (testing the same endpoint) says it's safe, the finding's confidence should be reduced until the team investigates further.
Advanced Algorithms (NEW)
Spyda 2.0 introduces four cutting-edge algorithms that enhance finding correlation, temporal analysis, AI detection, and quantum readiness assessment.
F. Multi-Scanner Correlation Graph
NEWGraph-based correlation linking scanner findings with GitHub dependency and code-owner maps for improved corroboration accuracy.
How It Works
- 1Node Creation: Each finding from each scanner becomes a node with file, component, and dependency metadata
- 2Edge Calculation: Edges connect findings based on same file (90%), same component (85%), dependency chain (60%), or code owner (30%)
- 3Corroboration Boost: Uses graph structure to calculate enhanced confidence scores for correlated findings
Example
If Snyk detects a SQL injection in login.py and Semgrep flags the same file, the graph creates a strong correlation edge (weight: 0.9), boosting confidence to 95%+.
G. Temporal Drift Algorithm
NEWDetects stale risk in unchanged files with old vulnerabilities, applying decay-weighting to credibility.
Drift Scoring Rules
High Drift (Stale)
File unchanged for 180+ days with 90+ day old vulnerability
Moderate Drift
File unchanged for 90+ days with 30+ day old vulnerability
No Drift
Recent code or newly detected vulnerability
Use Case
Legacy codebases with 3+ year old SQL injection vulnerabilities get down-weighted since they may be mitigated by other controls or represent accepted technical debt.
H. AI-Signature Detection Engine
NEWUses token patterns, syntax embedding clusters, and training-data fingerprints to detect AI-generated code.
Detection Signals
Suspicious Comment Patterns
Generic TODO/FIXME comments, "Example usage:", "Helper function to"
Generic Naming
Functions named handle*, process*, execute*, perform*, do*
Excessive Comments
>30% of lines are inline comments
No File History
Large file (500+ lines) with zero prior commits
Model Fingerprinting
Identifies likely AI model based on code patterns:
⚠️ Why This Matters: AI-generated code may contain subtle security flaws from training data or lack proper input validation. Flagging it ensures human security review.
I. PQC Readiness Classifier 2.0
NEWClassifies repositories into PQC (Post-Quantum Cryptography) risk tiers with GitHub secret scanning integration.
Risk Tiers
Tier 1: Critical PQC Risk
CRITICAL5+ vulnerable crypto operations (RSA, ECDSA, DH) with no PQC-ready libraries
Action: Immediate migration to Kyber, Dilithium required
Tier 2: High PQC Risk
HIGHVulnerable crypto usage detected, no PQC primitives
Action: Begin PQC readiness assessment
Tier 3: Moderate PQC Risk
MEDIUMMixed crypto usage with some PQC-ready libraries
Action: Expand PQC coverage to all operations
Tier 4: Low PQC Risk
LOWPQC-ready or minimal crypto exposure
Status: Repository is quantum-safe
PQC-Ready Libraries Detected
🔗 GitHub Integration: Automatically scans dependency graphs and secret scanning results to detect hardcoded crypto keys that need rotation.