Back to DocsOpen Console
Spyda 2.0

Confidence-Weighted Scoring Engine

Evidence-based, probabilistic classifier with algorithmic enhancements for security finding confidence analysis

Overview

Spyda 2.0 uses a sophisticated Confidence-Weighted Scoring Engine that goes beyond simple heuristics. Our system employs a 5-factor probabilistic classifier with dynamic adjustments based on project context, scanner reliability, and real-time threat intelligence.

Evidence-Based

Probabilistic classification using corroborated evidence from multiple sources

Dynamic Weights

Context-aware adjustment based on project type and security domains

A. Confidence Formula

Our 5-factor model positions each metric as part of an evidence-based, probabilistic classifier with UI explainability:

Confidence Score = (w₁ × Corroboration) + (w₂ × Clarity) + (w₃ × Source Credibility) + (w₄ × Exploitability) - (w₅ × Contradiction)

Default weights: w₁=30%, w₂=20%, w₃=20%, w₄=20%, w₅=10%

Corroboration (30%)

Number of independent sources confirming the finding. Multiple tools = higher confidence. Tied to UI panels showing which scanners detected the issue.

1 source = 50% • 2 sources = 75% • 3+ sources = 100%

Clarity (20%)

Specificity of evidence including file paths, line numbers, and rule matches. Provides explainability for security teams.

Source Credibility (20%)

Average credibility of scanning tools, calibrated over time based on false positive rates. High-accuracy tools weighted higher.

Exploitability (20%)

Real-world exploitability based on CISA KEV, EPSS scores, and vulnerability databases. Dynamically boosted when threat intel indicates active exploitation.

Contradiction (-10%)

Penalty applied when scanners provide conflicting severity assessments. Higher penalties for severe conflicts (e.g., SAST says CRITICAL but DAST says INFO).

B. Dynamic Weight Adjustment Layer

Spyda 2.0 adjusts confidence weights based on project characteristics and security domain context.

Supply Chain Projects

For dependency-heavy projects, exploitability gets prioritized (35%) to emphasize CVE/EPSS data.

Corr: 25% • Clarity: 15% • Credibility: 20% • Exploitability: 35% • Contradiction: 5%

AI/ML Projects

For AI systems, clarity gets boosted (35%) to capture model behavior specifics and training data risks.

Corr: 25% • Clarity: 35% • Credibility: 20% • Exploitability: 15% • Contradiction: 5%

Compliance-Focused Projects

For regulatory compliance, source credibility is critical (40%) to ensure audit-grade tool quality.

Corr: 20% • Clarity: 20% • Credibility: 40% • Exploitability: 15% • Contradiction: 5%

Integration: Uses GitHub metadata (repo topics, language stats, dependency graphs) to auto-detect project type.

C. Scanner Reliability Calibration Loop

Spyda learns scanner reliability over time, dynamically down-weighting high false-positive tools.

How It Works

  1. 1Initial State: All scanners start with default credibility (e.g., Snyk = 90%, unknown tools = 50%)
  2. 2Learning Phase: When findings are marked as false positives, the scanner's reliability score decreases by 5%
  3. 3Blended Scoring: Final credibility = 70% learned + 30% base credibility (floor at 30%)
  4. 4Per-Repo & Per-Team: Calibration is context-specific, so a scanner may be reliable for one team but not another

Example

If "Scanner X" produces 10 false positives in your repo, its credibility drops from 85% → 35% over time, reducing its influence on confidence scores.

Initial: 0.85 → After 10 FPs: 0.70 * 0.35 + 0.30 * 0.85 = 0.50 (50% credibility)

D. Exploitability Boost Layer

Real-time threat intelligence integration boosts confidence when vulnerabilities are actively exploited.

Boost Triggers

KEV

CISA Known Exploited Vulnerabilities

If CVE is in CISA KEV catalog → Exploitability = 100%

EPSS

EPSS Score ≥ 0.3

If EPSS (Exploit Prediction Scoring System) ≥ 30% → Add +20% boost

GH

GitHub Security Advisories

If vulnerability is trending in GitHub advisories → Dynamic boost based on severity

Example Calculation

Scenario: Log4Shell (CVE-2021-44228) detected in your application

Base EPSS: 0.97 (97%)
KEV Status: TRUE → Exploitability = 100%
Result: Maximum confidence boost applied

E. Contradiction Penalty 2.0

Dynamic penalty based on severity conflicts between scanners, preventing overconfidence.

Penalty Tiers

Severe Conflict

SAST: CRITICAL vs DAST: INFO/LOW

-40%

Minor Conflict

Different severities within 1-2 levels

-20%

Agreement

All scanners report same severity

0%

Why This Matters

Contradictory evidence suggests uncertainty. If one tool flags a SQL injection as CRITICAL but another tool (testing the same endpoint) says it's safe, the finding's confidence should be reduced until the team investigates further.

Advanced Algorithms (NEW)

Spyda 2.0 introduces four cutting-edge algorithms that enhance finding correlation, temporal analysis, AI detection, and quantum readiness assessment.

F. Multi-Scanner Correlation Graph

NEW

Graph-based correlation linking scanner findings with GitHub dependency and code-owner maps for improved corroboration accuracy.

How It Works

  1. 1Node Creation: Each finding from each scanner becomes a node with file, component, and dependency metadata
  2. 2Edge Calculation: Edges connect findings based on same file (90%), same component (85%), dependency chain (60%), or code owner (30%)
  3. 3Corroboration Boost: Uses graph structure to calculate enhanced confidence scores for correlated findings

Example

If Snyk detects a SQL injection in login.py and Semgrep flags the same file, the graph creates a strong correlation edge (weight: 0.9), boosting confidence to 95%+.

G. Temporal Drift Algorithm

NEW

Detects stale risk in unchanged files with old vulnerabilities, applying decay-weighting to credibility.

Drift Scoring Rules

High Drift (Stale)

File unchanged for 180+ days with 90+ day old vulnerability

-30%

Moderate Drift

File unchanged for 90+ days with 30+ day old vulnerability

-15%

No Drift

Recent code or newly detected vulnerability

0%

Use Case

Legacy codebases with 3+ year old SQL injection vulnerabilities get down-weighted since they may be mitigated by other controls or represent accepted technical debt.

H. AI-Signature Detection Engine

NEW

Uses token patterns, syntax embedding clusters, and training-data fingerprints to detect AI-generated code.

Detection Signals

Suspicious Comment Patterns

Generic TODO/FIXME comments, "Example usage:", "Helper function to"

Generic Naming

Functions named handle*, process*, execute*, perform*, do*

Excessive Comments

>30% of lines are inline comments

No File History

Large file (500+ lines) with zero prior commits

Model Fingerprinting

Identifies likely AI model based on code patterns:

GPT-4/CopilotContains "// Example:" or "// Usage:" patterns
UnknownAI signals detected but no clear model fingerprint

⚠️ Why This Matters: AI-generated code may contain subtle security flaws from training data or lack proper input validation. Flagging it ensures human security review.

I. PQC Readiness Classifier 2.0

NEW

Classifies repositories into PQC (Post-Quantum Cryptography) risk tiers with GitHub secret scanning integration.

Risk Tiers

Tier 1: Critical PQC Risk

CRITICAL

5+ vulnerable crypto operations (RSA, ECDSA, DH) with no PQC-ready libraries

Action: Immediate migration to Kyber, Dilithium required

Tier 2: High PQC Risk

HIGH

Vulnerable crypto usage detected, no PQC primitives

Action: Begin PQC readiness assessment

Tier 3: Moderate PQC Risk

MEDIUM

Mixed crypto usage with some PQC-ready libraries

Action: Expand PQC coverage to all operations

Tier 4: Low PQC Risk

LOW

PQC-ready or minimal crypto exposure

Status: Repository is quantum-safe

PQC-Ready Libraries Detected

liboqskyberdilithiumfalconsphincsntru

🔗 GitHub Integration: Automatically scans dependency graphs and secret scanning results to detect hardcoded crypto keys that need rotation.