v2.0.0 Pilot-Ready MVP

PATAS

Pattern-Adaptive Transmodal Anti-Spam System

An offline pattern discovery engine that turns historical chaos into transparent, production-grade signals.

What is PATAS?

PATAS is a signal engine, not an enforcement system. It sits alongside your pipeline, analyzing history to inform the future.

Historical Logs

Analyzes message batches (10k-50k+) offline.

Signal Engine

Pattern Discovery & Rule Management

Mining...DONE
Clustering...DONE
Eval...98% Precision

Transparent Rules

Outputs SQL-like rules & metrics for your existing engine.

Does NOT block messages directly

The Problem

Manual rule creation is slow & error-prone
High cost of running LLMs on 100% of traffic
Black-box ML decisions lack explainability
Reaction lag to new spam campaigns

The Scope

"Given historical logs, automatically discover patterns, generate transparent rules, and evaluate them offline with clear safety thresholds."
No Real-time Offline Batch Explainable

Two-Stage Pattern Mining

Optimized for cost and precision

Stage 1: Fast Scan

Deterministic & Cheap

URLs, Domains, Keywords

~2-4 mins / 500k msgs

TOP 2-3% SUSPICIOUS ONLY

Stage 2: Deep Analysis

Semantic & Focused

Embeddings + Clustering + LLM

70-90% Cost Reduction

Rule Lifecycle Management

From discovery to enforcement with safety guarantees

Candidate

Generated, not evaluated

Shadow

Evaluated on history, inactive

Active

Safe for production enforcement

Deprecated

Disabled due to low precision

Conservative Profile

Precision≥ 0.95
False Positive Budget≤ 5 hits

Recommended for high-risk automated blocking

Balanced Profile

Precision≥ 0.90
CoverageHigh

For shadow mode & controlled experiments

Architecture Snapshot

Layered design for on-premise deployment

API Layer

FastAPI (app/api/)

REST

Service Layer

Business Logic (app/v2_*.py)

Python 3.10+

Repository Layer

Data Access (SQLAlchemy 2.0)

PostgreSQL
Production DB
Redis
Locks & Cache

Internal Benchmarks

Pattern Mining Speed (500k msgs)

Stage 1
2 mins (Fast Scan)
Stage 2
3.5 hrs (Deep Analysis)
Precision (Conservative)
0.93 - 0.97
False Positive Rate
< 0.15%
Coverage
5 - 8%
of spam messages

*Benchmarks on 500k message dataset. Intended for offline overnight runs.

Pilot Execution Plan

Data Selection

Select 7-30 days of logs for a specific region.

Deployment

Deploy PATAS on-prem with read-only access.

Mining Run

Execute the two-stage pipeline offline.

Review & Export

Review metrics, export 'SAFE_AUTO' rules.

Maintenance

Weekly runs for new patterns & deprecation.