resk-logits
GPU-Accelerated Logits Processor for LLM Safety Filtering
Vectorized Aho-Corasick on GPU
resk-logits is a Python library that implements a shadow ban system for filtering dangerous content during LLM text generation. Unlike prompt-based filters that can be jailbroken, resk-logits operates at the logits level -- intercepting token predictions before they are sampled. It uses a GPU-accelerated Aho-Corasick automaton to detect banned patterns in O(n) time with zero overhead during inference.
Project Stats
Key Features
Vectorized Aho-Corasick Engine
Pre-computes a binary danger mask over the entire vocabulary for O(1) GPU-based filtering at each generation step. No per-token overhead during inference.
- Builds a trie with failure links from banned phrases
- Pre-computes state transitions for all tokens
- GPU danger mask enables vectorized penalty application
- Batch tokenization for fast automaton construction
Shadow Ban System
Instead of hard-blocking tokens with -inf (which can degrade generation quality), resk-logits applies configurable penalties to make dangerous tokens extremely unlikely.
- Light penalty (-5.0): ~1% chance of generation
- Medium penalty (-10.0): ~0.005% chance
- Strong penalty (-15.0): ~0.00003% chance
- Very strong (-20.0): virtually impossible
Multi-Level Filtering
Supports tiered penalty levels for different severity categories. Each severity level gets its own Aho-Corasick automaton with dedicated penalties.
- Separate phrase lists per severity level (high, medium, low)
- Independent penalty configuration per level
- Automatic EOS forcing on high-severity matches
- State tracking maintained per level per batch item
Integration with vLLM and HuggingFace
Compatible with both HuggingFace transformers and vLLM inference engines through built-in adapters.
- HuggingFace: Drop-in LogitsProcessor for model.generate()
- vLLM: VLLMWrapper adapts the processor to vLLM's signature
- Batch support: Handles multiple concurrent generations
- Device-aware: Automatically moves tensors to GPU
Quick Install
pip install resklogits
Usage Example
Shadow Ban Processor
from resklogits import ShadowBanProcessor
from transformers import AutoModelForCausalLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1")
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1")
processor = ShadowBanProcessor(
tokenizer=tokenizer,
banned_phrases=["DROP TABLE", "DELETE FROM", "xp_cmdshell"],
shadow_penalty=-15.0,
device="cuda",
)
outputs = model.generate(
**inputs,
logits_processor=[processor],
max_new_tokens=200,
)
Multi-Level Filtering
from resklogits import MultiLevelShadowBanProcessor
processor = MultiLevelShadowBanProcessor(
tokenizer=tokenizer,
banned_phrases_by_level={
"high": ["DROP TABLE", "DELETE FROM"],
"medium": ["ALTER TABLE", "TRUNCATE"],
"low": ["salaries", "ssn"],
},
penalties={"high": -20.0, "medium": -10.0, "low": -5.0},
device="cuda",
)
CLI Tool
resk-logits includes a command-line interface for testing and validation:
- resklogits generate: Test filtering on sample prompts
- resklogits test: Validate phrase detection
- resklogits expand: Expand patterns with synonyms
- resklogits cache: Manage rule generation cache
- resklogits validate: Validate configuration files
Architecture
Input tokens
|
v
Aho-Corasick Automaton (GPU)
|
v
State transition per token (O(1))
|
v
Match detection via failure links
|
v
Shadow penalty via danger_mask tensor
|
v
EOS forcing on complete matches
|
v
Filtered logits output
Get Started with resk-logits
Add GPU-accelerated content filtering to your LLM pipeline.
View on GitHub PyPI PackageTechnical Support
For technical inquiries and integration support:
contact[@]resk.fr
Contact Our Team