resk-logits

GPU-Accelerated Logits Processor for LLM Safety Filtering

Vectorized Aho-Corasick on GPU

resk-logits is a Python library that implements a shadow ban system for filtering dangerous content during LLM text generation. Unlike prompt-based filters that can be jailbroken, resk-logits operates at the logits level -- intercepting token predictions before they are sampled. It uses a GPU-accelerated Aho-Corasick automaton to detect banned patterns in O(n) time with zero overhead during inference.

Project Stats

1.2K
PyPI Downloads
0
GitHub Stars
v0.1.2
Latest Version
Python
Language

Key Features

Vectorized Aho-Corasick Engine

Pre-computes a binary danger mask over the entire vocabulary for O(1) GPU-based filtering at each generation step. No per-token overhead during inference.

  • Builds a trie with failure links from banned phrases
  • Pre-computes state transitions for all tokens
  • GPU danger mask enables vectorized penalty application
  • Batch tokenization for fast automaton construction

Shadow Ban System

Instead of hard-blocking tokens with -inf (which can degrade generation quality), resk-logits applies configurable penalties to make dangerous tokens extremely unlikely.

  • Light penalty (-5.0): ~1% chance of generation
  • Medium penalty (-10.0): ~0.005% chance
  • Strong penalty (-15.0): ~0.00003% chance
  • Very strong (-20.0): virtually impossible

Multi-Level Filtering

Supports tiered penalty levels for different severity categories. Each severity level gets its own Aho-Corasick automaton with dedicated penalties.

  • Separate phrase lists per severity level (high, medium, low)
  • Independent penalty configuration per level
  • Automatic EOS forcing on high-severity matches
  • State tracking maintained per level per batch item

Integration with vLLM and HuggingFace

Compatible with both HuggingFace transformers and vLLM inference engines through built-in adapters.

  • HuggingFace: Drop-in LogitsProcessor for model.generate()
  • vLLM: VLLMWrapper adapts the processor to vLLM's signature
  • Batch support: Handles multiple concurrent generations
  • Device-aware: Automatically moves tensors to GPU

Quick Install

pip install resklogits

Usage Example

Shadow Ban Processor

from resklogits import ShadowBanProcessor from transformers import AutoModelForCausalLM, AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-v0.1") model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-v0.1") processor = ShadowBanProcessor( tokenizer=tokenizer, banned_phrases=["DROP TABLE", "DELETE FROM", "xp_cmdshell"], shadow_penalty=-15.0, device="cuda", ) outputs = model.generate( **inputs, logits_processor=[processor], max_new_tokens=200, )

Multi-Level Filtering

from resklogits import MultiLevelShadowBanProcessor processor = MultiLevelShadowBanProcessor( tokenizer=tokenizer, banned_phrases_by_level={ "high": ["DROP TABLE", "DELETE FROM"], "medium": ["ALTER TABLE", "TRUNCATE"], "low": ["salaries", "ssn"], }, penalties={"high": -20.0, "medium": -10.0, "low": -5.0}, device="cuda", )

CLI Tool

resk-logits includes a command-line interface for testing and validation:

  • resklogits generate: Test filtering on sample prompts
  • resklogits test: Validate phrase detection
  • resklogits expand: Expand patterns with synonyms
  • resklogits cache: Manage rule generation cache
  • resklogits validate: Validate configuration files

Architecture

Input tokens | v Aho-Corasick Automaton (GPU) | v State transition per token (O(1)) | v Match detection via failure links | v Shadow penalty via danger_mask tensor | v EOS forcing on complete matches | v Filtered logits output

Get Started with resk-logits

Add GPU-accelerated content filtering to your LLM pipeline.

View on GitHub PyPI Package

Technical Support

For technical inquiries and integration support:

contact[@]resk.fr

Contact Our Team