AI API gateway for protocol translation, inline moderation, streaming parity, and local moderation training.
TransformVetter transforms AI API requests and responses across OpenAI, Claude, and Gemini-compatible formats, then vets request content with keyword rules, smart moderation, local models, and optional LLM review before forwarding upstream.
Note: TransformVetter was previously published as PrismGuard and, earlier, GuardianBraidge.
It is built for production proxy work, not only raw forwarding. The codebase keeps protocol behavior explicit across HTTP routing, request format detection, JSON response conversion, SSE event conversion, moderation error envelopes, history reuse, and training scheduling.
Use it when you need one front door for mixed LLM clients, but still want request moderation before traffic reaches the upstream provider.
- Why TransformVetter
- Fit
- Install
- How It Works
- Supported Protocols
- Quick Start
- Publishing
- Proxy Configuration
- Moderation Profiles
- Runtime Features
- HTTP Endpoints
- Testing
- Repository Map
- Operational Notes
- Contributing
- Security
- Acknowledgements
- License
Most LLM proxy setups solve either API compatibility or moderation. TransformVetter does both in the same request path:
| Need | What TransformVetter does |
|---|---|
| Mixed client protocols | Accepts OpenAI, Claude, Gemini, and OpenAI Responses-style traffic. |
| Upstream format conversion | Rewrites supported requests and responses into the target API shape. |
| Streaming behavior | Preserves SSE-style streaming while translating deltas and terminal events. |
| Inline safety checks | Runs keyword moderation, profile-based smart moderation, and optional LLM review before forwarding. |
| Local review models | Uses profile-selected local runtimes for confident decisions and falls back when uncertain. |
| Training feedback loop | Stores moderation history and can retrain profile-local models from collected samples. |
- Protocol bridge: OpenAI Chat Completions, OpenAI Responses, Claude Messages, and Gemini GenerateContent request families.
- Streaming parity: SSE decoding and re-emission with text deltas, tool-call deltas, final events, usage metadata, and single
[DONE]handling. - Inline moderation: keyword moderation plus smart moderation with cache/history reuse, local model decisions, LLM fallback, retry logic, and concurrency limiting.
- Local runtimes:
hashlinear,bow, andfasttextmoderation paths selected per profile. - Training loop: profile-aware sample RPC, training subprocess mode, cooldown decisions, and scheduler support.
- Debug surface: health, OpenAPI stub, URL config parsing, profile inspection, model metrics, and storage inspection routes.
TransformVetter is a good fit if you:
- route multiple AI client formats through one service;
- need moderation before an upstream LLM API receives the request;
- care about stream-compatible protocol conversion, not only non-stream JSON;
- want profile-specific moderation behavior and locally trained review models;
- prefer explicit proxy configuration over hidden global routing rules.
It is not designed to be a full API management platform with billing, tenant dashboards, or hosted policy authoring UI.
Use the published container image when you only want to try the proxy:
docker run --rm -p 8000:8000 --env-file .env ghcr.io/cassiopeiacode/transformvetter:latest
curl -s http://127.0.0.1:8000/healthzUse the release binary when you want a single executable:
gh release download --repo CassiopeiaCode/TransformVetter v1.0.0 --pattern 'TransformVetter-linux-amd64.tar.gz'
tar -xzf TransformVetter-linux-amd64.tar.gz
chmod +x TransformVetter-linux-amd64
./TransformVetter-linux-amd64The default image is published to GitHub Container Registry:
docker pull ghcr.io/cassiopeiacode/transformvetter:latest
docker run --rm -p 8000:8000 --env-file .env ghcr.io/cassiopeiacode/transformvetter:latestCommon tags:
| Tag | Meaning |
|---|---|
latest |
Latest successful build from the default branch. |
master |
Latest successful build from master. |
sha-<commit> |
Immutable image for a specific commit. |
<version> |
Release image from a v* Git tag, for example 1.2.3. |
Package page:
https://github.com/CassiopeiaCode/TransformVetter/pkgs/container/transformvetter
Tagged releases publish a Linux amd64 binary and checksum:
gh release download --repo CassiopeiaCode/TransformVetter --pattern 'TransformVetter-linux-amd64.tar.gz'
gh release download --repo CassiopeiaCode/TransformVetter --pattern 'TransformVetter-linux-amd64.tar.gz.sha256'
sha256sum -c TransformVetter-linux-amd64.tar.gz.sha256
tar -xzf TransformVetter-linux-amd64.tar.gz
chmod +x TransformVetter-linux-amd64
./TransformVetter-linux-amd64Release page:
https://github.com/CassiopeiaCode/TransformVetter/releases
git clone https://github.com/CassiopeiaCode/TransformVetter.git
cd TransformVetter
cargo build --release
./target/release/TransformVetterflowchart LR
Client[Client request] --> Route[Axum route]
Route --> Config[URL or env proxy config]
Config --> Format[format.rs request plan]
Format --> Extract[moderation text extraction]
Extract --> Basic[basic moderation]
Basic --> Smart[smart moderation]
Smart --> Upstream[Upstream LLM API]
Upstream --> Response[JSON response transform]
Upstream --> Stream[SSE stream transcoder]
Response --> Client
Stream --> Client
Smart --> History[(history.rocks)]
History --> Train[train-profile subprocess]
Train --> Model[local model artifacts]
Model --> Smart
Request flow:
- The catch-all proxy route receives a request whose path contains
{config}${upstream}or!ENV_KEY${upstream}. src/routes.rsparses the proxy configuration and real upstream URL.src/format.rsoptionally detects and transforms the request into a target API format.src/moderation/extract.rsderives the text that should be reviewed.basic_moderationcan block immediately from keywords.smart_moderationcan reuse history, run a local model, or call an LLM reviewer.src/proxy.rsforwards the request to the upstream API only after moderation passes.src/response.rstransforms non-stream JSON responses when needed.src/streaming.rstranscodes SSE streams when the source and target protocols differ.
format_transform supports these request format names:
| Format name | API family |
|---|---|
openai_chat |
OpenAI-compatible Chat Completions |
openai_responses |
OpenAI-compatible Responses API |
claude_chat |
Anthropic Claude Messages |
gemini_chat |
Gemini GenerateContent |
The transformation layer handles normal messages, system/instruction fields, multimodal content blocks, tool declarations, tool calls, tool results, stream flags, and path rewrites for target APIs. The response layer includes both JSON conversion and SSE conversion paths, with the broadest coverage in the HTTP proxy and stream tests.
- Linux or another Unix-like environment.
- Rust toolchain compatible with edition 2021. The Dockerfile currently installs Rust
1.89.0. clangandlibclangwhen building dependencies that need native compilation.- Optional:
systemd-runand CPU affinity tools for the scheduler-style workflows.
cd TransformVetter
cargo runBy default the service reads .env from the repository root, listens on 0.0.0.0:8000, and exposes:
curl -s http://127.0.0.1:8000/healthzExpected shape:
{
"ok": true,
"service": "TransformVetter",
"host": "0.0.0.0",
"port": 8000,
"debug": true
}cargo build --release
./target/release/TransformVetterThe bundled start.sh expects a .env file and a release binary:
nice -n 19 cargo build --release -j 1
./start.shdocker build -t transformvetter .
docker run --rm -p 8080:8080 transformvetterThe Docker image builds the default feature set. Storage debug routes and sample RPC service code require the explicit storage-debug feature, described below.
The repository uses two publishing workflows.
.github/workflows/ghcr.yml publishes the container image. On every push to master, it builds and pushes:
ghcr.io/cassiopeiacode/transformvetter:latestghcr.io/cassiopeiacode/transformvetter:masterghcr.io/cassiopeiacode/transformvetter:sha-<commit>
On v* tags, the same workflow also publishes semver image tags such as :1.2.3, :1.2, and :1.
.github/workflows/release.yml publishes GitHub Releases. On v* tags, it builds a Linux amd64 binary, creates or updates the release, and attaches:
TransformVetter-linux-amd64TransformVetter-linux-amd64.tar.gzTransformVetter-linux-amd64.tar.gz.sha256
Manual publishing is available from GitHub Actions:
gh workflow run ghcr.yml --repo CassiopeiaCode/TransformVetter --ref master
gh workflow run release.yml --repo CassiopeiaCode/TransformVetter --ref master -f tag=v1.2.3TransformVetter does not use a single static upstream route. The proxy route is encoded as:
/{config}${upstream-url}
or, preferably for real deployments:
/!ENV_KEY${upstream-url}
The env form keeps long JSON configs out of client URLs. ENV_KEY must contain JSON.
Example .env entry:
CLAUDE_TO_OPENAI='{"format_transform":{"enabled":true,"strict_parse":true,"from":"claude_chat","to":"openai_chat","delay_stream_header":true},"basic_moderation":{"enabled":true,"keywords_file":"configs/keywords.txt","error_code":"BASIC_MODERATION_BLOCKED"},"smart_moderation":{"enabled":true,"profile":"default"}}'Example request shape:
curl -sS \
-H 'content-type: application/json' \
-H "authorization: Bearer $UPSTREAM_API_KEY" \
--data '{"model":"claude-3-5-sonnet-latest","max_tokens":64,"messages":[{"role":"user","content":"Hello"}]}' \
'http://127.0.0.1:8000/!CLAUDE_TO_OPENAI$https://api.openai.com/v1/chat/completions'You can validate config parsing without calling an upstream:
curl -G http://127.0.0.1:8000/debug/url-config \
--data-urlencode 'value=!CLAUDE_TO_OPENAI$https://api.openai.com/v1/chat/completions'{
"format_transform": {
"enabled": true,
"strict_parse": true,
"from": "claude_chat",
"to": "openai_chat",
"delay_stream_header": true
}
}Important fields:
| Field | Meaning |
|---|---|
enabled |
Enables request planning and format conversion. |
strict_parse |
Returns a structured moderation-style error when the source format cannot be detected or is disallowed. |
from |
Source format, array of formats, or auto-style detection when omitted. |
to |
Target format. Use pass_through to keep the source protocol while still applying moderation. |
disable_tools |
Removes tool declarations and tool call content during transformation. |
delay_stream_header |
Lets the proxy delay committing stream headers so pre-stream errors can be returned as normal JSON. |
{
"basic_moderation": {
"enabled": true,
"keywords_file": "configs/keywords.txt",
"error_code": "BASIC_MODERATION_BLOCKED"
}
}The basic layer loads keywords from a file, matches case-insensitively, and refreshes when the file changes.
{
"smart_moderation": {
"enabled": true,
"profile": "default"
}
}The smart layer uses configs/mod_profiles/<profile>/profile.json to choose the AI reviewer, local model runtime, thresholds, training limits, and sample-loading strategy.
Profiles live under:
configs/mod_profiles/<profile>/
Example profile names:
defaultstrict-review
Each profile can contain:
| File or directory | Purpose |
|---|---|
profile.json |
AI reviewer settings, thresholds, local model type, and training parameters. |
ai_prompt.txt |
Prompt template for LLM moderation fallback. |
keywords.txt |
Profile-local keyword list. |
history.rocks/ |
RocksDB moderation history and training samples. |
.train_status.json |
Training status written by the training subsystem. |
The profile field local_model_type selects one of:
hashlinearbowfasttext
When local confidence is clearly below the low-risk threshold or above the high-risk threshold, the local model can decide without an LLM call. Uncertain scores fall back to LLM review unless concurrency fallback logic applies.
Moderation is configured in two layers.
The proxy request config decides whether moderation runs for a specific upstream call:
{
"basic_moderation": {
"enabled": true,
"keywords_file": "configs/keywords.txt",
"error_code": "BASIC_MODERATION_BLOCKED"
},
"smart_moderation": {
"enabled": true,
"profile": "default"
}
}The profile file decides how smart moderation behaves:
configs/mod_profiles/default/profile.json
A compact profile can look like this:
{
"ai": {
"provider": "openai",
"base_url": "https://api.example.com/v1",
"model": "moderation-model-a,moderation-model-b",
"api_key_env": "MODERATION_API_KEY",
"timeout": 50,
"max_retries": 1
},
"prompt": {
"template_file": "ai_prompt.txt",
"max_text_length": 50000
},
"probability": {
"ai_review_rate": 0.01,
"random_seed": 42,
"low_risk_threshold": 0.60,
"high_risk_threshold": 0.80,
"enable_concurrency_limit_fallback": true
},
"local_model_type": "hashlinear",
"hashlinear_training": {
"min_samples": 30,
"retrain_interval_minutes": 600,
"max_samples": 10000,
"max_db_items": 100000,
"sample_loading": "random_duplicate",
"analyzer": "char",
"ngram_range": [2, 4],
"n_features": 1048576,
"alpha": 0.00001,
"epochs": 2,
"batch_size": 2048,
"max_seconds": 300
}
}Profile fields:
| Field | Meaning |
|---|---|
ai.provider |
Reviewer provider label. The current smart path is OpenAI-compatible. |
ai.base_url |
Base URL for the reviewer API. |
ai.model |
Reviewer model name. Comma-separated values are treated as retry candidates. |
ai.api_key_env |
Environment variable that holds the reviewer API key. |
ai.timeout |
Reviewer HTTP timeout in seconds. |
ai.max_retries |
Number of retry attempts for reviewer failures. |
prompt.template_file |
Prompt template file relative to the profile directory. |
prompt.max_text_length |
Maximum moderation text length passed to the reviewer prompt. |
probability.ai_review_rate |
Fraction of requests forced through AI review even when a local model can decide. |
probability.low_risk_threshold |
Scores below this threshold can be treated as locally safe. |
probability.high_risk_threshold |
Scores above this threshold can be treated as locally unsafe. |
probability.enable_concurrency_limit_fallback |
Lets the proxy use local confidence fallback when reviewer concurrency is exhausted. |
local_model_type |
Selects hashlinear, bow, or fasttext. |
*_training.min_samples |
Minimum stored samples before a training run is considered. |
*_training.retrain_interval_minutes |
Cooldown between training runs for the profile. |
*_training.max_samples |
Maximum samples loaded into a training run. |
*_training.max_db_items |
Maximum history items considered when sampling. |
*_training.sample_loading |
Sample strategy, for example random_full or random_duplicate. |
Training blocks are model-specific. Use hashlinear_training when local_model_type is hashlinear, bow_training when it is bow, and fasttext_training when it is fasttext. Extra blocks can remain in the profile; the active runtime only uses the block that matches local_model_type.
Useful checks while editing a profile:
curl -s http://127.0.0.1:8000/debug/profile/default
curl -s 'http://127.0.0.1:8000/debug/profile/default/metrics?sample_size=1000&threshold=0.5'Cargo.toml defines one optional feature:
storage-debug
Default builds do not enable it.
cargo run
cargo run --features storage-debugWith storage-debug enabled, the service can compile storage inspection routes and the sample RPC server used by the training loop. Without it, storage-dependent debug routes are not attached, and startup logs warn if sample RPC is configured but unavailable in the build.
The configured sample RPC transport defaults to Unix sockets:
TRAINING_DATA_RPC_ENABLED=true
TRAINING_DATA_RPC_TRANSPORT=unix
TRAINING_DATA_RPC_UNIX_SOCKET=run/sample-store.sockTRAINING_DATA_RPC_TRANSPORT=tcp is recognized by configuration parsing but is not implemented by the server.
Always available:
| Endpoint | Method | Purpose |
|---|---|---|
/healthz |
GET, HEAD |
Health check. |
/openapi.json |
GET |
Minimal OpenAPI document for the proxy service. |
/docs |
GET |
Simple Swagger UI placeholder page. |
/redoc |
GET |
Simple ReDoc placeholder page. |
/debug/settings |
GET |
Loaded settings, excluding hidden fields. |
/debug/proxy-config/:key |
GET |
Parse a JSON proxy config from an environment variable. |
/debug/profile/:profile |
GET |
Inspect profile paths, model presence, status, and training decision. |
/debug/url-config?value=... |
GET |
Parse a full {config}${upstream} or !KEY${upstream} route value. |
/*cfg_and_upstream |
GET, POST, PUT, DELETE |
Main proxy entry. |
Available only with --features storage-debug:
| Endpoint | Method | Purpose |
|---|---|---|
/debug/profile/:profile/metrics |
GET |
Evaluate local model metrics over stored samples. |
/debug/storage/:profile/meta |
GET |
Inspect RocksDB metadata and sample preview. |
/debug/storage/:profile/sample/:id |
GET |
Read one stored sample by id. |
/debug/storage/:profile/find-by-text |
POST |
Find a sample by exact text. |
Server mode starts the scheduler loop when TRAINING_SCHEDULER_ENABLED=true. The scheduler scans profile directories, evaluates sample counts and cooldowns, then launches profile training through the binary's subprocess mode.
Manual training entry:
cargo run --features storage-debug -- train-profile defaultThe training code can produce or refresh runtime artifacts for the configured local model type. It uses the sample RPC layer for balanced/latest/random sample loading and writes profile status for later inspection.
Useful scheduler environment variables:
| Variable | Default | Meaning |
|---|---|---|
TRAINING_SCHEDULER_ENABLED |
true |
Starts the scheduler in server mode. |
TRAINING_SCHEDULER_INTERVAL_MINUTES |
10 |
Scheduler polling interval. |
TRAINING_SCHEDULER_FAILURE_COOLDOWN_MINUTES |
30 |
Cooldown after failed training. |
TRAINING_SUBPROCESS_ALLOWED_CPUS |
0 |
CPU set used when launching training with systemd-run. |
Run the full integration-focused suite:
cargo test --tests -- --nocaptureRun with storage-dependent tests enabled:
cargo test --features storage-debug --tests -- --nocaptureLower-impact local runs:
CARGO_BUILD_JOBS=1 cargo test --test format_process_tests -- --nocapture
CARGO_BUILD_JOBS=1 cargo test --test http_proxy_request_tests -- --nocapture
CARGO_BUILD_JOBS=1 cargo test --test http_proxy_response_tests -- --nocapture
CARGO_BUILD_JOBS=1 cargo test --test http_proxy_stream_tests -- --nocapture
CARGO_BUILD_JOBS=1 cargo test --test moderation_runtime_tests -- --nocapture
CARGO_BUILD_JOBS=1 cargo test --test scheduler_tests -- --nocapture
CARGO_BUILD_JOBS=1 cargo test --test training_tests -- --nocaptureOn shared hosts, keep Rust builds and tests pinned to fewer CPUs:
taskset -c 0 env CARGO_BUILD_JOBS=1 cargo test --test http_proxy_stream_tests -- --nocapture.
├── src/
│ ├── main.rs # startup mode, server boot, scheduler and RPC wiring
│ ├── routes.rs # Axum routes, debug endpoints, URL config parser
│ ├── proxy.rs # proxy pipeline, moderation calls, upstream forwarding
│ ├── format.rs # request detection and request format transformation
│ ├── response.rs # non-stream JSON response transformation
│ ├── streaming.rs # SSE decode/transcode/re-emit layer
│ ├── profile.rs # moderation profile config and artifact paths
│ ├── sample_rpc.rs # sample storage RPC over Unix sockets
│ ├── storage.rs # RocksDB sample/history storage, feature-gated
│ ├── training.rs # train-profile subprocess and model training decisions
│ ├── scheduler.rs # periodic profile training scheduler
│ └── moderation/
│ ├── basic.rs # keyword moderation
│ ├── extract.rs # protocol-aware text extraction
│ ├── smart.rs # smart moderation orchestration
│ ├── hashlinear.rs # local hashlinear runtime
│ ├── bow.rs # local bag-of-words runtime
│ └── fasttext.rs # local fastText-compatible runtime
├── configs/
│ ├── keywords.txt
│ ├── strict_moderation_example.py
│ └── mod_profiles/
├── tests/ # protocol, proxy, moderation, storage, scheduler, training tests
├── py-scripts/ # evaluation and benchmark helper scripts
├── docs/superpowers/ # implementation plans and design notes
├── Dockerfile
├── start.sh
├── Cargo.toml
├── LICENSE
└── README.md
- The binary name in
Cargo.tomlisTransformVetter; older builds usedPrismguand-Rust. - The process attempts to lower its own priority to
nice=19at startup. Failure is logged as a warning and does not stop the server. - Request bodies with supported compression encodings are decoded before JSON parsing in the proxy pipeline.
- Several debugging and training paths assume a stable repository root because profile paths are resolved relative to the process working directory.
- The checked-in
.envand profile files may contain deployment-specific values. Review them before using this repository in a different environment.
Focused issues and pull requests are welcome. The most useful contributions are usually:
- protocol fixtures for real OpenAI, Claude, Gemini, or OpenAI Responses edge cases;
- focused tests for request conversion, SSE conversion, moderation extraction, and error envelopes;
- profile examples that make moderation configuration easier to audit;
- small fixes that preserve existing proxy behavior and keep release artifacts reproducible.
Before opening a broad refactor, start with an issue that describes the behavior change and the compatibility impact.
Do not publish real upstream API keys, moderation reviewer keys, private profile data, or RocksDB history files in issues. If a report involves sensitive examples, reduce it to a minimal synthetic request body that still reproduces the behavior.
For deployment, keep reviewer API keys in environment variables, prefer !ENV_KEY proxy configs over raw JSON URLs, and review profile files before sharing them.
TransformVetter is grateful for the support, feedback, and technical discussions from the linux.do community.
TransformVetter is licensed under the Apache License 2.0.