ZAKKY zakky8

ZAKKY

AI safety · agent engineering · 3D web performance

🛠 Currently building

Pre-tool-call governance + structured audit for OSS agent frameworks. Adversarial security review of every TypeScript package I ship. A mechanism-grounded jailbreak taxonomy that publicly retracts overclaims instead of silently rewriting them.

📬 Available for

Custom bot builds (Discord / Telegram / Slack — AI-powered or classic) · web builds (Next.js, Astro, Three.js / R3F) · community moderator + support-agent roles · AI-safety consulting · independent security review for TypeScript + Node 22 · Three.js + WebGPU performance audits · technical writing with sourced citations.

Featured projects

Verification-first TypeScript agent framework. Hallucination defense, source grounding, pre-tool-call governance, capability-token sandboxing, and platform rate-limit scheduling as first-class layers — not bolt-ons.

28 packages · 10 surfaces (Discord, Slack, Telegram, Teams, web, REST, gRPC, ticketing, Matrix, voice)
916 tests passing · BENCHMARKS dimensions gated in CI on every PR
Verifier: atomic-claim multi-judge with Vectara HHEM-2.1 adapter
Interop: MCP + OAuth gateway · WASM sandbox (wasmtime) · Agent Client Protocol v1 (Zed / Cursor / Helix)
44 bugs caught across three internal adversarial review passes · 0 known CVEs
Apache-2.0

_{→ zakky8/TENET}

Mechanism-grounded taxonomy of 40 LLM jailbreak patterns across 10 categories, mapped to the safety-alignment assumptions they subvert.

Latest: v4.2.1 — "Honest Reframing: Simulation is Prior"
Live site: zakky8.github.io/llm-jailbreak-taxonomy
pytest 10/10 · Python 3.10 / 3.11 / 3.12 on GitHub Actions
17 citations, every one direct-WebFetch verified · 1 publicly refuted (PoisonedRAG 90%, not 97–99%)
Seed-42 bit-identical across runs · verified in CI
Adversarial peer review caught four overclaims in v4.2.0 — all retracted in v4.2.1 with side-by-side "claim → why wrong"

_{→ zakky8/llm-jailbreak-taxonomy}

The Three.js + WebGL performance reference I wish I'd had: 48 validated topic folders, every claim sourced against live repos and browser specs.

Live site: zakky8.github.io/web-optimization — ⌘K search, filter, particle background
Three.js r184
Covers WebGPU, GLSL, R3F, GSAP, mobile, GPGPU particles, Core Web Vitals for 3D
Sourced corrections to widely-repeated wrong claims: GSAP plugins free, SMIL not deprecated, PCFSoftShadowMap deprecated, Safari WebXR support

_{→ zakky8/web-optimization}

Stack

_{AI / Safety · Anthropic SDK · OpenAI SDK · Constitutional AI · RLHF · red-teaming · Wilson CI · McNemar · Cochran Q}
_{Agents · TypeScript strict · ES2024 · pnpm workspaces · MCP · ACP · OAuth gateway · wasmtime sandbox · multi-judge verifier · pre-tool-call governance + audit}
_{3D / Web · Three.js · React Three Fiber · drei · WebGPU · WGSL · GLSL · GSAP · Lenis · Vite · Astro · Next.js 15 · Core Web Vitals}

Recent shipments

Date	Project	Summary
2026-06-04	`TENET`	Phase 14 vulnerability-test triage — three-agent adversarial review surfaced 23 real issues; HIGH + MEDIUM patched (`a2ca9a8`)
2026-06-04	`@tenet/acp`	Agent Client Protocol v1 adapter — JSON-RPC 2.0 over stdio, NDJSON framed, for Zed / Cursor / Helix interop
2026-06-02	`llm-jailbreak-taxonomy`	v4.2.1 released with public retractions of four overclaims
2026-06-02	`tldr-pages/tldr`	`pyinstaller` merged upstream · `fc-scan` / `syft` / `helmfile` PRs approved

_{External-PR throughput this year: 12 PRs · ~850 commits}

Activity

Working principles

Sources: primary documents first (official docs, browser specs, live repos); peer-reviewed papers second. Affiliate review sites, paraphrased rules, and training-data memory are rejected outright.

Claims: hypothesis until corroborated by ≥2 independent sources. A single source is marked UNVERIFIED. Retractions are documented publicly in the CHANGELOG — never silently rewritten.

Simulation vs measurement: hand-tuned parameters that reproduce literature distributions are a prior, never presented as measurement. Live API calls against real models are the only valid measurement.

Review: adversarial review is welcome. The assumption is the reviewer is right until proven otherwise. Response order: fix → defend → silently ignore (last is never acceptable).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly