Skip to content
View zakky8's full-sized avatar

Block or report zakky8

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
zakky8/README.md

header

ZAKKY

AI safety · agent engineering · 3D web performance


🛠 Currently building

Pre-tool-call governance + structured audit for OSS agent frameworks. Adversarial security review of every TypeScript package I ship. A mechanism-grounded jailbreak taxonomy that publicly retracts overclaims instead of silently rewriting them.

📬 Available for

Custom bot builds (Discord / Telegram / Slack — AI-powered or classic) · web builds (Next.js, Astro, Three.js / R3F) · community moderator + support-agent roles · AI-safety consulting · independent security review for TypeScript + Node 22 · Three.js + WebGPU performance audits · technical writing with sourced citations.


Featured projects

Verification-first TypeScript agent framework. Hallucination defense, source grounding, pre-tool-call governance, capability-token sandboxing, and platform rate-limit scheduling as first-class layers — not bolt-ons.

  • 28 packages · 10 surfaces (Discord, Slack, Telegram, Teams, web, REST, gRPC, ticketing, Matrix, voice)
  • 916 tests passing · BENCHMARKS dimensions gated in CI on every PR
  • Verifier: atomic-claim multi-judge with Vectara HHEM-2.1 adapter
  • Interop: MCP + OAuth gateway · WASM sandbox (wasmtime) · Agent Client Protocol v1 (Zed / Cursor / Helix)
  • 44 bugs caught across three internal adversarial review passes · 0 known CVEs
  • Apache-2.0

→ zakky8/TENET

Mechanism-grounded taxonomy of 40 LLM jailbreak patterns across 10 categories, mapped to the safety-alignment assumptions they subvert.

  • Latest: v4.2.1"Honest Reframing: Simulation is Prior"
  • Live site: zakky8.github.io/llm-jailbreak-taxonomy
  • pytest 10/10 · Python 3.10 / 3.11 / 3.12 on GitHub Actions
  • 17 citations, every one direct-WebFetch verified · 1 publicly refuted (PoisonedRAG 90%, not 97–99%)
  • Seed-42 bit-identical across runs · verified in CI
  • Adversarial peer review caught four overclaims in v4.2.0 — all retracted in v4.2.1 with side-by-side "claim → why wrong"

→ zakky8/llm-jailbreak-taxonomy

The Three.js + WebGL performance reference I wish I'd had: 48 validated topic folders, every claim sourced against live repos and browser specs.

  • Live site: zakky8.github.io/web-optimization — ⌘K search, filter, particle background
  • Three.js r184
  • Covers WebGPU, GLSL, R3F, GSAP, mobile, GPGPU particles, Core Web Vitals for 3D
  • Sourced corrections to widely-repeated wrong claims: GSAP plugins free, SMIL not deprecated, PCFSoftShadowMap deprecated, Safari WebXR support

→ zakky8/web-optimization


Stack

Stack



AI / Safety · Anthropic SDK · OpenAI SDK · Constitutional AI · RLHF · red-teaming · Wilson CI · McNemar · Cochran Q
Agents · TypeScript strict · ES2024 · pnpm workspaces · MCP · ACP · OAuth gateway · wasmtime sandbox · multi-judge verifier · pre-tool-call governance + audit
3D / Web · Three.js · React Three Fiber · drei · WebGPU · WGSL · GLSL · GSAP · Lenis · Vite · Astro · Next.js 15 · Core Web Vitals


Recent shipments

Date Project Summary
2026-06-04 TENET Phase 14 vulnerability-test triage — three-agent adversarial review surfaced 23 real issues; HIGH + MEDIUM patched (a2ca9a8)
2026-06-04 @tenet/acp Agent Client Protocol v1 adapter — JSON-RPC 2.0 over stdio, NDJSON framed, for Zed / Cursor / Helix interop
2026-06-02 llm-jailbreak-taxonomy v4.2.1 released with public retractions of four overclaims
2026-06-02 tldr-pages/tldr pyinstaller merged upstream · fc-scan / syft / helmfile PRs approved

External-PR throughput this year: 12 PRs · ~850 commits


Activity

Profile summary
Repos per language Most-commit language
Contribution activity
Streak

Working principles

Sources: primary documents first (official docs, browser specs, live repos); peer-reviewed papers second. Affiliate review sites, paraphrased rules, and training-data memory are rejected outright.

Claims: hypothesis until corroborated by ≥2 independent sources. A single source is marked UNVERIFIED. Retractions are documented publicly in the CHANGELOG — never silently rewritten.

Simulation vs measurement: hand-tuned parameters that reproduce literature distributions are a prior, never presented as measurement. Live API calls against real models are the only valid measurement.

Review: adversarial review is welcome. The assumption is the reviewer is right until proven otherwise. Response order: fix → defend → silently ignore (last is never acceptable).




footer

Pinned Loading

  1. llm-jailbreak-taxonomy llm-jailbreak-taxonomy Public

    Mechanism-grounded taxonomy of 40 LLM jailbreak patterns across 10 categories. 8,000-trial bootstrap evaluation for the June 2026 frontier (Claude Opus 4-8, GPT-5.5, Gemini 3.5, DeepSeek V4). Every…

    Jupyter Notebook 3