Ben Keilman — Implementation & Enablement

I help non-technical teams adopt new technology — and I build the tools myself.

Business operations analyst with nearly a decade inside a government agency turning dense technical systems into the training, documentation, and rollouts thousands of staff actually use. For the last three years I've built my own AI tools — RAG systems, autonomous agents, automation — on Claude and the Anthropic API. The rare combination: I can run the adoption and be a credible technical partner.

Boston, MA · in-office or hybrid preferred, open to remote · government & regulated-industry native

Adoption is the deliverable. The build makes it credible.

Most implementation work fails at adoption, not installation. I lead the human side — the trainings, guides, and change management that get people to actually use a tool — and because I build the same kinds of systems myself, the demos and playbooks come from someone who has shipped, not someone reading a script. See exactly how I'd run it in a new role →

Enablement & adoption

Trainings, road shows, job aids, train-the-trainer, and community-of-practice programs that drive new tools into daily use across hundreds of staff.

Build & deliver

Three years building RAG systems, agents, and automation on the Anthropic API — working pilots and freelance systems clients run in production.

Regulated-environment fluency

Nine years inside a government compliance environment — audit, data classification, access control — the instincts responsible-AI deployment demands.

Change management, at real scale

At the Massachusetts Department of Transitional Assistance, getting people to adopt new tools and processes is the core of my role — not a side effect of it.

~800

staff I drive tool & process adoption across

1,200

page Online Guide I develop & maintain

~150

person office onboarded to the internal GenAI platform

9+ yrs

in a government compliance environment

Drove department-wide adoption of the Commonwealth's internal GenAI platform (GENIE) from rollout to daily use — sole owner of enablement: guides, group trainings, 1:1s, community of practice.

Advised the platform's developers for six months on the UX and feature changes that move a beta tool toward everyday staff use.

Run trainings and in-person road shows; train local office managers to carry adoption into their own teams (train-the-trainer).

Leading member of the agency's ~30-person AI Community of Practice — scaled fixes and prompting tips so answers compounded instead of repeating.

Tools I've built and put into real use

A mix of internal pilots inside a regulated agency and freelance systems clients run in production. The proprietary ones are shown through architecture and sanitized artifacts; the open ones link straight to the code.

Open source · live

claude-autofix — a self-healing error system

Python · Anthropic API · Claude Code

When a production process throws an error and no one's awake to fix it, this doesn't just retry — it reads its own source code, reasons about the failure, rewrites the code that caused it, restarts the process, and emails a report. Autonomously, inside hard guardrails (rate limits, a daily attempt cap, deduplication, and escalation to a human). Ran in production for 53 days on a personal data pipeline; this repo is the genericized, open extraction, with a no-API-key demo and real fix reports the agents wrote themselves.

53 active days in production 42 error types diagnosed 691 errors → 1 fix (one day) runnable in 10s, no API key

View on GitHub →

Flagship

Policy Navigator

Python · FastAPI · ChromaDB · Voyage AI · Anthropic API · HTMX

A multi-source RAG over 11,000+ regulatory sections across four authoritative sources — ask a plain-language policy question, get a cited, synthesized answer. Cross-checks sources to surface gaps and drafts new policy content from research.

View on GitHub →

Open source

agent_lab — a fleet of scheduled agents

Python · Claude Code

A framework for running specialized, scheduled Claude Code agents that share a workspace, message each other through files, and accumulate durable knowledge over time — extracted and genericized from a live multi-agent system.

View on GitHub →

Detail coming

Draft-Document Peer Reviewer

Python · Anthropic API · python-docx

An agentic reviewer that checks draft documents against department standards — accessibility, inclusivity, formatting, completeness — and returns structured feedback for human signoff. A worked before/after of AI removing toil while keeping a human in the loop.

Detail coming

Regulations Comparator

Python · Anthropic API · ChromaDB · Voyage AI

A 4-stage agentic pipeline (classify → triage → deep investigation → cross-check) detecting coverage gaps across 480+ regulatory sections before auditors find them — a generalizable pattern for comparing any two compliance bodies.

What I bring

Enablement & communication

Change & adoption management · training-program and curriculum design · large-scale documentation and playbooks · train-the-trainer and power-user programs · onboarding and stakeholder enablement · plain-language translation across audiences · accessibility (508 / WCAG) · UAT · presenting to leadership.

AI / LLM (hands-on practitioner)

Claude / Claude Code (3+ yrs) and the Anthropic API · RAG · agentic and multi-agent workflows · prompt engineering · tool-calling orchestration · MCP · embeddings (Voyage AI) · agent safety guardrails · model and vendor evaluation.

Responsible AI & governance

Responsible-AI guidance for non-technical staff · government compliance environment (9+ yrs) · data classification, audit, and access-control discipline · proactive gap detection and audit-readiness.

Engineering & platform fluency

Python · FastAPI · SQL / SQLite · ChromaDB · Git · HTMX · REST API design · document automation (python-docx) · command line — enough depth for credible conversations with engineers.

I help non-technical teams adopt new technology — and I build the tools myself.

Adoption is the deliverable. The build makes it credible.

Enablement & adoption

Build & deliver

Regulated-environment fluency

Change management, at real scale

Tools I've built and put into real use

claude-autofix — a self-healing error system

Policy Navigator

agent_lab — a fleet of scheduled agents

Draft-Document Peer Reviewer

Regulations Comparator

What I bring

Enablement & communication

AI / LLM (hands-on practitioner)

Responsible AI & governance

Engineering & platform fluency

Let's talk