Library-First Engineering

Claude-LFE

Making AI reliable, session after session.

by design

Stylianos Chiotis

AN ENGINEER WHO LOVES TO LEARN, EXPERIMENT AND DOCUMENT.

Thirteen years of one rule

Reliability, pointed at a wilder input.

Marine engine rooms

"mostly works" sinks ships

Biotech & genetics

regulated · staged pipelines

Agentic AI

a wilder kind of input

zero-error tolerance — the gauge that rides every station

A full-time dad who can't babysit an AI.

it needs to behave
when I'm not watching

Session one is magic. By session five…

The beast.

Survivable in a weekend hack. Fatal in a product you ship.

Loses your intent — drifts further every session

Re-litigates decisions it already made

Invents logic you never asked for

Sprawls into spaghetti as it grows

One wrong move quietly corrupts everything downstream

POWERFUL · FAST ...AND WILD

What everyone's getting wrong

The bottleneck isn't capability. It's trust.

Every AI coding tool is racing on speed. Nobody's optimizing for the thing that actually blocks shipping agent-built software — trusting it to behave when you're not watching.

And trust isn't asserted — it's measured.

Speedwhere everyone's racing

MAXED

Trustthe actual bottleneck · wide open

What it's built on

Three commitments.

01 — JUDGMENT

Thinking

in the Human

the human decides

02 — VOLUME

Processing

in the AI

the AI executes

03 — SOURCE OF TRUTH

Truth

in the Documentation

the single source of truth

If the code disagrees with the docs — the code is what's broken.

It takes three

The layers.

Protocolroutes the work through specialist roles

the route

Enforcementphysically constrains what can go wrong

the leash

Provenancethe memory, and the truth it's anchored to

memory + truth

quickly, through each →

Four specialist roles

Work flows down a line.

Architect

plans

→

Builder

builds

→

Inspector

verifies

→

Archivist

records

↺ max 2 — then escalate

big work → small, demoable slices

A process that can't converge
shouldn't be allowed to spin.

23 SKILLS · 4 ROLES · 2 EXCEPTIONS · 1 BRAIN · 1 LINE

The whole machine.

🫵 The Brain on the wheel throughout — approve · redirect · break-glass anywhere persistent oversight

Architectplans the work · owns both gates

grill→ to-prd→ to-issues→ mandatory sign-offslice→ architect→ mandatory sign-offplan→ plan-critique

Builderbuilds the slice

builder→ tdd

Inspectorverifies the work

zoom-out→ inspector→ verifysecurityperfcomplexitydepmutation→ diagnose(on fail → builder)

Archivistrecords the truth

archivist→ cleanup

↺ max 2 → escalate

.plans/ each step writes a file the next step reads

Exceptions — bypass the full line

Scout

branch off the complexity gate: scout → Archivist

≤3 files · no structural change

LFE-FORCE

break-glass from anywhere: patch now → log Protocol Debt

next session reconciles the debt

Entry: lfe-boot → complexity gate (full pipeline vs. minor fix) · + scheduled hygiene sweep every 5 sessions → live in VS Code

The part most frameworks don't have

A request is a suggestion. A rail is a wall.

←
constrains the AI
can't write outside its lane

constrains you
can't run steps out of order
→

The AI

write →

execution

The Operator

← step 4

that's you

And the reasoning checks aren't trusted — they're graded. Planted-defect fixtures · measured catch-rate · gated against regression.

~1,100 tests · guarding the rails, the prompt-discipline, and the checks' catch-rate

A structured library

The truth has an address.

Reception — front-desk map

Department · A

Department · B

Department · C

Flight recorder

.plans/ — transaction log

each step writes · the next step reads

✓ step 01 — architect.plan

✓ step 02 — builder.slice

step 03 — inspector.verify

Crash mid-flight? The next session reads the recorder and continues exactly where it stopped.

pipeline_status.mdthe live cursor

knows the step, the persona, the mission, and what's next — this is what the status line reads.

retention policy · HOT/COLD AUTO-SWEEP EVERY 5 SESSIONS — keeps the library lean, by design

Not one feature — a methodology

The cheaper it is, the more often it runs.

Not one feature — a whole engineering methodology. How it verifies and how it remembers, governed by one law.

Cheap · continuous one law · cost ↔ frequency Expensive · rare

Verifythe reliability discipline · runs the harnesses

~1,100 tests

every run

2 commit gates

every commit

Hygiene sweep

every 5 sessions

AI eval harness

every ~15 sessions

Rememberthe memory discipline · runs the library

.plans/ scratch

1 mission

CHANGELOG

7 milestones

Hot tiers

15 sessions

Cold archive

forever · fully indexed

My two engineering worlds, running as one system — verification and memory, on one law.

Before anything runs

Day 0 — it interviews you.

You

domain language · rules

› what does "batch" mean here? → captured

› what must never happen? → captured

› who signs off? → captured

The Library

source of truth

Every check is itself an AI — just as fallible

The chain ends
with you.

fallible checks · then something that isn't a language model

AI check

YOUthe owner

⚠ EMERGENCY OVERRIDE

Break the glass.

Kill the whole pipeline — but the framework forces the next session to clean up the debt before moving on.

Not a reviewer bolted on at the end. You're the owner of the beast.

Not the only way — a convergent one

A data pipeline for LLMs.

Mechanical & Reliability Engineering

marine · where I started

Poka-yoke FMEA Defense-in-depth Stop-the-line (Andon) RCM / scheduled maintenance

Data & Integration Engineering

what I do now

Pipeline orchestration Staged steps / child pipelines Quality gates between stages Retention / lifecycle policy Idempotency Transaction log (WAL) / checkpointing

LFE

where the two worlds meet

a data pipeline for LLMs — reliable at every step.

+ sharpened by Matt Pocock — shaped several skills · Bryan Finster — audited end-to-end, sharpened verification

A history, not a single sitting

Earned, not invented.

Spreadsheets → code

Excel · VBA · Power Query/BI · Python

start

my own financial product · releasing soon

building it with AI broke the old rules — and forced LFE

then

Pipeline discipline

biotech data-engineering, applied to AI

→

LFE v1

shaped by Matt Pocock's skills · shared online

→

The audit

Bryan Finster reviewed it end-to-end · verification sharpened

→

Claude-LFE

runtime guardrails · built natively on Claude · open-sourced

now

Built using itself every change ran through the exact pipeline — recorded in the commits, decisions, and tests.

→ it's all in the repo: git tags · ADRs · tests

No pitch — the limits, plainly

What LFE is not.

✕Not a speed boost—it's slower up front, on purpose

✕Not autonomy—the human stays on the wheel

✕Not a bigger model—it's discipline around the one you have

✕Not magic—it's overhead you choose to pay

If I don't name the cost, I'm pitching. So there it is.

Let me be straight about the cost

The trade-off.

Speed

up-front

Reliability

across dozens of sessions

Throwaway weekend prototype? Overkill — don't use it.

Reliability has a price. I'm telling you the price.

Screen capture

Watch me try to break it.

/lfe-boot — session active PIPELINE ▸ BUILD ▸ SLICE 2/4

.plans/

✓ 01-architect.plan

✓ 02-builder.slice

▸ 03-inspector.run

$ /lfe-boot # status line knows exactly where it is

$ run inspector.verify # out of order

⨯ REFUSED — sequence rail. Not a warning. A wall.

1 boot

2 declare

3 refuse ◂ wow

4 inspect

5 archive

6 kill ▸ resume

Foresight, not a promise

What's next — once it earns it.

The framework holds today. Where it goes tomorrow, I'm validating — not promising.

dashed = explored, not built

today
it holds

exploring

Fully-external orchestration

the pipeline run by an engine — not the model's compliance

validating

A Python SDK

LFE as a callable library

investigating

A data-factory engine

n8n · LFE as a literal pipeline

Knowing where it could go is the job.
Chasing it before it's earned is the trap.

And whatever wins — the framework will build it. The same way it built itself.

Stylianos Chiotis

Reshapes systems · Questions the defaults · Writes the playbook

Marine Engineering→ Biotech / Genetics→ Agentic AI

Bringing the zero-error reliability of marine engine rooms to enterprise data architecture

Reliability

FMEA & ISO 31000 (Risk) RCM (Reliability Analysis) Lean Six Sigma

Data

Microsoft | DP600 - PL300 - AZ900 - MO201 Databricks D.E. C#13 Development & .NET

+ more on LinkedIn

medium @st.chiotis94subscribe to
read more linkedin /stylianos-chiotisconnect and
keep in touch github /StChiotisfollow for what
comes next

Reliability is the destination.
Efficiency is how we walk each step.

clone it · try to break it · make it stronger

★ Star & share the repo github.com/StChiotis/claude-lfe

Claude-LFE is open source — built to serve all of us.

Claude-LFE

Reliability, pointed at a wilder input.

The beast.

The bottleneck isn't capability. It's trust.

Three commitments.

Thinking

Processing

Truth

The layers.

Work flows down a line.

The whole machine.

A request is a suggestion. A rail is a wall.

The truth has an address.

The cheaper it is, the more often it runs.

Day 0 — it interviews you.

The chain endswith you.

A data pipeline for LLMs.

Earned, not invented.

What LFE is not.

The trade-off.

Watch me try to break it.

What's next — once it earns it.

Stylianos Chiotis

Library-first isn't just for AI.

The chain ends
with you.