Builder Log Decision Notes January 28, 2025 7 min read

What AI Agents Should Remember — and What They Should Never Store

Credential discipline and memory architecture for a self-hosted AI agent. The practical rules that prevent security nightmares and memory drift.

Context

When you run an AI agent that actually does things — publishes posts, connects to servers, calls APIs — you immediately hit a problem that nobody in the “prompt engineering” discourse talks about: where do the credentials live, how does the agent find them, and how does it remember what it knows across sessions?

This isn’t a theoretical problem. It’s the first practical problem. And the answer matters for security as much as it does for functionality.

Here’s the system I landed on after building a working AI operator that manages a real workflow stack.

What We Built

Credentials: macOS Keychain as the Credential Store

Every credential the agent needs — API tokens, service passwords — lives in the OS secure credential store under a consistent naming convention. The agent retrieves them at runtime using a single shell command:

“`bash

security find-generic-password -s “service-name” -w

“`

Nothing is hardcoded. Nothing is in config files. Nothing is in environment variables that could leak into logs. The agent knows the naming convention and looks up what it needs when it needs it.

The naming convention matters. Consistent, predictable names mean the agent can find credentials without being told where to look:

“`

hosting-ssh-password

hosting-api-token

cloud-storage-app-password

blog-wp-app-password

“`

Service → credential type. Every time. No exceptions.

Long-Term Memory: Structured Markdown Files

AI agents don’t have persistent memory between sessions by default. Every session starts fresh. If you don’t build memory infrastructure, you rebuild context from scratch every time — which is expensive, slow, and error-prone.

The solution: treat markdown files as the agent’s brain. A curated `MEMORY.md` file holds everything the agent needs to know across sessions — infrastructure config, client details, established workflows, standing rules. The agent reads it at the start of each session.

Key design principle: `MEMORY.md` stores references to credentials, not credentials themselves. It says “SSH password in Keychain as `hosting-ssh-password`” — not the password. The actual secret never leaves the Keychain.

Daily notes files capture what happened today. Long-term memory captures what’s always true. They serve different purposes and shouldn’t be conflated.

Context Isolation by Domain

The agent operates across multiple domains — personal work, client work, system operations. Each domain has its own memory namespace. Project A’s data never appears in Project B’s context. Sensitive internal context never surfaces in group chats or shared sessions.

This isn’t just good practice — it’s the only way to run an AI agent across multiple workstreams without creating a liability.

What Broke (or Almost Broke)

A Credential Ended Up in Chat

Early on, before the Keychain discipline was fully established, a token ended up in a chat message. It was flagged immediately and rotated within minutes. But it happened.

The fix wasn’t just rotating the token — it was establishing a hard rule: credentials never go in chat, never go in files, never go in prompts. Keychain only. The rule exists because the failure happened, not because we anticipated it.

Lesson: You won’t build the discipline until you feel the failure. Build the rule before you feel it.

The Agent Forgot Things Between Sessions

Early sessions required re-explaining context that had already been established. Which PHP version to use. Where the WordPress install lives. What the SSH port is. This is wasted time and creates inconsistency — the agent might make different assumptions in different sessions.

The fix: every time something non-obvious is discovered or decided, it goes into `MEMORY.md` immediately. Not “I’ll remember this.” Files survive restarts. Memory doesn’t.

Lesson: “Write it down” is the most important operational rule for running AI agents. If it’s not in a file, it doesn’t exist in the next session.

Memory Files Got Polluted with Sensitive Data

The flip side of “write everything down” is writing down things you shouldn’t. Full email bodies, internal configuration details, client data specifics — these crept into early memory files.

The fix: a standing rule that memory files contain summaries and references only. Not raw content. Not credentials. Not PHI. If you need to recall something sensitive, store a pointer to where it lives securely — not the thing itself.

Lesson: Memory infrastructure needs its own access controls and content policies, not just the primary systems it supports.

What We Learned

The credential problem is solved by Keychain + naming conventions. It doesn’t require a secrets manager or a vault. It requires discipline and consistency. The agent retrieves at runtime, never stores inline. The memory problem is solved by structured markdown. Not a database, not a vector store (though those have their place for search at scale) — just well-organized files with clear purposes. Simple enough to read, edit, and audit by hand. The isolation problem is solved by explicit domain separation. Define the boundaries before you need them. Retrofitting isolation is much harder than building it in from the start. The biggest risk isn’t technical — it’s drift. Credentials that don’t get rotated. Memory files that grow stale. Naming conventions that gradually get abandoned. Discipline over time is harder than the initial setup.

What We Changed

Established a strict Keychain naming convention — enforced across all new credentials
Created a standing rule: anything discovered or decided goes into `MEMORY.md` before the session ends
Added content policies to memory files: summaries and references only, no raw sensitive data
Separated memory namespaces by domain — each workstream isolated from the others

Takeaways

Keychain + consistent naming = credential problem solved. Service name + credential type. Every time.
Files survive restarts. Memory doesn’t. Write it down.
Memory files store references, not secrets. Where it lives, not what it is.
Isolation by domain is a design decision, not a feature you add later.
The biggest operational risk is drift — conventions abandoned, files gone stale, rules forgotten. Build in periodic review.
Build the discipline before the failure, not after. The credential-in-chat moment is preventable. It just requires treating it as a real risk before it happens.

The larger lesson: Credentials should be references, not values. Memory should be structured, not accumulated. An AI agent that handles both with discipline can operate autonomously without becoming a security or operational liability. One that handles either carelessly becomes one quickly.

Related notes:

AI Agent Memory Needs the Same Discipline as Production Code
Three AI Governance Failures I've Seen