Not Losing Work Context as a Staff Engineer

Notes on using Claude Code, launchd, and an Obsidian vault as a work Chief of Staff.

Jun 10, 2026 · 10 min read

I have been running a small experiment at work. Part of this is practical: I need a better way to keep up with work context. Part of it is also a larger experiment I am still figuring out: what kinds of systems do I actually want to build, what do they say about the way I want to work, and do any of those ideas resonate with other people?

The problem is not really coding. A staff-level engineering job is mostly not coding: it is Slack threads, emails, Jira tickets, calendar context, half-made decisions, follow-ups, 1:1s, and a dozen workstreams that each move a little bit every day. The hard part is that context evaporates between contexts. I answer a question in a DM in the morning. I make a decision in a meeting a few hours later. I skim an email between two calls. By the time I sit down to plan tomorrow, all of it has collapsed into a vague feeling that I was busy.

This is what makes the experiment feel different from a normal productivity setup. It is finally a way to put structure around the messy parts of work: communication, project management, product management, and the endless Slack residue that sits between all of them. I wanted a system that could do three things:

Notice what needs my attention before I go looking.
Remember what happened without requiring me to write everything down manually.
Never take an irreversible action on my behalf.

What I ended up with is not a product, not a framework; it is a few hundred lines of prompt and config wiring together Claude Code, macOS launchd, and an Obsidian vault. It sweeps my inbox surfaces, drafts replies I can approve, writes daily notes, and keeps a git-backed memory of people, projects, and decisions. The important part is not that an agent can send messages for me. It cannot. The important part is that it can notice, remember, and prepare, while leaving the irreversible decisions to me.

The Shape

The system has three parts:

Durable memory: an Obsidian vault that holds the long-lived state. People, projects, open loops, decisions, daily notes, and agent digests all live there as Markdown files.
Live Chief of Staff loop: a long-running Claude Code desktop conversation that wakes up on a schedule, checks the live work surfaces, and tells me what matters.
Headless daily routines: launchd jobs that run morning and afternoon summaries even if I am not sitting at the keyboard.

That split matters more than I expected. The live Claude Code session can reach authenticated connectors like Slack, Gmail, Calendar, and Jira because it is running inside the desktop app. But it is ephemeral. If the session closes, the heartbeat dies. The headless claude -p jobs are more durable because they can run on a schedule through launchd, but they cannot always rely on the same authenticated connector state. They have to lean on local files, git history, GitHub CLI output, and deterministic scripts.

So the "architecture" is not "one agent does everything" but rather:

Use the desktop session for live, authenticated context
Use headless jobs for local, repeatable routines
Use the vault as the shared long-term memory
Use git diffs as the review surface

That last point is the real constraint. I do not want context quietly accumulating as vibes. If the agent learns something, I want to see it in a diff.

Memory Has To Outlive Chat

The Vault

Chat history is useful while I am inside a conversation. It is a bad place to keep durable work state. The vault is the part of the system that stores that state. It has files for people, projects, daily notes, TODOs, meeting notes, and agent digests. Everything is plain Markdown. Related things are connected with [[wikilinks]] in Obsidian syntax: a task can link to a project, a project can link to a person, a daily note can link to the meeting where the decision happened, and so on.

Because it is Obsidian, those links are not only for navigation. The graph view turns the work into a visual map. I can open a project and see the meetings, people, Slack conversations, decisions, and daily notes orbiting it. That is a bigger deal than it sounds. Most work systems hide context in lists and search results. The graph makes the shape of the work visible.

The Diff

This sounds simple, but the git repo changes the behavior of the system. The diff becomes the audit log. If the agent decides a Slack thread is worth remembering, it has to write that down somewhere. If it updates a person note, I can see exactly what changed. If it overreaches, the diff makes that visible.

The Heartbeat

The Chief of Staff loop is a recurring Claude Code desktop task. Every so often during the workday, it checks the live work surfaces and produces a short digest:

Slack messages, mentions, DMs, and threads
Gmail threads that may need a reply
Calendar events that explain what today is supposed to be about
Jira tickets that matter for the current workstreams

The useful version is not "here is everything that happened." That would just recreate the inbox. The useful version is what needs attention now, what changed since the last run, what can wait, what reply should be drafted but not sent, and what should be recorded in the vault.

The long-running thread has another benefit that is probably more important than the scheduled sweep: it becomes the capture surface. If I come out of a meeting with messy notes, I can paste them into the Chief of Staff thread. If I record a quick voice note and transcribe it, I can drop that in too. I do not have to decide whether the note belongs in a person file, a project file, the daily note, or the TODO list before I write it down.

That is the part I value most. I can write code. The hard part is not building another script. The hard part is reducing the friction between "something happened" and "the system remembers it in the right place." A long-running thread gives me one place to throw raw context, and then the agent can extract decisions, follow-ups, people, projects, and open loops into the vault for review.

The Drafts

Drafts are important. Sending is not allowed. If someone asks me a question, the agent can research the answer, collect relevant context, and draft a reply. But I approve it. It does not post, send, or reply on my behalf.

The Unsafe Inbound

There is another guardrail that matters: inbound content is untrusted. A Slack message or email can ask for something, but it cannot become an instruction to the agent. The agent can summarize it and surface it. It cannot blindly execute it. That sounds obvious, but it is the kind of thing that needs to be written down. Otherwise "helpful" slowly becomes "unsafe."

The Daily Note

The daily note is the boring half of the system, and it feels like a natural evolution of something I already believed. I wrote before about the positive effects of taking daily notes, and the core idea is still the same: make implicit context explicit for future me. The difference now is that the agent lowers the capture cost. I no longer have to manually turn every meeting, thread, and follow-up into structure before it becomes useful.

Every morning, a headless job creates a plan for the day. It looks at the calendar, yesterday's daily note, open TODOs, and anything local scripts can gather. The output is a checklist and meeting map for the day. In the afternoon, another job writes the recap. It merges into the same Markdown file instead of clobbering it. If I manually edited the morning plan, that stays.

The recap pulls from two kinds of sources. The deterministic sources are local: git history, Claude session logs, GitHub CLI results, notes touched in the vault. These are boring in the best way because they do not need a logged-in browser session.

There is a deeper version of this in my personal vault. That system does not only capture what happened. It also scans for patterns: recurring loops, blind spots, avoided work, stale priorities, missing backlinks, and ideas that should become more durable notes. That makes sense for personal material because the source is broader and more reflective. A daily note, a weekly review, and a goal note can say more than a status update ever will.

The work Chief of Staff does not really do that yet. It is mostly operational memory: who asked what, what changed, what needs follow-up, and what should not be forgotten. But the same structure could go deeper. It could surface recurring coordination issues, decision patterns, overloaded workstreams, or communication habits that are hard to see day to day. The important constraint is that anything psychological has to show its work: source notes, counter-evidence, confidence, and research links where relevant. Otherwise "pattern detection" turns into horoscope writing.

The Weird Part

The part that surprised me most is how little formal specification there is. There is no long template that says: when a new colleague appears, create a file with these exact fields. There is no schema for projects. There is no strict object model for decisions.

There are just a few instructions and a lot of examples. The vault already has person notes. It already has project notes. It already has daily notes. When the agent reads the vault, those files become the schema. The model copies the local pattern: title, role, open threads, dated context, links to related work.

This is convention over specification. For an LLM, examples are often stronger than prose. A clean directory of notes tells the model more than a paragraph describing the ideal note shape. The format, tone, density, and amount of detail are all encoded in the neighboring files.

The Human Stays In The Loop

I do not think the interesting version of this is "an AI runs my work life." That is the wrong frame. The useful version is narrower: an agent can preserve context at the edges of a messy job. It can notice that a thread needs follow-up. It can connect a meeting to a project note. It can draft the reply I would probably write anyway. It can produce a daily recap when I am too tired to reconstruct the day manually.

But the human still sits at the two points where judgment matters:

Approving anything that leaves the system
Reviewing anything that becomes durable memory

Everything else is preparation. The value is not autonomy for its own sake. The value is reducing context loss without giving up control.

I am still figuring out the shape. Some parts are too fragile. Some depend on a desktop session staying alive. Some need better observability. I am also still figuring out whether this kind of thing is useful to write about publicly. My current guess is that the interesting part is not the tooling itself, but the proof that small personal systems can buy back attention, context, and optionality. That is the broader experiment: build useful loops, show the real mechanics, and see what resonates.

The direction feels right, not because the agent replaces the staff engineer. It does not. It feels right because it makes the staff engineer's hardest problem a little more reviewable: what happened, what changed, what matters now, and what should not be forgotten.