<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>CueMarshal on Alfero Chingono</title><link>https://www.chingono.com/tags/cuemarshal/</link><description>Recent content in CueMarshal on Alfero Chingono</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Fri, 03 Apr 2026 20:02:39 -0400</lastBuildDate><atom:link href="https://www.chingono.com/tags/cuemarshal/index.xml" rel="self" type="application/rss+xml"/><item><title>Designing Multi-Agent Systems: Lessons from Building an 8-Agent Engineering Orchestra</title><link>https://www.chingono.com/blog/2025/08/28/designing-multi-agent-systems-lessons-from-building-an-8-agent-engineering-orchestra/</link><pubDate>Thu, 28 Aug 2025 09:00:00 +0000</pubDate><guid>https://www.chingono.com/blog/2025/08/28/designing-multi-agent-systems-lessons-from-building-an-8-agent-engineering-orchestra/</guid><description>&lt;p&gt;A lot of &amp;ldquo;multi-agent&amp;rdquo; demos are really one agent wearing different hats.&lt;/p&gt;
&lt;p&gt;The names change. The prompts change. Sometimes the avatars change. But the authority model, the memory model, and the execution model are all still basically the same. That is fine for a demo. It is much less convincing when you are trying to build a system that can do real engineering work.&lt;/p&gt;
&lt;p&gt;Building &lt;a class="link" href="https://github.com/cuemarshal/cuemarshal" target="_blank" rel="noopener"
&gt;CueMarshal&lt;/a&gt; made that distinction impossible for me to ignore.&lt;/p&gt;
&lt;p&gt;What I wanted was not eight personalities for marketing. I wanted a working system where planning, coding, review, testing, DevOps, documentation, and quality control could be separated cleanly enough to be trustworthy.&lt;/p&gt;
&lt;p&gt;That is how the &amp;ldquo;engineering orchestra&amp;rdquo; idea emerged.&lt;/p&gt;
&lt;h2 id="the-roles-mattered-because-the-boundaries-mattered"&gt;The roles mattered because the boundaries mattered
&lt;/h2&gt;&lt;p&gt;CueMarshal&amp;rsquo;s cast eventually became:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Marshal&lt;/strong&gt; for orchestration&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Ava&lt;/strong&gt; for architecture&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dave&lt;/strong&gt; for implementation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reese&lt;/strong&gt; for review&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tess&lt;/strong&gt; for testing&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Devin&lt;/strong&gt; for DevOps&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Dot&lt;/strong&gt; for documentation&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Linton&lt;/strong&gt; for linting&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What made that useful was not the naming. It was the fact that the roles had &lt;strong&gt;different responsibilities, different tool access, and different default model tiers&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;That is the first lesson I would pass on to anyone designing a multi-agent system:&lt;/p&gt;
&lt;h2 id="1-different-roles-need-different-authority"&gt;1. Different roles need different authority
&lt;/h2&gt;&lt;p&gt;If your reviewer can rewrite production code, your reviewer is not really a reviewer.&lt;/p&gt;
&lt;p&gt;In CueMarshal, least privilege is deliberate. The reviewer is configured without write/edit permissions. The docs agent is restricted from shell access. The linter acts like a gate, not a developer with nicer manners.&lt;/p&gt;
&lt;p&gt;That kind of restriction sounds limiting until you realize it is what gives each role meaning. Boundaries are not friction here. They are the mechanism that creates trust.&lt;/p&gt;
&lt;p&gt;A good multi-agent system is not just a cluster of competencies. It is a set of constrained responsibilities.&lt;/p&gt;
&lt;h2 id="2-coordination-needs-durable-state-outside-the-model"&gt;2. Coordination needs durable state outside the model
&lt;/h2&gt;&lt;p&gt;One of the reasons I anchored CueMarshal in Git is that I did not want coordination to depend on hidden model memory.&lt;/p&gt;
&lt;p&gt;Tasks become issues.
Work becomes branches.
Proposals become pull requests.
Reviews become durable comments and approvals.&lt;/p&gt;
&lt;p&gt;The Conductor receives webhooks, uses Redis and BullMQ to manage asynchronous flow, and dispatches work through Gitea Actions. The runners themselves stay stateless; they rebuild context from the repository, the issue, and the tool layer every time.&lt;/p&gt;
&lt;p&gt;That has been a much better trade than magical continuity.&lt;/p&gt;
&lt;p&gt;Models forget.
Git does not.&lt;/p&gt;
&lt;h2 id="3-identity-is-part-of-the-architecture"&gt;3. Identity is part of the architecture
&lt;/h2&gt;&lt;p&gt;Another thing I underestimated early on was how important identity separation would be.&lt;/p&gt;
&lt;p&gt;Each CueMarshal agent has its own account, token, and audit trail. That means the Git history shows who planned, who implemented, who reviewed, and who approved. Even when the &amp;ldquo;who&amp;rdquo; is an AI agent, the distinction still matters.&lt;/p&gt;
&lt;p&gt;This has two benefits.&lt;/p&gt;
&lt;p&gt;First, it improves explainability. The system becomes easier to inspect when actions are attributable.&lt;/p&gt;
&lt;p&gt;Second, it changes how you think about safety. Once every agent has a clear identity and permission scope, you stop designing from a vague &amp;ldquo;assistant&amp;rdquo; mindset and start designing from explicit operational roles.&lt;/p&gt;
&lt;p&gt;That shift is subtle, but it is foundational.&lt;/p&gt;
&lt;h2 id="4-the-tool-layer-is-what-makes-the-orchestra-playable"&gt;4. The tool layer is what makes the orchestra playable
&lt;/h2&gt;&lt;p&gt;This is where MCP became important for CueMarshal.&lt;/p&gt;
&lt;p&gt;All of the agents connect to a structured tool layer instead of improvising raw integrations on the fly. The same Gitea, Conductor, and System capabilities can be used by the runner agents over stdio and by the orchestration layer over HTTP/SSE.&lt;/p&gt;
&lt;p&gt;That matters because multi-agent systems are not only about reasoning. They are about coordination through reliable interfaces.&lt;/p&gt;
&lt;p&gt;If the tools are vague, agents collide.
If the permissions are sloppy, trust collapses.
If the transports are inconsistent, reuse gets expensive.&lt;/p&gt;
&lt;p&gt;The protocol is not the whole story, but it is the difference between a collection of prompts and a real system surface.&lt;/p&gt;
&lt;p&gt;I wrote more about that in &lt;a class="link" href="https://www.chingono.com/blog/2025/03/20/mcp-in-practice-what-anthropics-model-context-protocol-actually-means-for-developers/" &gt;MCP in Practice&lt;/a&gt;, because it deserves its own treatment.&lt;/p&gt;
&lt;h2 id="5-model-routing-is-architecture-not-optimization"&gt;5. Model routing is architecture, not optimization
&lt;/h2&gt;&lt;p&gt;Another lesson I came away with: not every role deserves the same model.&lt;/p&gt;
&lt;p&gt;Architecture work is more expensive and more consequential than documentation cleanup. Review often needs stronger reasoning than linting. Mechanical work should not burn premium tokens if a cheaper tier can do it reliably.&lt;/p&gt;
&lt;p&gt;CueMarshal&amp;rsquo;s tiered routing reflects that reality:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;heavy reasoning for architecture&lt;/li&gt;
&lt;li&gt;balanced capability for implementation, review, testing, and DevOps&lt;/li&gt;
&lt;li&gt;lighter-weight models for docs and linting&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That is not just a cost decision. It is part of how the system stays sustainable.&lt;/p&gt;
&lt;p&gt;Too many agent systems treat model choice as an afterthought. I think it belongs in the design doc.&lt;/p&gt;
&lt;h2 id="6-closed-loops-beat-hero-agents"&gt;6. Closed loops beat hero agents
&lt;/h2&gt;&lt;p&gt;The more I build these systems, the less I believe in the &amp;ldquo;super-agent&amp;rdquo; story.&lt;/p&gt;
&lt;p&gt;What works better is a closed loop:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;detect work&lt;/li&gt;
&lt;li&gt;route it clearly&lt;/li&gt;
&lt;li&gt;execute with constrained roles&lt;/li&gt;
&lt;li&gt;review it&lt;/li&gt;
&lt;li&gt;merge it with human control&lt;/li&gt;
&lt;li&gt;feed the next signal back into the system&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;CueMarshal&amp;rsquo;s self-improvement workflow made this even clearer to me. Once SonarQube findings, scanners, issues, PRs, and agent roles all started participating in the same loop, the system became more useful than any single agent inside it.&lt;/p&gt;
&lt;p&gt;That is why I think orchestration matters more than agent count.&lt;/p&gt;
&lt;h2 id="my-current-takeaway"&gt;My current takeaway
&lt;/h2&gt;&lt;p&gt;If you are building a multi-agent system, start with these questions:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What roles genuinely need to be different?&lt;/li&gt;
&lt;li&gt;What permissions should each role have?&lt;/li&gt;
&lt;li&gt;Where does coordination state live?&lt;/li&gt;
&lt;li&gt;How are actions attributed?&lt;/li&gt;
&lt;li&gt;What is the closed loop that turns outputs into the next inputs?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If you cannot answer those, adding more agents will mostly add more noise.&lt;/p&gt;
&lt;p&gt;If you can answer them, the number of agents becomes much less important than the quality of the structure around them.&lt;/p&gt;
&lt;p&gt;That has been the real lesson for me. The point of the orchestra is not to have more instruments. The point is to make the handoffs musical instead of chaotic.&lt;/p&gt;
&lt;p&gt;If you want the adjacent pieces, &lt;a class="link" href="https://www.chingono.com/blog/2025/02/15/why-i-started-building-my-own-devops-platform-and-what-i-learned/" &gt;Why I Started Building My Own DevOps Platform&lt;/a&gt; covers the motivation, and &lt;a class="link" href="https://www.chingono.com/blog/2026/03/05/how-i-run-sonarqube-in-my-own-ci-pipeline-and-let-ai-fix-what-it-finds/" &gt;How I Run SonarQube in My Own CI Pipeline (And Let AI Fix What It Finds)&lt;/a&gt; shows what this architecture looks like when the feedback loop closes on itself.&lt;/p&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal" target="_blank" rel="noopener"
&gt;CueMarshal repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal/blob/main/docs/architecture/overview.md" target="_blank" rel="noopener"
&gt;CueMarshal architecture overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal/blob/main/docs/features/agents/overview.md" target="_blank" rel="noopener"
&gt;CueMarshal agent profiles&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal/blob/main/docs/features/conductor/overview.md" target="_blank" rel="noopener"
&gt;CueMarshal conductor overview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>MCP in Practice: What Anthropic's Model Context Protocol Actually Means for Developers</title><link>https://www.chingono.com/blog/2025/03/20/mcp-in-practice-what-anthropics-model-context-protocol-actually-means-for-developers/</link><pubDate>Thu, 20 Mar 2025 09:00:00 +0000</pubDate><guid>https://www.chingono.com/blog/2025/03/20/mcp-in-practice-what-anthropics-model-context-protocol-actually-means-for-developers/</guid><description>&lt;p&gt;When Anthropic announced the &lt;a class="link" href="https://www.anthropic.com/news/model-context-protocol" target="_blank" rel="noopener"
&gt;Model Context Protocol&lt;/a&gt;, the most interesting part to me was not &amp;ldquo;LLMs can call tools.&amp;rdquo; We already knew that. The interesting part was that someone was finally trying to standardize the connection.&lt;/p&gt;
&lt;p&gt;That may sound like a small distinction, but it is the difference between a clever demo and an architecture you can actually build on.&lt;/p&gt;
&lt;p&gt;For developers, MCP matters because it turns tool access into something more portable, more inspectable, and less bespoke. Instead of wiring every model to every internal system in a slightly different way, you get a shared protocol for secure, two-way connections between AI clients and the systems where work actually lives.&lt;/p&gt;
&lt;p&gt;In other words: fewer one-off connectors, fewer weird wrappers, and less glue code pretending to be strategy.&lt;/p&gt;
&lt;h2 id="the-real-problem-mcp-solves"&gt;The real problem MCP solves
&lt;/h2&gt;&lt;p&gt;Without a protocol, most AI integrations end up with the same shape:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;custom JSON formats&lt;/li&gt;
&lt;li&gt;hand-rolled function schemas&lt;/li&gt;
&lt;li&gt;transport logic mixed into business logic&lt;/li&gt;
&lt;li&gt;a different adapter for every new client&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can absolutely ship systems that way. Many people already have. But you pay for it later in duplication, debugging, and lock-in.&lt;/p&gt;
&lt;p&gt;Anthropic&amp;rsquo;s framing resonated with me because it describes a problem I had already been running into while building CueMarshal. I did not need agents that could merely &amp;ldquo;use tools.&amp;rdquo; I needed a stable way for different parts of the system to use the &lt;strong&gt;same tools&lt;/strong&gt; in different contexts.&lt;/p&gt;
&lt;p&gt;That is where MCP becomes practical.&lt;/p&gt;
&lt;h2 id="what-it-changed-in-my-own-thinking"&gt;What it changed in my own thinking
&lt;/h2&gt;&lt;p&gt;In CueMarshal, I ended up with three MCP servers:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;a &lt;strong&gt;Gitea MCP server&lt;/strong&gt; for issues, pull requests, repositories, workflows, and search&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;Conductor MCP server&lt;/strong&gt; for task coordination and agent state&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;System MCP server&lt;/strong&gt; for costs, runners, and health&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That split was not arbitrary. It reflected a design choice: organize tool access around bounded responsibilities instead of dumping everything into one giant catch-all toolbox.&lt;/p&gt;
&lt;p&gt;Even more important, the same MCP server code supports &lt;strong&gt;two transports&lt;/strong&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;stdio&lt;/strong&gt; for agent runners&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;HTTP/SSE&lt;/strong&gt; for the long-running chat/orchestration layer&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is the part I think many developers will underestimate. The value is not just that the model can invoke a tool. The value is that your tool layer stops being trapped inside one execution model.&lt;/p&gt;
&lt;p&gt;The CueMarshal runners can spawn MCP servers directly as child processes. The Conductor can hold long-lived connections to those same tool surfaces over the network. Same capability, different runtime, no duplicated tool logic.&lt;/p&gt;
&lt;p&gt;That is not just elegant. It is operationally useful.&lt;/p&gt;
&lt;h2 id="mcp-is-really-about-interface-discipline"&gt;MCP is really about interface discipline
&lt;/h2&gt;&lt;p&gt;One thing building AI systems teaches very quickly is that &amp;ldquo;prompting&amp;rdquo; gets too much credit for problems that are really interface problems.&lt;/p&gt;
&lt;p&gt;If the tool schema is vague, the model will behave vaguely.&lt;/p&gt;
&lt;p&gt;If the permissions are broad, the behavior will feel risky.&lt;/p&gt;
&lt;p&gt;If the transport is brittle, the whole system looks flaky even when the reasoning is fine.&lt;/p&gt;
&lt;p&gt;What I like about MCP is that it nudges teams toward better engineering habits:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Typed tools instead of implied behavior&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Separation between protocol and implementation&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reusable tool layers across multiple clients&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Clearer permission boundaries&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;That discipline matters even if you never use Anthropic&amp;rsquo;s stack directly.&lt;/p&gt;
&lt;h2 id="what-developers-should-actually-do-with-it"&gt;What developers should actually do with it
&lt;/h2&gt;&lt;p&gt;My advice is to treat MCP less like a product feature and more like a systems design decision.&lt;/p&gt;
&lt;p&gt;If you are building AI-assisted software delivery, internal automation, or even just richer developer tools, start by asking:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;What are the real systems my assistant needs to access?&lt;/li&gt;
&lt;li&gt;Which of those interactions deserve typed, validated interfaces?&lt;/li&gt;
&lt;li&gt;Which capabilities should be shared across chat, automation, and background agents?&lt;/li&gt;
&lt;li&gt;Where do I want auditability and permission scoping to live?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That line of thinking will produce a better architecture whether you adopt MCP tomorrow or not.&lt;/p&gt;
&lt;p&gt;In my own work, it pushed me away from raw &lt;code&gt;curl&lt;/code&gt;-driven integration and toward a universal tool layer. Once I made that shift, a lot of downstream problems became easier: orchestration, reuse, security boundaries, and even explanation. It is easier to trust a system when you can say, very plainly, &amp;ldquo;here are the tools it has, here is what they do, and here is how they are invoked.&amp;rdquo;&lt;/p&gt;
&lt;h2 id="what-mcp-does-not-solve"&gt;What MCP does &lt;strong&gt;not&lt;/strong&gt; solve
&lt;/h2&gt;&lt;p&gt;MCP does not magically make an agent reliable.&lt;/p&gt;
&lt;p&gt;It does not fix poor workflow design.&lt;/p&gt;
&lt;p&gt;It does not remove the need for human review.&lt;/p&gt;
&lt;p&gt;And it definitely does not turn vague prompts into good engineering.&lt;/p&gt;
&lt;p&gt;What it does is give you a cleaner control plane for connecting models to real systems. That is already a meaningful improvement.&lt;/p&gt;
&lt;p&gt;For me, that is why MCP feels important. Not because it adds more AI theater, but because it reduces architectural friction in a place where friction compounds very fast.&lt;/p&gt;
&lt;p&gt;If you are curious how that idea plays out in a larger system, I wrote more about the broader coordination problem in &lt;a class="link" href="https://www.chingono.com/blog/2025/02/15/why-i-started-building-my-own-devops-platform-and-what-i-learned/" &gt;Why I Started Building My Own DevOps Platform&lt;/a&gt; and the orchestration lessons in &lt;a class="link" href="https://www.chingono.com/blog/2025/08/28/designing-multi-agent-systems-lessons-from-building-an-8-agent-engineering-orchestra/" &gt;Designing Multi-Agent Systems: Lessons from Building an 8-Agent Engineering Orchestra&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://www.anthropic.com/news/model-context-protocol" target="_blank" rel="noopener"
&gt;Introducing the Model Context Protocol&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://modelcontextprotocol.io/quickstart" target="_blank" rel="noopener"
&gt;MCP quickstart and specification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal/blob/main/docs/features/mcp-servers/overview.md" target="_blank" rel="noopener"
&gt;CueMarshal MCP server overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal/blob/main/docs/architecture/overview.md" target="_blank" rel="noopener"
&gt;CueMarshal architecture overview&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item><item><title>Why I Started Building My Own DevOps Platform (And What I Learned)</title><link>https://www.chingono.com/blog/2025/02/15/why-i-started-building-my-own-devops-platform-and-what-i-learned/</link><pubDate>Sat, 15 Feb 2025 09:00:00 +0000</pubDate><guid>https://www.chingono.com/blog/2025/02/15/why-i-started-building-my-own-devops-platform-and-what-i-learned/</guid><description>&lt;p&gt;For a while, I had the same reaction to most AI-for-software-delivery demos: impressive in a narrow way, but not something I would trust with real work. One tool could write code. Another could summarize a diff. Another could review a pull request. But the hard part of software delivery is rarely one isolated step. It is the handoff between steps.&lt;/p&gt;
&lt;p&gt;That was the itch that eventually pushed me to start building &lt;a class="link" href="https://github.com/cuemarshal/cuemarshal" target="_blank" rel="noopener"
&gt;CueMarshal&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;I did not start with the ambition to build &amp;ldquo;an AI company&amp;rdquo; or some abstract autonomous future. I started because I wanted a more coherent delivery system: one place where a task could move from idea to issue to branch to pull request to review without losing context every time responsibility changed hands.&lt;/p&gt;
&lt;h2 id="the-problem-i-actually-wanted-to-solve"&gt;The problem I actually wanted to solve
&lt;/h2&gt;&lt;p&gt;CI/CD was never the whole problem. In many teams, the pipeline is the most deterministic part of the process. The mess usually lives around it:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;the design decision that only exists in a chat thread&lt;/li&gt;
&lt;li&gt;the issue that says too little&lt;/li&gt;
&lt;li&gt;the reviewer who has to reconstruct intent from commit history&lt;/li&gt;
&lt;li&gt;the documentation that is always &amp;ldquo;we&amp;rsquo;ll do it after&amp;rdquo;&lt;/li&gt;
&lt;li&gt;the growing pile of tools that all know a little, but none of them own the workflow&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What I wanted was not another dashboard. I wanted a delivery surface that respected how engineering work already happens.&lt;/p&gt;
&lt;p&gt;That led me to a simple conviction: &lt;strong&gt;Git should be the source of truth, not just the storage layer.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If work already becomes legible through issues, branches, pull requests, labels, and reviews, then the orchestration layer should live there too. Not beside it. Not behind it. Inside it.&lt;/p&gt;
&lt;h2 id="why-i-built-it-myself"&gt;Why I built it myself
&lt;/h2&gt;&lt;p&gt;There were three constraints that mattered to me from day one.&lt;/p&gt;
&lt;p&gt;First, I wanted the system to be &lt;strong&gt;self-hosted&lt;/strong&gt;. A lot of AI tooling assumes you are comfortable sending your code, your process, and your delivery metadata into someone else&amp;rsquo;s black box. Many teams are not. I wanted an approach that made data sovereignty a feature, not an apology.&lt;/p&gt;
&lt;p&gt;Second, I wanted the system to be &lt;strong&gt;role-aware&lt;/strong&gt;. Real software delivery is not &amp;ldquo;one super-agent with a clever prompt.&amp;rdquo; Design, implementation, review, testing, DevOps, and documentation are different jobs. Sometimes one person does multiple jobs, but the jobs are still different. That distinction matters.&lt;/p&gt;
&lt;p&gt;Third, I wanted &lt;strong&gt;human control to remain the final gate&lt;/strong&gt;. I am interested in automation, not surrender. If an AI system cannot work inside a reviewable pull-request workflow, I do not think it is mature enough for serious engineering work.&lt;/p&gt;
&lt;p&gt;Those constraints eventually turned into the shape CueMarshal has now: a conductor service in TypeScript, specialized agents for architecture, development, review, testing, DevOps, docs, and linting, a Git-native workflow in Gitea, and a tool layer built around MCP so the same system can reason over structured interfaces instead of raw shell scripts and ad-hoc API calls.&lt;/p&gt;
&lt;h2 id="the-architecture-came-later-the-principles-came-first"&gt;The architecture came later. The principles came first.
&lt;/h2&gt;&lt;p&gt;Long before the implementation solidified, the design principles were already obvious to me.&lt;/p&gt;
&lt;h3 id="1-git-is-a-better-coordination-layer-than-most-agent-uis"&gt;1. Git is a better coordination layer than most agent UIs
&lt;/h3&gt;&lt;p&gt;An issue is a task. A branch is a workstream. A pull request is a proposal. A review is a decision record. A merge is a controlled state change.&lt;/p&gt;
&lt;p&gt;That sounds almost too obvious to say out loud, but it changed how I thought about the whole problem. Once I stopped treating Git as the place where code merely ends up, and started treating it as the place where engineering decisions become inspectable, the rest of the architecture got much simpler.&lt;/p&gt;
&lt;h3 id="2-specialization-beats-a-do-everything-agent"&gt;2. Specialization beats a &amp;ldquo;do everything&amp;rdquo; agent
&lt;/h3&gt;&lt;p&gt;In CueMarshal, the system is intentionally split into named roles: Marshal for orchestration, Ava for architecture, Dave for implementation, Reese for review, Tess for testing, Devin for DevOps, Dot for docs, and Linton for linting.&lt;/p&gt;
&lt;p&gt;That is not branding for its own sake. It is an operational choice.&lt;/p&gt;
&lt;p&gt;The moment one agent tries to be planner, coder, reviewer, tester, and documentarian all at once, you lose clarity. You also lose accountability. Specialization makes prompts sharper, tool permissions narrower, and outputs easier to judge.&lt;/p&gt;
&lt;h3 id="3-tool-contracts-matter-more-than-prompt-cleverness"&gt;3. Tool contracts matter more than prompt cleverness
&lt;/h3&gt;&lt;p&gt;One of the biggest lessons from building CueMarshal is that the quality of an agentic system is heavily constrained by the quality of its interfaces.&lt;/p&gt;
&lt;p&gt;If an agent is forced to improvise around loosely structured APIs, fragile shell commands, or browser automation for tasks that should be typed and validated, the system becomes harder to trust. This is one reason MCP clicked for me so quickly later on: it gave a clean shape to something I already knew was essential.&lt;/p&gt;
&lt;p&gt;Good tool contracts do not just help the model. They help the human operator understand what the system is even allowed to do.&lt;/p&gt;
&lt;h3 id="4-stateless-workers-are-a-feature-not-a-bug"&gt;4. Stateless workers are a feature, not a bug
&lt;/h3&gt;&lt;p&gt;CueMarshal&amp;rsquo;s runners are intentionally stateless. They reconstruct context from the repository, the issue, the pull request, and the tool layer every time.&lt;/p&gt;
&lt;p&gt;That may sound less magical than the &amp;ldquo;persistent AI teammate&amp;rdquo; narrative, but it is much easier to reason about. It scales better. It fails more cleanly. And it produces a better audit trail.&lt;/p&gt;
&lt;p&gt;In practice, that has made me more skeptical of systems that depend on hidden memory to feel smart.&lt;/p&gt;
&lt;h3 id="5-human-control-is-product-design"&gt;5. Human control is product design
&lt;/h3&gt;&lt;p&gt;The more I worked on this, the more convinced I became that &amp;ldquo;human in the loop&amp;rdquo; is not enough as a slogan. It has to be built into the workflow itself.&lt;/p&gt;
&lt;p&gt;That is why I prefer issue-driven execution, reviewable pull requests, typed tools, explicit handoffs, and merge control. Those are not bureaucratic constraints. They are the difference between a system that can support real engineering and a system that is only good for demos.&lt;/p&gt;
&lt;h2 id="what-i-learned-from-building-in-public"&gt;What I learned from building in public
&lt;/h2&gt;&lt;p&gt;The most useful part of this project has not been proving that agents can write code. We already knew that. The useful part has been learning where coordination breaks, where trust gets earned, and what kinds of structure make AI assistance actually usable.&lt;/p&gt;
&lt;p&gt;It also made one thing clearer for me: the next layer of software delivery is not &amp;ldquo;more CI/CD.&amp;rdquo; It is better orchestration around the work humans and machines are already doing together.&lt;/p&gt;
&lt;p&gt;That is the reason I started building CueMarshal, and it is still the reason I keep working on it.&lt;/p&gt;
&lt;p&gt;If you want the more technical follow-up, I wrote about &lt;a class="link" href="https://www.chingono.com/blog/2025/03/20/mcp-in-practice-what-anthropics-model-context-protocol-actually-means-for-developers/" &gt;what MCP actually changed for developers&lt;/a&gt; and the coordination lessons from &lt;a class="link" href="https://www.chingono.com/blog/2025/08/28/designing-multi-agent-systems-lessons-from-building-an-8-agent-engineering-orchestra/" &gt;building an eight-agent engineering orchestra&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;References:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal" target="_blank" rel="noopener"
&gt;CueMarshal repository&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal/blob/main/docs/architecture/overview.md" target="_blank" rel="noopener"
&gt;CueMarshal architecture overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a class="link" href="https://github.com/cuemarshal/cuemarshal/blob/main/docs/features/agents/overview.md" target="_blank" rel="noopener"
&gt;CueMarshal agent profiles&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;</description></item></channel></rss>