Platform Engineering on Alfero Chingono

The Silent Excellence Trap: Why Good Work Gets Invisible in Big Organizations

Wed, 25 Mar 2026 09:00:00 +0000

There is a specific kind of frustration that shows up in large organizations when you are doing good work that makes other work possible.

You reduce risk. You simplify delivery. You create better defaults. You remove ambiguity.

And at the end of the quarter, it can still look like nothing happened.

I have come to think of that as the silent excellence trap.

Outcomes matter, but outcomes that are not made legible, timely, and attributable are often discounted.

That sentence has stayed with me because it explains something I have seen repeatedly, both in myself and in other high-conscientious operators. We assume that if the work is real, the value will be obvious. In many organizations, that is simply not how visibility works.

The problem is usually structural, not personal

Most big organizations do not formally say they reward performative busyness over outcomes. In fact, they usually say the opposite.

The disconnect shows up in the operating reality.

I see three reasons for that.

1. Long feedback loops

Some of the most valuable work in platform engineering, architecture, governance, and operational improvement only becomes visible months later.

If you reduce configuration drift, improve rollout safety, or standardize an architectural decision, the value often appears as fewer future problems.

That is good for the business. It is not always good for recognition.

When outcomes take a long time to surface, visible activity starts to become a proxy for progress.

2. Narrative-driven evaluation

A lot of performance assessment is not just about what happened. It is about how what happened gets narrated.

This creates an uncomfortable truth: two people can generate similar value, but the person who can explain their impact in organizational language is often rewarded more clearly.

That is not always politics in the cynical sense. Sometimes it is just information design. Leaders are making decisions under ambiguity, and the work that is easiest to interpret tends to travel farther.

3. Cross-team dependency density

In big organizations, important work often moves through many hands.

You create the framework. Someone else adopts it. Another team gets the outcome. A downstream metric improves three months later.

At that point, attribution is fuzzy. Visibility fills the gap.

That is why quiet, enabling work is so easy to underrate.

This is not an argument for theater

I do not think the answer is to become a full-time self-promoter.

I also do not think the answer is to “look busy.”

The better answer is intentional visibility.

That means making outcomes easier to see without fabricating activity.

There is a huge ethical and psychological difference between:

performing importance
and making real impact legible

I care about the second one.

What I do instead now

Over time, I have found a few habits that help.

1. Express work in a simple chain

I try to describe meaningful work as:

Problem -> Action -> Outcome -> Why it matters

That structure sounds basic, but it keeps me from describing effort when I should be describing value.

2. Translate into enterprise outcomes

Enabling work often needs a second layer of interpretation.

It is not enough to say:

built a feature-flag framework
standardized observability
created reusable pipeline templates

You often have to go one step further:

reduced deployment risk
removed support fragmentation
shortened decision cycles
improved auditability
lowered operational variance

This is not spin. It is completion.

3. Publish checkpoints, not just final wins

Long-cycle work disappears when nobody can see intermediate progress.

I have become much more deliberate about creating artifacts along the way:

decision logs
patterns
working examples
metrics snapshots
short summaries of what changed

That keeps the work reviewable before it becomes historic.

4. Pre-wire leaders with outcomes, not activity

I have learned not to wait for annual review season to explain what mattered.

Short outcome-only updates travel better than heroic recaps. They also reduce the chance that your most valuable work gets remembered only when something breaks.

The uncomfortable part

The silent excellence trap is especially common among people who take pride in substance.

We tell ourselves the work should speak for itself.

Sometimes it does. Often it does not.

And in large systems, what does not become legible often does not become rewarded.

That is not fair, but it is useful to understand.

My current reframe

I no longer think of visibility as a political side quest. I think of it as part of delivery.

If I create meaningful change but fail to make it legible, timely, and attributable, then part of the job is still unfinished.

That has changed how I think about platform work, architecture work, and even writing in public. In some ways, building publicly is itself a response to this problem. The work becomes its own audit trail.

That is also why this post connects so naturally to The DORA Report Was Right: IDPs Improve Team Productivity by 10% — Here’s How I’ve Seen It. Platform work can produce real leverage. But if the leverage stays invisible, the organization often underestimates both the work and the people doing it.

References:

Designing Multi-Agent Systems: Lessons from Building an 8-Agent Engineering Orchestra

Thu, 28 Aug 2025 09:00:00 +0000

A lot of “multi-agent” demos are really one agent wearing different hats.

The names change. The prompts change. Sometimes the avatars change. But the authority model, the memory model, and the execution model are all still basically the same. That is fine for a demo. It is much less convincing when you are trying to build a system that can do real engineering work.

Building CueMarshal made that distinction impossible for me to ignore.

What I wanted was not eight personalities for marketing. I wanted a working system where planning, coding, review, testing, DevOps, documentation, and quality control could be separated cleanly enough to be trustworthy.

That is how the “engineering orchestra” idea emerged.

The roles mattered because the boundaries mattered

CueMarshal’s cast eventually became:

Marshal for orchestration
Ava for architecture
Dave for implementation
Reese for review
Tess for testing
Devin for DevOps
Dot for documentation
Linton for linting

What made that useful was not the naming. It was the fact that the roles had different responsibilities, different tool access, and different default model tiers.

That is the first lesson I would pass on to anyone designing a multi-agent system:

1. Different roles need different authority

If your reviewer can rewrite production code, your reviewer is not really a reviewer.

In CueMarshal, least privilege is deliberate. The reviewer is configured without write/edit permissions. The docs agent is restricted from shell access. The linter acts like a gate, not a developer with nicer manners.

That kind of restriction sounds limiting until you realize it is what gives each role meaning. Boundaries are not friction here. They are the mechanism that creates trust.

A good multi-agent system is not just a cluster of competencies. It is a set of constrained responsibilities.

2. Coordination needs durable state outside the model

One of the reasons I anchored CueMarshal in Git is that I did not want coordination to depend on hidden model memory.

Tasks become issues. Work becomes branches. Proposals become pull requests. Reviews become durable comments and approvals.

The Conductor receives webhooks, uses Redis and BullMQ to manage asynchronous flow, and dispatches work through Gitea Actions. The runners themselves stay stateless; they rebuild context from the repository, the issue, and the tool layer every time.

That has been a much better trade than magical continuity.

Models forget. Git does not.

3. Identity is part of the architecture

Another thing I underestimated early on was how important identity separation would be.

Each CueMarshal agent has its own account, token, and audit trail. That means the Git history shows who planned, who implemented, who reviewed, and who approved. Even when the “who” is an AI agent, the distinction still matters.

This has two benefits.

First, it improves explainability. The system becomes easier to inspect when actions are attributable.

Second, it changes how you think about safety. Once every agent has a clear identity and permission scope, you stop designing from a vague “assistant” mindset and start designing from explicit operational roles.

That shift is subtle, but it is foundational.

4. The tool layer is what makes the orchestra playable

This is where MCP became important for CueMarshal.

All of the agents connect to a structured tool layer instead of improvising raw integrations on the fly. The same Gitea, Conductor, and System capabilities can be used by the runner agents over stdio and by the orchestration layer over HTTP/SSE.

That matters because multi-agent systems are not only about reasoning. They are about coordination through reliable interfaces.

If the tools are vague, agents collide. If the permissions are sloppy, trust collapses. If the transports are inconsistent, reuse gets expensive.

The protocol is not the whole story, but it is the difference between a collection of prompts and a real system surface.

I wrote more about that in MCP in Practice, because it deserves its own treatment.

5. Model routing is architecture, not optimization

Another lesson I came away with: not every role deserves the same model.

Architecture work is more expensive and more consequential than documentation cleanup. Review often needs stronger reasoning than linting. Mechanical work should not burn premium tokens if a cheaper tier can do it reliably.

CueMarshal’s tiered routing reflects that reality:

heavy reasoning for architecture
balanced capability for implementation, review, testing, and DevOps
lighter-weight models for docs and linting

That is not just a cost decision. It is part of how the system stays sustainable.

Too many agent systems treat model choice as an afterthought. I think it belongs in the design doc.

6. Closed loops beat hero agents

The more I build these systems, the less I believe in the “super-agent” story.

What works better is a closed loop:

detect work
route it clearly
execute with constrained roles
review it
merge it with human control
feed the next signal back into the system

CueMarshal’s self-improvement workflow made this even clearer to me. Once SonarQube findings, scanners, issues, PRs, and agent roles all started participating in the same loop, the system became more useful than any single agent inside it.

That is why I think orchestration matters more than agent count.

My current takeaway

If you are building a multi-agent system, start with these questions:

What roles genuinely need to be different?
What permissions should each role have?
Where does coordination state live?
How are actions attributed?
What is the closed loop that turns outputs into the next inputs?

If you cannot answer those, adding more agents will mostly add more noise.

If you can answer them, the number of agents becomes much less important than the quality of the structure around them.

That has been the real lesson for me. The point of the orchestra is not to have more instruments. The point is to make the handoffs musical instead of chaotic.

If you want the adjacent pieces, Why I Started Building My Own DevOps Platform covers the motivation, and How I Run SonarQube in My Own CI Pipeline (And Let AI Fix What It Finds) shows what this architecture looks like when the feedback loop closes on itself.

References:

The DORA Report Was Right: IDPs Improve Team Productivity by 10% — Here's How I've Seen It

Thu, 10 Apr 2025 09:00:00 +0000

When the DORA research started surfacing stronger evidence around internal developer platforms, the headline did not surprise me nearly as much as the reactions did.

Some people still hear “platform engineering” and imagine more process, more gates, and another internal team inventing obstacles. That risk is real. But it is only one version of the story.

The version I have seen in practice is much simpler: when you reduce cognitive load, standardize the boring parts well, and make the safe path the easy path, teams move faster.

That is why the widely shared DORA finding about internal developer platforms improving team performance felt directionally right to me. I have seen that pattern from multiple angles: CI/CD modernization that improved project velocity, reusable delivery templates that removed duplication, feature-flag and configuration patterns that reduced rollout risk, and platform standards that made decision-making less expensive for product teams.

I would not claim every platform effort automatically produces a neat percentage uplift. But I do believe the mechanism is real.

What platform engineering is actually buying you

At its best, an internal developer platform is not a control tower. It is a friction reducer.

It helps teams spend less time answering questions like:

How should we structure this pipeline?
Which security checks are required?
What is the approved deployment pattern?
How do we manage runtime configuration safely?
How do we release gradually without gambling in production?

If every team answers those questions from scratch, you get inconsistency, duplicated effort, and unnecessary risk. If the platform team answers them once, clearly, and with enough flexibility, you get leverage.

That leverage is where the productivity gain comes from.

Not from a portal. Not from a dashboard. Not from a maturity model.

From fewer repeated decisions.

Where I have seen the gains show up

One of the more durable lessons in my career is that developer productivity is usually downstream of environment design.

At VCA Software, the measurable win was CI/CD modernization. We saw delivery speed improve because the path from code to release became more repeatable and less person-dependent. That did not happen because engineers suddenly became more talented. It happened because the delivery system stopped making them re-solve the same operational problems over and over.

In more platform-oriented work, I have seen the same principle show up differently:

a reusable feature-flag framework that makes progressive delivery safer
standardized pipeline templates that reduce copy-paste infrastructure
better observability defaults so teams are not blind after deployment
configuration-management patterns that reduce drift and remove manual setup

That last point is one reason I still like the pattern I wrote about in Conditionally Deploying Resources in Azure App Configuration Using Deployment Scripts. It is not glamorous, but it is exactly the kind of operational sharp edge a good platform should smooth out.

The DORA nuance matters too

What I appreciate about the DORA research is that it does not treat platform engineering as universally positive in every implementation.

That matches my experience.

A platform helps when it gives teams self-service with sensible defaults.

A platform hurts when it becomes:

mandatory ceremony
an opaque ticket queue
a rigid abstraction over real team needs
a place where local context goes to die

This is where some platform efforts go sideways. They optimize for governance theater instead of developer flow. Then leaders conclude that platform engineering is slow, when the real problem is that the platform is not being run like a product.

What good platform teams do differently

The best platform work I have seen has a few traits in common.

1. It starts with repeated pain, not abstract ambition

Good platform teams do not begin with “let’s build an internal developer portal.” They begin with “teams keep tripping over the same deployment, security, configuration, or release problems.”

That difference matters because it keeps the work grounded in actual developer friction.

2. It productizes standards

A standard that lives in a slide deck does almost nothing.

A standard that shows up as a reusable template, a safe default, a documented flow, a working example, and a supportable paved road actually changes behavior.

3. It respects local autonomy

The best platforms do not remove all choice. They remove the expensive choices that most teams should not have to make repeatedly.

That is a very different posture from central control.

4. It measures adoption, not just existence

If nobody uses the platform voluntarily, that is feedback.

Platform teams need to care about usability the same way product teams do. A paved road that engineers avoid is not a paved road.

My working definition now

I increasingly think of platform engineering as the discipline of making good engineering behavior easier to repeat.

That includes technology, of course, but also language, templates, defaults, and trust.

Done well, it gives teams more autonomy because it lowers the cost of doing the right thing. Done badly, it creates another dependency.

That is why the DORA finding resonated with me. Not because I am attached to the term, but because I have seen the underlying dynamic up close. When teams have usable internal platforms, they make better decisions faster.

That is not magic. It is what happens when you turn institutional knowledge into operable systems.

If this topic interests you, the closest companion piece here is Why I Started Building My Own DevOps Platform, which comes at the same problem from the builder side rather than the organizational one.

References:

Why I Started Building My Own DevOps Platform (And What I Learned)

Sat, 15 Feb 2025 09:00:00 +0000

For a while, I had the same reaction to most AI-for-software-delivery demos: impressive in a narrow way, but not something I would trust with real work. One tool could write code. Another could summarize a diff. Another could review a pull request. But the hard part of software delivery is rarely one isolated step. It is the handoff between steps.

That was the itch that eventually pushed me to start building CueMarshal.

I did not start with the ambition to build “an AI company” or some abstract autonomous future. I started because I wanted a more coherent delivery system: one place where a task could move from idea to issue to branch to pull request to review without losing context every time responsibility changed hands.

The problem I actually wanted to solve

CI/CD was never the whole problem. In many teams, the pipeline is the most deterministic part of the process. The mess usually lives around it:

the design decision that only exists in a chat thread
the issue that says too little
the reviewer who has to reconstruct intent from commit history
the documentation that is always “we’ll do it after”
the growing pile of tools that all know a little, but none of them own the workflow

What I wanted was not another dashboard. I wanted a delivery surface that respected how engineering work already happens.

That led me to a simple conviction: Git should be the source of truth, not just the storage layer.

If work already becomes legible through issues, branches, pull requests, labels, and reviews, then the orchestration layer should live there too. Not beside it. Not behind it. Inside it.

Why I built it myself

There were three constraints that mattered to me from day one.

First, I wanted the system to be self-hosted. A lot of AI tooling assumes you are comfortable sending your code, your process, and your delivery metadata into someone else’s black box. Many teams are not. I wanted an approach that made data sovereignty a feature, not an apology.

Second, I wanted the system to be role-aware. Real software delivery is not “one super-agent with a clever prompt.” Design, implementation, review, testing, DevOps, and documentation are different jobs. Sometimes one person does multiple jobs, but the jobs are still different. That distinction matters.

Third, I wanted human control to remain the final gate. I am interested in automation, not surrender. If an AI system cannot work inside a reviewable pull-request workflow, I do not think it is mature enough for serious engineering work.

Those constraints eventually turned into the shape CueMarshal has now: a conductor service in TypeScript, specialized agents for architecture, development, review, testing, DevOps, docs, and linting, a Git-native workflow in Gitea, and a tool layer built around MCP so the same system can reason over structured interfaces instead of raw shell scripts and ad-hoc API calls.

The architecture came later. The principles came first.

Long before the implementation solidified, the design principles were already obvious to me.

1. Git is a better coordination layer than most agent UIs

An issue is a task. A branch is a workstream. A pull request is a proposal. A review is a decision record. A merge is a controlled state change.

That sounds almost too obvious to say out loud, but it changed how I thought about the whole problem. Once I stopped treating Git as the place where code merely ends up, and started treating it as the place where engineering decisions become inspectable, the rest of the architecture got much simpler.

2. Specialization beats a “do everything” agent

In CueMarshal, the system is intentionally split into named roles: Marshal for orchestration, Ava for architecture, Dave for implementation, Reese for review, Tess for testing, Devin for DevOps, Dot for docs, and Linton for linting.

That is not branding for its own sake. It is an operational choice.

The moment one agent tries to be planner, coder, reviewer, tester, and documentarian all at once, you lose clarity. You also lose accountability. Specialization makes prompts sharper, tool permissions narrower, and outputs easier to judge.

3. Tool contracts matter more than prompt cleverness

One of the biggest lessons from building CueMarshal is that the quality of an agentic system is heavily constrained by the quality of its interfaces.

If an agent is forced to improvise around loosely structured APIs, fragile shell commands, or browser automation for tasks that should be typed and validated, the system becomes harder to trust. This is one reason MCP clicked for me so quickly later on: it gave a clean shape to something I already knew was essential.

Good tool contracts do not just help the model. They help the human operator understand what the system is even allowed to do.

4. Stateless workers are a feature, not a bug

CueMarshal’s runners are intentionally stateless. They reconstruct context from the repository, the issue, the pull request, and the tool layer every time.

That may sound less magical than the “persistent AI teammate” narrative, but it is much easier to reason about. It scales better. It fails more cleanly. And it produces a better audit trail.

In practice, that has made me more skeptical of systems that depend on hidden memory to feel smart.

5. Human control is product design

The more I worked on this, the more convinced I became that “human in the loop” is not enough as a slogan. It has to be built into the workflow itself.

That is why I prefer issue-driven execution, reviewable pull requests, typed tools, explicit handoffs, and merge control. Those are not bureaucratic constraints. They are the difference between a system that can support real engineering and a system that is only good for demos.

What I learned from building in public

The most useful part of this project has not been proving that agents can write code. We already knew that. The useful part has been learning where coordination breaks, where trust gets earned, and what kinds of structure make AI assistance actually usable.

It also made one thing clearer for me: the next layer of software delivery is not “more CI/CD.” It is better orchestration around the work humans and machines are already doing together.

That is the reason I started building CueMarshal, and it is still the reason I keep working on it.

If you want the more technical follow-up, I wrote about what MCP actually changed for developers and the coordination lessons from building an eight-agent engineering orchestra.

References: