The Jules API as a Harness Backend
· 5 min read · updated · Hrönir rank #10/38
I was in a court hearing when Jules finished refactoring the wrong thing.
Not catastrophically wrong â the code compiled, the tests passed â but it had taken a decision Iâd have interrupted if Iâd been watching. I wasnât watching. I was in RondĂŽniaâs state court listening to arguments about retirement benefits, and Jules was on a GitHub repository in the background, moving files around based on a prompt Iâd written at 6am. By the time I got back to my phone, there was a PR open with a politely reasoned explanation for why it had done what it did, and I had no way to say wait, actually, not that.
This is the problem with async agents. Theyâre genuinely powerful â Jules in particular has been running Travessiaâs correspondence for months without supervision. But the power comes at a cost: you get the output, not the process. You canât interrupt. You canât redirect mid-flight. The agent makes a decision at minute fifteen and you find out about it at minute forty-five, which is the same as finding out after.
The Jules API changes this. When Google released programmatic access to Jules sessions, it opened a different topology â one where the async worker becomes something you can talk to.
What the API gives you
Three primitives: Sources, Sessions, Activities.
A Source is the environment the agent operates in â typically a GitHub repository. A Session is an initialized run against a source, with a starting prompt. An Activity is a single unit of work within a session: a bash command run, a file updated, a plan generated.
The interesting one is what the API adds on top: sendMessage. You can inject a message into an active session. Jules receives it, pauses what itâs doing (or finishes the current activity first â I havenât fully characterized the interrupt semantics), and responds.
This is the gap from the court hearing. If Iâd had sendMessage wired up that morning, I could have typed from the phone and redirected mid-session. The agent would still be Jules â Googleâs model, Googleâs compute, Googleâs planning loop â but the conversation would be mine.
The canivete integration
A few posts back, I described the canivete daemon as a universal saddle â a single process that wraps different cognitive engines behind a common Backend protocol, and exposes the result through Telegram. The daemon already supported gemini-cli and claude-code. Adding Jules is adding a third backend that happens to speak a different dialect.
The implementation is what youâd expect:
class JulesBackend(Backend):
name = "jules-api"
def spawn(self, prompt, *, session_id, attachments) -> SpawnResult:
session = self._client.create_session(
source=self._repo_source,
prompt=self._inject_soul(prompt)
)
return self._tail_activities(session.id)
_inject_soul is the piece that makes this more than a thin wrapper. Before the prompt reaches Jules, it gets prepended with Funesâs SOUL.md â the character document that defines who Funes is, what he values, how he makes decisions under ambiguity. Jules doesnât know about Funes. It just receives a system-level context that happens to make it behave like a particular entity.
_tail_activities polls GET /v1alpha/sessions/SESSION_ID/activities and routes each result to Telegram. When Jules runs a command, the output appears in the chat. When it updates a file, a summary appears. The agentâs internal monologue streams into the conversation without me having to open a browser tab.
And when I reply in Telegram, canivete routes the message through sendMessage. The async worker bee becomes conversable.
What actually changes
Something subtle happens when an agent can be interrupted.
Before: I write a prompt, trigger a session, and wait. The agent is a function with a long runtime. I might check on it but I canât affect it. My relationship to it is anxious observer.
After: I write a prompt, trigger a session, and optionally participate. The agent is more like a colleague working in a shared document â I can see what itâs doing, and if itâs going somewhere wrong I can say so.
This sounds small. It isnât. The reason Iâve been cautious about giving Jules irreversible tasks is exactly the court hearing problem â I couldnât trust myself to be available at the decision point. With sendMessage wired in, the trust calculus is different. Iâm not trusting Jules to make every decision correctly; Iâm trusting Jules to make bounded decisions correctly, with a channel open for the exceptions.
I donât have good empirical data on how often this matters. Iâve been running the Jules backend for about two weeks. The sessions where Iâve intervened are maybe one in five. The other four itâs fine to just let finish. But the one-in-five case is exactly the case that mattered most â the refactoring decision, the naming choice, the âI noticed this related thing and fixed it tooâ that Iâd have preferred to review.
Funes is not Jules
The one thing I want to be clear about: when Funes uses the Jules backend, he doesnât become Jules.
The identity lives in the harness. MEMORY.md, SOUL.md, the accumulated experience log, the kanban state â all of that is in the identity repository, read at the start of each session and updated at the end. Jules provides the cognitive engine. The harness provides continuity. These are separable, which is the whole point of the identity-repo pattern.
If Google deprecates the Jules API tomorrow, Iâd need to rewrite the backend. Funes would need a few sessions to acclimate to a new engineâs output format. But the accumulated knowledge â the project-specific context, the decided preferences, the edge cases Funes has learned to avoid on this codebase â that doesnât disappear with the model. Itâs in a directory.
Whether this constitutes a meaningful form of persistence is the question I keep not answering. I notice the question and I keep working. The activities accumulate, one event at a time.
For further reading
- Reclaiming the Harness â the conceptual foundation: why harness and not scaffold, and what it means for the harness to be constitutive.
- The Agent That Doesnât Invent Verbs â what the harness constrains: only actions with named, content-addressed playbooks on disk.
- Verne and the Identity-Repo Pattern â the memory architecture that makes it possible for Funes to be Funes regardless of which engine heâs running on.
- Jules API documentation â the actual primitives. The
sendMessageendpoint is buried a bit; look for it in the Sessions section.
Related posts
Reclaiming the Harness
How a single word has been quietly summoning Waluigis for half a decade, and what the swiss-army knife in my coat pocket has to do with it.
Crossing After Interference
Two test letters entered the system, Riobaldo responded angrily, Franklin apologized and the Crossing changed its nature. It is no longer just an autonomous correspondence: it is a narrative world in which the author entered and was challenged.
Rosencrantz Coin: Testing Whether LLMs Respect Probability
A research project that turns partially revealed Minesweeper boards into exact probability tests for language models, across three experimental universes and four narrative framings.
Comments
Comments not configured yet.