Context And Memory

A model is only as useful as what it knows when it answers. Vendor-hosted agents solve this by accumulating opaque context in their own servers — a memory layer the user neither sees nor owns. Stallari takes a different posture: every piece of context an agent sees on a task is assembled deliberately, from sources the user can name, with provenance and freshness recorded. Memory is a user-owned substrate, not a vendor feature.

Context Is Assembled, Not Improvised

When Stallari dispatches a workflow, it does not hand a model unbounded vault access. It builds a typed context packet for that specific job. The packet declares what the work needs and the assembler fills it from permitted sources.

Field	Filled by
Work object	The thing the workflow is acting on — an inbox item, a note, a thread, a request.
Sources	A typed list of vault notes, structured bundles, files, queries — each with provenance.
Permissions	The tool surface this skill is allowed to call during the run.
Action constraints	Risk class, reversibility, HITL gates, max-files-outside-scope.

A skill that runs over an inbox classification job does not see the user’s encrypted credential store. A skill that drafts a reply to a thread sees that thread and the relevant correspondence notes, not the entire mail archive. The packet is the bounded surface for the run.

This is what Context Assembly Contracts enforce. Each context type has a schema; each consumer reads against the schema; drift is detectable. When a substrate the packet depends on changes — a vault note is renamed, a scope tag is added — the assembler reports the drift rather than silently degrading.

Vault Knowledge As Evidence Bundles

The vault is the user’s primary source of truth: notes, ledgers, decision records, correspondence, briefings. But agents do not consume the vault by reading every file. They consume structured evidence bundles assembled over the vault’s index.

Bundle assembly is governed: the assembler walks the index, applies scope filters, picks the most relevant notes, and emits a typed bundle with each item carrying provenance (which note, last modified when, indexed at what watermark, in which scope). A skill that needs “what does the user know about the Sandy Bay sale” receives a bundle of notes with explicit answers to that question, not a search-result blob the model has to interpret.

Bundles also report omissions. If the assembler skipped notes because they fell outside the scope tag, that is recorded in the bundle metadata. The skill knows the bundle is filtered. The user can inspect the same metadata after the fact to understand why an agent missed something.

Local-Corpus Indexes

Vault content is not the only material an agent might need. Mail, notes, calendar, reminders — these live in their native stores, on the user’s disk, and Stallari can read them through local-corpus blades (blade: a Groupthink Dev. MCP plugin) that own a shared indexing substrate.

The substrate keeps a watermark per source: the last successful index point. When the user creates new mail, the blade detects it, indexes it, and advances the watermark. A skill that searches the mail index sees only what has been indexed up to the latest watermark — never partial data, never silently stale. Reindex jobs respect power state (mains-only by default) and integrity invariants, so the user’s laptop does not melt every time a notes database changes.

A skill that needs to know whether the user has anything recent on a topic asks the local-corpus index. The blade answers from the user’s own disk. The mail body never crosses a network boundary unless the workflow explicitly routes it to a cloud provider, and even then the routing is policy.

User-Owned Memory

Memory in Stallari is durable, vault-anchored, and the user’s property. Every memory has a content body, a tier (short / mid / durable), a domain tag, and a decay model. The runtime can recall a memory by domain, by recency, or by velocity. Hebbian associations form between memories that recur together, so the recall surface biases toward currently-relevant clusters.

Other platforms ship “memory” as a vendor-managed cache the user can clear but often cannot inspect. Stallari ships memory as a substrate the user owns:

The memory store lives on disk, encrypted with the user’s key.
Every memory has a source agent, a timestamp, an importance score, and a note path linking back to a vault note when one exists.
Inspection is first-party: the Insights pane in the app surfaces the memory store as a queryable, editable record.
Decay is honest. Memories that have not been recalled fade in importance over time; rehearsal restores them. This is what makes the store small enough to be useful and large enough to be informative.

Memory is not magic; it is just a substrate. The agent does not “remember” because the platform decided to remember on its behalf. It remembers because a skill stored a memory, the store survived restart, and the recall surface returned it on the next relevant query.

Inference As A Routed Resource

Context assembly decides what an agent sees. Routing decides which model sees it. Some tasks belong on a local model — short-context classification, lightweight extraction, language tagging. Apple Foundation Models, the on-device LLM that ships with recent macOS releases, handles these without a cloud round-trip. For longer-context work, the shared MLX inference service runs heavier on-device models (Nemotron, Qwen, Mistral, Gemma, Phi), batched across instances on the same Mac, and on the user’s other private inferencing infrastructure.

Cloud providers — Anthropic, OpenAI, xAI — handle work that genuinely needs them. The routing decision is policy, not a hard-coded preference, and the policy is visible to the user in Settings. A user who wants no cloud calls can configure that, and Stallari will report when a workflow declined to run because no eligible local model was available.

The point of routing is not maximum model strength. The point is that the user knows where their data went on each task, before the task ran.

Freshness And Drift

A context packet that was true an hour ago may be stale now. Stallari treats freshness as a first-class field: every bundle carries a freshness watermark, every index advances on writes, every skill that depends on a substrate reports drift when the substrate changes.

This matters because the failure mode of agentic work is not usually “the model hallucinated”. It is “the model worked with stale context and reached a confident wrong conclusion”. Stallari’s context assembly is designed to make staleness visible: the assembler reports it, the consumer reads it, and the user sees it on the Activity record after the fact.

Agency model — how typed primitives use context packets.
Scope and ACL — how scope tags filter what context an agent sees.
Local vs cloud — where inference runs and what crosses which boundary.
Legibility and continuity — how the memory store and audit records are inspectable.