Memory Integration

To upvote this issue, give it a thumbs up. See [this list](https://github.com/editor-code-assistant/eca/issues?q=is%3Aissue+is%3Aopen+sort%3Areactions-%2B1-desc) for the most upvoted issues.

## Memory
I think one of the biggest features missing from eca is memory integration. Hermes agent, for example, has a memory plugin system that allows adding different memory providers.

I think these are the most notable memory systems that are open source and can run locally:
- [Hindsight](https://github.com/vectorize-io/hindsight)
- [Honcho](https://github.com/plastic-labs/honcho)
- [OpenViking](https://github.com/volcengine/OpenViking)

Out of these, Hindsight stands out to me for these reasons:
- [Hindsight is #1 on BEAM](https://hindsight.vectorize.io/blog/2026/04/02/beam-sota) (a much more difficult benchmark than older ones like locomo and longmemeval)
- hindsight's website is mostly documentation and only has a small link to their cloud option vs. honcho where it's not even immediately obvious there is a non-paid option (also not clear to me if there is any functionality only in the paid option)
- hindsight is extremely easy to setup locally, just run `uvx hindsight-embed@latest` (no complex installation or docker setup)
- nice dashboard

## Memory Best Practices (especially for Hindsight)
I don't have time at the moment to do this for myself (might eventually), but if someone else decides they want to add memory support, I can provide some information on best practices because hermes' implementation is currently flawed.

- Should support ambient store/recall and manual tools for store/recall (it should be configurable which are enabled)
- Ambient store should automatically store every turn in the memory system. Ideally it is configurable when the LLM will actually run (e.g. every n turns or only on session switch/close)
- Ambient recall should be ephemerally injected just before sending the user message to the LLM. It should be given the *current turn*'s user message (up to some char limit). Hermes currently prefetches based on the previous turn's user message, but this can cause irrelevant memories to surface and potentially confuse the agent. Hindsight's recall is low latency compared to LLM API calls, so I do not believe there is a use case for this
- Forking and resuming are edge cases that need to be taken into account immediately. ECA's rollback might be something to consider how to interact with.

Hindsight specific:
- [Hindsight has a best practices guide](https://hindsight.vectorize.io/best-practices#retaining-data)
- Most notable best practices:
  - Don't pre-summarize
  - `context` field should be set to describe the nature and source of the content. The way I'm doing this in pi is by having a configurable prefix and a truncated session title, which will be the manually set title or part of the first message, e.g. "pi: research hindsight"
  - Sessions should use their stable session id as the `document_id`. Hindsight now has an `update_mode=append` that can be used to add the latest turn to an existing document.
  - `timestamp` - can be session start time
- `tags` - used for filtering at recall time; For automatically retained information, I'm currently tagging with `parent:<parent id>`, `session:<session id>`, `harness:pi` (user configurable constants tag), `cwd:<cwd>`
- Forking/resume can be handled if session id is unique for forks by only appending new turns to the document with that session id (if you retain full document of a fork, you will get duplicate information stored)

I have a pi plugin for hindsight that also:
- After every message queues to disk and then sends them as json strings to hindsight (currently hindsight doesn't have a way to delay processing; queueing to disk is also useful if hindsight is down, as can flush later)
- Allows pruning certain fields before sending into hindsight (may or may not be a good idea, but it reduces token usage)
- Allows upserting the entirety of old sessions to hindsight (to seed with existing memories instead of needing to start fresh; also allows changing pruning settings or reseed after changing retain mission)

The configuration for how to prune session messages currently looks like this:
```jsonc
{
  "retainContent": {
    "assistant": ["text", "thinking", "toolCall"],
    "user": ["text"],
    // remove all tool results and only include user messages and assistant reasoning
    "toolResult": []
  },
  "strip": {
    // assumes pi's /tree is not being used so messages are ordered and ids not relevant
    "topLevel": ["type", "id", "parentId"],
    "message": ["api", "provider", "model", "usage", "cost", "stopReason", "timestamp", "responseId"]
  }
}
```

I'm not sure at this point whether or not it makes sense to include full code reads/edits. Hindsight has a "retain mission" which is a prompt you can use to determine what actually gets stored. Currently I am more interested in storing memories about architectural decisions, library/linter/testing preferences, etc. I'm just totally pruning tools for token savings. The alternative would be to just send everything and let the retain mission determine what it needs to extract.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory Integration #408

Memory

Memory Best Practices (especially for Hindsight)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Memory Integration #408

Description

Memory

Memory Best Practices (especially for Hindsight)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions