This repository contains the necessary support files and logic to make osmAgent work seamlessly with local reasoning ("thinking") models via Ollama and LM Studio.
When using "Thinking" models (like DeepSeek-R1-Distill-Qwen, QwQ, etc.) with standard OpenAI-compatible tool calling endpoints, the local servers' internal parsers often break. The model uses its <think> block to reason about tool syntax, which prematurely triggers the server's parser, leading to infinite loops or malformed JSON blocks.
To fix this, we bypass the native tool-calling features (the tools array) of Ollama/LM Studio.
Instead, we:
- Inject a Custom Tool Schema into the System Prompt.
- Intercept the raw text stream.
- Parse the
<think>blocks out for the UI. - Execute any requested
<tool_call>blocks manually in the app.
system-prompt.txt: The system prompt snippet you should append to your agent's instructions, outlining the<tool_call>format.parser.js: The custom text parser that separates thoughts from tool calls and final answers.ollama.js: Example wrapper for calling Ollama without the native tool parser.lmstudio.js: Example wrapper for calling LM Studio without the native tool parser.
- Use the instructions from
system-prompt.txtat the end of your system prompt. - Pass the raw output of the local model to
parseAgentOutputinparser.js. - If
isToolCallis true, pause execution, run the requested tool in your app, format the result as an observation, and append it to the chat history as a new user message before calling the model again.