Skip to content

junainfinity/The-Final-Cut

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Final Cut - AI Server & UI

Welcome to The Final Cut, a high-performance, locally-hosted AI Chat application and API server. It is powered by Apple Silicon (MLX) and designed from the ground up to offer both a beautiful web UI and an OpenAI-compatible high-speed streaming API for external clients like LM Studio.

🚀 Getting Started on macOS

To start the entire platform (both the backend AI Server and the frontend Chat UI):

  1. Open Finder and navigate to this folder (Project_N2K).
  2. Double-click on the start_tfc.command file.

This script will automatically:

  • Start the Python FastAPI backend on Port 8000.
  • Start the React/Vite development server for the UI on Port 5173.
  • Open your default web browser directly to the Chat UI.
  • Allow you to shut down everything safely simply by closing the terminal window or pressing Ctrl+C.

🎨 The Chat UI

Once the server is running, the pristine chat interface is accessible at: http://localhost:5173

Features:

  • Thinking Blocks: The UI intelligently captures all internal reasoning emitted by deep-thinking models (like Qwen3.5) until the closing </think> tag. It isolates this monologue in an auto-collapsible dropdown to keep the chat clean while preserving insight into the AI's reasoning.
  • Syntax Highlighting: Full support for markdown rendering and code blocks with 1-click copy-to-clipboard functionality.
  • Telemetry Footer: Every response calculates and displays token speed (tokens/sec), generation time, and total tokens.
  • Local Persistence: Your chat history, folder organization, and sidebar settings are saved automatically using your browser's localStorage.

🔌 Using The Server With LM Studio & Others

The backend server exposes a lightning-fast, fully OpenAI-compatible chat completions endpoint. You can drop it into any application, SDK, or UI that accepts a custom OpenAI Base URL.

Connection Info for LM Studio

  • Base URL: http://localhost:8000/v1 (Note for LM Studio: Add this as a custom OpenAI endpoint)
  • Model Name: "qwen3.5" (Any string works; the server automatically routes to your loaded local model).
  • Context Length: Adjust as needed in LM Studio.

The 2026 Specification

The /v1/chat/completions endpoint strictly adheres to the March 2026 OpenAI streaming specification. It fully supports:

  • High-speed text/event-stream SSE generation containing delta objects.
  • system_fingerprint identifiers ("fp_finalcut_2026").
  • Exact usage statistics injected into the final chunk (with finish_reason: "stop"). This ensures LM Studio correctly parses and displays your backend's Tokens-Per-Second metrics.

Enjoy your absolute control over The Final Cut.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors