Skip to content

codivery/creir-mcp

Repository files navigation

CREIR MCP

Korean-specialized search MCP server for LLMs

Rust License MCP

Features

  • 4 MCP Tools: creir_search, creir_local, creir_extract, creir_trends covering comprehensive search needs.
  • Multi-source Korean Search: Aggregates results from Naver (Blog, News, Cafe, Kin, Encyclopedia), Kakao (Web, Local), and Google CSE.
  • 6-Signal Context-Aware Ranking: sophisticated ranking engine using freshness, source quality, intent match, keyword overlap, location proximity, and weather relevance.
  • Korean NLP Pipeline: Built-in processing for synonym expansion, suffix normalization, stop word filtering, and tokenization (optional lindera integration).
  • Location Intelligence: Native handling of GPS coordinates, dong/sigungu matching, juso.go.kr address API integration, and beopjeongdong codes.
  • Real-time Context: Integrates KMA weather API, AirKorea PM2.5 data, time slots, holidays, and seasonal context for smarter results.
  • L1/L2 Caching: High-performance dual-layer caching with moka (in-memory) and Meilisearch for persistent storage.
  • Rate Limiting: Robust governor-based burst protection (10 req/sec) and per-API daily quota management.
  • Dual Transport Support: flexible deployment with both stdio (optimized for Claude Desktop) and Streamable HTTP (for MCP Inspector/web clients).
  • Data-Driven Configuration: 11 JSON config files control behavior with zero hardcoded rules.
  • Docker Optimized: 4-stage multi-stage build resulting in a lightweight ~63MB image.

Quick Start

# Clone
git clone https://github.com/codivery/creir-mcp.git
cd creir-mcp

# Set API keys
export NAVER_CLIENT_ID=your_id
export NAVER_CLIENT_SECRET=your_secret
export KAKAO_REST_API_KEY=your_key

# Run (stdio mode - for Claude Desktop)
cargo run

# Run (HTTP mode - for MCP Inspector)
cargo run -- --http --port 8080

Claude Desktop Configuration

Add this configuration to your Claude Desktop config file:

{
  "mcpServers": {
    "creir": {
      "command": "cargo",
      "args": ["run", "--manifest-path", "/path/to/creir-mcp/Cargo.toml"],
      "env": {
        "NAVER_CLIENT_ID": "your_id",
        "NAVER_CLIENT_SECRET": "your_secret",
        "KAKAO_REST_API_KEY": "your_key"
      }
    }
  }
}

MCP Tools

1. creir_search

Search Korean web content across Naver, Kakao, and Google.

Parameter Type Required Description
query String Yes Search query
sources String[] No Filter sources: naver_blog, naver_news, naver_cafe, naver_kin, naver_encyc, kakao_web, google. Default: auto-routed by intent
max_results Number No 1-50, default 10
sort String No "relevance" (default) or "date"
time_context String No Override auto-detected time context
location Object No { lat?, lng?, dong?, sigungu? }

Example Input:

{
  "query": "맛집 추천",
  "location": {
    "lat": 37.5665,
    "lng": 126.9780
  }
}

Example Output:

[
  {
    "title": "서울 종로 맛집 베스트 10",
    "link": "https://blog.naver.com/...",
    "snippet": "종로구청 근처에 위치한...",
    "source": "naver_blog",
    "score": 0.95
  }
]

2. creir_local

Search Korean local businesses and POIs.

Parameter Type Required Description
query String Yes Business/POI search query
location Object Yes { lat: number, lng: number }
radius_km Number No 0.1-20.0, default 2.0

Example Input:

{
  "query": "24시 카페",
  "location": { "lat": 37.5665, "lng": 126.9780 },
  "radius_km": 1.5
}

Example Output:

[
  {
    "name": "스타벅스 종로점",
    "address": "서울 종로구...",
    "distance": "0.3km",
    "category": "카페"
  }
]

3. creir_extract

Extract and clean content from Korean web pages.

Parameter Type Required Description
urls String[] Yes URLs to extract
extract_mode String No Extraction mode

Example Input:

{
  "urls": ["https://news.naver.com/main/read.nhn?..."]
}

Example Output:

{
  "results": [
    {
      "url": "https://news.naver.com/...",
      "content": "본문 내용..."
    }
  ]
}

4. creir_trends

Get Korean trends and contextual recommendations.

Parameter Type Required Description
type String Yes "realtime_search", "weather_context", "seasonal"
location String No Location filter

Example Input:

{
  "type": "realtime_search"
}

Example Output:

{
  "trends": [
    { "keyword": "현재 날씨", "rank": 1 },
    { "keyword": "프로야구 중계", "rank": 2 }
  ]
}

Benchmark

Evaluated against Tavily and Exa on 10 Korean-language queries across restaurants, cafes, travel, lodging, and more — each returning up to 10 results scored on 7 weighted metrics.

Metric creir-mcp Tavily Exa
Composite Score 0.838 🥇 0.694 🥈 0.640 🥉
Content Quality 0.923 0.785 0.585
Korean Language % 1.000 0.990 0.900
Source Originality 0.990 0.567 0.560
Freshness 0.830 0.430 0.560
Actionable Info 0.720 0.346 0.570
Content Depth 0.636 0.892 0.570
Result Diversity 0.410 0.768 0.620
  • 10/10 query wins against both competitors
  • +20.7% composite over Tavily, +30.9% over Exa
  • 878x cache speedup on warm hits · 100% keyword match · 1.48s avg latency
Benchmark Report Screenshots (click to expand)

Overview

Overview

Section 1 — Quantitative Benchmarks

Quantitative

Section 2 — Qualitative Benchmarks

Qualitative

Section 3 — IR Metrics

IR Metrics

Section 4 — Competitive: creir-mcp vs Tavily vs Exa

Competitive

Full interactive report: benchmark/index.html

Architecture

┌─────────────────────────────────────────────┐
│              LLM Client                      │
│        (Claude, GPT, etc.)                   │
└──────────────┬──────────────────────────────┘
                │ MCP Protocol (stdio / HTTP)
┌──────────────▼──────────────────────────────┐
│            CREIR MCP Server                  │
├─────────┬─────────┬──────────┬──────────────┤
│ Search  │ Local   │ Extract  │   Trends     │
├─────────┴─────────┴──────────┴──────────────┤
│              Korean NLP Pipeline             │
│     (normalizer → tokenizer → synonyms)      │
├─────────────────────────────────────────────┤
│           Context Engine                     │
│   (location + temporal + weather + intent)   │
├─────────────────────────────────────────────┤
│         6-Signal Ranking Engine              │
├──────┬──────────┬───────────────────────────┤
│ L1   │    L2    │     Search Adapters       │
│ moka │ Meili    │  Naver / Kakao / Google   │
└──────┴──────────┴───────────────────────────┘

Environment Variables

Variable Required Description
NAVER_CLIENT_ID For Naver search Naver Open API Client ID
NAVER_CLIENT_SECRET For Naver search Naver Open API Client Secret
KAKAO_REST_API_KEY For Kakao search Kakao REST API Key
GOOGLE_CSE_API_KEY For Google search Google Custom Search API Key
GOOGLE_CSE_CX For Google search Google Custom Search Engine ID
KMA_API_KEY For weather Korea Meteorological Administration API Key
AIRKOREA_API_KEY For air quality AirKorea Open API Key
JUSO_API_KEY For address resolution juso.go.kr Address API Key
MEILISEARCH_KEY For L2 cache Meilisearch API Key
NAMU_DATA_DIR No (default: data) Custom data directory path

Configuration

Detailed configuration settings can be found in config/default.toml. This file controls search behaviors, caching policies, and default parameters.

Docker

docker build -t creir-mcp .
docker-compose up -d

The Docker image uses a 4-stage multi-stage build process based on distroless, resulting in a highly optimized image size of approximately 63MB.

Development

cargo check          # Type check
cargo clippy         # Lint
cargo test           # Run all 85 tests
cargo run            # Dev server (stdio)
cargo run -- --http  # Dev server (HTTP)

Data Files

The system behavior is data-driven through 11 JSON configuration files:

  • synonyms.json: Korean synonym dictionary for query expansion
  • locations.json: Fallback location data (regions, coords)
  • beopjeongdong.json: Korean administrative division codes (17 provinces, districts, neighborhoods)
  • time_slots.json: Time-of-day slot definitions for contextual search
  • weather_rules.json: Weather-based search query adjustment rules
  • weather_stations.json: KMA weather observation stations with grid coordinates
  • intent_keywords.json: Query intent detection keywords
  • intent_sources.json: Intent-to-source routing mappings
  • source_quality.json: Source quality scores for ranking
  • scoring_weights.json: 6-signal ranking weight configuration
  • normalizer_suffixes.json: Korean suffix patterns for query normalization
  • stop_words.json: Korean stop words for filtering
  • holidays.json: Korean public holidays for temporal context

Tech Stack

  • Rust + rmcp 0.15 (MCP protocol)
  • axum 0.8 (HTTP transport)
  • reqwest 0.12 (HTTP client)
  • moka 0.12 (in-memory cache)
  • meilisearch-sdk 0.27 (L2 cache)
  • governor 0.8 (rate limiting)
  • lindera 0.38 (optional Korean tokenizer)
  • schemars 1.2 (JSON Schema generation)

License

MIT

About

Korean-specialized LLM search MCP server

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published