Skip to content

feat!: migrate MCP server to ScrapeGraph API v2#16

Open
VinciGit00 wants to merge 3 commits intomainfrom
feat/migrate-v2-api
Open

feat!: migrate MCP server to ScrapeGraph API v2#16
VinciGit00 wants to merge 3 commits intomainfrom
feat/migrate-v2-api

Conversation

@VinciGit00
Copy link
Copy Markdown
Member

Summary

Migrates the ScrapeGraph MCP server to API v2, matching scrapegraph-py#82.

Changes

  • Base URL: https://api.scrapegraphai.com/api/v2 (override with SCRAPEGRAPH_API_BASE_URL)
  • Auth: Authorization: Bearer, SGAI-APIKEY, X-SDK-Version: scrapegraph-mcp@2.0.0
  • Endpoints: /scrape, /extract, /search, /crawl (+ stop/resume), /credits, /history, /monitor (+ lifecycle)
  • New tools: crawl_stop, crawl_resume, credits, sgai_history, monitor_create, monitor_list, monitor_get, monitor_pause, monitor_resume, monitor_delete
  • Updated: markdownify, smartscraper, searchscraper, scrape (format options), smartcrawler_initiate (markdown/html crawl; default extraction_mode=markdown)
  • Removed: sitemap, agentic_scrapper, markdownify_status, smartscraper_status

Breaking changes

Clients relying on removed tools or v1-only behavior must migrate per the README and .agent docs.

Test plan

  • ruff check src/ and mypy src/scrapegraph_mcp/server.py
  • Run MCP server with SGAI_API_KEY and smoke-test markdownify, credits (if deployed on target API)

Made with Cursor

VinciGit00 and others added 3 commits April 3, 2026 07:39
Align with scrapegraph-py PR #82: base URL /api/v2, Bearer + SGAI-APIKEY
headers, X-SDK-Version, and v2 endpoints for scrape, extract, search, crawl,
credits, history, and monitor.

BREAKING CHANGE: Removes sitemap, agentic_scrapper, markdownify_status, and
smartscraper_status. Crawl supports markdown/html only (no AI crawl mode).
Adds crawl_stop, crawl_resume, credits, sgai_history, and monitor_* tools.

Optional SCRAPEGRAPH_API_BASE_URL for custom API hosts.

Made-with: Cursor
…Config

Align MCP server with scrapegraph-py PR #82 proxy changes:
- Replace stealth/render_heavy_js/render_js booleans with mode enum
  (auto, fast, js, direct+stealth, js+stealth)
- Rename wait_ms → wait, add timeout, cookies, country params
- Add LlmConfig support (llm_model, llm_temperature, llm_max_tokens)
  to smartscraper, searchscraper, and monitor_create
- Add fetch_config params to smartcrawler_initiate and monitor_create
- Add output_schema support to searchscraper
- Remove deprecated v1 params (website_html, website_markdown, stream,
  total_pages, same_domain_only, time_range, number_of_scrolls)
- Update parameter reference documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove llm_model, llm_temperature, llm_max_tokens from smartscraper,
searchscraper, and monitor_create tools. Remove _llm_config helper
and llm_config_dict from all client methods.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@VinciGit00
Copy link
Copy Markdown
Member Author

✅ Integration Test Results — localhost:3002

Tested all three SDKs/tools against the local dev API (localhost:3002) on 2026-04-09. Everything works end-to-end.

scrapegraph-py v2 (PR #82 — built from feat/migrate-python-sdk-to-api-v2)

Test Endpoint Result
client.credits() GET /credits {remaining: 248935, plan: "Pro Plan"}
client.scrape(url, format="markdown") POST /scrape ✅ Returns markdown content
client.extract(url, prompt=...) POST /extract ✅ Returns structured JSON
client.search(query, num_results=3) POST /search ✅ Returns search results
client.history(limit=3) GET /history ✅ Returns request history

scrapegraph-mcp v2 (this PR — ScapeGraphClient)

Test Endpoint Result
credits() GET /credits
scrape_v2(url, "markdown") POST /scrape
scrape_v2(url, "html") POST /scrape
scrape_v2(url, "screenshot") POST /scrape
scrape_v2(url, "branding") POST /scrape
extract(prompt, url) POST /extract
search_api(query, num_results=3) POST /search
history(limit=3) GET /history
crawl_start(url, depth=1) POST /crawl
monitor list GET /monitor

just-scrape CLI (scrapegraph-cli + scrapegraph-js v2)

Command Result
just-scrape credits --json
just-scrape scrape https://example.com --json
just-scrape extract https://example.com -p "..." --json
just-scrape search "ScrapeGraph AI" --json
just-scrape history scrape --json

All three client implementations (Python SDK, MCP server, JS CLI) are aligned and working against the v2 API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant