Skip to content

Commit abd46bc

Browse files
fii6jackwener
andauthored
feat(gemini): add Gemini web adapter with minimal output (#619)
* feat(gemini): add web adapter with minimal output * fix(gemini): use defaultFormat for minimal output * fix(gemini): preserve full transcript responses * docs(gemini): add browser adapter guide * review: wire gemini into adapter indexes --------- Co-authored-by: fii6 <246637913+fii6@users.noreply.github.com> Co-authored-by: jackwener <jakevingoo@gmail.com>
1 parent 098c7f4 commit abd46bc

16 files changed

Lines changed: 931 additions & 3 deletions

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,7 @@ git clone git@github.com:jackwener/opencli.git && cd opencli && npm install && n
138138
| **twitter** | `trending` `search` `timeline` `bookmarks` `post` `download` `profile` `article` `like` `likes` `notifications` `reply` `reply-dm` `thread` `follow` `unfollow` `followers` `following` `block` `unblock` `bookmark` `unbookmark` `delete` `hide-reply` `accept` |
139139
| **reddit** | `hot` `frontpage` `popular` `search` `subreddit` `user` `user-posts` `user-comments` `read` `save` `saved` `subscribe` `upvote` `upvoted` `comment` |
140140
| **amazon** | `bestsellers` `search` `product` `offer` `discussion` |
141+
| **gemini** | `new` `ask` `image` |
141142
| **notebooklm** | `status` `list` `open` `select` `current` `get` `metadata` `source-list` `source-get` `source-fulltext` `source-guide` `history` `note-list` `notes-list` `notes-get` `summary` |
142143
| **spotify** | `auth` `status` `play` `pause` `next` `prev` `volume` `search` `queue` `shuffle` `repeat` |
143144

README.zh-CN.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -191,6 +191,7 @@ npx skills add jackwener/opencli --skill opencli-oneshot # 快速命令参
191191
| **facebook** | `feed` `profile` `search` `friends` `groups` `events` `notifications` `memories` `add-friend` `join-group` | 浏览器 |
192192
| **google** | `news` `search` `suggest` `trends` | 公开 |
193193
| **amazon** | `bestsellers` `search` `product` `offer` `discussion` | 浏览器 |
194+
| **gemini** | `new` `ask` `image` | 浏览器 |
194195
| **spotify** | `auth` `status` `play` `pause` `next` `prev` `volume` `search` `queue` `shuffle` `repeat` | OAuth API |
195196
| **notebooklm** | `status` `list` `open` `select` `current` `get` `metadata` `source-list` `source-get` `source-fulltext` `source-guide` `history` `note-list` `notes-list` `notes-get` `summary` | 浏览器 |
196197
| **36kr** | `news` `hot` `search` `article` | 公开 / 浏览器 |

docs/.vitepress/config.mts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ export default defineConfig({
7373
{ text: 'Chaoxing', link: '/adapters/browser/chaoxing' },
7474
{ text: 'Grok', link: '/adapters/browser/grok' },
7575
{ text: 'Amazon', link: '/adapters/browser/amazon' },
76+
{ text: 'Gemini', link: '/adapters/browser/gemini' },
7677
{ text: 'NotebookLM', link: '/adapters/browser/notebooklm' },
7778
{ text: 'WeRead', link: '/adapters/browser/weread' },
7879
{ text: 'Douban', link: '/adapters/browser/douban' },

docs/adapters/browser/gemini.md

Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
# Gemini
2+
3+
**Mode**: 🔐 Browser · **Domain**: `gemini.google.com`
4+
5+
## Commands
6+
7+
| Command | Description |
8+
|---------|-------------|
9+
| `opencli gemini new` | Start a new Gemini web chat |
10+
| `opencli gemini ask <prompt>` | Send a prompt and return only the assistant reply |
11+
| `opencli gemini image <prompt>` | Generate images in Gemini and optionally save them locally |
12+
13+
## Usage Examples
14+
15+
```bash
16+
# Start a fresh chat
17+
opencli gemini new
18+
19+
# Ask Gemini and return minimal plain-text output
20+
opencli gemini ask "Reply with exactly: HELLO"
21+
22+
# Ask in a new chat and wait longer
23+
opencli gemini ask "Summarize this design in 3 bullets" --new true --timeout 90
24+
25+
# Generate an icon image with short flags
26+
opencli gemini image "Generate a tiny cyan moon icon" --rt 1:1 --st icon
27+
28+
# Only generate in Gemini and print the page link without downloading files
29+
opencli gemini image "A watercolor sunset over a lake" --sd true
30+
31+
# Save generated images to a custom directory
32+
opencli gemini image "A flat illustration of a robot" --op ~/tmp/gemini-images
33+
```
34+
35+
## Options
36+
37+
### `ask`
38+
39+
| Option | Description |
40+
|--------|-------------|
41+
| `prompt` | Prompt to send (required positional argument) |
42+
| `--timeout` | Max seconds to wait for a reply (default: `60`) |
43+
| `--new` | Start a new chat before sending (default: `false`) |
44+
45+
### `image`
46+
47+
| Option | Description |
48+
|--------|-------------|
49+
| `prompt` | Image prompt to send (required positional argument) |
50+
| `--rt` | Aspect ratio shorthand: `1:1`, `16:9`, `9:16`, `4:3`, `3:4`, `3:2`, `2:3` |
51+
| `--st` | Optional style shorthand, e.g. `icon`, `anime`, `watercolor` |
52+
| `--op` | Output directory for downloaded images (default: `~/tmp/gemini-images`) |
53+
| `--sd` | Skip download and only print the Gemini page link |
54+
55+
## Behavior
56+
57+
- `ask` uses plain minimal output and returns only the assistant response text prefixed with `💬`.
58+
- `image` also uses plain output and prints `status / file / link` instead of a table.
59+
- `image` always starts from a fresh Gemini chat before sending the prompt.
60+
- When `--sd` is enabled, `image` keeps the generation in Gemini and only prints the conversation link.
61+
62+
## Prerequisites
63+
64+
- Chrome is running
65+
- You are already logged into `gemini.google.com`
66+
- [Browser Bridge extension](/guide/browser-bridge) is installed
67+
68+
## Caveats
69+
70+
- This adapter drives the Gemini consumer web UI, not a public API.
71+
- It depends on the current browser session and may fail if Gemini shows login, consent, challenge, quota, or other gating UI.
72+
- DOM or product changes on Gemini can break composer detection, new-chat handling, or image export behavior.

docs/adapters/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@ Run `opencli list` for the live registry.
2929
| **[linux-do](./browser/linux-do)** | `feed` `categories` `tags` `search` `topic` `user-topics` `user-posts` | 🔐 Browser |
3030
| **[chaoxing](./browser/chaoxing)** | `assignments` `exams` | 🔐 Browser |
3131
| **[grok](./browser/grok)** | `ask` | 🔐 Browser |
32+
| **[gemini](./browser/gemini)** | `new` `ask` `image` | 🔐 Browser |
3233
| **[notebooklm](./browser/notebooklm)** | `status` `list` `open` `select` `current` `get` `metadata` `source-list` `source-get` `source-fulltext` `source-guide` `history` `note-list` `notes-list` `notes-get` `summary` | 🔐 Browser |
3334
| **[doubao](./browser/doubao)** | `status` `new` `send` `read` `ask` `history` `detail` `meeting-summary` `meeting-transcript` | 🔐 Browser |
3435
| **[weread](./browser/weread)** | `shelf` `search` `book` `ranking` `notebooks` `highlights` `notes` | 🔐 Browser |

src/clis/gemini/ask.ts

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
import { cli, Strategy } from '../../registry.js';
2+
import type { IPage } from '../../types.js';
3+
import { GEMINI_DOMAIN, getGeminiTranscriptLines, sendGeminiMessage, startNewGeminiChat, waitForGeminiResponse } from './utils.js';
4+
5+
function normalizeBooleanFlag(value: unknown): boolean {
6+
if (typeof value === 'boolean') return value;
7+
const normalized = String(value ?? '').trim().toLowerCase();
8+
return normalized === 'true' || normalized === '1' || normalized === 'yes' || normalized === 'on';
9+
}
10+
11+
const NO_RESPONSE_PREFIX = '[NO RESPONSE]';
12+
13+
export const askCommand = cli({
14+
site: 'gemini',
15+
name: 'ask',
16+
description: 'Send a prompt to Gemini and return only the assistant response',
17+
domain: GEMINI_DOMAIN,
18+
strategy: Strategy.COOKIE,
19+
browser: true,
20+
navigateBefore: false,
21+
defaultFormat: 'plain',
22+
timeoutSeconds: 180,
23+
args: [
24+
{ name: 'prompt', required: true, positional: true, help: 'Prompt to send' },
25+
{ name: 'timeout', required: false, help: 'Max seconds to wait (default: 60)', default: '60' },
26+
{ name: 'new', required: false, help: 'Start a new chat first (true/false, default: false)', default: 'false' },
27+
],
28+
columns: ['response'],
29+
func: async (page: IPage, kwargs: any) => {
30+
const prompt = kwargs.prompt as string;
31+
const timeout = parseInt(kwargs.timeout as string, 10) || 60;
32+
const startFresh = normalizeBooleanFlag(kwargs.new);
33+
34+
if (startFresh) await startNewGeminiChat(page);
35+
36+
const beforeLines = await getGeminiTranscriptLines(page);
37+
await sendGeminiMessage(page, prompt);
38+
const response = await waitForGeminiResponse(page, beforeLines, prompt, timeout);
39+
40+
if (!response) {
41+
return [{ response: `💬 ${NO_RESPONSE_PREFIX} No Gemini response within ${timeout}s.` }];
42+
}
43+
44+
return [{ response: `💬 ${response}` }];
45+
},
46+
});

src/clis/gemini/image.ts

Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
import * as os from 'node:os';
2+
import * as path from 'node:path';
3+
import { cli, Strategy } from '../../registry.js';
4+
import type { IPage } from '../../types.js';
5+
import { saveBase64ToFile } from '../../utils.js';
6+
import { GEMINI_DOMAIN, exportGeminiImages, getGeminiVisibleImageUrls, sendGeminiMessage, startNewGeminiChat, waitForGeminiImages } from './utils.js';
7+
8+
function extFromMime(mime: string): string {
9+
if (mime.includes('png')) return '.png';
10+
if (mime.includes('webp')) return '.webp';
11+
if (mime.includes('gif')) return '.gif';
12+
return '.jpg';
13+
}
14+
15+
function normalizeBooleanFlag(value: unknown): boolean {
16+
if (typeof value === 'boolean') return value;
17+
const normalized = String(value ?? '').trim().toLowerCase();
18+
return normalized === 'true' || normalized === '1' || normalized === 'yes' || normalized === 'on';
19+
}
20+
21+
function displayPath(filePath: string): string {
22+
const home = os.homedir();
23+
return filePath.startsWith(home) ? `~${filePath.slice(home.length)}` : filePath;
24+
}
25+
26+
function buildImagePrompt(prompt: string, options: {
27+
ratio?: string;
28+
style?: string;
29+
}): string {
30+
const extras: string[] = [];
31+
if (options.ratio) extras.push(`aspect ratio ${options.ratio}`);
32+
if (options.style) extras.push(`style ${options.style}`);
33+
if (extras.length === 0) return prompt;
34+
return `${prompt}
35+
36+
Image requirements: ${extras.join(', ')}.`;
37+
}
38+
39+
function normalizeRatio(value: string): string {
40+
const normalized = value.trim();
41+
const allowed = new Set(['1:1', '16:9', '9:16', '4:3', '3:4', '3:2', '2:3']);
42+
return allowed.has(normalized) ? normalized : '1:1';
43+
}
44+
async function currentGeminiLink(page: IPage): Promise<string> {
45+
const url = await page.evaluate('window.location.href').catch(() => '');
46+
return typeof url === 'string' && url ? url : 'https://gemini.google.com/app';
47+
}
48+
49+
export const imageCommand = cli({
50+
site: 'gemini',
51+
name: 'image',
52+
description: 'Generate images with Gemini web and save them locally',
53+
domain: GEMINI_DOMAIN,
54+
strategy: Strategy.COOKIE,
55+
browser: true,
56+
navigateBefore: false,
57+
defaultFormat: 'plain',
58+
timeoutSeconds: 240,
59+
args: [
60+
{ name: 'prompt', positional: true, required: true, help: 'Image prompt to send to Gemini' },
61+
{ name: 'rt', default: '1:1', help: 'Ratio shorthand for aspect ratio (1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3)' },
62+
{ name: 'st', default: '', help: 'Style shorthand, e.g. anime, icon, watercolor' },
63+
{ name: 'op', default: path.join(os.homedir(), 'tmp', 'gemini-images'), help: 'Output directory shorthand' },
64+
{ name: 'sd', type: 'boolean', default: false, help: 'Skip download shorthand; only show Gemini page link' },
65+
],
66+
columns: ['status', 'file', 'link'],
67+
func: async (page: IPage, kwargs: any) => {
68+
const prompt = kwargs.prompt as string;
69+
const ratio = normalizeRatio(String(kwargs.rt ?? '1:1'));
70+
const style = String(kwargs.st ?? '').trim();
71+
const outputDir = (kwargs.op as string) || path.join(os.homedir(), 'tmp', 'gemini-images');
72+
const timeout = 120;
73+
const startFresh = true;
74+
const skipDownloadRaw = kwargs.sd;
75+
const skipDownload = skipDownloadRaw === '' || skipDownloadRaw === true || normalizeBooleanFlag(skipDownloadRaw);
76+
77+
const effectivePrompt = buildImagePrompt(prompt, {
78+
ratio,
79+
style: style || undefined,
80+
});
81+
82+
if (startFresh) await startNewGeminiChat(page);
83+
84+
const beforeUrls = await getGeminiVisibleImageUrls(page);
85+
await sendGeminiMessage(page, effectivePrompt);
86+
const urls = await waitForGeminiImages(page, beforeUrls, timeout);
87+
const link = await currentGeminiLink(page);
88+
89+
if (!urls.length) {
90+
return [{ status: '⚠️ no-images', file: '📁 -', link: `🔗 ${link}` }];
91+
}
92+
93+
if (skipDownload) {
94+
return [{ status: '🎨 generated', file: '📁 -', link: `🔗 ${link}` }];
95+
}
96+
97+
const assets = await exportGeminiImages(page, urls);
98+
if (!assets.length) {
99+
return [{ status: '⚠️ export-failed', file: '📁 -', link: `🔗 ${link}` }];
100+
}
101+
102+
const stamp = Date.now();
103+
const results = [];
104+
for (let index = 0; index < assets.length; index += 1) {
105+
const asset = assets[index];
106+
const base64 = asset.dataUrl.replace(/^data:[^;]+;base64,/, '');
107+
const suffix = assets.length > 1 ? `_${index + 1}` : '';
108+
const filePath = path.join(outputDir, `gemini_${stamp}${suffix}${extFromMime(asset.mimeType)}`);
109+
await saveBase64ToFile(base64, filePath);
110+
results.push({ status: '✅ saved', file: `📁 ${displayPath(filePath)}`, link: `🔗 ${link}` });
111+
}
112+
113+
return results;
114+
},
115+
});

src/clis/gemini/new.ts

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
import { cli, Strategy } from '../../registry.js';
2+
import type { IPage } from '../../types.js';
3+
import { GEMINI_DOMAIN, startNewGeminiChat } from './utils.js';
4+
5+
export const newCommand = cli({
6+
site: 'gemini',
7+
name: 'new',
8+
description: 'Start a new conversation in Gemini web chat',
9+
domain: GEMINI_DOMAIN,
10+
strategy: Strategy.COOKIE,
11+
browser: true,
12+
navigateBefore: false,
13+
args: [],
14+
columns: ['Status', 'Action'],
15+
func: async (page: IPage) => {
16+
const action = await startNewGeminiChat(page);
17+
return [{
18+
Status: 'Success',
19+
Action: action === 'navigate' ? 'Reloaded /app as fallback' : 'Clicked New chat',
20+
}];
21+
},
22+
});

src/clis/gemini/utils.test.ts

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
import { describe, expect, it } from 'vitest';
2+
import { collectGeminiTranscriptAdditions, sanitizeGeminiResponseText } from './utils.js';
3+
4+
describe('sanitizeGeminiResponseText', () => {
5+
it('strips a prompt echo only when it appears as a prefixed block', () => {
6+
const prompt = 'Reply with the word opencli';
7+
const value = `Reply with the word opencli\n\nopencli`;
8+
expect(sanitizeGeminiResponseText(value, prompt)).toBe('opencli');
9+
});
10+
11+
it('does not strip prompt text that appears later in a legitimate answer', () => {
12+
const prompt = 'opencli';
13+
const value = 'You asked about opencli, and opencli is the right keyword here.';
14+
expect(sanitizeGeminiResponseText(value, prompt)).toBe(value);
15+
});
16+
17+
it('removes known Gemini footer noise', () => {
18+
const value = 'Answer body\nGemini can make mistakes.\nGoogle Terms';
19+
expect(sanitizeGeminiResponseText(value, '')).toBe('Answer body');
20+
});
21+
});
22+
23+
describe('collectGeminiTranscriptAdditions', () => {
24+
it('joins multiple new transcript lines instead of keeping only the last line', () => {
25+
const before = ['Older answer'];
26+
const current = ['Older answer', 'First new line', 'Second new line'];
27+
expect(collectGeminiTranscriptAdditions(before, current, '')).toBe('First new line\nSecond new line');
28+
});
29+
30+
it('filters prompt echoes out of transcript additions', () => {
31+
const prompt = 'Tell me a haiku';
32+
const before = ['Previous'];
33+
const current = ['Previous', 'Tell me a haiku', 'Tell me a haiku\n\nSoft spring rain arrives'];
34+
expect(collectGeminiTranscriptAdditions(before, current, prompt)).toBe('Soft spring rain arrives');
35+
});
36+
});

0 commit comments

Comments
 (0)