Skip to content

Commit 1d7abc6

Browse files
authored
Merge pull request #32 from ScrapeGraphAI/docs/add-wait-ms-parameter
docs: add wait_ms parameter documentation
2 parents a7b8bfc + 25e369c commit 1d7abc6

File tree

2 files changed

+211
-1
lines changed

2 files changed

+211
-1
lines changed

docs.json

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,8 @@
6666
"pages": [
6767
"services/additional-parameters/headers",
6868
"services/additional-parameters/pagination",
69-
"services/additional-parameters/proxy"
69+
"services/additional-parameters/proxy",
70+
"services/additional-parameters/wait-ms"
7071
]
7172
},
7273
{
Lines changed: 209 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,209 @@
1+
---
2+
title: 'Wait Time'
3+
description: 'Control how long the scraper waits before capturing page content'
4+
icon: 'clock'
5+
---
6+
7+
<Frame>
8+
<img src="/services/images/smartscraper-banner.png" alt="Wait Time Configuration" />
9+
</Frame>
10+
11+
## Overview
12+
13+
The `wait_ms` parameter controls how many milliseconds the scraper waits before capturing page content. This is useful for pages that load content dynamically after the initial page load, such as:
14+
15+
- Single Page Applications (SPAs)
16+
- Pages with lazy-loaded content
17+
- Websites that render content via client-side JavaScript
18+
- Pages with animations or delayed content loading
19+
20+
## Parameter Details
21+
22+
| Field | Value |
23+
|-------|-------|
24+
| **Parameter** | `wait_ms` |
25+
| **Type** | Integer |
26+
| **Required** | No |
27+
| **Default** | `3000` (3 seconds) |
28+
| **Validation** | Must be a positive integer |
29+
30+
## Supported Services
31+
32+
The `wait_ms` parameter is available on the following endpoints:
33+
34+
- **SmartScraper** - AI-powered structured data extraction
35+
- **Scrape** - Raw HTML content extraction
36+
- **Markdownify** - Web content to markdown conversion
37+
38+
## Usage Examples
39+
40+
### Python SDK
41+
42+
```python
43+
from scrapegraph_py import Client
44+
45+
client = Client(api_key="your-api-key")
46+
47+
# SmartScraper with custom wait time
48+
response = client.smartscraper(
49+
website_url="https://example.com",
50+
user_prompt="Extract product information",
51+
wait_ms=5000 # Wait 5 seconds before scraping
52+
)
53+
54+
# Scrape with custom wait time
55+
response = client.scrape(
56+
website_url="https://example.com",
57+
wait_ms=5000
58+
)
59+
60+
# Markdownify with custom wait time
61+
response = client.markdownify(
62+
website_url="https://example.com",
63+
wait_ms=5000
64+
)
65+
```
66+
67+
### JavaScript SDK
68+
69+
```javascript
70+
import { smartScraper, scrape, markdownify } from 'scrapegraph-js';
71+
72+
const apiKey = 'your-api-key';
73+
74+
// SmartScraper with custom wait time
75+
const response = await smartScraper(
76+
apiKey,
77+
'https://example.com',
78+
'Extract product information',
79+
null, // schema
80+
null, // numberOfScrolls
81+
null, // totalPages
82+
null, // cookies
83+
{ waitMs: 5000 } // Wait 5 seconds before scraping
84+
);
85+
86+
// Scrape with custom wait time
87+
const scrapeResponse = await scrape(apiKey, 'https://example.com', {
88+
waitMs: 5000
89+
});
90+
91+
// Markdownify with custom wait time
92+
const mdResponse = await markdownify(apiKey, 'https://example.com', {
93+
waitMs: 5000
94+
});
95+
```
96+
97+
### cURL
98+
99+
```bash
100+
curl -X 'POST' \
101+
'https://api.scrapegraphai.com/v1/smartscraper' \
102+
-H 'accept: application/json' \
103+
-H 'SGAI-APIKEY: your-api-key' \
104+
-H 'Content-Type: application/json' \
105+
-d '{
106+
"website_url": "https://example.com",
107+
"user_prompt": "Extract product information",
108+
"wait_ms": 5000
109+
}'
110+
```
111+
112+
### Async Python SDK
113+
114+
```python
115+
from scrapegraph_py import AsyncClient
116+
117+
async def scrape_with_wait():
118+
client = AsyncClient(api_key="your-api-key")
119+
120+
# SmartScraper with custom wait time
121+
response = await client.smartscraper(
122+
website_url="https://example.com",
123+
user_prompt="Extract product information",
124+
wait_ms=5000
125+
)
126+
127+
# Markdownify with custom wait time
128+
response = await client.markdownify(
129+
website_url="https://example.com",
130+
wait_ms=5000
131+
)
132+
```
133+
134+
## When to Adjust `wait_ms`
135+
136+
### Increase wait time when:
137+
- The target page loads content dynamically via JavaScript
138+
- You're scraping a SPA (React, Vue, Angular) that needs time to hydrate
139+
- The page fetches data from APIs after initial load
140+
- You're seeing incomplete or empty results with the default wait time
141+
142+
### Decrease wait time when:
143+
- The target page is static HTML with no dynamic content
144+
- You want faster scraping for simple pages
145+
- You're scraping many pages and want to optimize throughput
146+
147+
## Best Practices
148+
149+
1. **Start with the default** - The default value of 3000ms works well for most websites. Only adjust if you're seeing incomplete results.
150+
151+
2. **Test incrementally** - If the default doesn't capture all content, try increasing in 1000ms increments (4000, 5000, etc.) rather than setting a very high value.
152+
153+
3. **Combine with other parameters** - Use `wait_ms` together with `render_heavy_js` for JavaScript-heavy pages:
154+
155+
```python
156+
response = client.smartscraper(
157+
website_url="https://heavy-js-site.com",
158+
user_prompt="Extract all products",
159+
wait_ms=8000,
160+
render_heavy_js=True
161+
)
162+
```
163+
164+
4. **Balance speed and completeness** - Higher wait times ensure more content is captured but increase response time and resource usage.
165+
166+
## Troubleshooting
167+
168+
<Accordion title="Content still missing after increasing wait_ms" icon="exclamation-triangle">
169+
If increasing `wait_ms` doesn't capture all content:
170+
171+
- Try enabling `render_heavy_js=True` for JavaScript-heavy pages
172+
- Check if the content requires user interaction (clicks, scrolls) - use `number_of_scrolls` for infinite scroll pages
173+
- Verify the content isn't behind authentication - use custom headers/cookies
174+
</Accordion>
175+
176+
<Accordion title="Scraping is too slow" icon="clock">
177+
If scraping is taking longer than expected:
178+
179+
- Lower the `wait_ms` value for static pages
180+
- Use the default (omit the parameter) unless you specifically need a longer wait
181+
- Consider using async clients for parallel scraping
182+
</Accordion>
183+
184+
## API Reference
185+
186+
For detailed API documentation, see:
187+
- [SmartScraper Start Job](/api-reference/endpoint/smartscraper/start)
188+
- [Markdownify Start Job](/api-reference/endpoint/markdownify/start)
189+
190+
## Support & Resources
191+
192+
<CardGroup cols={2}>
193+
<Card title="API Reference" icon="book" href="/api-reference/introduction">
194+
Detailed API documentation
195+
</Card>
196+
<Card title="Dashboard" icon="dashboard" href="/dashboard/overview">
197+
Monitor your API usage and credits
198+
</Card>
199+
<Card title="Community" icon="discord" href="https://discord.gg/uJN7TYcpNa">
200+
Join our Discord community
201+
</Card>
202+
<Card title="GitHub" icon="github" href="https://github.com/ScrapeGraphAI">
203+
Check out our open-source projects
204+
</Card>
205+
</CardGroup>
206+
207+
<Card title="Need Help?" icon="question" href="mailto:support@scrapegraphai.com">
208+
Contact our support team for assistance with wait time configuration or any other questions!
209+
</Card>

0 commit comments

Comments
 (0)