|
| 1 | +--- |
| 2 | +title: 'Wait Time' |
| 3 | +description: 'Control how long the scraper waits before capturing page content' |
| 4 | +icon: 'clock' |
| 5 | +--- |
| 6 | + |
| 7 | +<Frame> |
| 8 | + <img src="/services/images/smartscraper-banner.png" alt="Wait Time Configuration" /> |
| 9 | +</Frame> |
| 10 | + |
| 11 | +## Overview |
| 12 | + |
| 13 | +The `wait_ms` parameter controls how many milliseconds the scraper waits before capturing page content. This is useful for pages that load content dynamically after the initial page load, such as: |
| 14 | + |
| 15 | +- Single Page Applications (SPAs) |
| 16 | +- Pages with lazy-loaded content |
| 17 | +- Websites that render content via client-side JavaScript |
| 18 | +- Pages with animations or delayed content loading |
| 19 | + |
| 20 | +## Parameter Details |
| 21 | + |
| 22 | +| Field | Value | |
| 23 | +|-------|-------| |
| 24 | +| **Parameter** | `wait_ms` | |
| 25 | +| **Type** | Integer | |
| 26 | +| **Required** | No | |
| 27 | +| **Default** | `3000` (3 seconds) | |
| 28 | +| **Validation** | Must be a positive integer | |
| 29 | + |
| 30 | +## Supported Services |
| 31 | + |
| 32 | +The `wait_ms` parameter is available on the following endpoints: |
| 33 | + |
| 34 | +- **SmartScraper** - AI-powered structured data extraction |
| 35 | +- **Scrape** - Raw HTML content extraction |
| 36 | +- **Markdownify** - Web content to markdown conversion |
| 37 | + |
| 38 | +## Usage Examples |
| 39 | + |
| 40 | +### Python SDK |
| 41 | + |
| 42 | +```python |
| 43 | +from scrapegraph_py import Client |
| 44 | + |
| 45 | +client = Client(api_key="your-api-key") |
| 46 | + |
| 47 | +# SmartScraper with custom wait time |
| 48 | +response = client.smartscraper( |
| 49 | + website_url="https://example.com", |
| 50 | + user_prompt="Extract product information", |
| 51 | + wait_ms=5000 # Wait 5 seconds before scraping |
| 52 | +) |
| 53 | + |
| 54 | +# Scrape with custom wait time |
| 55 | +response = client.scrape( |
| 56 | + website_url="https://example.com", |
| 57 | + wait_ms=5000 |
| 58 | +) |
| 59 | + |
| 60 | +# Markdownify with custom wait time |
| 61 | +response = client.markdownify( |
| 62 | + website_url="https://example.com", |
| 63 | + wait_ms=5000 |
| 64 | +) |
| 65 | +``` |
| 66 | + |
| 67 | +### JavaScript SDK |
| 68 | + |
| 69 | +```javascript |
| 70 | +import { smartScraper, scrape, markdownify } from 'scrapegraph-js'; |
| 71 | + |
| 72 | +const apiKey = 'your-api-key'; |
| 73 | + |
| 74 | +// SmartScraper with custom wait time |
| 75 | +const response = await smartScraper( |
| 76 | + apiKey, |
| 77 | + 'https://example.com', |
| 78 | + 'Extract product information', |
| 79 | + null, // schema |
| 80 | + null, // numberOfScrolls |
| 81 | + null, // totalPages |
| 82 | + null, // cookies |
| 83 | + { waitMs: 5000 } // Wait 5 seconds before scraping |
| 84 | +); |
| 85 | + |
| 86 | +// Scrape with custom wait time |
| 87 | +const scrapeResponse = await scrape(apiKey, 'https://example.com', { |
| 88 | + waitMs: 5000 |
| 89 | +}); |
| 90 | + |
| 91 | +// Markdownify with custom wait time |
| 92 | +const mdResponse = await markdownify(apiKey, 'https://example.com', { |
| 93 | + waitMs: 5000 |
| 94 | +}); |
| 95 | +``` |
| 96 | + |
| 97 | +### cURL |
| 98 | + |
| 99 | +```bash |
| 100 | +curl -X 'POST' \ |
| 101 | + 'https://api.scrapegraphai.com/v1/smartscraper' \ |
| 102 | + -H 'accept: application/json' \ |
| 103 | + -H 'SGAI-APIKEY: your-api-key' \ |
| 104 | + -H 'Content-Type: application/json' \ |
| 105 | + -d '{ |
| 106 | + "website_url": "https://example.com", |
| 107 | + "user_prompt": "Extract product information", |
| 108 | + "wait_ms": 5000 |
| 109 | +}' |
| 110 | +``` |
| 111 | + |
| 112 | +### Async Python SDK |
| 113 | + |
| 114 | +```python |
| 115 | +from scrapegraph_py import AsyncClient |
| 116 | + |
| 117 | +async def scrape_with_wait(): |
| 118 | + client = AsyncClient(api_key="your-api-key") |
| 119 | + |
| 120 | + # SmartScraper with custom wait time |
| 121 | + response = await client.smartscraper( |
| 122 | + website_url="https://example.com", |
| 123 | + user_prompt="Extract product information", |
| 124 | + wait_ms=5000 |
| 125 | + ) |
| 126 | + |
| 127 | + # Markdownify with custom wait time |
| 128 | + response = await client.markdownify( |
| 129 | + website_url="https://example.com", |
| 130 | + wait_ms=5000 |
| 131 | + ) |
| 132 | +``` |
| 133 | + |
| 134 | +## When to Adjust `wait_ms` |
| 135 | + |
| 136 | +### Increase wait time when: |
| 137 | +- The target page loads content dynamically via JavaScript |
| 138 | +- You're scraping a SPA (React, Vue, Angular) that needs time to hydrate |
| 139 | +- The page fetches data from APIs after initial load |
| 140 | +- You're seeing incomplete or empty results with the default wait time |
| 141 | + |
| 142 | +### Decrease wait time when: |
| 143 | +- The target page is static HTML with no dynamic content |
| 144 | +- You want faster scraping for simple pages |
| 145 | +- You're scraping many pages and want to optimize throughput |
| 146 | + |
| 147 | +## Best Practices |
| 148 | + |
| 149 | +1. **Start with the default** - The default value of 3000ms works well for most websites. Only adjust if you're seeing incomplete results. |
| 150 | + |
| 151 | +2. **Test incrementally** - If the default doesn't capture all content, try increasing in 1000ms increments (4000, 5000, etc.) rather than setting a very high value. |
| 152 | + |
| 153 | +3. **Combine with other parameters** - Use `wait_ms` together with `render_heavy_js` for JavaScript-heavy pages: |
| 154 | + |
| 155 | +```python |
| 156 | +response = client.smartscraper( |
| 157 | + website_url="https://heavy-js-site.com", |
| 158 | + user_prompt="Extract all products", |
| 159 | + wait_ms=8000, |
| 160 | + render_heavy_js=True |
| 161 | +) |
| 162 | +``` |
| 163 | + |
| 164 | +4. **Balance speed and completeness** - Higher wait times ensure more content is captured but increase response time and resource usage. |
| 165 | + |
| 166 | +## Troubleshooting |
| 167 | + |
| 168 | +<Accordion title="Content still missing after increasing wait_ms" icon="exclamation-triangle"> |
| 169 | +If increasing `wait_ms` doesn't capture all content: |
| 170 | + |
| 171 | +- Try enabling `render_heavy_js=True` for JavaScript-heavy pages |
| 172 | +- Check if the content requires user interaction (clicks, scrolls) - use `number_of_scrolls` for infinite scroll pages |
| 173 | +- Verify the content isn't behind authentication - use custom headers/cookies |
| 174 | +</Accordion> |
| 175 | + |
| 176 | +<Accordion title="Scraping is too slow" icon="clock"> |
| 177 | +If scraping is taking longer than expected: |
| 178 | + |
| 179 | +- Lower the `wait_ms` value for static pages |
| 180 | +- Use the default (omit the parameter) unless you specifically need a longer wait |
| 181 | +- Consider using async clients for parallel scraping |
| 182 | +</Accordion> |
| 183 | + |
| 184 | +## API Reference |
| 185 | + |
| 186 | +For detailed API documentation, see: |
| 187 | +- [SmartScraper Start Job](/api-reference/endpoint/smartscraper/start) |
| 188 | +- [Markdownify Start Job](/api-reference/endpoint/markdownify/start) |
| 189 | + |
| 190 | +## Support & Resources |
| 191 | + |
| 192 | +<CardGroup cols={2}> |
| 193 | + <Card title="API Reference" icon="book" href="/api-reference/introduction"> |
| 194 | + Detailed API documentation |
| 195 | + </Card> |
| 196 | + <Card title="Dashboard" icon="dashboard" href="/dashboard/overview"> |
| 197 | + Monitor your API usage and credits |
| 198 | + </Card> |
| 199 | + <Card title="Community" icon="discord" href="https://discord.gg/uJN7TYcpNa"> |
| 200 | + Join our Discord community |
| 201 | + </Card> |
| 202 | + <Card title="GitHub" icon="github" href="https://github.com/ScrapeGraphAI"> |
| 203 | + Check out our open-source projects |
| 204 | + </Card> |
| 205 | +</CardGroup> |
| 206 | + |
| 207 | +<Card title="Need Help?" icon="question" href="mailto:support@scrapegraphai.com"> |
| 208 | + Contact our support team for assistance with wait time configuration or any other questions! |
| 209 | +</Card> |
0 commit comments