Python SDK examples for ScrapingBee users
Python SDK examples for ScrapingBee users
If you are used to ScrapingBee showing Python snippets by default, this guide gives you the nearest Toolkit API equivalent using our Python SDK.
Install the SDK:
pip install toolkitapi
1. Basic page fetch
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.fetch(url="https://toolkitapi.io", output="html")
print(result["status_code"])
print(result["content"][:500])
2. JavaScript rendering
ScrapingBee often enables browser rendering by default. In Toolkit API, turn it on explicitly when you need it.
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.render_page(
url="https://toolkitapi.io/app",
wait_until="networkidle",
output="html",
)
print(result["js_rendered"])
print(result["content"][:500])
3. Wait for a selector
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.render_page(
url="https://toolkitapi.io/product/123",
wait_for=".price",
wait_timeout=15000,
output="clean",
)
print(result["content"])
4. Wait for browser state
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.render_page(
url="https://toolkitapi.io/dashboard",
wait_until="load",
output="text",
)
print(result["content"])
Use one of:
loaddomcontentloadednetworkidle
5. Wait for a fixed amount of time
For scrape responses, the preferred pattern is waiting for a selector or browser state rather than sleeping blindly. If you need an actual render delay for a visual capture, use the Screenshot SDK:
from toolkitapi import Screenshot
with Screenshot(api_key="tk_...") as shot:
png = shot.capture(
url="https://toolkitapi.io",
delay=3000,
format="png",
)
6. Block images, fonts, or stylesheets
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.render_page(
url="https://toolkitapi.io/news",
output="markdown",
block_resources=["image", "stylesheet", "font"],
)
print(result["content"])
7. Remove clutter and ad-like noise
Toolkit API does not expose a separate ad-block toggle in the scrape SDK. The closest equivalent is clean extraction plus resource blocking.
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.clean_content(
url="https://toolkitapi.io/article",
output="clean",
remove=[".promo", ".newsletter-box", ".sticky-banner"],
)
print(result["content"])
8. Return Markdown content
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.extract_markdown(url="https://toolkitapi.io/blog/post")
print(result["content"])
9. Return plain text content
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.page_text(url="https://toolkitapi.io/blog/post")
print(result["content"])
10. JSON response by default
Unlike services that need a special JSON wrapper flag, Toolkit API already returns structured JSON.
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.fetch(
url="https://toolkitapi.io",
output="markdown",
extract={"meta_tags": True, "links": True},
)
print(result.keys())
print(result.get("meta_tags"))
print(result.get("links"))
11. Return source HTML
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.fetch(
url="https://toolkitapi.io",
output="html",
render_js=False,
)
print(result["content"])
If you need both rendered and unrendered views, make one request with render_js=False and a second one with render_js=True.
12. CSS selector extraction
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.css_extract(
url="https://toolkitapi.io/product/123",
render_js=True,
selectors={
"title": "h1",
"price": ".price",
"buy_link": {
"selector": ".buy-now",
"attr": "href",
},
},
)
print(result.get("selectors"))
13. AI extraction with a schema
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.ai_extract(
url="https://toolkitapi.io/product/123",
render_js=True,
prompt="Extract the product details shown on the page.",
schema={
"type": "object",
"properties": {
"name": {"type": "string"},
"price": {"type": "string"},
"availability": {"type": "string"},
},
},
)
print(result.get("ai_extract"))
14. Article extraction
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.extract_article(url="https://toolkitapi.io/blog/post")
print(result.get("article"))
print(result["content"])
15. Metadata extraction
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
meta = scrape.get_meta_tags("https://toolkitapi.io")
preview = scrape.link_preview("https://toolkitapi.io")
links = scrape.get_links("https://toolkitapi.io")
images = scrape.get_images("https://toolkitapi.io")
16. Headers and custom cookies
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
result = scrape.fetch(
url="https://toolkitapi.io/account",
render_js=True,
headers={"Accept-Language": "en-GB,en;q=0.9"},
cookies=[
{
"name": "sessionid",
"value": "abc123",
"domain": ".toolkitapi.io",
}
],
extract={"headers": True},
)
print(result.get("headers"))
17. Session reuse
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
scrape.render_page(
url="https://toolkitapi.io/login",
session_name="shop-session",
)
result = scrape.render_page(
url="https://toolkitapi.io/cart",
session_name="shop-session",
output="text",
)
print(result["content"])
18. Proxy and geolocation-style usage
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
us_result = scrape.fetch(
url="https://toolkitapi.io",
proxy="US",
output="html",
)
dc_result = scrape.fetch(
url="https://toolkitapi.io",
proxy="datacenter",
output="html",
)
19. Sitemap, robots, and crawl
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
sitemap = scrape.parse_sitemap(
"https://toolkitapi.io/sitemap.xml",
limit=100,
discover_links=True,
)
robots = scrape.parse_robots_txt("https://toolkitapi.io")
crawl_job = scrape.crawl(
start_url="https://toolkitapi.io/docs",
max_pages=20,
max_depth=2,
output="markdown",
)
crawl_result = scrape.get_crawl_job(crawl_job["job_id"])
20. PDF extraction
from toolkitapi import Scrape
with Scrape(api_key="tk_...") as scrape:
pdf = scrape.pdf_extract(
url="https://toolkitapi.io/report.pdf",
pages="1-3",
)
print(pdf["text"])
21. Screenshot equivalents
Some ScrapingBee examples are really visual-browser tasks. In Toolkit API, those are better served by the Screenshot SDK.
Full-page screenshot
from toolkitapi import Screenshot
with Screenshot(api_key="tk_...") as shot:
png = shot.capture(
url="https://toolkitapi.io",
full_page=True,
format="png",
)
Screenshot a specific selector
from toolkitapi import Screenshot
with Screenshot(api_key="tk_...") as shot:
image = shot.capture_element(
url="https://toolkitapi.io",
selector=".pricing-table",
format="png",
)
HTML to PDF
from toolkitapi import Screenshot
with Screenshot(api_key="tk_...") as shot:
pdf_bytes = shot.capture_pdf(
url="https://toolkitapi.io/invoice/123",
page_format="A4",
print_background=True,
)
22. Common mapping table
| ScrapingBee idea | Toolkit API SDK |
|---|---|
| Basic HTML fetch | Scrape.fetch(..., output="html") |
| JavaScript rendering | Scrape.render_page(...) |
| Wait for selector | wait_for=".selector" |
| Wait for browser load | wait_until="load" or wait_until="networkidle" |
| Markdown output | Scrape.extract_markdown(...) |
| Text output | Scrape.page_text(...) |
| CSS extraction | Scrape.css_extract(...) |
| AI extraction | Scrape.ai_extract(...) |
| Link, image, meta extraction | get_links, get_images, get_meta_tags, link_preview |
| Sitemap / robots | parse_sitemap, parse_robots_txt |
| Crawl | crawl and get_crawl_job |
| Screenshot / PDF rendering | Screenshot.capture, capture_element, capture_pdf |
23. Features that are not one-to-one today
A few ScrapingBee-specific switches do not have a direct public scrape SDK equivalent yet:
- pure header forwarding mode
- custom upstream proxy passthrough
- target-site POST or PUT forwarding
- special Google-only scrape toggle
- explicit transparent-status toggle
- dedicated scrape usage endpoint in the SDK
- viewport width and height controls in the Scrape class itself
When you need visual browser configuration such as viewport size, element capture, or PDF rendering, use the Screenshot SDK alongside the Scrape SDK.