🕸️

Page Text Extractor

Return readable page text or scoped content via the unified scrape endpoint

POST /v1/scrape

Description

Return readable page text or scoped content via the unified scrape endpoint

Parameters

Name	Type	Required	Description
x-api-key	string	optional

How to Use

1

1. Send `url` and `output: "text"` to `/v1/scrape`. 2. Optionally set `selector` to narrow the extraction target. 3. Use `include_links` if you want hyperlinks preserved in text mode. 4. Read the extracted text from the `content` field.

About This Tool

Page Text Extractor is now handled through the unified scrape endpoint. Set `output` to `text` to return readable plain text from a page, or combine it with `selector` to scope the extraction to a particular region such as `main`, `article`, or `#content`.

This is useful when you want search- or NLP-friendly content without Markdown or raw HTML.

Why Use This Tool

Full-page indexing — Build search indexes from plain text
NLP workflows — Feed readable content into classification or sentiment pipelines
Scoped extraction — Focus on a page section rather than the whole body
Change monitoring — Compare text output between site versions

Frequently Asked Questions

How is this different from Markdown mode?

Text mode removes most formatting and is flatter. Markdown mode preserves more structure.

How is this different from clean mode?

`clean` aims for readable content with some structure preserved, while `text` is better for plain-text analysis.

Can I limit it to one area of the page?

Yes — use the `selector` field to scope extraction to a specific CSS selector.

Start using Page Text Extractor now

Get your free API key and make your first request in under a minute.

Get Free API Key View Docs