Markdown Format

Markdown Support

Extract clean, formatted content from any webpage as markdown - perfect for AI applications, content analysis, and text processing pipelines.

The Web API now supports markdown extraction, providing clean, structured text content without HTML tags or formatting noise. This feature is available in two modes depending on your use case:

  1. Pure Markdown Response - Returns only markdown content

  2. Markdown with Structured Data - Returns markdown alongside HTML and parsed data

Benefits

  • Clean Content for LLMs: Markdown provides optimal formatting for language models and AI applications

  • Simplified Processing: No need to parse HTML - get readable content directly

  • Flexible Output Options: Choose between pure markdown or markdown alongside structured data

  • Reduced Payload Size: Markdown responses are 60-80% smaller than HTML

Request Options

Parameter
Required
Description

foramt

Optional (default = JSON)

Enum: JSON | HTML | JSON-LINES | RAW | MARKDOWN - Set to MARKDOWN to receive pure markdown response (cannot be used with other parsing options)

markdown

Optional (default = false)

Boolean | Set to true to include markdown field in standard JSON response alongside HTML and parsed data

Usage Notes

  • Use format=markdown for text-only workflows where you need just the content

  • Use markdown=true when you need both markdown and other response data, html and parsed data,

  • JavaScript-rendered content requires render=true for accurate markdown extraction

Example Request - Pure Markdown

Pure Markdown Response

Example Request - Markdown with Structured Data

Markdown with Structured Data Response

Last updated