Parsing Types

The Search API supports three parsing formats to suit different use cases:

plain_text

Converts HTML to clean plain text with preserved line breaks. Scripts, styles, and images are removed. Best for:

  • Text analysis and NLP tasks

  • Content summarization

  • Simple content extraction

Example output:

Latest AI Trends in 2025

The artificial intelligence landscape has evolved dramatically in recent years...

markdown

Converts HTML to markdown format, preserving headings, links, and basic formatting. Best for:

  • Documentation extraction

  • Content migration

  • LLM-friendly input

  • Human-readable output

Example output:

# Latest AI Trends in 2025

The artificial intelligence landscape has evolved dramatically...

## Key Developments
- Generative AI advances
- Autonomous systems

simplified_html

Strips unnecessary attributes and elements while maintaining HTML structure. Best for:

  • Web scraping with structure preservation

  • Custom parsing pipelines

  • Lightweight HTML processing

Example output:

<h1>Latest AI Trends in 2025</h1>
<p>The artificial intelligence landscape...</p>
<ul>
<li>Generative AI advances</li>
<li>Autonomous systems</li>
</ul>

Last updated