LogoLogo
Nimble HomeLoginCreate an Account
  • Home
  • Quick Start Tutorials
    • Tutorial Library
      • Track SEO and SEM Ranking
      • Reddit as a Guerilla Marketing Strategy
  • Nimble Platform
    • Nimble Platform Overview
    • Online Pipelines
      • Supermarkets
        • ASDA
        • Tesco Groceries
        • Sainsbury’s
        • Morrisons
      • eCommerce
      • Restaurants
        • Yelp
        • Tabelog
        • Uber Eats Japan
        • Demaecan
        • Uber Eats US
      • Real Estate
        • Zillow
  • Nimble SDK
    • SDK Overview
    • Web API
      • Web API Overview
      • API Product Specs
      • Nimble Web API Quick Start Guide
        • Introduction
        • Nimble APIs Authentication
        • Real-time URL request
        • Delivery methods
        • Batch processing
        • Response codes
        • FAQs
      • Nimble Web API Functions
        • Realtime, Async & Batch Request
        • Geo Location Targeting
        • Javascript Rendering
        • Page Interaction
          • Wait (delay)
          • Wait for Selector
          • Wait and Click
          • Wait and Type
          • Scroll
          • Scroll to
          • Infinite Scrolling
          • Capturing Screenshots
          • Collecting Cookies
          • Executing HTTP Requests
          • Operation Reference
        • Network Capture
          • Filter by URL Matching
          • Filter By Resource Type
            • Real World Demo: Capturing Ajax Requests
          • Wait for Requests
          • Capturing XHR without Rendering
          • Operation Reference
        • Data Parsing
          • Parsing Templates
          • Merge Dynamic Parser
        • Custom Headers & Cookies
        • General Params
      • Vertical Endpoints
        • SERP API
          • Real-time search request
          • Getting local data
          • Browsing SERP pagination
          • Delivery methods
          • Batch Processing
          • Endpoints and Response Codes
        • Maps API
          • Searching for places
          • Getting information about a place
          • Collecting reviews
          • Delivery methods
          • Batch processing
          • Endpoints and Response Codes
    • Web Retrieval API
      • Web Retrieval API Overview
    • Proxy API
      • Nimble IP Overview
      • Nimble IP Quick Start Guide
        • Send a request
        • Nimble IP Autentication
        • Geotargeting and session control
        • Response codes
        • FAQs
      • Nimble IP Functions
        • Country/state/city geotargeting
        • Controlling IP rotation
        • Geo-sessions: longer, stickier, more accurate sessions
        • Using IPv6 Proxies
        • Response Codes
      • Integration Guides
        • Incogniton
        • Kameleo
        • VMLogin
        • AdsPower
        • FoxyProxy
        • Android
        • Multilogin
        • iOS
        • SwitchyOmega
        • Windows
        • macOS
        • Proxifier
        • MuLogin
        • Puppeteer
        • Selenium
        • Scrapy
    • Client Libraries
      • Installation
      • Quick Start
    • LangChain Integration
  • Technologies
    • Browserless Drivers
      • API Driver-Based Pricing
    • IP Optimization Models
    • AI Parsing Skills
  • Management Tools
    • Nimble Dashboard
      • Exploring the User Dashboard
      • Managing Pipelines
      • Reporting and Analytics
      • Account Settings
      • Experimenting with the Playground
      • Billing and history
    • Nimble Admin API
      • Admin API basics
      • Admin API reference
  • General
    • Onboarding Guide
      • Getting started with Nimble's User Dashboard
      • Nimble IP Basics
      • Nimble API Basics
      • Helpful Resources
    • FAQs
      • Account Settings and Security
      • Billing and Pricing
      • Tools and Integrations
      • Nimble API
      • Nimble IP
    • Deprecated APIs
      • E-commerce API
        • E-commerce API Authentication
        • Real-time product request
        • Real-time product search request
        • Delivery methods
        • Batch Processing
        • Endpoints and Response Codes
      • Unlocker Proxy Overview
        • Unlocker Proxy Quick Start Guide
          • Real-time request
          • FAQs
        • Unlocker Proxy FAQ
Powered by GitBook
On this page
  • What?
  • Why?
  • Which tool is right for me?
  • Additional Information
  • Request Option
  • Enable Parsing
  • Data Formatting
  • Next Steps
  1. Nimble SDK
  2. Web API
  3. Nimble Web API Functions

Data Parsing

PreviousOperation ReferenceNextParsing Templates

Last updated 7 months ago

What?

Transforming raw HTML into clean, accurate, and useable data is no easy task. With each website having its own unique layout and unpredictable updates, it's important to have a diverse set of powerful tools to ensure consistent and accurate data extraction.

Nimble's Web API comes built-in with three tools to help you effectively extract the key data you need easily, reliably, and at scale.

Let's look at each one in more detail and examine some examples to understand when it's right to use each one.

Parsing Templates

Nimble Parsing Templates provide users with an easy to use, surgical parsing tool for parsing with a high degree of control and specificity. Parsing Templates provide a set of functions (called Types, Extractors, and Objects) that users can harness to accurately parse the exact web data they want.

Parsing Templates offer similar levels of accuracy and freedom to Beautiful Soup, but with significantly less complexity. Their goal is to help users fill gaps left by automated systems when collecting data from unorthodox or highly-specialized sources.

However, unlike Beautiful Soup, Parsing Templates have a much lower learning curve, and operate seamlessly alongside AI Parsing Skills, allowing for them to be used in parallel or independently from Nimble's other parsing solutions.

Merge Dynamic Parser

The Merge Dynamics feature enables users to combine Nimble's AI-powered parsing with their own custom parsing logic into a single, unified response.

This allows for a highly customizable and flexible approach to data extraction, where the precision and automation of AI parsing can be enhanced or tailored by incorporating specific user-defined parsing rules.

The result is a comprehensive and cohesive data set that aligns perfectly with your unique requirements.

This feature is particularly useful for scenarios where standard AI parsing might need refinement or additional context provided by custom logic, ensuring that the final output meets your exact needs.

learn more

Nimble AI Parsing Skills (Beta)

Nimble AI Parsing Skills empower engineers to easily parse web data from any webpage into accurate, consistent JSON structures. By combining HTML-trained LLMs with classical parsing techniques, AI Parsing Skills make scalable parsing of any quantity and variety of web pages in real-time possible.

  • Automatic mode: in automatic mode, no user input is needed at all. , and our system does the rest. Behind the scenes, Nimble uses our built-in collection of generic parsing skills to extract data from webpages. Results are generally good, but may vary from page to page.

  • Skills Mode (coming soon) : In Skills mode, the user creates a simple, plain-English schema that guides the creation of custom parsers - also called Skills.

Why?

  1. Enhanced Accuracy: LLMs are adept at understanding the context and structure of web content, enabling them to parse complex web data more accurately than traditional parsing tools. This results in higher-quality data extraction, particularly from sophisticated web pages including site stricture changes.

  2. Scalability: AI models can handle a wide range of website layouts and structures without needing specific rules for each site. This scalability makes it easier to process data from a broad spectrum of sources with minimal setup time.

  3. Continuity: Unlike traditional parsers that require pre-defined schemas and are often brittle to changes in web page design, AI-based parsing adapts to changes in webpage layouts and content schemes, reducing the need for frequent manual updates.

  4. Efficiency: By automating the structuring of data into usable formats, this feature saves significant time and effort that would otherwise be spent on manual data cleaning and organization. This allows users to focus on analysis and insights rather than data preprocessing.

  5. Integration Readiness: The structured data output from AI Parsing is readily integrable into various data analysis tools and applications, enhancing the workflow from data collection to actionable insights.

Which tool is right for me?

Each tool has its own unique advantages and disadvantages. The below table should help clarify the features of each individual tool, and help you decide which is right for you. It's also important to remember that these tools can operate in parallel within each request, and we encourage users to try out each one and experiment to get the best results

AI Parsing Skills
Parsing Templates
Merge Dynamic

Fully-automated

Manual control

Auto-healing

Easy to use

CSS Selector targeting

Additional Information

  • Supported by realtime (except cloud delivery), asynchronous, and batch requests.

Request Option

Enable Parsing

To run Nimble API request that requires data parsing (HTML -> JSON), the user simply needs to include the parse parameter to true. Behind the scenes, the Nimble AI Parser will dynamically parse the webpage HTML content into structured data format (JSON).

Data Formatting

To set Nimble API data response format as JSON (instead of HTML), the user simply needs to include the parameter "format": JSON in the body of the request. Actually this is the default value of format param so the user don't need manually set it, but this is configurable.

Parameter
Required
Description

parse

Optional (default = false)

Enum: true | false - True - the page's content will be parsed and returned in a JSON format. False - Response will include page headers and raw data (without parsing).

format

Optional (default = JSON)

Enum: JSON | HTML - The data response format. HTML - in case of error, returns JSON with error message.

When setting parse as true, the format must be set to JSON (which is the default format)

Example Request

  • Actually no need as JSON is the default value of format

curl -X POST 'https://api.webit.live/api/v1/realtime/web' \
--header 'Authorization: Basic <credential string>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://www.google.com",
    "parse": true,
    "format": "json"
}'

Next Steps

Dive into the full guides for each of Nimble's parsing solutions:

Supported Endpoints: , , and .

Not supported Endpoints:

Web
SERP
Maps
eCommerce
Social

AI Parsing Skills - Coming Soon

Parsing Templates

Merge Dynamics

Learn more about Parsing Templates ->
Simply enable parsing
✅
❌
❌
✅
✅
✅
✅
❌
❌
✅
✅
✅
✅
✅
✅