Page Interactions

Nimble Labs Beta Feature

Page Interactions allow users to perform operations on a webpage before the web data is collected and returned including clicking, typing, scrolling, and more.

This is useful (and sometimes necessary) in a variety of situations, such as one-page applications that use lazy loading and require scrolling in order to load the desired data. Another example includes E-commerce websites that require button clicks or zip code input to display a product price.

Currently, Page Interactions are synchronous, so operations are run sequentially one by one, and the overall sequence is limited to a maximum timeout of 120 seconds. Page Interactions are supported by real-time, asynchronous, and batch requests.

Using Page Interactions

To perform page interactions, set render to true and add a render_flow argument with a JSON list of the operations you’d like to perform, as in the example below:

curl -X POST 'https://api.webit.live/api/v1/realtime/web' \
--header 'Authorization: Basic <credential string>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://www.example.com",
    "render": true,
    "format": "json",
    "country": "US",
    "parse": true,
    "render_flow": [
        {
            "wait": {
                "delay": 500
            }
        }
    ]
}'

Page Interactions will not function if rendering is not enabled in the request. Please be sure to set render:true in order for interactions to function correctly.

Response

When a request with Page Interactions is completed, an additional property named render:true is included that contains a log detailing the result of each interaction. For each interaction, the following fields are present:

  • Name – the interaction type (eg. scroll, wait_and_click, wait_and_type, etc).

  • Result – some operations may return information. If there is information to return, result will contain the returned value, otherwise result will be true.

  • Error – this field is only present if an error occurs, and details the type of error encountered.

  • Status – the status of the interaction. Interactions can have one of four statuses:

StatusDescription

no-run

the interaction was not performed.

in-progress

The interaction was in progress when the global 120-second timeout was triggered.

done

the interaction finished successfully.

error

the interaction encountered an error.

The above request would produce the following response:

{
    "status": "success",
    "html_content":"...",
    "render_flow": {
        "success": true,
        "results": [
            {
                "name": "wait",
                "status": "done",
                "result": true
            }
        ]
    }
}

Chaining Page Interactions

Multiple interactions can be chained together and performed sequentially. To do this, simply add additional steps to the render:true property. When chaining interactions, it’s important to consider the maximum overall execution time of all the requested interactions.

Each request has a maximum timeout of 120 seconds for the overall execution, which includes all render flow interactions such as delays, clicks, scrolls, and timeouts if and when they occur.

Additionally, if any particular interaction encounters an error, interactions that come after it will not be executed. The chain is halted, and the data from the webpage is returned normally.

Below is an example of a request with several interactions chained together in a sequence:

curl -X POST 'https://api.webit.live/api/v1/realtime/web' \
--header 'Authorization: Basic <credential string>' \
--header 'Content-Type: application/json' \
--data-raw '{
    "url": "https://www.example.com",
    "render": true,
    "format": "json",
    "country": "US",
    "parse": true,
    "render_flow": [
        {
            "wait_and_type": {
                "selector": "input[type='\''search'\'']",
                "timeout": 2000,
                "value": "eggplant",
                "click_on_element": true
            }
        },
        {
            "wait_and_click": {
                "selector": "#__next > div:nth-child(1) > div > span > header > form > div > button.absolute.bn.br-100.bg-gold.h3.w3 > i",
                "timeout": 2000
            }
        },
        {
            "wait_and_click": {
                "selector": "#maincontent > main > div > div:nth-child(2) section:nth-child(9) > div > h3 > button",
                "timeout": 5000,
                "scroll": true
            }
        },
        {
            "wait_and_click": {
                "selector": "input[name='\''Organic'\'']",
                "timeout": 2000
            }
        },
        {
            "wait": {
                "delay": 5000
            }
        }
    ]
}'

When chaining interactions, the response will list a success/failed status for each interaction:

{
    "status": "success",
    "html_content":"...",
    "render_flow": {
        "success": true,
        "results": [
            {
                "name": "wait_and_type",
                "status": "done",
                "result": true
            },
            {
                "name": "wait_and_click",
                "status": "done",
                "result": true
            },
            {
                "name": "wait_and_click",
                "status": "done",
                "result": true
            },
            {
                "name": "wait_and_click",
                "status": "done",
                "result": true
            },
            {
                "name": "wait",
                "status": "done",
                "result": true
            }
        ]
    }
}

Failure Handling

The most common errors that are likely to occur include execution errors or timeouts. Most interaction functions include an optional timeout property that allows users to cap their execution time.

If an error is encountered in a chain of interactions, interactions that come after the failed interaction will not be executed. The data is collected from the web resource normally, and returned in accordance with the parameters of the associated request.

Additionally, the error encountered is returned to the user for debugging purposes. In the below example, the two initial interactions were executed successfully, followed by a failed interaction, and finally an interaction that was not executed because of the previously failed interaction:

{
    "status": "success",
    "html_content":"...",
    "render_flow": {
        "success": false,
        "results": [
            {
                "name": "wait_and_type",
                "status": "done",
                "result": true
            },
            {
                "name": "wait_and_click",
                "status": "done",
                "result": true
            },
            {
                "name": "scroll",
                "status": "error",
                "error": "timeout"
            },
            {
                "name": "wait",
                "status": "no-run"
            }
        ]
    }
}

Operation Reference

The table below lists all of the currently supported operations, as well as their types, default values, examples, and more.

NameFunctionTypeRequiredMinMaxDefaultExampleDescription

wait

delay (ms)

number

no

0

♾️

1000

Pause for a period of x ms

wait_for

selectors

array<string>

yes

[”body”,”id”]

Wait until the listed selector(s) have loaded.

timeout (ms)

number

no

0

♾️

1000

2000

Wait for x ms before throwing a timeout exception

wait_and_click

selector

string

yes

“#element-id”

The selector for the desired element

timeout (ms)

number

no

0

♾️

1000

2000

Wait for x ms before throwing a timeout exception

delay (ms)

number

no

0

♾️

0

1000

An artificial delay added at the end of the operation.

scroll

bool

no

false

false

When true, scroll the selected element into the visible area if it is not already visible.

visible

bool

no

true

true

wait for the element to be present in the DOM and to be visible, i.e. to not have display: none or visibility: hidden CSS properties.

wait_and_type

selector

string

yes

“#element-id”

The selector for the desired element

timeout (ms)

number

no

0

♾️

1000

2000

Wait for x ms before throwing a timeout exception

delay (ms)

number

no

0

♾️

0

1000

An artificial delay added at the end of the operation.

visible

bool

no

true

true

wait for the element to be present in the DOM and to be visible, i.e. to not have display: none or visibility: hidden CSS properties.

click_on_element

bool

no

false

true

click on the element before typing.

value

string

yes

“any text”

Text to input.

scroll_to

selector

string

yes

“#element-id”

The selector for the desired element

visible

bool

no

true

true

wait for the element to be present in the DOM and to be visible, i.e. to not have display: none or visibility: hidden CSS properties.

scroll

x (px)

number

no

0

♾️

0

50

x-axis position to scroll to

y (px)

number

no

0

♾️

0

100

x-axis position to scroll to

timeout (ms)

number

no

0

♾️

1000

2000

Wait for x ms before throwing a timeout exception

Last updated