> For the complete documentation index, see [llms.txt](https://docs.lleverage.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.lleverage.ai/canvas-actions/ai/browser-agent.md).

# Browser Agent

**Browser Agent** is an AI-powered node that automates web browser interactions. It can navigate websites, click through processes, and complete tasks just like a human user would, then return the results to your workflow.

> ⚠️ **Premium Feature:** Browser Agent is only available on Pro and Enterprise plans. This feature is not included in free accounts.

<figure><img src="/files/OaON3pJeMNv5vj9Att2v" alt="" width="375"><figcaption><p>Successfully Ran Browser Agent Action</p></figcaption></figure>

### Key Features

* AI-powered web browser automation
* Real-time interaction with websites
* Human-like navigation and clicking
* Task completion with intelligent problem-solving
* Recording playback of browser sessions
* Session viewing capabilities

### How to Add a Browser Agent Action

1. Open the Add Action menu using one of three methods:
   * Click the "Add Action" button in the top left corner
   * Click on a connection circle on an existing action card
   * Click and drag from one action to create a connection
2. Navigate to the **Tools** tab in the action menu
3. Find the **AI** subcategory under Tools
4. Click on **Browser Agent** to add it to your canvas

> 💡 **Tip:** Browser Agent actions will be automatically numbered if you add multiple instances (Browser Agent 1, Browser Agent 2, etc.)

### How to Configure the Browser Agent Action Card

1. Locate the Browser Agent card on your canvas which includes:
   * Header with the action name (which can be renamed)
   * Variable chip in the top left corner for referencing outputs
   * Controls for expanding/collapsing, running the agent, and additional options
   * Connection points for inputs and outputs
2. Configure the **Task** input field:
   * Describe what you want the browser agent to accomplish
   * Be specific about the website and actions you want performed
   * Use clear, step-by-step language for complex tasks
3. The AI will interpret your task and execute the browser interactions automatically

### How to Write Effective Browser Agent Tasks

1. **Be specific about the target website**:
   * Example: "Go to leverage.ai and describe what Lleverage does"
   * Example: "Navigate to Amazon.com and search for wireless headphones"
2. **Describe the desired outcome clearly**:
   * What information you want extracted
   * What actions should be completed
   * What format you want the results in
3. **Use action-oriented language**:
   * "Go to \[website]"
   * "Click on \[element]"
   * "Search for \[term]"
   * "Extract \[information]"
   * "Fill out \[form]"
4. **Combine multiple steps in one task**:
   * Example: "Go to LinkedIn, search for companies in the tech industry, and list the top 5 results"

### How to Run and Monitor Browser Agent Tasks

1. Click the play button on the Browser Agent action card to start execution
2. **Be patient during execution**:
   * Browser Agent tasks take longer than typical AI actions
   * The agent works in real-time, simulating human interactions
   * Processing time varies based on task complexity
3. **Monitor the progress**:
   * The agent will work through each step systematically
   * It can handle unexpected situations and adapt as needed
   * Some complex interactions may require additional processing time

> ⚠️ **Warning:** Browser Agent tasks can take significantly longer than other workflow actions due to real-time web interaction requirements.

### How to Review Browser Agent Results

1. **View the Output**:
   * The agent will return the requested information or confirmation of completed actions
   * Results appear in the standard output format for use in subsequent workflow steps
2. **Access Session Recording**:
   * After execution, two new buttons appear on the Browser Agent card:
     * **View Recording** - Opens a popup showing the step-by-step browser interactions
     * **Open Session** - Opens the browser session in a new window for review
3. **Analyze the Process**:
   * Use the recording to understand how the agent navigated the task
   * Review tab structure and interaction patterns
   * Identify any areas for task optimization

### Best Practices

* **Start with simple tasks** to understand how the agent interprets instructions
* **Be explicit about required information** to ensure accurate extraction
* **Test tasks thoroughly** before deploying in production workflows
* **Use clear, unambiguous language** in task descriptions
* **Consider timeout implications** for time-sensitive workflows
* **Review recordings** to optimize future task descriptions

### Troubleshooting

* **If the agent gets stuck**: Review your task description for clarity and specificity
* **For failed executions**: Check the session recording to identify where issues occurred
* **For slow performance**: Consider breaking complex tasks into smaller, focused steps
* **For unexpected results**: Refine your task instructions with more specific requirements

### Output Format

The Browser Agent action outputs:

* Extracted information or data as requested in the task
* Confirmation of completed actions
* Structured results ready for use in subsequent workflow steps
* Access to session recordings for process review

> 💡 **Tip:** Browser Agent results can be used as variables in other workflow actions, making it perfect for automated data collection, form filling, and web-based research tasks.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.lleverage.ai/canvas-actions/ai/browser-agent.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.