How to Install browser-act CLI

Introduction

If you want to install browser-act and actually use BrowserAct on the same day, the goal is simple: get the CLI installed, confirm the binary works, create or pick a browser, and run one small task end to end. That sounds obvious, but it is where most setup guides go wrong. They stop at "install the package" and never show what to do next. For browser automation, the post-install step matters more than the install step, because the real test is whether the agent can open a browser, keep state, a

Detail

📌Key Takeaways

1Install BrowserAct with uv tool install browser-act-cli --python 3.12, then verify the command before you touch a real workflow.
2The first useful BrowserAct test is read-only: run stealth-extract, inspect output, and confirm the browser layer works.
3Use explicit sessions when the task needs login state, repeated actions, or a browser identity that should not leak into another workflow.

Quick Answer

The simplest way to install BrowserAct is:

uv tool upgrade browser-act-cli --python 3.12 || uv tool install browser-act-cli --python 3.12

Then verify it:

browser-act --help
browser-act browser list

If those commands work, you are past the hard part.

What You Need Before You Start

BrowserAct works best when you already know which browser mode your workflow needs.

Typical options include:

a stealth browser for repeatable agent workflows
a Chrome-based session when you want to reuse local login state
a real browser flow when you need a human to take over

If your task is just content extraction, start with stealth-extract. If your task is logged-in, multi-step work, start with a browser session.

Step 1: Install the CLI

The standard install command is:

uv tool upgrade browser-act-cli --python 3.12 || uv tool install browser-act-cli --python 3.12

Why this matters:

uv tool gives you a clean tool install
the upgrade fallback keeps existing installs current
Python 3.12 matches the workspace convention used in the internal docs

If you are on a machine where browser-act already exists, the upgrade branch is harmless.

Step 2: Verify the Binary

Once the install finishes, check that the command is on your path:

browser-act --help

Then confirm the browser registry is visible:

browser-act browser list

If browser-act --help works but browser-act browser list fails, that usually means the CLI is installed but the local environment is not fully configured yet.

Step 3: Do a Read-Only Smoke Test

The best first test is not login or posting. It is a read-only extraction.

browser-act stealth-extract https://example.com --output ./page.md

That tells you three useful things:

the command runs
the browser layer can fetch a page
the output path and permissions are working

If you want a lighter test, try a plain text page first before moving to a protected site.

Step 4: Open Your First Browser Session

For interactive work, BrowserAct is more useful when the browser session is explicit.

browser-act browser list
browser-act --session my-task browser open <browser-id> https://example.com

Then inspect the page:

browser-act --session my-task eval "document.title"

That tiny loop is the real starting point for browser automation:

choose the browser
attach the session
open the page
inspect state

Once that works, you can build on top of it.

BrowserAct Skills

Give your agent a real browser, then turn the workflow into a Skill.

1. Use browser-act when an agent needs to open, click, scroll, extract, or inspect a live site.
2. Use browser-act-skill-forge when the workflow should become reusable across runs and agents.
3. Keep the operational boundary simple: automate what the user can already do in the browser.

Install browser-act Skill Build with Skill Forge

Step 5: Learn the Three Commands You Will Use Most

Most new users only need three command families at the beginning:

stealth-extract

Use this when you want page content without building a full interaction flow.

browser-act stealth-extract https://example.com --content-type markdown

browser open

Use this when the browser session itself matters.

browser-act --session shop-a browser open <browser-id> https://shop.example.com

eval

Use this when you need to inspect the current DOM state or pull a value from the page.

browser-act --session shop-a eval "document.querySelector('h1')?.textContent"

These three commands cover most tutorial-level workflows without forcing you into a full framework mindset.

Step 6: Pick the Right Browser Mode

BrowserAct is easier to use once you stop thinking of "the browser" as one thing.

Mode	Best for	Why it helps
Stealth browser	Repeatable automated workflows	Keeps session identity stable
Chrome browser	Reusing existing login state	Lets you work with a known profile
Real browser takeover	Tasks that may need human help	Lets the workflow continue after a handoff
Private browser	Disposable or zero-residue tasks	Keeps the task isolated

If you are not sure, start with a read-only stealth task and move to a login-sensitive workflow later.

For a broader product overview before you pick a mode, the AI browser guide explains why browser access matters for agents. If you are evaluating tools, best browser automation for AI agents gives the higher-level comparison.

Common Setup Mistakes

1. Trying to start with the hardest workflow

Do not begin with a live posting flow, a payment page, or a 2FA-heavy dashboard. Start with a page you can safely read.

2. Skipping the browser inventory step

browser-act browser list is not just a utility command. It tells you what the system can actually open. If you skip it, you end up debugging the wrong layer.

3. Treating install as the finish line

The install is only the first checkpoint. The first successful stealth-extract or browser open is the real milestone.

4. Mixing one browser identity across unrelated tasks

If you are going to do repeated work, keep sessions explicit. Reusing one fuzzy browser context across unrelated workflows is how state leaks happen.

A Good First Workflow

If you want a practical starter path, use this sequence:

Install BrowserAct
Run browser-act --help
Run browser-act browser list
Run browser-act stealth-extract on a public page
Open one session with browser-act --session my-task browser open
Inspect the DOM with eval
Move to a login-sensitive workflow only after the first six steps are stable

That sequence gives you a stable foundation before you try anything production-like.

Where to Go After Install

Once the CLI works, the next step depends on your workflow:

Next task	Good next step	Why
Extract content from a protected page	Try `stealth-extract` with markdown output	It keeps the first workflow read-only
Reuse a logged-in browser	Open a named session against a known browser ID	It makes session state explicit
Compare BrowserAct with lower-level tools	Read the BrowserAct vs Playwright guide	It clarifies when to use a workflow layer
Build reusable agent workflows	Browse ClawHub skills	It helps move from one-off commands to repeatable tasks

Why This Tutorial Matters

The point of installing BrowserAct is not to prove you can install a CLI. The point is to get to a browser workflow that can survive real work:

real login state
real page changes
real browser sessions
real handoff when things get blocked

If your automation still breaks at the first browser boundary, the problem is usually not the model. It is the setup.

Conclusion

If you came here to install browser-act, the shortest path is:

uv tool upgrade browser-act-cli --python 3.12 || uv tool install browser-act-cli --python 3.12
browser-act --help
browser-act browser list
browser-act stealth-extract https://example.com --output ./page.md

Once that works, you have a real starting point for browser automation. From there, the next step is to choose the right browser mode and build one workflow you can repeat.

Agent-ready scraping

Two Skills, One Repeatable Browser Workflow

Start with live browser execution when the agent needs to understand a page. Move to Skill Forge when the same scraper should run again without re-exploring the site.

Step 1

Run once with browser-act

Give Codex, Claude Code, Cursor, Windsurf, or another agent a real browser for rendered pages, clicks, scrolling, screenshots, DOM extraction, and network inspection.

Open browser-act Skill

Step 2

Package with Skill Forge

Explore the site once, verify the extraction path, then generate a callable Skill package that other agents can reuse for batch jobs or scheduled workflows.

Open Skill Forge

Discover

Agent opens the target site and learns the working path.

Verify

Fields, pagination, limits, and failure cases are tested.

Reuse

The flow becomes a Skill that future agents can call.