How to Build AI Agents that can Use any Website
Connecting AI Agents to the Web: A Developer's Journey and the Rise of Computer Use
One major hurdle in AI agent development over the past two years has been reliably granting web access. Consider an AI agent designed to send emails: how do you connect it to Gmail or Outlook? APIs, websites, or autonomous web agents? This article explores various methods.
APIs and SDKs: A Limited Approach
Many developers utilize APIs and SDKs. This offers low latency and robust authentication, but limitations exist:
- API Unavailability: Not all web services provide APIs.
- Documentation Challenges: Outdated or poorly written documentation is common.
- Feature Gaps: APIs often lack the full functionality of their corresponding websites, hindering specific tasks.
Fortunately, several services offer API call libraries:
- Composio: Provides tools for AI agents with strong authentication.
- Langchain tools: A resource for Langchain/graph agents.
- Apify: A vast community-driven API library.
However, for universal web service access, we must move beyond APIs.
Website Interaction: The Human Approach
Reliable AI agent website interaction enables automation of any web-based human task. But how?
Many developers initially use browser testing frameworks like Selenium or Playwright. This approach, however, faces challenges:
- Fragility: Website changes (e.g., A/B testing) easily break scripts.
- Detectability: Test browsers are easily identified and blocked.
- Production Deployment: Hosting browsers, managing authentication, and rotating proxies are complex in production.
To address these issues, we experimented with a Browser SDK that:
- Employs natural language selectors (e.g.,
get_element("find the login button")
) instead of brittle CSS selectors. - Integrates built-in authentication.
- Offers pre-configured remote hosting with built-in rotating proxies to prevent blocking.
This work, now open-source (Dendrite SDK), is no longer under active development but remains available for study and adaptation. Similar alternatives include:
- AgentQL: A Python library.
- Stagehand: A JavaScript/TypeScript library.
Computer Use: The Future of Web AI Agents?
Rich Sutton's "Bitter Lesson" highlights the dominance of generalizable AI solutions scalable with increased compute. Anthropic's Computer Use embodies this principle, allowing LLMs to directly control computers/browsers using mouse and keyboard input, eliminating the need for scripts and API calls. Their approach emphasizes general computer skills over task-specific tools. This aligns perfectly with the Bitter Lesson, suggesting that the most versatile AI agents will directly interact with the web like humans. Early results show high reliability in complex tasks using well-crafted prompts, often enhanced by Anthropic's prompt improver.
Conclusion: Embracing the Future
While APIs remain valuable, the future likely favors Computer Use-like approaches for most AI agents. If an agent can log in and use a website's search function, extracting conclusions from top results, why rely on the entire database via an API? The question for AI developers is whether to embrace this generalizable approach or risk facing the limitations of more specialized methods.
Note: This is my first dev.to post. Feedback on improving future posts is welcome. Questions on AI agents or AI-driven task automation are also encouraged.
The above is the detailed content of How to Build AI Agents that can Use any Website. For more information, please follow other related articles on the PHP Chinese website!

Hot AI Tools

Undresser.AI Undress
AI-powered app for creating realistic nude photos

AI Clothes Remover
Online AI tool for removing clothes from photos.

Undress AI Tool
Undress images for free

Clothoff.io
AI clothes remover

Video Face Swap
Swap faces in any video effortlessly with our completely free AI face swap tool!

Hot Article

Hot Tools

Notepad++7.3.1
Easy-to-use and free code editor

SublimeText3 Chinese version
Chinese version, very easy to use

Zend Studio 13.0.1
Powerful PHP integrated development environment

Dreamweaver CS6
Visual web development tools

SublimeText3 Mac version
God-level code editing software (SublimeText3)

Hot Topics

Solution to permission issues when viewing Python version in Linux terminal When you try to view Python version in Linux terminal, enter python...

How to avoid being detected when using FiddlerEverywhere for man-in-the-middle readings When you use FiddlerEverywhere...

When using Python's pandas library, how to copy whole columns between two DataFrames with different structures is a common problem. Suppose we have two Dats...

How does Uvicorn continuously listen for HTTP requests? Uvicorn is a lightweight web server based on ASGI. One of its core functions is to listen for HTTP requests and proceed...

Fastapi ...

How to teach computer novice programming basics within 10 hours? If you only have 10 hours to teach computer novice some programming knowledge, what would you choose to teach...

Using python in Linux terminal...

Understanding the anti-crawling strategy of Investing.com Many people often try to crawl news data from Investing.com (https://cn.investing.com/news/latest-news)...
