I used the Openai operator’s rival browser use and it is impressive but takes some technical skill to use

Openai showed off its first AI agent, Operator, last week, but it already has an eerie competitor offering an AI tool called browser use that can perform tasks online for you. This computer-use agent (CUA) can write, search, click buttons and copy information from websites without you needing to touch the mouse or keyboard and without the $ 200 month Chatgpt Pro subscription.

Browser use is actually free, at least if you are willing and able to spend some time playing with API code. I’m not a lot of code elite, but I naively thought I knew enough about how GitHub is working to use the API version. Hours of sifting through documentation, fine -tuning settings and seeing examples later, I decided that this would need a deeper level of coding knowledge than I have, so much less the average person who reviews the Internet.

Fortunately for me, browser just debuted a cloud version that uses Openai’s own GPT-4o model. It cuts a lot of the heavy technical lift and streamline things to a more well -known chat format without any extra work. It has its limitations and costs $ 30, but after my excellent API root it felt like a bargain. And even in this (still obviously unfinished) form, you still need to put some effort into engineering and negotiate how AI works. The most limiting aspect is that you can only issue a prompt before starting a new interaction. Despite the text field, you can’t answer what AI is doing and refine your request.

Purchase of AI

(Image Credit: Screenshots from Browser Use)

With everything that is set up, I used the browser through a few real tests. First up was a price comparison task. I went into the prompt: “Navigate to Amazon, Best Buy and Walmart and search for ‘MacBook Air M2’. Extract the product name, price and stock accessibility from the first five results at each place. Compare prices and identify the lowest one.

It did the job well, though it found no hidden discounts or coupons. The fact that I could automate price tracking across multiple sites was still quite exciting. That said, a continued problem for any agent like this comes when a site wants to check that you are human. Browser use has a button that allows you to take over whenever you want, but it will also warn you when needed. You can prove your humanity and then hit CV to let AI take over again.

(Image Credit: Screenshots from Browser Use)

Fly AI

(Image Credit: Screenshots from Browser Use)

Next came a travel planning task with prompt: “Search for a return flight flight from New York to London on December 15, 2025 at British Air. Choose the cheapest option and excerpt details, including price, airline and departure time.”

Browser -Use Delivered, which draws a British Airways flight to $ 750, complete with departure time and other relevant details. This can be incredibly useful for people who book a lot of travel, especially if you automate it to check for price drop regularly.

Fair weather ai friend

(Image Credit: Screenshots from Browser Use)

Finally, I tested weather prediction and planning with prompt: “Check 7-day weather forecast for New York City on Weather.com and summarize temperature trends, rain chances and any serious weather warnings and then suggest how to dress to it. “

The weather is one of the most popular uses to voice assistants, so I would see how AI handled a more complex request in this vein. It did very well, not only to extract the information from the prognosis, but suggest which days to wear a light coat and what days to “insulate with a warm coat and scarf as it will be cool with low rain chance. “

Power Trip

The most important difference between the two is accessibility. Browser farming is like a Swiss army knife for developers. It has the flexibility to do almost everything in a browser, but you need to know how to use the tools. You can dig in the code, adjust it and shape it to your exact needs. If a feature is missing, nothing will prevent you from adding it. Browser farming, which is open source, also has an active developer community that constantly refines it. This means that if you encounter questions, there are forums and github discussions where you can probably find answers.

Openai’s operator is on the other hand like hiring a butler. It does a lot for you, but within certain restrictions. The strength of the operator is its integration with Openai’s wider AI ecosystem, giving it access to proprietary models that can make more nuanced decisions. However, you are locked in Openai’s pricing structure and limited customization options.

Browser use is not perfect. Even its cloud version requires some patience. You need to create your requests carefully, stiff yourself for troubleshooting and occasionally start over. The cloud version may be some of this later, but so far the limits are set to be unable to edit or respond within the conversation harsh limits to its otherwise flexible nature.

And the speed can also be frustrating. Check a video of my second test; This is four times the speed of the actual process.

Right now, browser use is best suited for people who enjoy adhesive, such as developers, researchers and automation geeks who don’t mind getting their hands dirty. If you are willing to make an effort, you get a powerful, flexible tool that costs far less than its competition.

But if you would rather not spend your weekend on wrestling with configuration files, the operator may be the more forgiving option. Either way, web automation is ready for a boom.

Must Read

Leave a Comment Cancel Reply