What is browser automation?
Vision-driven browser automation using Midscene. Operates entirely from screenshots — no DOM or accessibility labels required. Can interact with all visible elements on screen regardless of technology stack. Opens a new browser tab for each target URL via Puppeteer (headless Chrome). Use this skill when the user wants to: - Browse, navigate, or open web pages - Scrape, extract, or collect data from websites - Fill out forms, click buttons, or interact with web elements - Verify, validate, or test frontend UI behavior - Take screenshots of web pages - Automate multi-step web workflows - Run browser automation or check website content Powered by Midscene.js (https://midscenejs.com) Source: web-infra-dev/midscene-skills.