Magentic-UI: Microsoft’s New Open-Source Tool for Smarter, Safer Web Automation

Microsoft has launched Magentic-UI, an open-source research prototype designed to automate complex web-based tasks while allowing users to retain full control. Built on AutoGen’s Magentic-One system, Magentic-UI introduces a multi-agent framework with five specialized agents that collectively streamline web interactions, code execution, and file operations.

Five-Agent Framework: 

At the heart of Magentic-UI is a coordinated system of agents:

  • Orchestrator: Manages the overall workflow and task execution.
  • WebSurfer: Interacts with live websites by performing actions such as clicking, typing, and file uploading.
  • Coder: Executes Python or shell commands for code-related tasks.
  • FileSurfer: Handles file analysis and format conversions.
  • UserProxy: Maintains human-in-the-loop control, allowing collaboration between the system and the user.

Key Features for Transparent Automation:

One of the defining elements of Magentic-UI is its co-planning interface, which enables users to collaboratively build and adjust step-by-step task plans. Users can easily add, delete, or modify steps and provide follow-up instructions as needed.

To maintain safety and transparency, Magentic-UI includes Action Guards that ensure sensitive actions are only carried out with user approval. It also uses session indicators to signal task progress or when user input is required. For productivity, the platform supports parallel task execution, allowing multiple workflows to run at the same time.

Intelligent Learning and Retrieval:

Magentic-UI also offers plan learning and retrieval, allowing it to learn from previous sessions. Users can manually or automatically reuse task plans, helping streamline repetitive workflows and enhance performance over time.

Cross-Platform and Developer-Friendly:

The system is compatible with macOS, Linux, and Windows (via WSL2), and is containerized using Docker. Installation is supported via pip, and the interface can be accessed through a local port. It also offers integration with Azure services and Ollama AI models for extended capabilities.

Interface Design:

Magentic-UI features a dual-pane interface, including a session navigator and a session workspace. The workspace displays the live task plan alongside a real-time browser view, giving users visibility into the automation process and enabling them to pause or intervene at any point.

A Platform for Research and Interaction:

More than just an automation tool, Magentic-UI serves as a research platform to explore human-agent collaboration. It enables experimentation with intelligent agents in real-world scenarios while ensuring human oversight remains central to the process.

With Magentic-UI, Microsoft is pushing the boundaries of web automation by blending AI-driven efficiency with human guidance, aiming to set a new standard in intelligent, transparent task management.