
RPA Teacher. Follow along👆 35,000+ YouTube Subscribers. Microsoft MVP. 2 x UiPath MVP.
In a recent YouTube video, Anders Jensen [MVP] demonstrates how Microsoft 365 Copilot now runs GPT and Claude side by side. He frames this as a major shift for enterprise workflows, showing how multiple models can work together inside the same Copilot experience to speed up research and reduce errors. Consequently, viewers get a practical walkthrough of new features and real-world examples aimed at business use cases like market analysis and automation.
Jensen highlights the new multi-model tools called Critique and Council, and he explains how they change the way teams verify and refine outputs. Moreover, the video demonstrates both parallel comparisons and automated review paths, which make it easier to judge model strengths and pick the best output. Therefore, the clip is useful for IT leaders, knowledge workers, and anyone exploring AI-assisted research in Microsoft 365.
At the core, the setup embeds multiple large language models into Copilot so they can either draft and review or run simultaneously on the same prompt. For example, Critique uses GPT as the primary drafter while Claude serves as a reviewer that checks for accuracy, citations, and potential hallucinations. Meanwhile, Council runs both models in parallel and then applies a third judge model to compare outputs and highlight agreements and differences.
This architecture gives teams clear visibility into where models align and where they diverge, which helps users choose the right passage or refine prompts further. In practice, Jensen shows that one model may produce a concise result while the other offers a more detailed report, so side-by-side views accelerate decision making. Consequently, the system supports more nuanced research than single-model approaches typically allow.
Jensen explains that organizations must enable the Frontier early-access program and activate Anthropic models in Copilot settings to use the multi-model features. Administrators therefore need to configure tenant controls and enable model selection before users can switch between GPT and Claude in the Copilot Researcher interface. Once enabled, users can pick models from a dropdown, run side-by-side comparisons, or engage automated review modes for everyday tasks like financial planning or marketing analysis.
He also shows how Copilot Cowork—which leans on Claude—can execute multi-step tasks across Microsoft 365 apps, such as scheduling, document updates, or compiling reports. Therefore, the toolchain ties model outputs back to user data through Work IQ, preserving data context while helping teams automate follow-up steps. As a result, the video paints a realistic picture of how these tools fit into daily workflows rather than remaining experimental features.
The multi-model approach brings clear advantages, including reduced hallucinations and improved accuracy when two models cross-check one another. For instance, Jensen notes measurable improvements on research benchmarks when using a critique-style review chain, which can increase trust in outputs for regulated industries. However, teams must weigh these gains against added complexity in management and potential increases in compute costs when running multiple models concurrently.
Furthermore, side-by-side comparisons grant transparency but can also create decision friction when models disagree, requiring human judgement to choose or reconcile outputs. While this adds a valuable layer of scrutiny, it also demands training and governance so users interpret differences correctly and avoid over-reliance on any single answer. Therefore, organizations should balance the desire for accuracy with operational factors such as cost, latency, and user readiness.
Jensen points to practical challenges like deployment complexity, model licensing, and the need for strong admin controls to align AI behavior with enterprise policies. In addition, organizations must plan for compliance, data protection, and audit trails, because multi-model setups can increase the surface area for governance. Consequently, IT teams should prepare clear policies for when to trust automated reviews and how to log decisions for future audits.
Another challenge involves handling disagreements between GPT and Claude; while differences can reveal model strengths, they also require human adjudication and updated prompts to converge on reliable answers. Training users to spot meaningful divergences and to craft follow-up prompts is therefore essential, and leaders should expect an initial learning curve. Overall, Jensen’s video makes it clear that the multi-model path offers powerful benefits, but it requires thoughtful implementation and ongoing governance to realize those gains effectively.
GPT vs Claude Microsoft 365 Copilot, Microsoft 365 Copilot GPT Claude comparison, GPT and Claude side-by-side Copilot, Copilot AI comparison GPT Claude, Microsoft Copilot GPT Claude review, 365 Copilot GPT Claude features, GPT Claude integration in 365 Copilot, Best AI copilot for Microsoft 365