GuidesApril 19, 2026· 7 min read· By Marcus Webb

Monitoring Microsoft 365 Status: Teams, Outlook, SharePoint, and Azure AD

A practical guide to monitoring Microsoft 365 reliability — including Teams, Outlook, SharePoint, OneDrive, and Azure Active Directory. Component-level alerts, outage patterns, and an IT-ops-friendly setup.

Why Microsoft 365 Monitoring Matters Beyond the IT Department

Microsoft 365 sits at the center of most enterprise productivity stacks. A single hour of Teams or Outlook downtime cascades across email communication, internal meetings, file collaboration, and Azure AD-dependent SSO into dozens of other applications. For organizations above 500 employees, a major Microsoft 365 outage typically affects 85–95% of the workforce's ability to do their job.

Microsoft 365 has also had a run of high-impact incidents over the past 18 months: the January 2023 global Teams outage (7+ hours), the October 2024 Outlook/Exchange incident (6 hours, regional), the July 2025 Azure AD degradation (4 hours, cascading into hundreds of M365-connected apps), and several shorter Teams-specific incidents. Average M365 uptime is strong (99.87% in Q1 2026), but when incidents happen, they are broad.

The Component Tree You Should Actually Monitor

Microsoft publishes status for each M365 service independently via the Microsoft 365 Service Health Dashboard and Admin Center. The components that matter most for most organizations: Exchange Online (email), Microsoft Teams (chat and meetings), SharePoint Online (document libraries, intranet), OneDrive for Business (file storage and sync), Azure Active Directory (authentication), Microsoft 365 Apps (desktop Office applications and activation), and Microsoft Defender for Office 365 (email security, phishing protection).

Subscribe to each in PulsAPI. Azure AD deserves special attention: because it is the authentication layer for M365 and for most SSO-connected SaaS tools, an Azure AD outage cascades to almost every other tool your employees use. In the July 2025 incident, Azure AD degradation made Slack, Salesforce, Workday, Zoom, and Notion effectively unavailable for organizations using Azure AD as their identity provider — even though those services themselves were healthy.

Tenant-level status matters too. Microsoft sometimes publishes incident information that's scoped to specific regions (North America, EMEA, APAC) or to specific customer cohorts. If your tenant is on a less common cluster, the global M365 status page may show green while your users experience degradation. Running end-to-end probes against your actual tenant (authenticating via Azure AD, sending a test email, writing to OneDrive) catches these tenant-specific incidents.

Alert Routing for IT Operations Teams

The right alert-routing strategy for M365 differs from the typical engineering on-call setup. M365 incidents affect non-technical employees, so the information flow needs to reach the helpdesk and internal communications team, not just a SRE rotation.

A typical PulsAPI configuration for IT ops: Azure AD, Exchange Online, and Teams degradations trigger a high-priority alert to the helpdesk Slack or Teams channel and a PagerDuty page to the IT Operations on-call. SharePoint and OneDrive degradations trigger a standard-priority alert to the helpdesk channel. Microsoft Defender alerts route to the security team. Microsoft 365 Apps activation issues route to the desktop support team.

The goal of this routing is that by the time employees start filing helpdesk tickets — which happens within minutes of a major M365 incident — the helpdesk team already has context, a draft internal communication ready to send, and an estimated resolution window from Microsoft's published status. Teams that have this flow in place consistently see 60–80% fewer inbound tickets during M365 incidents because the internal communication goes out before the tickets do.

Preparing for the Next Major M365 Incident

Three practices significantly improve M365 incident response. First, maintain an internal 'M365 status' page that aggregates Microsoft's status with your own probe data and any tenant-specific information. Host it on infrastructure that is not M365-dependent (a separate domain, not on SharePoint) so it remains accessible during an incident.

Second, maintain an out-of-band communication channel — typically a dedicated Slack workspace, Discord, or SMS list — that your core response team can use when Teams and Outlook are down. You will not be able to coordinate an M365 incident response inside Teams.

Third, document your SSO dependencies. List every SaaS application in your stack that relies on Azure AD for authentication. When Azure AD degrades, you immediately know which tools are likely affected and can communicate proactively to the business rather than fielding questions one application at a time. PulsAPI's Stack Impact Intelligence maps this automatically for subscribed services.

About the Author

Marcus WebbHead of Product

Marcus leads product at PulsAPI, where he focuses on making operational awareness effortless for engineering teams. Previously at Datadog and PagerDuty.

Start monitoring your stack

Aggregate real-time operational data from every service your stack depends on into a single dashboard. Free for up to 10 services.

Create Free Dashboard

GuidesHow to Set Up Real-Time Status Monitoring for Your Entire AWS Infrastructure7 min read EngineeringWhy Unified Status Monitoring Matters for Engineering Teams6 min read EngineeringCloud Outage Report: Which Services Had the Most Downtime in Q1 20268 min read

Back to all articles