Handling Graph API Throttling in SPFx Solutions

Your SPFx web part works perfectly in your dev tenant with one user.
In production, with 200 people loading the same dashboard at 9am on a Monday, you start seeing intermittent failures — odd 429 responses, some users getting data and others getting errors, components that load on refresh but not on first load.
That is Graph throttling. It is not a bug in your code. It is a feature of Graph's fairness system, and every production SPFx developer needs to know how to handle it.

🗺️ How Graph Throttling Works

Microsoft Graph enforces resource limits at multiple levels:

Level	Scope
Per application	All calls from the same AAD app across the tenant
Per user per app	Calls made by a specific user via a specific app
Per tenant	Total Graph calls originating from the tenant

When any limit is exceeded, Graph returns:

HTTP 429 Too Many Requests
Retry-After: 14

The Retry-After header contains the number of seconds to wait before retrying. Ignoring it and retrying immediately will only extend the throttle window — Graph tracks repeated violations.

SPFx solutions share the SharePoint Online Client Extensibility Web Application Principal across the tenant. High-traffic periods (morning logins, large team meetings, company-wide announcements) can cause multiple web parts from multiple developers to collectively hit the per-application limit.

🔁 Basic Retry with Respect for Retry-After

The minimum viable throttle handler: catch 429, read the Retry-After header, wait, then retry once.

import { MSGraphClientV3 } from '@microsoft/sp-http';

export async function graphCallWithRetry<T>(
  client: MSGraphClientV3,
  apiPath: string,
  maxRetries: number = 3
): Promise<T> {
  let attempt = 0;

  while (attempt <= maxRetries) {
    try {
      return await client.api(apiPath).get() as T;
    } catch (err) {
      const status = (err as { statusCode?: number }).statusCode;
      const isThrottled = status === 429;
      const isTransient = status === 503 || status === 504;

      if ((isThrottled || isTransient) && attempt < maxRetries) {
        const retryAfterHeader = (err as { headers?: Record<string, string> })
          .headers?.['retry-after'];

        // Use Retry-After if provided; otherwise use exponential backoff
        const waitSeconds = retryAfterHeader
          ? parseInt(retryAfterHeader, 10)
          : Math.pow(2, attempt) * 2; // 2s, 4s, 8s

        console.warn(
          `Graph throttled (${status}). Waiting ${waitSeconds}s before retry ${attempt + 1}/${maxRetries}...`
        );
        await new Promise(resolve => setTimeout(resolve, waitSeconds * 1000));
        attempt++;
      } else {
        throw err;
      }
    }
  }

  throw new Error(`Graph call to ${apiPath} failed after ${maxRetries} retries.`);
}

Usage:

const me = await graphCallWithRetry<{ displayName: string }>(client, '/me?$select=displayName');
console.log(me.displayName);

🧩 A Reusable Graph Throttle Wrapper Class

For a service class pattern, wrapping the retry logic at the method level gets repetitive fast. A higher-order wrapper keeps the service methods clean:

export class ThrottleAwareGraphService {
  private readonly _client: MSGraphClientV3;

  constructor(client: MSGraphClientV3) {
    this._client = client;
  }

  // Generic retry wrapper — use this for every Graph call
  private async _call<T>(
    fn: () => Promise<T>,
    maxRetries: number = 3
  ): Promise<T> {
    let attempt = 0;

    while (attempt <= maxRetries) {
      try {
        return await fn();
      } catch (err) {
        const status = (err as { statusCode?: number }).statusCode;
        const retryable = status === 429 || status === 503 || status === 504;

        if (retryable && attempt < maxRetries) {
          const retryAfter = (err as { headers?: Record<string, string> })
            .headers?.['retry-after'];
          const waitMs = retryAfter
            ? parseInt(retryAfter, 10) * 1000
            : Math.pow(2, attempt) * 2000;

          await new Promise(resolve => setTimeout(resolve, waitMs));
          attempt++;
        } else {
          throw err;
        }
      }
    }

    throw new Error('Max retries exceeded');
  }

  public async getMe(): Promise<{ displayName: string; mail: string }> {
    return this._call(() =>
      this._client.api('/me').select('displayName,mail').get()
    );
  }

  public async getGroupMembers(groupId: string): Promise<unknown[]> {
    return this._call(async () => {
      const res = await this._client
        .api(`/groups/${groupId}/members`)
        .select('displayName,mail')
        .get();
      return res.value;
    });
  }
}

Every public method delegates to _call — retry logic lives in exactly one place.

🚦 Throttle-Aware Request Queue

When multiple Graph calls fire simultaneously (common in web parts that load several data sources on mount), they all compete for the same rate limit budget. A simple request queue serialises calls and prevents the burst pattern that triggers throttling most aggressively:

type QueuedTask = () => Promise<unknown>;

export class GraphRequestQueue {
  private _queue: Array<{ task: QueuedTask; resolve: (v: unknown) => void; reject: (e: unknown) => void }> = [];
  private _running = false;
  private readonly _delayBetweenCallsMs: number;

  constructor(delayBetweenCallsMs: number = 200) {
    // 200ms between calls = max 5 calls/second — well within Graph limits
    this._delayBetweenCallsMs = delayBetweenCallsMs;
  }

  public enqueue<T>(task: () => Promise<T>): Promise<T> {
    return new Promise<T>((resolve, reject) => {
      this._queue.push({ task: task as QueuedTask, resolve: resolve as (v: unknown) => void, reject });
      if (!this._running) this._processQueue();
    });
  }

  private async _processQueue(): Promise<void> {
    this._running = true;

    while (this._queue.length > 0) {
      const next = this._queue.shift()!;
      try {
        const result = await next.task();
        next.resolve(result);
      } catch (err) {
        next.reject(err);
      }
      if (this._queue.length > 0) {
        await new Promise(resolve => setTimeout(resolve, this._delayBetweenCallsMs));
      }
    }

    this._running = false;
  }
}

Usage in a web part that loads multiple datasets:

const queue = new GraphRequestQueue(250); // 250ms spacing

const [profile, events, members] = await Promise.all([
  queue.enqueue(() => client.api('/me').get()),
  queue.enqueue(() => client.api('/me/calendarView?startDateTime=...').get()),
  queue.enqueue(() => client.api('/me/memberOf').get())
]);

Promise.all waits for all three to complete, but the queue serialises their execution with 250ms gaps — preventing the simultaneous burst.

📊 Recognising Throttling Patterns

Symptom: Works fine for one user, fails for many.
Classic per-application or per-tenant throttling. The solution works within limits for a single user but exceeds them under real load. Use the batch endpoint and request queuing to reduce call count.

Symptom: First load fails, refresh works.
The per-user limit was hit on the initial burst of parallel calls. Add retry logic — the second attempt succeeds because the throttle window has cleared.

Symptom: Failures at 9am every day, fine otherwise.
Morning login surge — many users loading Graph-heavy web parts simultaneously. Implement delta queries or caching so subsequent loads within a session do not re-fetch unchanged data.

Symptom: 429 with no Retry-After header.
Some Graph endpoints return 429 without specifying a retry window. Default to 30 seconds of exponential backoff in this case.

💡 Prevention Strategies

Use $select religiously. Every unneeded field in a Graph response costs quota. api('/me').select('displayName,mail') consumes a fraction of what api('/me').get() does.

Cache aggressively. User profiles, group memberships, and org chart data change rarely. Cache them in component state, session storage, or a PnPjs caching behaviour for the duration of the browser session.

Use $batch for parallel calls. Twenty requests in one HTTP call uses one unit of quota, not twenty. If you are making more than two Graph calls on component mount, batch them.

Stagger initial loads. If multiple web parts load on the same page, add a small random delay to each web part's onInit to spread the Graph calls over a few hundred milliseconds rather than firing simultaneously.

✅ Summary

Graph returns HTTP 429 with a Retry-After header (in seconds) when throttled — always read and respect this header.
503 and 504 responses are also transient and should be retried with backoff.
A generic _call(fn, maxRetries) wrapper keeps retry logic in one place across your entire service layer.
A request queue with 200–300ms spacing between calls prevents the burst patterns that trigger throttling most aggressively.
Prevent throttling before it happens: use $select, cache results, use $batch for parallel calls, and stagger web part initialisation.

Happy coding!