TenantAtlas/specs/051-entra-group-directory-cache/research.md
2026-01-11 22:02:06 +01:00

2.1 KiB

Research: Entra Group Directory Cache (Groups v1)

Decisions

Decision: Full tenant scope sync (all groups)

  • Rationale: Ensures browse/search is complete and name-resolution has the best chance of resolving IDs across modules.
  • Alternatives considered:
    • Only sync referenced groups (creates gaps, hard to debug, non-deterministic coverage).

Decision: Start modes = manual + scheduled

  • Rationale: Manual sync supports operator workflows (pre-restore/triage), while scheduled sync keeps cache fresh without relying on user action.
  • Alternatives considered:
    • Manual only (cache freshness depends on operator discipline).
    • Scheduled only (harder to react quickly during ops).

Decision: App-only (service principal) directory reads

  • Rationale: Scheduled runs must not depend on an interactive user session; app-only is simpler to audit/lock down.
  • Alternatives considered:
    • Delegated tokens (breaks scheduled runs, unpredictable auth state).

Decision: Cache metadata only (no membership/owners)

  • Rationale: Solves the UX problem (name resolution + browse) with minimal data/PII exposure and lower API volume.
  • Alternatives considered:
    • Cache membership/owners (larger data surface, higher sensitivity, more Graph calls).

Decision: Missing groups retention and purge

  • Decision: Retain groups for 90 days after last_seen_at, then purge.
  • Rationale: Preserves investigatory value and audit context while bounding DB growth.
  • Alternatives considered:
    • Immediate delete (breaks historical triage; creates sudden “unresolved” labels).
    • Retain forever (unbounded growth).

Decision: Graph access patterns (paging + retry)

  • Decision: Use Graph list endpoint with paging; apply retry/backoff for 429/503; persist stable error categories in run record.
  • Rationale: Full-tenant enumeration can be large; throttling is expected.
  • Alternatives considered:
    • Best-effort without retry (unreliable coverage).
    • UI-time lookups (explicitly disallowed by spec).

Open Questions (resolved for planning)

  • Which specific Graph endpoint and permission names to use will be documented in the contract registry during implementation.