tenantpilot/specs/005-backend-arch-pivot/tasks.md

22 KiB

Tasks: Backend Architecture Pivot

Feature: 005-backend-arch-pivot Generated: 2025-12-09 Total Tasks: 66 (T001-T066) Spec: spec.md | Plan: plan.md

Phase 1: Setup (no story label)

  • T001 Confirm Dokploy-provided REDIS_URL and record connection string in specs/005-backend-arch-pivot/notes.md
  • T002 Add REDIS_URL to local .env.example and project .env (if used) (.env.example)
  • T003 Update lib/env.mjs to validate REDIS_URL (lib/env.mjs)
  • T004 [P] Add npm dependencies: bullmq, ioredis, @azure/identity and dev tsx (package.json)
  • T005 [P] Add npm script worker:start to package.json to run tsx ./worker/index.ts (package.json)
  • T006 [P] Create lib/queue/redis.ts - Redis connection wrapper reading process.env.REDIS_URL (lib/queue/redis.ts)
  • T007 [P] Create lib/queue/syncQueue.ts - Export BullMQ Queue('intune-sync-queue') (lib/queue/syncQueue.ts)
  • T008 Test connectivity: add a dummy job from a Node REPL/script and verify connection to provided Redis (scripts/test-queue-connection.js)

Phase 2: Worker Skeleton (no story label)

  • T009 Create worker/index.ts - minimal BullMQ Worker entry point (concurrency:1) (worker/index.ts)
  • T010 Create worker/logging.ts - structured JSON logger used by worker (worker/logging.ts)
  • T011 Create worker/events.ts - job lifecycle event handlers (completed/failed) (worker/events.ts)
  • T012 [P] Add npm run worker:start integration to README.md with run instructions (README.md)
  • T013 Create worker/health.ts - minimal health check handlers (used in docs) (worker/health.ts)
  • T014 Smoke test: start npm run worker:start and verify worker connects and logs idle state (no file)

Phase 3: US1 — Manual Policy Sync via Queue [US1]

  • T015 [US1] Update lib/actions/policySettings.ts → implement triggerPolicySync() to call syncQueue.add(...) and return jobId (lib/actions/policySettings.ts)
  • T016 [US1] Create server action wrapper if needed app/actions/triggerPolicySync.ts (app/actions/triggerPolicySync.ts)
  • T017 [US1] Update /app/search/SyncButton.tsx to call server action and show queued toast with jobId (components/search/SyncButton.tsx)
  • T018 [US1] Add API route /api/policy-sync/status (optional) to report job status using BullMQ Job API (app/api/policy-sync/status/route.ts)
  • T019 [US1] Add simple job payload typing types/syncJob.ts (types/syncJob.ts)
  • T020 [US1] Add unit test for triggerPolicySync() mocking syncQueue.add (tests/unit/triggerPolicySync.test.ts)
  • T021 [US1] End-to-end test: UI → triggerPolicySync → job queued (integration test) (tests/e2e/sync-button.test.ts)
  • T022 [US1] OPTIONAL [P] Document MVP scope for job status endpoint (FR-022) in specs/005-backend-arch-pivot/notes.md (specs/005-backend-arch-pivot/notes.md)

Phase 4: US2 — Microsoft Graph Data Fetching [US2]

  • T023 [US2] Create worker/jobs/graphAuth.ts - getGraphAccessToken() using @azure/identity (worker/jobs/graphAuth.ts)
  • T024 [US2] Create worker/jobs/graphFetch.ts - fetchFromGraph(endpoint) with pagination following @odata.nextLink (worker/jobs/graphFetch.ts)
  • T025 [US2] Implement worker/utils/retry.ts - exponential backoff retry helper (worker/utils/retry.ts)
  • T026 [US2] Create integration tests mocking Graph endpoints for paginated responses (tests/integration/graphFetch.test.ts)
  • T027 [US2] Implement rate limit handling and transient error classification in graphFetch.ts (worker/jobs/graphFetch.ts)
  • T028 [US2] Add logging for Graph fetch metrics (requests, pages, duration) (worker/logging.ts)
  • T029 [US2] Test: run syncPolicies job locally against mocked Graph responses (tests/e2e/sync-with-mock-graph.test.ts)

Phase 5: US3 — Deep Flattening & Transformation [US3]

  • T030 [US3] Create worker/jobs/policyParser.ts - top-level router and parsePolicySettings() (worker/jobs/policyParser.ts)
  • T031 [US3] Implement Settings Catalog parser in policyParser.ts (worker/jobs/policyParser.ts)
  • T032 [US3] Implement OMA-URI parser in policyParser.ts (worker/jobs/policyParser.ts)
  • T033 [US3] Create worker/utils/humanizer.ts - humanizeSettingId() function (worker/utils/humanizer.ts)
  • T034 [US3] Create normalization function worker/jobs/normalizer.ts to produce PolicyInsertData[] (worker/jobs/normalizer.ts)
  • T035 [US3] Unit tests for parsers + humanizer with representative Graph samples (tests/unit/policyParser.test.ts)

Phase 6: US3 — Database Persistence (shared, assign to US3) [US3]

  • T036 [US3] Create worker/jobs/dbUpsert.ts - batch upsert function using Drizzle (worker/jobs/dbUpsert.ts)
  • T037 [US3] Implement transactional upsert logic and ON CONFLICT DO UPDATE behavior (worker/jobs/dbUpsert.ts)
  • T038 [US3] Add performance tuning: batch size config and bulk insert strategy (worker/jobs/dbUpsert.ts)
  • T039 [US3] Add tests for upsert correctness (duplicates / conflict resolution) (tests/integration/dbUpsert.test.ts)
  • T040 [US3] Add lastSyncedAt update on upsert (worker/jobs/dbUpsert.ts)
  • T041 [US3] Load test: upsert 500+ policies and measure duration (scripts/load-tests/upsert-benchmark.js)
  • T042 [US3] Instrument metrics for DB operations (timings, rows inserted/updated) (worker/logging.ts)
  • T043 [US3] Validate data integrity end-to-end (Graph → transform → DB) (tests/e2e/full-sync.test.ts)

Phase 7: US4 — Frontend Integration & Legacy Cleanup [US4]

[X] T044 [US4] Update lib/actions/policySettings.ts to remove n8n webhook calls and call triggerPolicySync() (lib/actions/policySettings.ts) [X] T045 [US4] Update app/api/policy-settings/route.ts to be deleted or archive its behavior (app/api/policy-settings/route.ts) [X] T046 [US4] Delete app/api/admin/tenants/route.ts (n8n polling) (app/api/admin/tenants/route.ts) [X] T047 [US4] Remove POLICY_API_SECRET and N8N_SYNC_WEBHOOK_URL from .env and lib/env.mjs (.env, lib/env.mjs) [X] T048 [US4] Grep-check: verify no remaining n8n references (repo-wide) (no file)

  • T049 [US4] Update docs: remove n8n setup instructions and add worker notes (docs/worker-deployment.md)
  • T050 [US4] Add migration note to specs/002-manual-policy-sync/README.md marking it superseded (specs/002-manual-policy-sync/README.md)
  • T051 [US4] End-to-end QA: trigger sync from UI and confirm policies saved after cleanup (tests/e2e/post-cleanup-sync.test.ts)

Phase 8: Testing & Validation (no story label)

  • T052 Add unit tests for worker/utils/humanizer.ts and policyParser.ts coverage (tests/unit/*.test.ts)
  • T053 Add integration tests for worker jobs processing (tests/integration/worker.test.ts)
  • T054 Run load tests for large tenant (1000+ policies) and record results (scripts/load-tests/large-tenant.js)
  • T055 Test worker stability (run 1+ hour with multiple jobs) and check memory usage (local script)
  • T056 Validate all Success Criteria (SC-001 to SC-008) and document results (specs/005-backend-arch-pivot/validation.md)

Phase 9: Deployment & Documentation (no story label)

  • T057 Create docs/worker-deployment.md with production steps (docs/worker-deployment.md)
  • T058 Add deployment config for worker (Dockerfile or PM2 config) (deploy/worker/Dockerfile)
  • T059 Ensure REDIS_URL is set in production Dokploy config and documented (deploy/README.md)
  • T060 Add monitoring & alerting for worker failures (Sentry / logs / email) (deploy/monitoring.md)
  • T061 Run canary production sync and verify (scripts/canary-sync.js)
  • T062 Final cleanup: remove unused n8n-related code paths and feature flags (grep and code edits)
  • T063 Update README.md and DEPLOYMENT.md with worker instructions (README.md, DEPLOYMENT.md)
  • T064 Tag release branch 005-backend-arch-pivot and create PR template (.github/)
  • T065 Merge PR after review and monitor first production sync (GitHub workflow)
  • T066 Post-deploy: run post-mortem checklist and close feature ticket (specs/005-backend-arch-pivot/closure.md)

Notes

  • Tasks labeled [P] are safe to run in parallel across different files or developers.
  • Story labels map to spec user stories: US1 = Manual Sync, US2 = Graph Fetching, US3 = Transformation & DB, US4 = Cleanup & Frontend.
  • Each task includes a suggested file path to implement work; adjust as needed to match repo layout.

Tasks: Backend Architecture Pivot

Feature: 005-backend-arch-pivot
Generated: 2025-12-09
Total Tasks: 64 (T001-T066)
Spec: spec.md | Plan: plan.md


Phase 1: Setup & Infrastructure (8 tasks)

Goal: Prepare environment, install dependencies, setup Redis and BullMQ queue infrastructure

Environment Setup

  • T001 Install Redis via Docker Compose (add redis service to docker-compose.yml)
  • T002 [P] Add REDIS_URL to .env file (REDIS_URL=redis://localhost:6379)
  • T003 [P] Update lib/env.mjs - Add REDIS_URL: z.string().url() to server schema
  • T004 [P] Update lib/env.mjs - Add REDIS_URL to runtimeEnv object
  • T005 Install npm packages: bullmq, ioredis, @azure/identity, tsx

BullMQ Queue Infrastructure

  • T006 [P] Create lib/queue/redis.ts - Redis connection wrapper with IORedis
  • T007 [P] Create lib/queue/syncQueue.ts - BullMQ Queue definition for "intune-sync-queue"
  • T008 Test Redis connection and queue creation (add dummy job, verify in Redis CLI)

Phase 2: Worker Process Skeleton (6 tasks)

Goal: Set up worker process entry point and basic job processing infrastructure

Worker Setup

  • T009 Create worker/index.ts - BullMQ Worker entry point with job processor
  • T010 [P] Add worker:start script to package.json ("tsx watch worker/index.ts")
  • T011 [P] Implement worker event handlers (completed, failed, error)
  • T012 [P] Add structured logging for worker events (JSON format)
  • T013 Create worker/jobs/syncPolicies.ts - Main sync orchestration function (empty skeleton)
  • T014 Test worker starts successfully and listens on intune-sync-queue

Phase 3: Microsoft Graph Integration (9 tasks)

Goal: Implement Azure AD authentication and Microsoft Graph API data fetching with pagination

Authentication

  • T015 Create worker/jobs/graphAuth.ts - ClientSecretCredential token acquisition
  • T016 [P] Implement getGraphAccessToken() using @azure/identity
  • T017 Test token acquisition returns valid access token

Graph API Fetching

  • T018 Create worker/jobs/graphFetch.ts - Microsoft Graph API client
  • T019 [P] Implement fetchWithPagination() for handling @odata.nextLink
  • T020 [P] Create fetchAllPolicies() to fetch from 4 endpoints in parallel
  • T021 [P] Add Graph API endpoint constants (deviceConfigurations, compliancePolicies, configurationPolicies, intents)

Error Handling

  • T022 Create worker/utils/retry.ts - Exponential backoff retry logic
  • T023 Test Graph API calls with real tenant, verify pagination works for 100+ policies

Phase 4: Data Transformation (11 tasks)

Goal: Port n8n flattening logic to TypeScript, implement parsers for all policy types

Policy Parser Core

  • T024 Create worker/jobs/policyParser.ts - Main policy parsing router
  • T025 [P] Implement detectPolicyType() based on @odata.type
  • T026 [P] Implement parsePolicySettings() router function

Settings Catalog Parser

  • T027 Implement parseSettingsCatalog() for #microsoft.graph.deviceManagementConfigurationPolicy
  • T028 [P] Implement extractValue() for different value types (simple, choice, group collection)
  • T029 Handle nested settings with dot-notation path building

OMA-URI Parser

  • T030 [P] Implement parseOmaUri() for omaSettings[] arrays
  • T031 [P] Handle valueType mapping (string, int, boolean)

Humanizer & Utilities

  • T032 Create worker/utils/humanizer.ts - Setting ID humanization
  • T033 [P] Implement humanizeSettingId() to remove technical prefixes and format names
  • T034 [P] Implement defaultEmptySetting() for policies with no settings

Validation

  • T035 Test parser with sample Graph API responses, verify >95% extraction rate

Phase 5: Database Persistence (7 tasks)

Goal: Implement Drizzle ORM upsert logic with conflict resolution

Database Operations

  • T036 Create worker/jobs/dbUpsert.ts - Drizzle ORM upsert function
  • T037 [P] Implement upsertPolicySettings() with batch insert
  • T038 [P] Configure onConflictDoUpdate with policy_settings_upsert_unique constraint
  • T039 [P] Update lastSyncedAt timestamp on every sync
  • T040 Map FlattenedSetting[] to PolicySetting insert format

Integration

  • T041 Connect syncPolicies() orchestrator: auth → fetch → parse → upsert
  • T042 Test full sync with real tenant data, verify database updates correctly

Phase 6: Frontend Integration (4 tasks)

Goal: Replace n8n webhook with BullMQ job creation in Server Action

Server Action Update

  • T043 Modify lib/actions/policySettings.ts - triggerPolicySync() function
  • T044 Remove n8n webhook call (fetch to N8N_SYNC_WEBHOOK_URL)
  • T045 Add BullMQ job creation (syncQueue.add('sync-tenant', { tenantId }))
  • T046 Test end-to-end: UI click "Sync Now" → job created → worker processes → database updated

Phase 7: Legacy Cleanup (8 tasks)

Goal: Remove all n8n-related code, files, and environment variables

File Deletion

  • T047 Delete app/api/policy-settings/route.ts (n8n ingestion API)
  • T048 Delete app/api/admin/tenants/route.ts (n8n polling API)

Environment Variable Cleanup

  • T049 Remove POLICY_API_SECRET from .env file
  • T050 Remove N8N_SYNC_WEBHOOK_URL from .env file
  • T051 Remove POLICY_API_SECRET from lib/env.mjs server schema
  • T052 Remove N8N_SYNC_WEBHOOK_URL from lib/env.mjs server schema
  • T053 Remove POLICY_API_SECRET from lib/env.mjs runtimeEnv
  • T054 Remove N8N_SYNC_WEBHOOK_URL from lib/env.mjs runtimeEnv

Verification

  • T055 Run grep search for n8n references: grep -r "POLICY_API_SECRET|N8N_SYNC_WEBHOOK_URL" --exclude-dir=specs → should be 0 results

Phase 8: Testing & Validation (6 tasks)

Goal: Comprehensive testing of new architecture

Unit Tests

  • T056 [P] Write unit tests for humanizer.ts
  • T057 [P] Write unit tests for retry.ts
  • T058 [P] Write unit tests for policyParser.ts

Integration Tests

  • T059 Write integration test for full syncPolicies() flow with mocked Graph API
  • T060 Write integration test for database upsert with conflict resolution

End-to-End Test

  • T061 E2E test: Start Redis + Worker, trigger sync from UI, verify database updates

Phase 9: Deployment (5 tasks)

Goal: Deploy worker process to production environment

Docker & Infrastructure

  • T062 Update docker-compose.yml for production (Redis service with persistence)
  • T063 Create Dockerfile for worker process (if separate container)
  • T064 Configure worker as background service (PM2, Systemd, or Docker Compose)

Production Deployment

  • T065 Set REDIS_URL in production environment variables
  • T066 Deploy worker, monitor logs for first production sync

Dependencies Visualization

Phase 1 (Setup)
  ↓
Phase 2 (Worker Skeleton)
  ↓
Phase 3 (Graph Integration) ←─┐
  ↓                            │
Phase 4 (Transformation) ──────┤
  ↓                            │
Phase 5 (Database) ────────────┘
  ↓
Phase 6 (Frontend)
  ↓
Phase 7 (Cleanup)
  ↓
Phase 8 (Testing)
  ↓
Phase 9 (Deployment)

Parallel Opportunities:

  • Phase 3 & 4 can overlap (Graph integration while building parsers)
  • T002-T004 (env var updates) can be done in parallel
  • T006-T007 (Redis & Queue files) can be done in parallel
  • T015-T017 (auth) independent from T018-T021 (fetch)
  • T056-T058 (unit tests) can be done in parallel

Task Details

T001: Install Redis via Docker Compose

File: docker-compose.yml

Action: Add Redis service

services:
  redis:
    image: redis:alpine
    ports:
      - '6379:6379'
    volumes:
      - redis-data:/data
    restart: unless-stopped

volumes:
  redis-data:

Verification: docker-compose up -d redis && redis-cli ping returns PONG


T002-T004: Environment Variable Setup

Files: .env, lib/env.mjs

Changes:

  1. Add REDIS_URL=redis://localhost:6379 to .env
  2. Add REDIS_URL: z.string().url() to server schema
  3. Add REDIS_URL: process.env.REDIS_URL to runtimeEnv

Verification: npm run dev starts without env validation errors


T005: Install npm Dependencies

Command:

npm install bullmq ioredis @azure/identity
npm install -D tsx

Verification: Check package.json for new dependencies


T006: Create Redis Connection Wrapper

File: lib/queue/redis.ts

Implementation: See technical-notes.md section "BullMQ Setup"

Exports: redisConnection


T007: Create BullMQ Queue

File: lib/queue/syncQueue.ts

Implementation: See technical-notes.md section "BullMQ Setup"

Exports: syncQueue


T009: Create Worker Entry Point

File: worker/index.ts

Implementation: See technical-notes.md section "Worker Implementation"

Features:

  • Worker listens on intune-sync-queue
  • Concurrency: 1 (sequential processing)
  • Event handlers for completed, failed, error

T015-T016: Azure AD Token Acquisition

File: worker/jobs/graphAuth.ts

Implementation: See technical-notes.md section "Authentication"

Function: getGraphAccessToken(): Promise<string>

Uses: @azure/identity ClientSecretCredential


T018-T021: Graph API Fetching

File: worker/jobs/graphFetch.ts

Functions:

  • fetchWithPagination<T>(url, token): Promise<T[]>
  • fetchAllPolicies(token): Promise<Policy[]>

Endpoints:

  • deviceManagement/deviceConfigurations
  • deviceManagement/deviceCompliancePolicies
  • deviceManagement/configurationPolicies
  • deviceManagement/intents

T024-T034: Policy Parser Implementation

File: worker/jobs/policyParser.ts

Functions:

  • detectPolicyType(odataType: string): string
  • parsePolicySettings(policy: any): FlattenedSetting[]
  • parseSettingsCatalog(policy: any): FlattenedSetting[]
  • parseOmaUri(policy: any): FlattenedSetting[]
  • extractValue(settingInstance: any): any

Reference: See technical-notes.md section "Flattening Strategy"


T036-T040: Database Upsert

File: worker/jobs/dbUpsert.ts

Function: upsertPolicySettings(tenantId: string, settings: FlattenedSetting[])

Features:

  • Batch insert with Drizzle ORM
  • Conflict resolution on policy_settings_upsert_unique
  • Update lastSyncedAt timestamp

Reference: See technical-notes.md section "Database Upsert"


T043-T045: Frontend Integration

File: lib/actions/policySettings.ts

Function: triggerPolicySync(tenantId: string)

Before:

const response = await fetch(env.N8N_SYNC_WEBHOOK_URL, {
  method: 'POST',
  body: JSON.stringify({ tenantId }),
});

After:

import { syncQueue } from '@/lib/queue/syncQueue';

const job = await syncQueue.add('sync-tenant', { 
  tenantId,
  triggeredAt: new Date(),
});
return { jobId: job.id };

Success Criteria Mapping

Task(s) Success Criterion
T001-T008 SC-001: Job creation <200ms
T041-T042 SC-002: Sync 50 policies in <30s
T019-T021 SC-003: Pagination handles 100+ policies
T024-T035 SC-004: >95% setting extraction
T022-T023 SC-005: Automatic retry on 429
T047-T055 SC-006: Zero n8n references
T061, T066 SC-007: Worker stable 1+ hour
T041-T042 SC-008: No data loss on re-sync

Estimated Effort

Phase Tasks Hours Priority
1. Setup 8 1-2h P1
2. Worker Skeleton 6 2h P1
3. Graph Integration 9 4h P1
4. Transformation 11 6h P1
5. Database 7 3h P1
6. Frontend 4 2h P1
7. Cleanup 8 2h P1
8. Testing 6 4h P1
9. Deployment 5 3h P1
Total 64 27-29h

Implementation Notes

Task Execution Order

Sequential Tasks (blocking):

  • T001 → T002-T004 → T005 (setup before queue)
  • T006-T007 → T008 (Redis before queue test)
  • T009 → T013 (worker before sync skeleton)
  • T041 → T042 (integration before test)
  • T043-T045 → T046 (implementation before E2E test)

Parallel Tasks (can be done simultaneously):

  • T002, T003, T004 (env var updates)
  • T006, T007 (Redis + Queue files)
  • T010, T011, T012 (worker event handlers)
  • T015-T017, T018-T021 (auth independent from fetch)
  • T027-T029, T030-T031 (different parser types)
  • T047, T048 (file deletions)
  • T049-T054 (env var removals)
  • T056, T057, T058 (unit tests)

Common Pitfalls

  1. Redis Connection: Ensure maxRetriesPerRequest: null for BullMQ compatibility
  2. Graph API: Handle 429 rate limiting with exponential backoff
  3. Pagination: Always follow @odata.nextLink until undefined
  4. Upsert: Use correct constraint name policy_settings_upsert_unique
  5. Worker Deployment: Don't forget concurrency: 1 for sequential processing

Testing Checkpoints

  • After T008: Redis + Queue working
  • After T014: Worker starts successfully
  • After T017: Token acquisition works
  • After T023: Graph API fetch with pagination works
  • After T035: Parser extracts >95% of settings
  • After T042: Full sync updates database
  • After T046: UI → Worker → DB flow complete
  • After T055: No n8n references remain
  • After T061: E2E test passes

Task Status: Ready for Implementation
Next Action: Start with Phase 1 (T001-T008) - Setup & Infrastructure