Files
the_order/docs/operations/ENTRA_VERIFIEDID_RUNBOOK.md
defiQUG 92cc41d26d Add Legal Office seal and complete Azure CDN deployment
- Add Legal Office of the Master seal (SVG design with Maltese Cross, scales of justice, legal scroll)
- Create legal-office-manifest-template.json for Legal Office credentials
- Update SEAL_MAPPING.md and DESIGN_GUIDE.md with Legal Office seal documentation
- Complete Azure CDN infrastructure deployment:
  - Resource group, storage account, and container created
  - 17 PNG seal files uploaded to Azure Blob Storage
  - All manifest templates updated with Azure URLs
  - Configuration files generated (azure-cdn-config.env)
- Add comprehensive Azure CDN setup scripts and documentation
- Fix manifest URL generation to prevent double slashes
- Verify all seals accessible via HTTPS
2025-11-12 22:03:42 -08:00

10 KiB

Entra VerifiedID Operational Runbook

This runbook provides operational procedures for managing the Entra VerifiedID integration.

Table of Contents

  1. Daily Operations
  2. Monitoring
  3. Troubleshooting
  4. Common Operations
  5. Emergency Procedures

Daily Operations

Health Checks

Check Service Health

curl https://api.theorder.org/health

Check Entra Client Status

# Check logs for Entra client initialization
kubectl logs -n the-order-prod deployment/identity-service | grep -i entra

Verify Metrics Collection

curl https://api.theorder.org/metrics | grep entra

Key Metrics to Monitor

  1. Issuance Success Rate: Should be >95%

    rate(entra_credentials_issued_total{status="success"}[5m]) / 
    rate(entra_credentials_issued_total[5m])
    
  2. API Latency: p95 should be <5 seconds

    histogram_quantile(0.95, entra_api_request_duration_seconds_bucket{operation="issueCredential"})
    
  3. Error Rate: Should be <5%

    rate(entra_api_errors_total[5m]) / rate(entra_api_requests_total[5m])
    
  4. Webhook Processing: Should process all webhooks

    rate(entra_webhooks_received_total[5m])
    

Monitoring

Grafana Dashboard

Access the Entra VerifiedID dashboard at: https://grafana.theorder.org/d/entra-verifiedid

Key Panels:

  • Issuance Success Rate (gauge)
  • API Request Rate (graph)
  • Error Rate by Operation (graph)
  • Issuance Duration (histogram)
  • Webhook Events (graph)
  • Active Requests (gauge)

Alerts

Critical Alerts:

  • EntraIssuanceErrorRateHigh: Error rate >10%
  • EntraIssuanceLatencyHigh: p95 latency >10 seconds
  • EntraWebhookProcessingFailed: Webhook processing failures
  • EntraAPIDown: No successful API requests in 5 minutes

Warning Alerts:

  • EntraIssuanceErrorRateWarning: Error rate >5%
  • EntraIssuanceLatencyWarning: p95 latency >5 seconds
  • EntraRateLimitApproaching: Rate limit usage >80%

Troubleshooting

Issue: Credential Issuance Failing

Symptoms:

  • High error rate in metrics
  • 500 errors in logs
  • No credentials being issued

Diagnosis:

# Check recent errors
kubectl logs -n the-order-prod deployment/identity-service --tail=100 | grep -i error

# Check Entra API connectivity
curl -X POST https://verifiedid.did.msidentity.com/v1.0/<tenant-id>/verifiableCredentials/createIssuanceRequest \
  -H "Authorization: Bearer <token>"

# Verify credentials
kubectl get secret -n the-order-prod entra-credentials -o yaml

Solutions:

  1. Verify Entra credentials are correct
  2. Check API permissions are granted
  3. Verify credential manifest exists
  4. Check network connectivity to Entra API
  5. Review Entra service status in Azure Portal

Issue: Webhooks Not Received

Symptoms:

  • No webhook events in metrics
  • Credentials stuck in "pending" status
  • Database not updated

Diagnosis:

# Check webhook endpoint
curl -X POST https://api.theorder.org/vc/entra/webhook \
  -H "Content-Type: application/json" \
  -d '{"requestId":"test","requestStatus":"issuance_successful"}'

# Check webhook logs
kubectl logs -n the-order-prod deployment/identity-service | grep webhook

# Verify webhook URL in Entra
# Go to Azure Portal → Verified ID → Settings → Webhooks

Solutions:

  1. Verify webhook URL is configured in Entra VerifiedID
  2. Check webhook endpoint is accessible (firewall, ingress rules)
  3. Verify webhook payload format matches expected schema
  4. Check database connectivity
  5. Review webhook processing logs

Issue: High Latency

Symptoms:

  • Slow credential issuance (>10 seconds)
  • High p95/p99 latency metrics
  • Timeout errors

Diagnosis:

# Check API request duration
kubectl logs -n the-order-prod deployment/identity-service | grep "duration"

# Check network latency to Entra
ping verifiedid.did.msidentity.com

# Check retry attempts
kubectl logs -n the-order-prod deployment/identity-service | grep retry

Solutions:

  1. Check network connectivity and latency
  2. Verify Entra API is not experiencing issues
  3. Review retry configuration (may be retrying too many times)
  4. Check if rate limiting is causing delays
  5. Consider increasing timeout values

Issue: Rate Limit Errors

Symptoms:

  • 429 errors in logs
  • Rate limit metrics showing violations
  • Requests being rejected

Diagnosis:

# Check rate limit violations
kubectl logs -n the-order-prod deployment/identity-service | grep "429"

# Check current rate limit settings
kubectl get configmap -n the-order-prod identity-service-config -o yaml | grep ENTRA_RATE_LIMIT

Solutions:

  1. Review current rate limit configuration
  2. Check Entra API quota limits
  3. Adjust rate limits if needed
  4. Implement request queuing if necessary
  5. Contact Entra support if quota needs increase

Issue: Token Refresh Failures

Symptoms:

  • "Failed to get access token" errors
  • Authentication failures
  • 401 errors

Diagnosis:

# Check token refresh logs
kubectl logs -n the-order-prod deployment/identity-service | grep "token"

# Verify credentials
kubectl get secret -n the-order-prod entra-credentials -o jsonpath='{.data.ENTRA_CLIENT_SECRET}' | base64 -d

Solutions:

  1. Verify client secret is correct and not expired
  2. Check API permissions are granted
  3. Verify tenant ID and client ID are correct
  4. Check if client secret needs rotation
  5. Review Azure AD app registration status

Common Operations

Issue a Credential Manually

curl -X POST https://api.theorder.org/vc/issue/entra \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <token>" \
  -d '{
    "claims": {
      "email": "user@example.com",
      "name": "John Doe",
      "role": "member"
    },
    "manifestName": "default"
  }'

Check Credential Status

curl https://api.theorder.org/vc/entra/status/<requestId> \
  -H "Authorization: Bearer <token>"

Verify a Credential

curl -X POST https://api.theorder.org/vc/verify/entra \
  -H "Content-Type: application/json" \
  -d '{
    "credential": {
      "id": "vc:123",
      "type": ["VerifiableCredential"],
      "issuer": "did:web:...",
      "credentialSubject": {...},
      "proof": {...}
    }
  }'

View Recent Issuances

# Query database
kubectl exec -n the-order-prod deployment/identity-service -- \
  psql $DATABASE_URL -c "SELECT * FROM verifiable_credentials ORDER BY created_at DESC LIMIT 10;"

Check Metrics

# Get all Entra metrics
curl https://api.theorder.org/metrics | grep entra_

# Get specific metric
curl https://api.theorder.org/metrics | grep entra_credentials_issued_total

Rotate Client Secret

  1. Create new client secret in Azure Portal
  2. Update secret in Key Vault:
    az keyvault secret set --vault-name <keyvault> --name "entra-client-secret" --value "<new-secret>"
    
  3. Restart identity service to pick up new secret
  4. Verify service starts correctly
  5. Test credential issuance
  6. Delete old secret after verification

Add New Credential Manifest

  1. Create manifest in Azure Portal → Verified ID
  2. Note the Manifest ID
  3. Update ENTRA_MANIFESTS environment variable:
    ENTRA_MANIFESTS='{"default":"id1","new-manifest":"new-id"}'
    
  4. Restart identity service
  5. Test issuance with new manifest:
    curl -X POST .../vc/issue/entra -d '{"claims": {...}, "manifestName": "new-manifest"}'
    

Emergency Procedures

Disable Entra Integration

If critical issues occur:

  1. Scale down identity service (if using separate deployment):

    kubectl scale deployment identity-service -n the-order-prod --replicas=0
    
  2. Or disable Entra routes by setting:

    ENTRA_TENANT_ID=""
    
  3. Verify routes are disabled:

    curl https://api.theorder.org/vc/issue/entra
    # Should return 503 or route not found
    
  4. Monitor for stability

Rollback Deployment

  1. Identify previous working version
  2. Rollback deployment:
    kubectl rollout undo deployment/identity-service -n the-order-prod
    
  3. Verify rollback:
    kubectl rollout status deployment/identity-service -n the-order-prod
    
  4. Test critical functionality
  5. Monitor metrics

Emergency Credential Issuance

If automated issuance fails, use manual process:

  1. Access Entra VerifiedID portal directly
  2. Issue credential manually
  3. Export credential data
  4. Import into database if needed
  5. Notify affected users

Diagnostic Commands

Check Service Status

kubectl get pods -n the-order-prod -l app=identity-service
kubectl describe pod <pod-name> -n the-order-prod

View Logs

# Recent logs
kubectl logs -n the-order-prod deployment/identity-service --tail=100

# Follow logs
kubectl logs -n the-order-prod deployment/identity-service -f

# Logs with grep
kubectl logs -n the-order-prod deployment/identity-service | grep -i entra

Check Configuration

# Environment variables
kubectl exec -n the-order-prod deployment/identity-service -- env | grep ENTRA

# ConfigMap
kubectl get configmap -n the-order-prod identity-service-config -o yaml

# Secrets (base64 encoded)
kubectl get secret -n the-order-prod entra-credentials -o yaml

Test Connectivity

# Test Entra API
curl -v https://verifiedid.did.msidentity.com/v1.0/

# Test webhook endpoint
curl -X POST https://api.theorder.org/vc/entra/webhook \
  -H "Content-Type: application/json" \
  -d '{"requestId":"test","requestStatus":"issuance_successful"}'

Support Escalation

  1. Level 1: Check logs, metrics, and run diagnostic commands
  2. Level 2: Review configuration and test connectivity
  3. Level 3: Contact Azure support for Entra VerifiedID issues
  4. Level 4: Escalate to engineering team for code issues

Contact Information


Last Updated: [Current Date] Version: 1.0