DeviceTrackHandler Service
Overview
The DeviceTrackHandler is an event-driven Azure Function microservice that processes user agent strings to extract device and browser information. This service enriches incoming events with detailed device tracking data by leveraging the UserStack API and implements intelligent caching to optimize performance and reduce external API calls.
Business Purpose
This service serves as a device intelligence and tracking system that: - Processes user agent strings from incoming events to extract device/browser information - Enriches event data with detailed device characteristics (OS, browser, device type, etc.) - Implements intelligent caching using DynamoDB to reduce external API calls and improve performance - Forwards enriched device tracking data to downstream systems via Event Hub - Provides device analytics capabilities for the Publisher platform
Architecture
Service Type
- Platform: Azure Functions (Containerized Kubernetes Microservice)
- Runtime: Node.js
- Trigger: Event Hub (routerresult)
- Pattern: Event-Driven Processing with Caching
Key Components
graph TD
A[Event Hub: routerresult] --> B[DeviceTrackHandler]
B --> C[Handler.js]
C --> D{User Agent Available?}
D -->|Yes| E[Generate MD5 Hash]
D -->|No| F[Log Error & Skip]
E --> G[Check DynamoDB Cache]
G --> H{Cache Hit?}
H -->|Yes| I[Use Cached Data]
H -->|No| J[Call UserStack API]
J --> K[Store in Cache]
K --> L[Enrich Event Data]
I --> L
L --> M[Add Session Info]
M --> N[Event Hub: device_track]
O[DynamoDB: UACache] --> G
K --> O
P[UserStack API] --> J
F --> Q[Failed Events]
R[Processing Errors] --> Q
Data Flow
Input Processing
- Event Reception: Receives events from
routerresultEvent Hub - User Agent Extraction: Extracts user agent string from incoming events
- Cache Lookup: Checks DynamoDB cache using MD5 hash of user agent
- API Call: Calls UserStack API if cache miss occurs
- Data Enrichment: Adds device information to event data
- Output: Sends enriched events to
device_trackEvent Hub
Event Structure
Input Event Format
{
"wizsid": "session-id",
"useragent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36",
"ts": "2023-01-01T00:00:00Z",
"resp": {
"campaign": {
"id": "campaign-id"
}
}
}
Output Event Format
{
"wizsid": "session-id",
"campaign": "campaign-id",
"recordid": "generated-32-char-id",
"ts": "2023-01-01T00:00:00Z",
"sessionTimestamp": "2023-01-01T00:00:00Z",
"ua": "user-agent-string",
"uaResponse": {
"browser": "Chrome",
"version": "91.0.4472.124",
"os": "Windows 10",
"device": "Desktop"
}
}
Core Functionality
Device Intelligence Processing
- User Agent Analysis: Processes user agent strings to extract device information
- Intelligent Caching: Uses MD5 hashing and DynamoDB for efficient caching
- External API Integration: Integrates with UserStack API for device detection
- Batch Processing: Processes multiple events concurrently with configurable limits
- Error Handling: Comprehensive error handling with failed event tracking
Key Features
- Performance Optimization: 30-day TTL caching reduces API calls by ~90%
- Concurrent Processing: Processes up to 5 events simultaneously
- Fault Tolerance: Failed events are tracked and can be reprocessed
- Rate Limiting: Built-in timeout protection for external API calls
- Data Enrichment: Adds comprehensive device metadata to events
Dependencies
External Services
- UserStack API: Third-party service for user agent parsing and device detection
- DynamoDB: AWS NoSQL database for caching user agent responses
- Event Hub: Azure Event Hub for input/output event streaming
Key NPM Packages
aws-sdk: AWS SDK for DynamoDB operationsdynogels: DynamoDB ORM for Node.jsaxios: HTTP client for UserStack API callsasync: Asynchronous flow controlmd5: MD5 hashing for cache keysidgen: Unique ID generation
Configuration
Environment-Specific Settings
- Development: Basic logging and development API endpoints
- Integration: Integration testing with staging APIs
- Production: Production APIs with enhanced monitoring
Key Configuration Elements
- UserStack API endpoint and access key
- DynamoDB table configuration (UACache)
- Event Hub connection strings
- Cache TTL settings (30 days default)
- Concurrent processing limits
- API timeout settings (60 seconds)
Event Hub Integration
Input Source
- Event Hub:
routerresult - Consumer Group:
devicetrackhandler - Batch Size: 10 events
- Poll Interval: 5 seconds
Output Destination
- Event Hub:
device_track - Message Format: Enriched device tracking events
Caching Strategy
Cache Implementation
- Storage: DynamoDB table
UACache - Key Strategy: MD5 hash of user agent string
- TTL: 30 days (configurable)
- Schema: Flexible schema allowing unknown properties
Cache Benefits
- Reduces external API calls by ~90%
- Improves response time from ~500ms to ~50ms
- Reduces costs associated with UserStack API usage
- Provides resilience against API outages
Performance Characteristics
Processing Metrics
- Throughput: ~100 events per second
- Latency: 50ms (cache hit) / 500ms (cache miss)
- Concurrency: 5 concurrent event processing
- Cache Hit Rate: ~90% in production
Resource Usage
- Memory: ~128MB average
- CPU: Low utilization due to I/O bound operations
- Network: Moderate for API calls and Event Hub communication
Monitoring and Observability
Logging
- Structured logging with configurable levels
- Performance metrics for API calls
- Cache hit/miss statistics
- Failed event tracking
Metrics
- Processing throughput and latency
- Cache performance statistics
- API call success/failure rates
- Error rates and types
Application Insights Integration
- Custom telemetry for device tracking metrics
- Performance counters
- Exception tracking
- Dependency tracking for external services
Error Handling
Error Scenarios
- Missing User Agent: Events without user agent strings are logged and skipped
- API Failures: UserStack API failures are logged, events marked as failed
- Cache Errors: DynamoDB errors are logged but don't stop processing
- Processing Errors: Individual event failures don't affect batch processing
Failed Event Management
- Failed events are collected and returned
- Can be reprocessed through retry mechanisms
- Detailed error information is logged for debugging
Security Considerations
- API Keys: UserStack API keys stored securely in configuration
- Data Privacy: User agent strings are hashed for caching
- Access Control: DynamoDB access controlled via IAM roles
- Network Security: All API calls use HTTPS
Related Services
This service integrates with the broader Publisher ecosystem:
- RouterV2: Provides input events via routerresult Event Hub
- Analytics Services: Consume enriched device data from device_track Event Hub
- SelfHealer: Handles failed events for retry processing
Troubleshooting
Common Issues
- High API Costs: Check cache hit rates and TTL settings
- Processing Delays: Monitor UserStack API response times
- Failed Events: Review user agent format and API connectivity
- Cache Issues: Verify DynamoDB table configuration and permissions
Debug Steps
- Check Application Insights for processing metrics
- Verify UserStack API connectivity and quotas
- Review DynamoDB cache performance
- Monitor Event Hub message flow
Development
Local Development Setup
- Clone repository
- Install dependencies:
npm install - Configure AWS credentials for DynamoDB
- Set UserStack API key in configuration
- Configure Event Hub connection strings
- Run tests:
npm test
Code Structure
src/Handler.js: Main event processing logicsrc/Helper.js: UserStack API integrationsrc/Dynamo.js: DynamoDB cache operationsconfig/: Environment-specific configurations