Purpose-built smart HVAC controls for light commercial, industrial, and retail spaces. I own the full ML stack: research, training, production deployment, and the data infrastructure that feeds it — across a fleet of hundreds of devices at 2-minute resolution.
Trained a cross-attention diffusion model for 6-hour indoor temperature trajectory forecasting, conditioned on 31-dimensional physics embeddings derived from per-device 2R2C thermal circuit parameters. Achieved 0.44°F median MAE — a 3–4× improvement over baseline — with calibrated ensemble uncertainty bounds via DDIM sampling.
During evaluation I discovered the model had learned a shortcut: HVAC call duration perfectly predicts outcome in real thermostat data (bang-bang control makes them deterministically correlated), causing SVD of the HVAC response surface to be rank 1 and Spearman ρ = −1.000 between call rate and duration. The model was acing every metric while learning the wrong thing. I fixed this with physics-informed counterfactual data augmentation — generating synthetic training samples where HVAC schedules are decoupled from thermostat logic, breaking the correlation without changing the underlying physics — reducing ρ to −0.4 while preserving MAE on real data. The model now learns that heating rate is a property of the building, not the call.
Deployed as a self-contained 2.7MB production inference service (model weights embedded, no external dependencies) serving both forecasting and anomaly detection use cases.
Estimated thermal circuit parameters per device — envelope resistance, internal resistance, air capacitance, thermal mass, two-stage HVAC capacity and lag, solar gain, internal gain, infiltration coefficient — using bootstrap uncertainty quantification. Ablation studies identified the 21-parameter subset that provides maximum discriminating power across the fleet.
These physics parameters serve double duty as a representation space: stored as 21-dimensional vectors in pgvector, they enable fleet-wide k-NN similarity search — finding buildings with comparable thermal signatures for benchmarking, transfer, and anomaly contextualization. A separate 31-dimensional variant conditions the diffusion model, extending the embedding with regime detection and seasonal coverage features.
Drift detection monitors parameter stability over time to distinguish genuine equipment degradation from estimation noise.
Built a fleet-wide outlier detection pipeline using consensus voting across three complementary methods — HDBSCAN, Isolation Forest, and Local Outlier Factor — applied to the physics embedding space. Devices flagged by 2+ methods surface as candidates for investigation, with per-method scores providing diagnostic context.
Anomaly detection in production compares observed temperature trajectories against the diffusion model's predicted distribution, flagging deviations beyond a configurable σ threshold with severity and timing metadata.
Change-point detection on raw time series treats historical behavior as the baseline — deviations matter more than absolute values. Distributional features (e.g., kurtosis of the estimation error distribution) identify devices where the physics model is underconfident, surfacing structurally hard-to-model buildings before they generate false alerts.
Built a multi-stage pipeline that translates raw time series into prioritized user-facing recommendations: statistical findings are interpreted as domain insights, ranked, and presented as an attention queue for users arriving at the dashboard without a specific issue in mind. The stack surfaces what needs attention now without requiring HVAC domain expertise from the user.
Rewrote the time series feature store as a Rust-driven query compiler and planner in front of TimescaleDB. The planner maintains a feature registry with category-based grouping, resolves inter-feature dependencies, separates raw fetches across data sources (device telemetry, weather, remote sensors), and applies derived transforms — producing an optimized execution plan before any data is fetched. Serves 2,000 requests per second per core in an EC2 autoscaling group.
Built an agentic system and MCP server for free-form fleet diagnostics. The agent's understanding of the fleet is grounded in a knowledge graph derived from the codebase and feature store — devices, zones, equipment, and remote sensors as vertices; relationships (LOCATED_IN, CONTROLS, MONITOR_SUPPLY, MONITOR_RETURN) as edges; traversable via recursive CTEs in PostgreSQL. To answer analytical questions, the agent generates Python and executes it in a WebAssembly sandbox (python.wasm via Wasmtime) — full process isolation with no host filesystem access beyond explicitly scoped directories. Support and engineering teams can now diagnose issues and surface insights without specialized knowledge of HVAC, IoT protocols, or the underlying data schema.
On-board heat sources (cellular radio, processor, power management IC) bias the thermostat's temperature reading. Designed an ongoing experiment with our electrical engineer to characterize each heat source across locations and power states using reference sensors, then built a compensation model on top of the feature store that corrects the reading in real time.
Privacy compliance software for large organizations subject to GDPR, CCPA, and similar regulations.
Led a team of engineers to build a new product for managing privacy compliance data collection across large organizations — replacing spreadsheet-driven processes with a structured task and information management system. The core challenge was combinatorial: many assets × many regulations × overlapping requirements.
We addressed this with three mechanisms: a template system (lawyers define a compliance report structure; it decomposes into distributed tasks), automation (prefill from known answers, skip irrelevant questions, flag contradictions), and an asset inventory (shared answers across overlapping regulatory requirements eliminate redundant data collection). Business logic and automation run entirely in PostgreSQL via PL/pgSQL. The API layer is Go; the template compiler targets WebAssembly for reuse in the browser; the UI is PureScript.
Flexible rent payment scheduling and budgeting tools for renters and landlords.
Designed a streaming ingestion pipeline for daily renter status and balance updates, supporting multiple vendor APIs and manually uploaded Excel files. Built with change detection, replayable historical input, and support for schema variance across vendors.
Built an identifier translation service mapping vendor-specific renter IDs to internal identifiers, including a fuzzy matching endpoint for new-user enrollment. Implemented in Go with Redis and bloom filters.
Business travel booking platform for frequent travelers. (Defunct — business travel and global pandemics don't mix.)
Sabre charges overage fees without warning when a company exceeds its purchased token count. With 180 microservices competing for a fixed token pool, accurate distributed counting was critical. I used TLA+ to design the management algorithm, then implemented a distributed session pool in Redis + Lua + Node.js that enforces the limit across the fleet with no false positives.
Built the core booking engine service interfacing with Sabre's legacy terminal API — a system predating ASCII with 6-bit integers and a travel-agent keyboard UI. Led implementation across four engineering teams and the data science team, from initial proof of concept to production.
Infrastructure-as-code with emphasis on security, auditability, and automated remediation.
Built a system to enumerate every resource in an AWS account for snapshot and diff operations, implemented in Python with comprehensive property-based testing via Hypothesis. Introduced property-based testing to the broader engineering team.
Created a fuzz testing framework that introduces arbitrary infrastructure changes into an AWS account and measures remediation latency. Perturbation selection was derived from historical analysis of bug reports and feature changes to maximize signal per dollar of AWS spend.
Bot detection and mitigation for websites, mobile applications, and APIs.
Collaborated with the data science team to redesign browser fingerprinting: identified the feature subset with maximum uniqueness, added data validation, and shipped 2.5× more identifiers at equivalent traffic volume. Added HashCash to raise the cost of identifier cycling, producing a 30% increase in bot detection with no customer complaints about false positives. Implemented a stable bloom filter in Lua inside Redis to deduplicate an unbounded token stream in constant space.
Designed a rule engine to replace a fragile flat settings system, encoding platform knowledge in a format accessible to engineers, support, and analysts. Built as an Nginx module in Rust to meet a 20ms latency SLA — Rust's WebAssembly target allowed the same library to power the browser-based configuration UI. Rules compile to disjunctive normal form for constant-space evaluation; the compiler estimates resource usage and provides user-facing feedback before deployment.
Diagnosed and fixed a reliability inversion: multi-node deployments were less reliable than single-node despite being sold for redundancy. Root cause was error propagation via ZeroMQ topology design. Redesigned data flow for 2-node replication and n/2−1 fault tolerance in larger deployments. Built a fault injection test suite to validate failover behavior.
Joined when the platform team was two people; left with nine engineers across four offices on two continents. Drove recruiting and hiring for all of them. Organized an ongoing Rust meetup as a sourcing pipeline. Led architecture integration for two acquisitions.