← BlogTutorial · Week 2 of 13

detecting sensor anomalies in 100 lines of python

fixed thresholds catch the obvious failures. they miss the sensor drifting 5°C off baseline overnight, and humidity quietly sliding 12 points below where it sat all week. those are the ones that matter.

this tutorial catches them. ~75 lines of python, a 30-second rolling window, alert when a reading sits more than 3 standard deviations from normal for 3 ticks in a row. extends week 1’s ESP32 setup. no ML.

by ann schulte~12 min read
What you’ll build
ESP32 + BME280  ──▶  plexus  ──▶  detector.py  ──▶  plexus  ──▶  #your-channel
   (from week 1)    temperature    30s window         monitor on    (slack)
   every 2s         stream         rolling z-score    temp.zscore
                                   3-streak guard

two new pieces. detector.py polls plexus once a second for the latest temperature samples, keeps the most recent 30 seconds in memory, and computes a rolling z-score per new point. when |z| > 3 for three samples in a row — about 6 seconds of sustained deviation — it posts temperature.zscore back to plexus.

plexus already knows how to fire slack on a max threshold. point one at temperature.zscore with max: 3.0 and week 1’s alert pipeline fires unchanged.

the detector doesn’t touch firmware. it runs anywhere — laptop, vps, raspberry pi — and works for any source and metric in your fleet.

What you need

install:

bash
pip install plexus-python requests

no numpy, no pandas. the math is six lines and we want to see it.

Step 1 of 4

get the data flowing

if you finished week 1, your ESP32 is already pushing temperature, humidity, and pressure to plexus every 2 seconds. that’s the data source for everything below — same source_id, same metric.

no ESP32 yet? point the detector at any temperature stream you have. it’s a SOURCE_ID and METRIC constant at the top of the script.

Step 2 of 4

rolling stats over a 30-second window

start detector.py. the window is a collections.deque capped at 15 samples — 30 seconds at week 1’s 2-second cadence. deque(maxlen=15) drops the oldest sample automatically when the next one is appended. zero math required.

detector.py
# detector.py
import collections
import math

WINDOW_SAMPLES = 15  # 30 seconds @ one sample per 2 seconds

window = collections.deque(maxlen=WINDOW_SAMPLES)


def zscore(values):
    """Return the z-score of the latest value vs. the rest of the window."""
    if len(values) < 5:
        return 0.0
    mean = sum(values) / len(values)
    var = sum((v - mean) ** 2 for v in values) / len(values)
    std = math.sqrt(var)
    if std < 0.01:
        return 0.0
    return (values[-1] - mean) / std

three guards in there:

  • < 5 samples: not enough data to call anything anomalous yet. return 0.
  • std < 0.01: the sensor was perfectly flat. anything else divides by zero.
  • otherwise: standard z-score. positive for “above the recent mean,” negative for below.

that’s the math. six lines if you cut the guards.

Step 3 of 4

poll, score, and post the spike

now the loop. poll plexus for new temperature samples once a second, push them through the window, and post temperature.zscore back when 3 anomalous samples have landed in a row.

detector.py (continued)
import os
import time
import requests
from plexus import Plexus

SOURCE_ID = "esp32-bme280"
METRIC = "temperature"
Z_THRESHOLD = 3.0
CONSECUTIVE = 3
POLL_S = 1.0

API_BASE = os.environ.get("PLEXUS_ENDPOINT", "https://app.plexus.company")
API_KEY = os.environ["PLEXUS_API_KEY"]

session = requests.Session()
session.headers["x-api-key"] = API_KEY

px = Plexus(source_id=SOURCE_ID)


def fetch_recent():
    r = session.get(
        f"{API_BASE}/api/v1/telemetry",
        params={"source": SOURCE_ID, "metric": METRIC, "limit": 200},
        timeout=5,
    )
    r.raise_for_status()
    return sorted(r.json()["data"], key=lambda x: x["timestamp"])


def main():
    last_ts = None
    consecutive = 0
    while True:
        for row in fetch_recent():
            ts = row["timestamp"]
            if last_ts is not None and ts <= last_ts:
                continue
            last_ts = ts

            value = float(row["value"])
            window.append(value)
            z = zscore(list(window))

            consecutive = consecutive + 1 if abs(z) > Z_THRESHOLD else 0
            emit = abs(z) if consecutive >= CONSECUTIVE else 0.0
            px.send("temperature.zscore", emit)

            tag = "ANOMALY" if consecutive >= CONSECUTIVE else "normal "
            print(f"[{tag}] t={value:6.2f}  z={z:+5.2f}  streak={consecutive}")

        time.sleep(POLL_S)


if __name__ == "__main__":
    main()

the 3-consecutive guard is the part that matters. without it, a single noisy reading 3.5σ above the mean fires an alert. with it, the alert only fires when something has actually been wrong for ~6 seconds.

run it:

bash
export PLEXUS_API_KEY=plx_xxx
python detector.py
Step 4 of 4

warm the sensor, watch it spike

press the BME280 between your fingers like you did in week 1. the temperature climbs and the z-score climbs with it — on a quiet room, even a 1°C jump is several sigma.

stdout
[normal ] t= 22.41  z=+0.18  streak=0
[normal ] t= 22.45  z=+0.39  streak=0
[normal ] t= 23.18  z=+2.41  streak=0    ← warming up
[normal ] t= 24.05  z=+3.82  streak=1
[normal ] t= 24.92  z=+4.51  streak=2
[ANOMALY] t= 25.78  z=+5.04  streak=3    ← posts zscore=5.04
[ANOMALY] t= 26.43  z=+5.49  streak=4

three things happen on that third anomaly tick:

  1. 01px.send("temperature.zscore", 5.04) lands at the gateway.
  2. 02plexus evaluates the rule (registered below), finds 5.04 > 3.0, emits alert.triggered.
  3. 03slack message arrives — same channel, same integration, week 1’s pipeline unchanged.

the rule, one curl:

bash
curl -X POST https://app.plexus.company/api/monitors \
  -H "x-api-key: $PLEXUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "source_id": "esp32-bme280",
    "metric": "temperature.zscore",
    "threshold": {
      "max": 3.0,
      "severity": "warning",
      "message": "BME280 deviating from baseline"
    }
  }'

let go of the sensor. the temperature drops, the streak resets, the metric goes back to 0, plexus emits alert.resolved. clean.

Going further

three directions to take this

  • score every metric. the detector is parameterized on SOURCE_ID and METRIC. run one process per metric you care about (temperature, humidity, pressure), or fan out inside the script with a list. zscore() doesn’t know what it’s scoring.
  • fleet-level baselines. instead of comparing a sensor to its own recent past, compare it to the fleet’s recent past. GET /api/v1/telemetry?metric=temperature with no source filter returns every device; median + MAD becomes the baseline. one bad sensor in a fleet of 50 stands out immediately.
  • when ML earns its complexity. at this layer — single sensor, single metric, “this is far from recent normal” — z-score is the right answer. ML earns its weight when you have multivariate inputs (vibration + current + temperature on a motor and you need to know which one drifted), seasonality you’d otherwise alert on, or labeled failure data to train against. for “this looks weird,” 75 lines of python beats any model.

next week: dashboards from telemetry, automatically.

Get the code

clone, run, file an issue if it breaks.

detector.py, the monitor curl, and a README live in the plexus-tutorials repo on GitHub.

Detecting sensor anomalies in 100 lines of python | Plexus