One bearer token, one HTTPS endpoint, the shape your code already produces. No collector daemon, no proprietary query language, no per-GB markup that doubles your bill the month traffic actually grows.
Ship with curl, an OTel SDK, Vector, or Fluent Bit. Search by time +
substring + service + level. Live-tail over SSE — events land in your terminal
under 100 ms after they hit ingest. Pattern-match a threshold, get paged.
Same auth, audit, and multi-tenancy as the rest of the API.
v1 scope. Ingest + search + live tail + log-based alerts + auto-facet extraction ship now. R2 archival, saved searches, and dashboards are queued for v1.1+.
POST /api/v1/logs/ingest — JSON object, JSON array (≤500), or NDJSON. PAT bearer auth.GET /api/v1/logs/search — time-range + substring + service + level. Cursor-paginated.GET /api/v1/logs/tail — Server-Sent Events. Same filters; tails forward from "now."GET / POST / PATCH / DELETE /api/v1/log-alerts — manage alert rules.{
"ts": "2026-05-10T12:00:00Z", // optional — defaults to now if absent
"level": "info", // trace | debug | info | warn | error | fatal
"service": "checkout", // any string ≤ 255
"source": "stdout", // optional; e.g. "stdout", "syslog", "file"
"host": "web-1", // optional
"message": "checkout completed", // required, ≤ 8KB
"attrs": { "user_id": 42, "amount": 19.99 } // optional, ≤ 4KB serialized
}
Server stamps org_id from your PAT — agents cannot spoof an origin
org by lying in the body. ingest_pat_id and client_ip are
captured server-side for forensic queries.
Ingest requires the logs:write scope; search + tail require logs:read.
A wildcard PAT (scopes: ["*"]) covers both — that's the default for backward-compat.
For agent least-privilege, mint a narrow PAT (see /docs/api):
curl -X POST https://api.24observe.com/api/v1/me/tokens \
-H 'Authorization: Bearer obs_<admin>' \
-H 'Content-Type: application/json' \
-d '{
"name": "aeoniti-log-shipper",
"scopes": ["logs:write"],
"dailyLogBytesLimit": 104857600
}'
# Token only valid for /logs/ingest. Cap = 100 MB/day on this PAT.
# Wrong-scope call returns 403 PAT_SCOPE_INSUFFICIENT. 429 PLAN_LIMIT_LOGS_VOLUME when exhausted.dailyLogBytesLimit. 429 PAT_DAILY_LIMIT_EXCEEDED on trip. Resets UTC midnight.
Set Content-Encoding: gzip, deflate, or
br on the request and send the compressed body. The
ingest endpoint decompresses natively — no per-customer plan
change, no special header. Measured on a 400-event pino batch
against production:
| Encoding | Wire bytes | vs raw |
|---|---|---|
| plain | 153,439 | 1.00× |
gzip | 7,015 | 21.9× smaller |
deflate | 7,003 | 21.9× smaller |
br (brotli) | 4,005 | 38.3× smaller |
For a customer pushing 50 GB/day of structured logs, that's ~2.3 GB/day on the wire instead of 50 — roughly $130/month off the AWS egress bill alone, and the compressed batches arrive ~3× faster end-to-end because there's less to upload.
Per-shipper config — one line each.
Vector (TOML):
[sinks.observe24] type = "http" inputs = ["my_logs"] uri = "https://api.24observe.com/api/v1/logs/ingest" compression = "gzip" [sinks.observe24.encoding] codec = "json" [sinks.observe24.auth] strategy = "bearer" token = "obs_..."
Fluent Bit (HTTP output):
[OUTPUT]
Name http
Match *
Host api.24observe.com
Port 443
URI /api/v1/logs/ingest
Format json
compress gzip
Header Authorization Bearer obs_...
tls on OpenTelemetry Collector (otlphttp exporter):
exporters:
otlphttp/24observe:
endpoint: https://api.24observe.com/api/v1/otlp
encoding: json
compression: gzip
headers:
authorization: "Bearer obs_..." curl:
gzip -c batch.json | curl -X POST https://api.24observe.com/api/v1/logs/ingest \ -H 'Authorization: Bearer obs_...' \ -H 'Content-Type: application/json' \ -H 'Content-Encoding: gzip' \ --data-binary @-
If the body is truncated or corrupt, the response is
400 BAD_COMPRESSED_BODY — retry with a fresh
payload. The observe24-collector
ships with compression: gzip enabled by default;
no configuration needed if you're using it.
curl -X POST https://api.24observe.com/api/v1/logs/ingest \
-H 'Authorization: Bearer obs_...' \
-H 'Content-Type: application/json' \
-d '{"level":"info","service":"my-app","message":"hello from my agent"}'
# Response (202)
{ "accepted": 1, "rejected": [], "bytesAccounted": 87 } curl -X POST https://api.24observe.com/api/v1/logs/ingest \
-H 'Authorization: Bearer obs_...' \
-H 'Content-Type: application/json' \
-d '[
{"level":"info","service":"web","message":"GET /healthz 200"},
{"level":"warn","service":"web","message":"slow query 1.2s","attrs":{"query":"select_users"}},
{"level":"error","service":"worker","message":"job failed","attrs":{"job_id":"abc123"}}
]' curl -X POST https://api.24observe.com/api/v1/logs/ingest \
-H 'Authorization: Bearer obs_...' \
-H 'Content-Type: application/x-ndjson' \
--data-binary $'{"level":"info","service":"web","message":"line1"}\n{"level":"warn","service":"web","message":"line2"}' const PAT = process.env.OBSERVE24_PAT!;
async function ship(events: object[]) {
const r = await fetch('https://api.24observe.com/api/v1/logs/ingest', {
method: 'POST',
headers: {
authorization: `Bearer ${PAT}`,
'content-type': 'application/json',
},
body: JSON.stringify(events),
});
if (!r.ok) throw new Error(`ingest ${r.status}: ${await r.text()}`);
return r.json();
}
await ship([{ level: 'info', service: 'my-app', message: 'started', attrs: { pid: process.pid } }]); import os, httpx
PAT = os.environ['OBSERVE24_PAT']
def ship(events):
r = httpx.post(
'https://api.24observe.com/api/v1/logs/ingest',
headers={'authorization': f'Bearer {PAT}', 'content-type': 'application/json'},
json=events,
timeout=10,
)
r.raise_for_status()
return r.json()
ship([{'level': 'info', 'service': 'my-app', 'message': 'started'}]) Every official OTel SDK can ship logs to us directly — no agent, no translation layer. Point your SDK at the OTLP endpoint and you're done.
# Any OTel-instrumented service (Java / Python / Node / Go / .NET / Ruby / Rust) export OTEL_EXPORTER_OTLP_ENDPOINT=https://api.24observe.com/api/v1/otlp export OTEL_EXPORTER_OTLP_HEADERS=authorization=Bearer obs_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx export OTEL_EXPORTER_OTLP_PROTOCOL=http/json export OTEL_LOGS_EXPORTER=otlp
POST /api/v1/otlp/v1/logs (the /v1/logs suffix is appended by the SDK).application/json only in v1. For http/protobuf, run the 24observe collector as a translator.logs:write scope; same quota + per-PAT byte cap as /logs/ingest.{} on full success; {"partialSuccess":{"rejectedLogRecords":N,"errorMessage":"..."}} when some records were dropped (e.g. attrs > 4KB). HTTP 200 in both cases per OTLP spec.severityNumber 1-4 → trace, 5-8 → debug, 9-12 → info, 13-16 → warn, 17-20 → error, 21-24 → fatal. Falls back to severityText if number is missing.service.name → service, host.name → host, telemetry.sdk.name → source. Everything else lands in attrs.traceId + spanId flow into attrs.trace_id / attrs.span_id so you can correlate logs with traces (when we ship Tracing v1).For sources that don't speak OTLP natively — legacy syslog appliances or Fluent Bit / Fluentd pipelines — run our pre-configured collector image. Single Docker container, forwards everything as OTLP to the endpoint above.
Heroku ships every app log line — dyno output, router events, deploys — to a drain URL of your choice. Add ours to start receiving them; no SDK install, no buildpack changes.
logs:write only and a sensible daily byte cap. A drain URL leaking into an ops doc shouldn't grant full account access.heroku drains:add \ https://api.24observe.com/api/v1/logs/ingest/heroku/obs_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \ --app my-heroku-app
Heroku starts forwarding immediately. Inside ~2 seconds you'll see
events arrive with source=heroku, service=<your
heroku app name>, and the dyno identifier
(web.1, worker.2, etc.) in attrs.proc_id.
application/logplex-1) — parsed server-side, no setup on your end.Pull every Next.js / SvelteKit / Astro deployment's request logs, function logs, edge logs, and build output into 24observe with one URL. No edge function to deploy, no custom hooks.
logs:write and a daily byte cap (same advice as Heroku above).https://api.24observe.com/api/v1/logs/ingest/vercel/obs_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Pick which projects + sources (static / lambda / edge / build) the
drain should cover, hit save, and Vercel starts batching events to
you within a minute. Events arrive with source=vercel,
service=<your-project-name>, and the following
Vercel metadata promoted to attrs so you can facet on it:
request_id — Vercel's x-vercel-id per request; correlate function logs with the originating HTTP request.status_code — for runtime/edge logs that wrap a response.vercel_source — one of lambda, edge, static, build, external.deployment_id — pin a log to a specific Vercel deployment.path — the URL path that produced the log line.
Level inference: status ≥ 500 → error, status ≥ 400
→ warn, message starting with ERROR /
Error / FATAL → error,
everything else → info. Override server-side by
emitting structured JSON from your function and parsing the level
client-side via a saved KQL search.
Pipe every Lambda function, ECS task, EKS pod, RDS instance, or anything else logging to CloudWatch directly into 24observe via a Kinesis Firehose subscription. No SDK install, no agent on the instance — CloudWatch already has your logs, Firehose forwards them to a URL.
logs:write only and a daily byte cap (CloudWatch can be high-volume; protect the blast radius).aws firehose create-delivery-stream):
https://api.24observe.com/api/v1/logs/ingest/aws-firehose/<your-PAT>aws logs put-subscription-filter \ --log-group-name "/aws/lambda/my-function" \ --filter-name "ship-to-24observe" \ --filter-pattern "" \ --destination-arn "arn:aws:firehose:us-east-1:<account>:deliverystream/observe24-stream" \ --role-arn "arn:aws:iam::<account>:role/CloudWatchLogsFirehoseRole"
Events arrive with source=aws-cloudwatch,
service=<your log group name>, the log stream
in host, plus these CloudWatch attrs:
cloudwatch_event_id — CloudWatch's per-event id for dedup.aws_account_id — owning account.log_stream — duplicated from host for KQL search ergonomics.
Level inference: messages prefixed with ERROR / WARN /
FATAL / DEBUG (case-insensitive) get the
matching level; everything else lands as info. Lambda
runtime's standard prefixes work out of the box.
Docker's built-in syslog driver ships every container's stdout/stderr to the 24observe collector — no per-container agent, no log file scraping. The collector's RFC5424 receiver listens on TCP 5414.
Daemon-wide (every container on the host): in
/etc/docker/daemon.json:
{
"log-driver": "syslog",
"log-opts": {
"syslog-address": "tcp://collector.local:5414",
"syslog-format": "rfc5424",
"tag": "{{.Name}}/{{.ImageName}}"
}
} Per-container (selective opt-in, no daemon restart):
docker run --log-driver=syslog \ --log-opt syslog-address=tcp://collector.local:5414 \ --log-opt syslog-format=rfc5424 \ --log-opt tag="my-app" \ my-app:latest
Container name + image flow through as the syslog tag and surface
in service on the ingested rows. Restart of the Docker
daemon required for the daemon.json variant; the per-container flag
takes effect on the next docker run.
For systemd hosts, the OTel Collector Contrib distribution (which our
collector image is built on) ships with a journald
receiver — point it at /var/log/journal and every unit's
output flows in with severity, unit name, and PID preserved as
structured attrs.
Drop this override config alongside the bundled config:
# /etc/otelcol/journald.yaml
receivers:
journald:
directory: /var/log/journal
# Optional: filter to specific systemd units. Omit to ship everything.
units:
- nginx
- my-app
priority: info
service:
pipelines:
logs:
receivers: [otlp, syslog/rfc5424, syslog/rfc3164, fluentforward, journald] Then run the collector with the journal mounted and both configs:
# Pre-create the persistent-queue directory with the collector's UID # so the file_storage extension can write to it (the container runs # as non-root; the volume inherits root ownership otherwise). mkdir -p /var/lib/observe24-collector && chown 10001:10001 /var/lib/observe24-collector docker run -d --name observe24-collector \ -v /var/log/journal:/var/log/journal:ro \ -v /etc/machine-id:/etc/machine-id:ro \ -v /etc/otelcol/journald.yaml:/etc/otelcol-contrib/override.yaml:ro \ -v /var/lib/observe24-collector:/var/lib/otelcol/file_storage \ -e OBSERVE24_PAT=obs_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx \ -p 5414:5414 \ observe24/collector:latest \ --config /etc/otelcol-contrib/config.yaml \ --config /etc/otelcol-contrib/override.yaml
The file_storage volume keeps a persistent on-disk
queue so a collector restart doesn't lose buffered events. ~512 MiB
is plenty for normal workloads.
The /etc/machine-id mount is required by journald to
identify the host. Read-only mounts are sufficient — the collector
never writes to the journal.
See infra/vector/example.toml in the repo for a working example. Key snippet:
[sources.app_logs]
type = "file"
include = ["/var/log/myapp/*.log"]
[transforms.parse_json]
type = "remap"
inputs = ["app_logs"]
source = """
. = parse_json!(string!(.message))
.service = "myapp"
"""
[sinks.observe24]
type = "http"
inputs = ["parse_json"]
uri = "https://api.24observe.com/api/v1/logs/ingest"
encoding.codec = "ndjson"
request.headers.Authorization = "Bearer ${OBSERVE24_PAT}"
request.headers.Content-Type = "application/x-ndjson"
batch.max_events = 500
batch.timeout_secs = 2 curl 'https://api.24observe.com/api/v1/logs/search?q=timeout&service=web&level=error&limit=200' \
-H 'Authorization: Bearer obs_...'
# Response
{
"events": [
{ "ts": "2026-05-10T12:00:00.123Z", "level": "error", "service": "web",
"host": "web-1", "source": "stdout",
"message": "upstream timeout after 30s",
"attrs": { "request_id": "abc" } }
],
"nextCursor": "1715342400123:web",
"tookMs": 47
}
# Paginate by passing nextCursor as ?cursor=...
For anything beyond a single-string substring, pass ?query=...
instead of (or alongside) q. KQL-lite is a small grammar an
agent can emit programmatically: field:value, AND
/ OR / NOT, parentheses for grouping, wildcards,
and quoted values for terms with spaces. Whitelisted fields:
service, level, host,
source, message.
# All 5xx errors in the checkout service over the last 24h ?query=level:error AND service:checkout &from=2026-05-24T00:00:00Z # Error OR warn in any "api"-prefixed service, EXCLUDING the health probe ?query=(level:error OR level:warn) AND service:api* NOT message:"health check" # Bare terms = substring on message (same as ?q=...) ?query=ECONNREFUSED # Quote values with spaces, escape with \\ inside a CLI ?query=message:"connection refused" # Wildcards on field values (LIKE on the column) ?query=host:web-* # Combine with ?service= / ?level= / ?q= — they all AND together
Parse errors return 400 KQL_PARSE_ERROR with a message
indicating the position of the bad token. Unknown fields are rejected at
parse time (you can't query org_id or attrs.*
— those are intentionally outside the language).
Default time window: 1h back from to (or now) when from is
omitted. Hard 5s query timeout. Cursor-paginated to skip the cost of OFFSET.
Stack-trace fingerprinting on log events. Stash a million
occurrences of TypeError: x is undefined as one row
with a count, first-seen, last-seen, and a sample message —
Sentry-lite, bundled, no extra meter.
The scheduler scans the last 6 min of logs every 5 min, extracts
a stable signature from each stack trace (normalized header + 1-3
frames, FNV-1a 64-bit hashed), and upserts one row per
(org, signature). Languages auto-detected: JS / Node,
Python tracebacks, Java / Kotlin / JVM, Go panic, and generic
ERROR / FATAL prefixes.
# List open errors
curl 'https://api.24observe.com/api/v1/log-errors?status=open&limit=20' \
-H 'Authorization: Bearer obs_...'
# Response
[
{
"id": 7,
"errorType": "TypeError",
"signature": "TypeError | at processUser (user.js) | at handle (server.js)",
"service": "api",
"sampleMessage": "TypeError: Cannot read properties of undefined ...",
"totalCount": 14821,
"firstSeen": "2026-05-27T08:14:00.000Z",
"lastSeen": "2026-05-28T13:55:12.000Z",
"resolvedAt": null, "resolvedByUserId": null, "ignored": false
},
...
]
# Mark resolved
curl -X PATCH https://api.24observe.com/api/v1/log-errors/7 \
-H 'Authorization: Bearer obs_...' -H 'Content-Type: application/json' \
-d '{"resolved": true}'
# Mark known noise (hides from default Open list)
curl -X PATCH https://api.24observe.com/api/v1/log-errors/7 \
-H 'Authorization: Bearer obs_...' -H 'Content-Type: application/json' \
-d '{"ignored": true}' ?status=open|resolved|ignored|all, ?service=..., ?limit=... (max 200).
Save any log query as a chartable, alertable time-series. The metric
is just count(matching events per bucket) — computed on
demand against the same store the search endpoint queries, so it
can't lag behind real log data.
# Create a metric
curl -X POST https://api.24observe.com/api/v1/log-metrics \
-H 'Authorization: Bearer obs_...' \
-H 'Content-Type: application/json' \
-d '{
"name": "5xx in checkout",
"query": "5xx",
"service": "checkout",
"level": "error",
"bucketSec": 300
}'
# Fetch the series (last 6h by default, zero-filled buckets)
curl 'https://api.24observe.com/api/v1/log-metrics/42/series?from=2026-05-28T00:00:00Z' \
-H 'Authorization: Bearer obs_...'
# Response
{
"metricId": 42,
"bucketSec": 300,
"points": [
{ "ts": "2026-05-28T00:00:00.000Z", "value": 12 },
{ "ts": "2026-05-28T00:05:00.000Z", "value": 8 },
...
],
"tookMs": 41
} /series call.value: 0 rather than being absent — charts stay flat instead of jagged on quiet windows.to when from is omitted./series is rate-limited at 60/min/IP and 5s hard timeout. The dashboard auto-refreshes each chart every 60s.
Group log messages by template, not exact string. The dashboard's
Patterns tab (or GET /api/v1/logs/patterns) returns the
top message templates by count after normalizing high-cardinality
tokens — so 10,000 lines that differ only in a timestamp and a UUID
collapse to one row.
curl 'https://api.24observe.com/api/v1/logs/patterns?service=checkout&limit=20' \
-H 'Authorization: Bearer obs_...'
# Response (top-K templates by count, descending)
{
"patterns": [
{
"templateHash": "14782...",
"template": "ERROR DB query took <NUM>ms for user_id=<NUM>",
"sample": "ERROR DB query took 2451ms for user_id=8821",
"service": "checkout",
"level": "error",
"count": 11423,
"firstSeen": "2026-05-27 12:00:00.000",
"lastSeen": "2026-05-28 11:59:31.420"
},
...
],
"tookMs": 1830,
"scannedRows": 14210
} Normalized tokens:
<TS> — ISO-8601 timestamps<UUID> — RFC 4122 UUIDs<IP> — IPv4 addresses<HEX> — hex blobs (≥ 8 chars, common for ids / signatures)<STR> — single- or double-quoted strings<NUM> — plain integers (applied last so it doesn't eat parts of the above)
Same filter ergonomics as /search:
?service=..., ?level=..., ?q=...
(pre-normalization substring match), ?from=... /
?to=.... Default window 1h back. Hard 8s server timeout;
rate limit 30/min/IP (heavier per-call than search).
Every search response now includes a facets object —
top-K key/value/count rollups over the visible result page. Top-level
columns (level, service, host,
source) plus any primitive-typed key in each row's
attrs blob become a facet. No schema declaration, no
index config — ship JSON, the keys auto-surface.
# Response shape (truncated)
{
"events": [...],
"nextCursor": "...",
"tookMs": 47,
"facets": {
"level": [{"value": "info", "count": 142}, {"value": "error", "count": 11}],
"service": [{"value": "checkout", "count": 89}, {"value": "api", "count": 64}],
"user_id": [{"value": "u_a8f", "count": 23}, {"value": "u_b14", "count": 9}, ...],
"method": [{"value": "POST", "count": 88}, {"value": "GET", "count": 65}],
"status": [{"value": "200", "count": 134}, {"value": "500", "count": 11}]
}
}
Rules: top 20 keys (ranked by total occurrences), top 10 values per
key, values capped at 256 chars. Fields where every row has the same
value are suppressed (nothing to facet). Nested objects and arrays
in attrs are skipped — only string / number / boolean
primitives become chips. High-entropy IDs (trace_id,
request_id) are block-listed so they don't pollute the
panel.
Click a chip in the dashboard sidebar to filter by that
key:value; the clause is appended to your KQL
query with AND. Combine multiple chips to drill in.
# curl
curl -N 'https://api.24observe.com/api/v1/logs/tail?service=web&level=error' \
-H 'Authorization: Bearer obs_...'
# JS / browser EventSource (no Authorization header support; use ?token=)
const es = new EventSource(`https://api.24observe.com/api/v1/logs/tail?token=${PAT}&service=web`);
es.addEventListener('log', (e) => console.log(JSON.parse(e.data)));
es.addEventListener('timeout', () => es.close()); // 30-min server cap Real-time: events are pushed the moment they're accepted by
/ingest — typically < 100 ms end-to-end. On connect,
the last ~5 s of events are replayed from the event store so you see
recent context immediately. A : hb SSE comment is sent
every 15 s so intermediaries don't idle-close.
Hard 30-minute connection cap; clients reconnect.
Tail filter (q / service / level)
is applied subscriber-side using the same matching as /search.
Requires logs:read scope.
Define a rule: "open an incident when N events match (q + service + level) within W seconds." Same alert routing as monitor alerts (email, webhook, Slack, Discord, MS Teams, Telegram). Rules are evaluated on a rolling 1-minute window.
curl -X POST https://api.24observe.com/api/v1/log-alerts \
-H 'Authorization: Bearer obs_...' \
-H 'Content-Type: application/json' \
-d '{
"name": "5xx surge in web service",
"query": "5xx",
"service": "web",
"level": "error",
"threshold": 10,
"windowSec": 300,
"alertSlackUrl": "https://hooks.slack.com/services/..."
}'
Set kind: "anomaly" to alert when the rate is N× higher
than a rolling baseline, instead of N events flat. Use when "normal"
varies by service / time-of-day and no single threshold fits — common
for "error rate suddenly spiked" rules where the magic number changes
as traffic grows.
curl -X POST https://api.24observe.com/api/v1/log-alerts \
-H 'Authorization: Bearer obs_...' \
-H 'Content-Type: application/json' \
-d '{
"name": "Checkout error spike",
"kind": "anomaly",
"service": "checkout",
"level": "error",
"windowSec": 300,
"ratioThreshold": 3.0,
"baselineHours": 168,
"minBaselineEvents": 10,
"alertSlackUrl": "https://hooks.slack.com/services/..."
}' ratioThreshold — fire when current count ≥ N × baseline avg. Default 3.0.baselineHours — rolling lookback for the baseline. Default 168 (7 days — smooths weekday/weekend traffic differences).minBaselineEvents — skip evaluation when baseline avg per bucket is below this floor. Prevents "3 vs 1 events" tripping a 3× rule on a tiny baseline. Default 10.count(now-window) / avg(count per same-sized bucket over baselineHours, excluding the current window) ≥ ratioThreshold.log_alert_kind: "anomaly": anomaly_ratio, ratio_threshold, baseline_hours.LOG_ALERT_KIND_MISMATCH — 400 response when create/update mixes shapes: anomaly-only fields (ratioThreshold / baselineHours / minBaselineEvents) sent with kind: "threshold", or vice versa. Fix: pick a kind and send only its required fields.