Working with Derived Columns¶
Derived columns (also called calculated fields) are virtual columns computed from expressions at query time. They allow you to create new fields from existing data without modifying the underlying events.
Basic Operations¶
List Derived Columns¶
async def list_derived_columns(client: HoneycombClient, dataset: str) -> list[DerivedColumn]:
"""List all derived columns in a dataset.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to list columns from
Returns:
List of derived columns
"""
columns = await client.derived_columns.list_async(dataset)
for col in columns:
print(f"{col.alias}: {col.expression}")
return columns
def list_derived_columns_sync(client: HoneycombClient, dataset: str) -> list[DerivedColumn]:
"""List derived columns using sync client.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to list columns from
Returns:
List of derived columns
"""
columns = client.derived_columns.list(dataset)
for col in columns:
print(f"{col.alias}: {col.expression}")
return columns
Get a Specific Derived Column¶
dc = await client.derived_columns.get_async("dataset-slug", "column-id")
print(f"Alias: {dc.alias}")
print(f"Expression: {dc.expression}")
Delete a Derived Column¶
Creating Derived Columns¶
Simple Expression with DerivedColumnBuilder¶
async def create_simple_derived_column(client: HoneycombClient, dataset: str) -> str:
"""Create a simple derived column using DerivedColumnBuilder.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to create column in
Returns:
The created derived column ID
"""
dc = (
DerivedColumnBuilder("has_trace")
.expression("EXISTS($trace.trace_id)")
.description("True if trace ID exists")
.build()
)
created = await client.derived_columns.create_async(dataset, dc)
return created.id
IF Expression¶
async def create_if_expression_column(client: HoneycombClient, dataset: str) -> str:
"""Create a derived column with IF expression.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to create column in
Returns:
The created derived column ID
"""
import time
# Use timestamp to ensure unique column names across test runs
dc = (
DerivedColumnBuilder(f"request_success_{int(time.time())}")
.expression("IF(LT($status_code, 400), 1, 0)")
.description("1 if request succeeded, 0 otherwise")
.build()
)
created = await client.derived_columns.create_async(dataset, dc)
return created.id
Manual Construction¶
async def create_derived_column_manual(client: HoneycombClient, dataset: str) -> str:
"""Create a derived column using manual construction.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to create column in
Returns:
The created derived column ID
"""
dc = DerivedColumnCreate(
alias="has_span",
expression="EXISTS($trace.span_id)",
description="True if span ID exists",
)
created = await client.derived_columns.create_async(dataset, dc)
return created.id
Environment-Wide Derived Columns¶
Environment-wide derived columns are available across all datasets in your environment. Use "__all__" as the dataset slug.
Create Environment-Wide Column¶
async def create_environment_wide_column(client: HoneycombClient) -> str:
"""Create an environment-wide derived column.
Environment-wide columns are available in all datasets.
Use "__all__" as the dataset slug.
Returns:
The created derived column ID
"""
dc = (
DerivedColumnBuilder("global_has_trace")
.expression("EXISTS($trace.trace_id)")
.description("Environment-wide trace indicator")
.build()
)
# Use "__all__" for environment-wide columns
created = await client.derived_columns.create_async("__all__", dc)
return created.id
List Environment-Wide Columns¶
async def list_environment_wide_columns(client: HoneycombClient) -> list:
"""List all environment-wide derived columns.
Returns:
List of environment-wide derived columns
"""
columns = await client.derived_columns.list_async("__all__")
for col in columns:
print(f"{col.alias}: {col.expression}")
return columns
Expression Syntax Fundamentals¶
Understanding the syntax rules is essential for writing valid expressions.
Column References¶
- Basic: Prefix column names with
$- e.g.,$status_code,$duration_ms - With spaces/special chars: Use quotes - e.g.,
$"http.status_code",$"user name" - Case sensitive:
$Statusand$statusare different columns
Literals and Quoting¶
| Type | Syntax | Example |
|---|---|---|
| Strings | Double quotes "..." |
"error", "GET" |
| Regex patterns | Backticks `...` |
`^/api/v[0-9]+` |
| Numbers | Plain | 200, 1000.5 |
| Booleans | true, false |
Used in comparisons |
Key Concepts¶
Single-Row Operation
Derived columns operate on one event at a time. They are NOT aggregation functions. You cannot reference other events or compute across multiple rows.
SLI Expressions for SLOs
SLI (Service Level Indicator) expressions must return a boolean (true/false).
Values 1/0 are automatically coerced to true/false.
Example: LT($status_code, 500) or IF(LT($status_code, 500), true, false)
Expression Functions Reference¶
Conditional Functions¶
| Function | Description | Example |
|---|---|---|
IF(cond, then, else) |
Conditional expression | IF(LT($status, 400), "ok", "error") |
SWITCH($field, case1, val1, ..., default) |
Match against string cases | SWITCH($method, "GET", 1, "POST", 2, 0) |
COALESCE(a, b, ...) |
First non-null value | COALESCE($user_id, $session_id, "anonymous") |
Comparison Functions¶
All comparison functions return boolean values.
| Function | Description | Example |
|---|---|---|
LT($a, $b) |
Less than | LT($status_code, 400) |
LTE($a, $b) |
Less than or equal | LTE($duration_ms, 100) |
GT($a, $b) |
Greater than | GT($duration_ms, 1000) |
GTE($a, $b) |
Greater than or equal | GTE($error_count, 1) |
EQUALS($a, $b) |
Equality check | EQUALS($method, "GET") |
IN($field, val1, val2, ...) |
Value in set | IN($status, 200, 201, 204) |
Boolean Functions¶
| Function | Description | Example |
|---|---|---|
EXISTS($field) |
True if field is non-null | EXISTS($trace.parent_id) |
NOT(cond) |
Logical negation | NOT(EXISTS($error)) |
AND(a, b, ...) |
All conditions true | AND(GT($status, 199), LT($status, 300)) |
OR(a, b, ...) |
Any condition true | OR(EQUALS($level, "error"), EQUALS($level, "fatal")) |
Math Functions¶
| Function | Description | Example |
|---|---|---|
SUM(a, b, ...) |
Add values | SUM($request_time, $queue_time) |
SUB(a, b) |
Subtract | SUB($end_time, $start_time) |
MUL(a, b) |
Multiply | MUL($duration_ms, 0.001) |
DIV(a, b) |
Divide | DIV($bytes, 1024) |
MOD(a, b) |
Modulo | MOD($request_id, 100) |
MIN(a, b, ...) |
Minimum value | MIN($timeout, $max_wait) |
MAX(a, b, ...) |
Maximum value | MAX($retries, 0) |
LOG10(a) |
Base-10 logarithm | LOG10($request_count) |
String Functions¶
| Function | Description | Example |
|---|---|---|
CONCAT(a, b, ...) |
Join strings | CONCAT($method, " ", $path) |
STARTS_WITH($str, prefix) |
Check prefix (returns bool) | STARTS_WITH($path, "/api/") |
ENDS_WITH($str, suffix) |
Check suffix (returns bool) | ENDS_WITH($filename, ".json") |
CONTAINS($str, substr) |
Check contains (returns bool) | CONTAINS($error, "timeout") |
TO_LOWER($str) |
Convert to lowercase | TO_LOWER($method) |
LENGTH($str) |
String length | LENGTH($message) |
Regular Expression Functions¶
Use backticks for regex patterns
Regex patterns use backticks, not quotes: `pattern` not "pattern"
| Function | Description | Example |
|---|---|---|
REG_MATCH($str, pattern) |
Test if pattern matches (bool) | REG_MATCH($path, `/api/v[0-9]+`) |
REG_VALUE($str, pattern) |
Extract first capture group | REG_VALUE($path, `/api/([^/]+)`) |
REG_COUNT($str, pattern) |
Count pattern matches | REG_COUNT($log, `error`) |
Time Functions¶
| Function | Description | Example |
|---|---|---|
EVENT_TIMESTAMP() |
Event time as Unix timestamp | EVENT_TIMESTAMP() |
UNIX_TIMESTAMP($field) |
Parse field as timestamp | UNIX_TIMESTAMP($created_at) |
FORMAT_TIME(format, $ts) |
Format with strftime | FORMAT_TIME("%Y-%m-%d", EVENT_TIMESTAMP()) |
Data Transformation¶
| Function | Description | Example |
|---|---|---|
BUCKET($val, size) |
Numeric bucketing | BUCKET($duration_ms, 100) |
Type Conversion¶
| Function | Description | Example |
|---|---|---|
INT($val) |
Convert to integer | INT($string_id) |
FLOAT($val) |
Convert to float | FLOAT($count) |
BOOL($val) |
Convert to boolean | BOOL($enabled) |
STRING($val) |
Convert to string | STRING($status_code) |
DerivedColumnBuilder Methods¶
| Method | Description |
|---|---|
DerivedColumnBuilder(alias) |
Create builder with column alias |
.expression(expr) |
Set the expression |
.description(desc) |
Set optional description |
.build() |
Build the DerivedColumnCreate object |
Updating Derived Columns¶
from honeycomb import DerivedColumnCreate
updated = await client.derived_columns.update_async(
"dataset-slug",
"column-id",
DerivedColumnCreate(
alias="updated_column_name",
expression="IF(GT($duration_ms, 500), 1, 0)",
description="Updated expression"
)
)
Use Cases¶
SLI for SLOs¶
Derived columns are commonly used to define Service Level Indicators.
SLI Must Return Boolean
SLI expressions for SLOs must evaluate to true (good) or false (bad) for each event. Values 1/0 are automatically coerced to true/false.
# Define success/failure for SLO - comparison functions return boolean directly
sli_column = (
DerivedColumnBuilder("request_success")
.expression("LT($status_code, 500)")
.description("true for successful requests, false for server errors")
.build()
)
await client.derived_columns.create_async("api-logs", sli_column)
Latency Bucketing¶
# Categorize latency into buckets
latency_bucket = (
DerivedColumnBuilder("latency_bucket")
.expression(
'IF(LT($duration_ms, 100), "fast", '
'IF(LT($duration_ms, 500), "normal", "slow"))'
)
.description("Categorize requests by latency")
.build()
)
await client.derived_columns.create_async("api-logs", latency_bucket)
Best Practices¶
- Use descriptive aliases: Make column names self-explanatory
- Document expressions: Add descriptions explaining what the column calculates
- Test expressions: Verify expressions work before using in production SLOs
- Environment-wide for shared logic: Use environment-wide columns for cross-dataset consistency
- Keep expressions simple: Complex expressions are harder to debug
- Use correct quoting: Strings use
"...", regex patterns use`...` - Remember case sensitivity: Column names are case-sensitive (
$Status!=$status)