Working with SLOs¶
Service Level Objectives (SLOs) help you track reliability targets for your services. An SLO defines a target success rate over a time period.
Basic SLO Operations¶
List SLOs¶
async def list_slos(client: HoneycombClient, dataset: str) -> list[SLO]:
"""List all SLOs in a dataset.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to list SLOs from
Returns:
List of SLOs
"""
slos = await client.slos.list_async(dataset)
for slo in slos:
target_pct = slo.target_per_million / 10000
print(f"{slo.name}: {target_pct}% over {slo.time_period_days} days")
return slos
Get a Specific SLO¶
async def get_slo(client: HoneycombClient, dataset: str, slo_id: str) -> SLO:
"""Get a specific SLO by ID.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug containing the SLO
slo_id: ID of the SLO to retrieve
Returns:
The SLO object
"""
slo = await client.slos.get_async(dataset, slo_id)
print(f"Name: {slo.name}")
print(f"Target: {slo.target_per_million / 10000}%")
print(f"Time Period: {slo.time_period_days} days")
return slo
Update an SLO¶
async def update_slo(
client: HoneycombClient, dataset: str, slo_id: str, sli_alias: str
) -> SLO:
"""Update an existing SLO.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug containing the SLO
slo_id: ID of the SLO to update
sli_alias: SLI alias to use
Returns:
The updated SLO
"""
# Get existing SLO first
existing = await client.slos.get_async(dataset, slo_id)
# Update with new values
updated = await client.slos.update_async(
dataset,
slo_id,
SLOCreate(
name="Updated API Availability",
description=existing.description,
sli=sli_alias,
time_period_days=existing.time_period_days,
target_per_million=999900, # Increase to 99.99%
),
)
return updated
Delete an SLO¶
async def delete_slo(client: HoneycombClient, dataset: str, slo_id: str) -> None:
"""Delete an SLO.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug containing the SLO
slo_id: ID of the SLO to delete
"""
await client.slos.delete_async(dataset, slo_id)
Creating SLOs with SLOBuilder¶
SLOBuilder provides a fluent interface for creating SLOs with integrated burn alerts and automatic derived column management. It handles the complete orchestration of creating an SLO with all its dependencies.
Simple Example - Using Existing Derived Column¶
async def create_simple_slo(client: HoneycombClient, dataset: str, sli_alias: str) -> str:
"""Create an SLO using SLOBuilder with existing derived column.
This example creates a simple SLO using an existing derived column
without any burn alerts.
"""
bundle = (
SLOBuilder("API Availability")
.description("Track API request success rate")
.dataset(dataset)
.target_percentage(99.9)
.time_period_days(30)
.sli(alias=sli_alias) # Use existing derived column
.build()
)
slos = await client.slos.create_from_bundle_async(bundle)
return slos[dataset].id
Moderate Complexity - Creating New Derived Column¶
When you need to create both a derived column and an SLO together:
async def create_slo_with_new_column(client: HoneycombClient, dataset: str) -> str:
"""Create an SLO with a new derived column using SLOBuilder.
This example creates both a derived column and an SLO in one operation.
The builder handles creating the derived column first, then the SLO.
"""
import time
# Use timestamp to ensure unique column names across test runs
bundle = (
SLOBuilder("Request Success Rate")
.description("Percentage of successful requests")
.dataset(dataset)
.target_percentage(99.5)
.time_period_weeks(4)
.sli(
alias=f"request_success_{int(time.time())}",
expression="IF(LT($status_code, 400), 1, 0)",
description="1 if request succeeded, 0 otherwise",
)
.build()
)
slos = await client.slos.create_from_bundle_async(bundle)
return slos[dataset].id
High Complexity - SLO with Burn Alerts¶
Create an SLO with both exhaustion time and budget rate burn alerts:
async def create_slo_with_burn_alerts(
client: HoneycombClient, dataset: str, sli_alias: str, recipient_id: str
) -> str:
"""Create an SLO with burn alerts using SLOBuilder.
This example creates an SLO with two burn alerts:
1. Exhaustion time alert - triggers when budget will be exhausted soon
2. Budget rate alert - triggers when burn rate exceeds threshold
The builder handles creating the SLO and all burn alerts in one operation.
"""
bundle = (
SLOBuilder("Critical API SLO")
.description("High-priority API availability tracking")
.dataset(dataset)
.target_percentage(99.99)
.time_period_days(30)
.sli(alias=sli_alias)
# Add exhaustion time alert with existing recipient
.exhaustion_alert(
BurnAlertBuilder(BurnAlertType.EXHAUSTION_TIME)
.exhaustion_minutes(60)
.description("Alert when budget exhausts in 1 hour")
.recipient_id(recipient_id)
)
# Add budget rate alert with existing recipient
.budget_rate_alert(
BurnAlertBuilder(BurnAlertType.BUDGET_RATE)
.window_minutes(60)
.threshold_percent(2.0)
.description("Alert when burn rate exceeds 2% per hour")
.recipient_id(recipient_id)
)
.build()
)
slos = await client.slos.create_from_bundle_async(bundle)
return slos[dataset].id
SLOs with Tags¶
Organize and categorize SLOs using tags for team, service, environment, or criticality:
async def create_slo_with_tags(client: HoneycombClient, dataset: str, sli_alias: str) -> str:
"""Create an SLO with tags for organization.
Tags help categorize and filter SLOs by team, service, environment,
criticality, or any other dimension. Useful for large deployments
with many SLOs.
"""
bundle = (
SLOBuilder("API Availability")
.description("Track API request success rate")
.dataset(dataset)
.target_percentage(99.9)
.sli(alias=sli_alias)
.tag("team", "platform")
.tag("service", "api")
.tag("environment", "production")
.tag("criticality", "high")
.build()
)
slos = await client.slos.create_from_bundle_async(bundle)
return slos[dataset].id
Multi-Dataset SLOs¶
Create an SLO across multiple datasets with an environment-wide derived column:
async def create_multi_dataset_slo(
client: HoneycombClient, datasets: list[str]
) -> dict[str, str]:
"""Create an SLO across multiple datasets using SLOBuilder.
Multi-dataset SLOs create a SINGLE SLO that spans multiple datasets.
The builder automatically:
1. Creates environment-wide derived column (if using inline expression)
2. Creates ONE SLO via the __all__ endpoint with dataset_slugs
3. Creates burn alerts in the first dataset
Returns dict mapping each dataset to the same SLO ID.
"""
bundle = (
SLOBuilder("Cross-Service Availability")
.description("Overall service availability across all APIs")
.datasets(datasets) # Multiple datasets
.target_percentage(99.9)
.time_period_days(30)
.sli(
alias="service_success",
expression="IF(EQUALS($status, 200), 1, 0)",
description="1 for success, 0 for failure",
)
.tag("team", "platform")
.tag("criticality", "high")
.budget_rate_alert(
BurnAlertBuilder(BurnAlertType.BUDGET_RATE)
.window_minutes(60)
.threshold_percent(1.0)
.email("platform@example.com")
)
.build()
)
slos = await client.slos.create_from_bundle_async(bundle)
# Return SLO IDs for each dataset (all point to the same SLO)
return {dataset: slo.id for dataset, slo in slos.items()}
SLOBuilder Reference¶
Target Configuration Methods¶
| Method | Description |
|---|---|
.target_percentage(percent) |
Set target as percentage (e.g., 99.9 for 99.9%) |
.target_per_million(value) |
Set target directly as per-million value (e.g., 999000 for 99.9%) |
Time Period Methods¶
| Method | Description |
|---|---|
.time_period_days(days) |
Set time period in days (1-90) |
.time_period_weeks(weeks) |
Set time period in weeks |
SLI Definition Methods¶
| Method | Description |
|---|---|
.sli(alias) |
Use existing derived column |
.sli(alias, expression, description) |
Create new derived column |
Dataset Scoping¶
| Method | Description |
|---|---|
.dataset(slug) |
Scope SLO to single dataset |
.datasets([slug1, slug2]) |
Scope SLO to multiple datasets (creates one SLO via all endpoint) |
Organization Methods¶
| Method | Description |
|---|---|
.description(desc) |
Set SLO description |
.tag(key, value) |
Add a tag key-value pair for organizing/filtering SLOs (max 10 tags) |
Burn Alert Methods¶
| Method | Description |
|---|---|
.exhaustion_alert(builder) |
Add exhaustion time burn alert |
.budget_rate_alert(builder) |
Add budget rate burn alert |
BurnAlertBuilder Reference¶
BurnAlertBuilder is used within SLOBuilder to configure burn alerts with recipients. It composes RecipientMixin for notification management.
Alert Type Configuration¶
| Alert Type | Required Methods | Description |
|---|---|---|
EXHAUSTION_TIME |
.exhaustion_minutes(minutes) |
Alert when budget will be exhausted within timeframe |
BUDGET_RATE |
.window_minutes(minutes) + .threshold_percent(percent) |
Alert when burn rate exceeds threshold |
Recipient Methods (from RecipientMixin)¶
See Recipients documentation for full details on available recipient methods:
- .email(address) - Email notification
- .slack(channel) - Slack notification
- .pagerduty(routing_key, severity) - PagerDuty notification
- .webhook(url, secret) - Webhook notification
- .msteams(workflow_url) - MS Teams notification
- .recipient_id(id) - Reference existing recipient by ID
Example: Exhaustion Time Alert¶
from honeycomb import BurnAlertBuilder, BurnAlertType
alert = (
BurnAlertBuilder(BurnAlertType.EXHAUSTION_TIME)
.exhaustion_minutes(60)
.description("Alert when budget exhausts in 1 hour")
.recipient_id("recipient-id-123") # Reference existing recipient
.build()
)
Example: Budget Rate Alert¶
from honeycomb import BurnAlertBuilder, BurnAlertType
alert = (
BurnAlertBuilder(BurnAlertType.BUDGET_RATE)
.window_minutes(60)
.threshold_percent(2.0)
.description("Alert when burn rate exceeds 2% per hour")
.recipient_id("recipient-id-456") # Reference existing recipient
.build()
)
Note: For integration testing, use .recipient_id() to reference recipients configured in Honeycomb. Email, Slack, PagerDuty, and webhook recipients must be set up in Honeycomb first via the Recipients API or UI.
Creating SLOs Manually¶
For simple cases or when you need fine-grained control, you can create SLOs directly:
async def create_basic_slo(
client: HoneycombClient, dataset: str, sli_alias: str
) -> str:
"""Create a basic SLO with 99.9% target.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to create SLO in
sli_alias: Alias of the SLI derived column
Returns:
The created SLO ID
"""
slo = await client.slos.create_async(
dataset,
SLOCreate(
name="API Availability",
description="99.9% availability target for API service",
sli=sli_alias,
time_period_days=30,
target_per_million=999000, # 99.9%
),
)
return slo.id
_with_targets
async def create_slo_with_targets(
client: HoneycombClient, dataset: str, sli_alias: str
) -> str:
"""Create an SLO with different target levels.
Args:
client: Authenticated HoneycombClient
dataset: Dataset slug to create SLO in
sli_alias: Alias of the SLI derived column
Returns:
The created SLO ID
Common target_per_million values:
- 999000 = 99.9% (3 nines)
- 999900 = 99.99% (4 nines)
- 990000 = 99.0% (2 nines)
- 950000 = 95.0%
"""
slo = await client.slos.create_async(
dataset,
SLOCreate(
name="API Request Success",
description="High availability SLO",
sli=sli_alias,
time_period_days=7, # 7-day rolling window
target_per_million=995000, # 99.5%
),
)
return slo.id
Understanding target_per_million¶
The target_per_million field represents your success rate as parts per million. Common values:
999000= 99.9% (3 nines) - ~43 minutes downtime/month999900= 99.99% (4 nines) - ~4.3 minutes downtime/month999990= 99.999% (5 nines) - ~26 seconds downtime/month990000= 99.0% (2 nines) - ~7.2 hours downtime/month950000= 95.0% - ~36 hours downtime/month
To convert from percentage: target_per_million = int(percentage * 10000)
SLI Configuration¶
The Service Level Indicator (SLI) is typically configured in the Honeycomb UI and referenced by alias:
Time Period Options¶
SLOs support rolling windows between 1 and 90 days: - 7 days: Good for rapidly changing services - 30 days: Good for most services (recommended) - 90 days: Good for very stable, critical services
Sync Usage¶
All SLO operations have sync equivalents:
with HoneycombClient(api_key="...", sync=True) as client:
# List SLOs
slos = client.slos.list("my-dataset")
# Create SLO
slo = client.slos.create("my-dataset", SLOCreate(...))
# Get SLO
slo = client.slos.get("my-dataset", slo_id)
# Update SLO
updated = client.slos.update("my-dataset", slo_id, SLOCreate(...))
# Delete SLO
client.slos.delete("my-dataset", slo_id)