CONCEPT Cited by 1 source

Service group¶

Definition¶

A service group is a clustered collection of workloads (pods, containers, processes, hosts) that share a common identity in observability data and are addressable as a single logical service. In the Kubernetes context, a service group typically corresponds to a Service + its backing Deployment/StatefulSet; in the non-K8s case, it corresponds to a coherent named service emitting metrics under a shared service label.

The service group is the granularity at which agent infrastructure memory is extracted and summarised — above individual pods (too narrow) and below entire namespaces or clusters (too broad).

Canonical framing (Grafana Assistant, 2026-05-01)¶

(Source: sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask)

"For each discovered service group, agents produce documentation covering five areas: what the service is, its key metrics and labels, how it's deployed, what it depends on, and how its logs are structured."

And:

"You can review what the assistant has learned by navigating to the Assistant settings and browsing the discovered service groups."

Why this granularity is load-bearing¶

The five-category service-knowledge schema (identity, metrics, topology, dependencies, log structure) is coherent at the service- group level but falls apart at adjacent granularities:

Granularity	Why it fails
Single pod	Too narrow — metrics / logs / deps are all shared with peers; per-pod memory would be redundant and volatile (pods are ephemeral)
Deployment	Close, but a Deployment alone lacks the client-facing service name (K8s Service) users mention in queries
Namespace	Too broad — a namespace contains N services with N different dependency graphs, log formats, metric schemas
Cluster	Far too broad — clusters contain dozens of namespaces
Service group ✅	Matches how engineers think ("checkout-api"), matches metric label conventions (`service=checkout-api`), has coherent lifecycle, has well-defined dependencies

How it's detected¶

The exact detection heuristic used by Grafana Assistant is not disclosed. Plausible signals:

Prometheus label clustering. Workloads sharing service=X or app=X label values.
K8s API. Service objects and their selector-matched Deployments/StatefulSets.
Trace service names. OpenTelemetry trace service.name attribute groupings.
Name similarity. Workloads whose metric names share semantic prefixes (http_requests_total{handler=…} scoped by service label).

Most likely a combination of these: K8s API when available, falling back to Prometheus label conventions otherwise.

Relationship to adjacent concepts¶

Service (as in K8s Service) — the K8s-native primitive closest to service group, but K8s-only. Service group generalises to non-K8s stacks.
Deployment — a specific K8s controller, not a service abstraction on its own. A service group usually corresponds to exactly one Deployment but isn't identical to it.
Microservice — overloaded marketing term. "Service group" is more precise because it's defined by the observability data, not by code-organisation rhetoric.
concepts/critical-business-operation — CBO is at a different altitude (customer-facing operation across multiple services); service group is a substrate below that.

Failure modes in detection¶

Label hygiene drift. Teams not using consistent service=/app= labels produce inferred service groups that don't match human-used names. "What does billing depend on?" fails if billing's workloads use label team=finance-apis and no service label.
Multi-service K8s Pods. A sidecar-heavy pod may contribute metrics to multiple service groups; grouping purely by pod or Deployment fails.
Batch jobs / CronJobs. Short-lived workloads may appear and disappear between refresh cycles, producing transient service groups.
Monorepo deployments. A single codebase deployed under multiple names with the same metric schema may be conflated into one service group.

Seen in¶

sources/2026-05-01-grafana-how-grafana-assistant-learns-your-infrastructure-before-you-even-ask — canonical wiki instance of service group as the extraction granularity for agent infrastructure memory. Load-bearing: "For each discovered service group, agents produce documentation…" — the whole five-category schema only holds at this altitude.