Skip to content

How Generali Malaysia optimizes operations with Amazon EKS

Summary

Generali Malaysia — one of Malaysia's largest general insurers, part of the Generali Group (~190 years) — migrated to AWS in 2019 and selected Amazon EKS as the target container platform for modernised core insurance and digital applications. This customer-retrospective post from the AWS Architecture Blog walks through Generali's integration points after adopting EKS Auto Mode (AWS-managed K8s data plane with Bottlerocket nodes, auto-upgrades, managed add-ons), framed against the AWS Well-Architected Framework. Six themes are covered end-to-end: managed cluster lifecycle via Auto Mode; defence-in-depth security (GuardDuty Extended Threat Detection + runtime monitoring, Inspector's ECR-image-to-running-container mapping, AWS Network Firewall SNI-based egress filtering, Secrets Manager + External Secrets Operator with stateless-only pods); cost allocation via AWS Billing's split cost allocation data for EKS + Savings Plans; observability via Amazon Managed Grafana with CloudWatch as data source for per-namespace dashboards. This is a reference-architecture-style customer case study — substance is in the integration topology (which AWS services plug into an EKS cluster, how, and why) and the operational-discipline choices (stateless-only pods, pod/node disruption budgets, SNI allow-list egress) rather than quantified production numbers (no cluster sizes, RPS, cost deltas, or incident retrospectives published).

Key takeaways

  1. EKS Auto Mode expands AWS's shared-responsibility line into the K8s data plane. Before: customer managed nodes, add-ons, AMI upgrades, and cluster upgrades. After: AWS manages Bottlerocket OS patching, default add-on upgrades, node lifecycle, and the cluster-upgrade cadence itself (typically a new AMI weekly — nodes are terminated and replaced with upgraded ones). Customer retains node-pool policy (instance-type preferences, zones, scale bounds) and workload concerns. Generali described this as "automated upgrades, so our teams can focus on application development rather than infrastructure complexity." (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  2. Auto Mode requires explicit disruption-control configuration to coexist with production workloads. Because Auto Mode actively terminates nodes on a weekly cadence to apply the new AMI, Generali had to add:

  3. Maintenance windows pinned to off-peak hours so rolling node replacement happens when demand is lowest.
  4. Pod Disruption Budgets (PDBs) so the K8s scheduler never takes down all replicas of a micro-service simultaneously during drain.
  5. Node Disruption Budgets bounding how many nodes can be replaced concurrently across the cluster. These are the three primitives that convert "the platform will randomly replace your nodes" into a safe property; without them the managed upgrade is a DDOS against your own workloads. See patterns/disruption-budget-guarded-upgrades. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  6. Generali's K8s operating discipline: stateless-only + immutable pods + Helm + HPA. Four complementary rules, stated as first-class principles:

  7. "only allow stateless micro-services" — no in-pod persistent state; all persistence is in external managed services. This materially simplifies the rest of the stack (no stateful pod migration during node churn; Secrets Manager can inject as env vars without volume mounts; Auto Mode's node-replacement cadence is safe).
  8. "treat the underlying pods as immutable" — no in-place mutation; upgrade by replacement.
  9. "Helm charts as a standardized deployment mechanism" — one packaging format for apps + OSS add-ons.
  10. "Horizontal Pod Autoscaler (HPA) to scale services based on traffic" — pod-level elasticity driven by the real demand signal, not static capacity planning. These four compound: stateless-only makes pod immutability cheap; pod immutability makes HPA safe; Helm encodes all three as template defaults. Design discipline is primarily what makes Auto Mode viable, not the Auto Mode features themselves. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  11. GuardDuty Extended Threat Detection + runtime monitoring fuses four signal streams into MITRE-ATT&CK-mapped multistage attack chains. Generali enabled both the EKS-protection and runtime monitoring modes of GuardDuty. The service correlates:

  12. Amazon EKS audit logs (control-plane API calls)
  13. Runtime behaviours (process / network / file activity inside containers)
  14. Malware execution detections
  15. AWS API activity (CloudTrail) into consolidated findings for complex patterns — "container exploitation, privilege escalation, and unauthorized movement within their Kubernetes environment, with detailed timelines mapped to MITRE ATT&CK tactics and techniques." Business outcome framed as reducing investigation time and enabling prioritisation by blast radius rather than by individual-event severity. No quantified numbers for false-positive rate, mean-time-to-detect, or alert volume are disclosed. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  16. Inspector's ECR-image-to-running-container mapping is the canonical runtime-vulnerability-prioritisation primitive. Traditional vulnerability scanning finds vulns "in repository images" — every image in ECR, regardless of whether it is deployed. Inspector now maps each vuln finding to:

  17. Cluster ARNs where the image is deployed
  18. Number of EKS pods running the image
  19. Last in-use date per vulnerability finding "Prioritize remediation efforts based on actual container usage patterns rather than repository events alone." This is the structural fix for "we have 50,000 CVEs — which ones matter?" — the ones in images actually running in production. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  20. AWS Network Firewall for SNI-based egress allow-list is the canonical hostname-stable egress-control pattern. Architecture topology:

  21. EKS cluster in private subnets — no direct internet route.
  22. Network Firewall endpoints in public subnets — the filter hop.
  23. NAT Gateways in protected subnets between Firewall and internet — outbound NAT happens after filtering. Filter rule shape: allow only a configured list of hostnames, matched against the TLS ClientHello Server Name Indication (SNI). Two structural benefits:
  24. IP-drift immunity — SaaS providers rotate IPs; hostnames are stable, so the allow-list doesn't need continuous IP-range refreshes.
  25. CloudWatch traffic analysis — Network Firewall emits alert logs of accessed hostnames into CloudWatch, giving Generali "traffic pattern analysis" and compliance evidence. Compliance framing: "applications can only access approved external services." (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  26. External Secrets Operator + Secrets Manager + stateless pods = env-var-only secret injection without daemonsets. Generali's choice chain:

  27. Hard-coded secrets in deployment manifests → "not recommended."
  28. Kubernetes native Secret objects managed by hand → manual rotation, no audit trail.
  29. CSI driver (secrets as mounted volumes) → conflicts with stateless-only discipline (volume mount) and adds a daemonset.
  30. Chosen: External Secrets Operator reads from AWS Secrets Manager on a recurring basis and writes K8s Secret objects, which pods consume as environment variables. No daemonset, no volume mount, no application code changes, automatic sync to reflect Secrets Manager rotation events. This is the composition-level consequence of the stateless-only principle — each design choice consistently eliminates the volume-mount dependency. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  31. AWS Billing split cost allocation data for EKS maps K8s-native identities to business dimensions. The billing primitive exposes four cluster-scoped cost allocation tags that CUR inherits:

  32. aws:eks:cluster-name
  33. aws:eks:deployment
  34. aws:eks:namespace
  35. aws:eks:node Generali uses these to map Kubernetes spend back to lines of business and applications "alongside other AWS spend" in one Cost Explorer view, not in a separate K8s-only cost tool. This is the billing-layer realisation of the same idea as namespace-per- tenant observability dashboards: the Kubernetes label schema is the chargeback axis, AWS services read it natively. Paired with Savings Plans for compute discounts. See patterns/eks-cost-allocation-tags. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  36. Amazon Managed Grafana + CloudWatch as data source: per-EKS- namespace dashboards without managing Grafana infra. Generali uses the CloudWatch → Amazon Managed Grafana integration to give each business-unit application owner their own dashboards scoped to their EKS namespace — "unified views of cluster health, node performance, pod resource utilization, and application performance indicators." The tenancy shape is namespace-per-project (same dimension as the cost-allocation tagging), and the observability substrate is fully AWS-managed. See systems/amazon-managed-grafana. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

  37. Reported outcomes (qualitative only). Five business-level outcomes are enumerated, all qualitative:

    • "Significant reduction in operational overhead with EKS Auto Mode"
    • "Enhanced security with automated threat detection and response"
    • "Reduction in infrastructure costs through optimization"
    • "Improved mean-time-to-resolution"
    • "Accelerated application deployment cycles" No percentages, cluster sizes, pod counts, cost deltas, RPS, or incident retrospectives are published, which is typical for the Architecture Blog's customer-case-study format. The article's value is in the integration topology, not the numbers. (Source: sources/2026-03-23-aws-generali-malaysia-eks-auto-mode)

Architecture summary

                                                       Amazon Managed
                                                         Grafana ◄────┐
 Users ──► [Route53/ALB] ──► [Network Firewall public subnet]         │
                                      │  (SNI allow-list)             │
                                      ▼                               │
                                  [NAT (protected)]  ◄── outbound     │
                                      │                               │
                                 ┌────┴───────────────────────┐       │
                                 ▼                            │       │
                      ┌──────── EKS Cluster (Auto Mode) ──────┴──┐    │
                      │                                           │    │
                      │  Private subnets                          │    │
                      │  Bottlerocket nodes (weekly AMI replace)  │    │
                      │  PDBs + Node Disruption Budgets           │    │
                      │                                           │    │
                      │  Namespace A (tenant 1)  ┌───────┐        │    │
                      │    Deployments (HPA)     │ ESO   │────────┼─► Secrets Manager
                      │    Pods (stateless)      │       │        │    │
                      │                          └───────┘        │    │
                      │  Namespace B (tenant 2)  ...              │    │
                      └──────────────┬────────────────────────────┘    │
                                     │                                 │
                                     ▼                                 │
                   CloudWatch  ──────┼───────► Amazon Managed Grafana ─┘
                   GuardDuty runtime │
                   Inspector (ECR ↔ running containers)
                   CloudTrail        │
                   (Cost allocation tags: aws:eks:cluster-name /
                    deployment / namespace / node → Cost Explorer)

Systems introduced / extended

Introduces: - systems/eks-auto-mode — AWS-managed K8s data plane variant. - systems/amazon-guardduty — threat-detection service with EKS protection + runtime monitoring. - systems/amazon-inspector — vulnerability scanner with ECR-image to running-container mapping. - systems/aws-network-firewall — managed stateful network firewall with SNI allow-listing. - systems/external-secrets-operator — CNCF operator syncing secrets from AWS Secrets Manager into K8s Secret objects. - systems/amazon-managed-grafana — AWS-managed Grafana with native CloudWatch data source. - systems/bottlerocket — container-optimised Linux distro; the default AMI under EKS Auto Mode. - systems/generali-malaysia-eks — the customer-platform system page synthesising the whole integration topology.

Extends: - systems/aws-eks — new role: EKS Auto Mode (AWS-managed data plane variant) alongside standard EKS. Canonical integration- surface reference (six peer AWS services). - systems/kubernetes — new operational discipline (stateless- only + immutable pods + Helm + HPA as a compound rule set) tied to Auto Mode's managed-upgrade cadence. - systems/aws-secrets-manager — new role as source of record behind External Secrets Operator on EKS — the idiomatic K8s-native secret-store pattern. - systems/helm — Helm-as-standardised-deployment-format as a stated operating principle at an enterprise customer, not just a per-project tool. - systems/aws-iam — IAM integration for EKS pod identity (alluded to; not fully detailed). - companies/aws — Recent articles entry for EKS Auto Mode as a distinct managed-data-plane offering.

Concepts introduced / extended

Introduces: - concepts/well-architected-framework — AWS's six-pillar architecture-review framework (Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, Sustainability); the organising structure of this post. - concepts/shared-responsibility-model — the division of responsibilities between AWS (security/ops of the cloud) and customer (security/ops in the cloud); EKS Auto Mode shifts the line deeper into the customer's former scope. - concepts/pod-disruption-budget — K8s primitive bounding simultaneous pod terminations during voluntary disruption; core primitive for coexisting with managed-data-plane rolling upgrades. - concepts/egress-sni-filtering — hostname-based outbound allow- listing using the TLS ClientHello SNI, IP-drift-immune, the canonical pattern for egress control in AWS Network Firewall.

Extends: - concepts/managed-data-plane — EKS Auto Mode as a new tier on the same spectrum as the App Mesh managed-data-plane example; the platform takes over node lifecycle + OS patching + add-on upgrades on top of the existing control-plane-managed property. - concepts/observability — namespace-per-tenant CloudWatch-to- Managed-Grafana integration shape (as distinct from the EKS DevOps Agent's resource-discovery shape). - concepts/stateless-compute — stated as enterprise-wide platform discipline (not per-service); paired with env-var secret injection, HPA, immutable pods as a compound rule set. - concepts/tenant-isolation — namespace-per-project on shared EKS cluster with per-namespace dashboards + cost-allocation tags; a low-end isolation substrate contrasted with the extreme end of concepts/account-per-tenant-isolation (ProGlove).

Patterns introduced / extended

Introduces: - patterns/runtime-vulnerability-prioritization — map each vuln finding in an image registry to the set of currently-running workloads using that image; prioritise by running-pod count + last- use date rather than treating every image equally. Inspector's ECR-image-to-running-container mapping is the canonical instance. - patterns/eks-cost-allocation-tags — use K8s-native identity dimensions (cluster / namespace / deployment / node) as cost- allocation-tag axes at the AWS Billing layer, so K8s spend lands in the same Cost Explorer view as the rest of AWS spend. - patterns/disruption-budget-guarded-upgrades — when the infrastructure layer actively replaces nodes on a managed cadence, the customer's safety contract is the combination of (a) a maintenance window pinning when the churn happens, (b) Pod Disruption Budgets bounding simultaneous pod terminations, (c) Node Disruption Budgets bounding concurrent node replacements. Generali canonical instance on EKS Auto Mode.

Caveats and gaps

  • No numbers disclosed. Cluster count, node count, pod count, request rates, cost deltas, MTTR before/after, alert volumes, GuardDuty true-positive rates, Inspector finding counts, Network Firewall rule counts — all absent. Qualitative outcomes only.
  • No incident retrospective. The post claims improved MTTR but doesn't walk through an example investigation against the GuardDuty / Managed Grafana / Inspector stack.
  • No details on EKS Auto Mode's node-pool policy shape"EC2 instance types selected by Generali in the node pools configuration" is the only surface-level mention.
  • External Secrets Operator refresh interval is described only as "automatic secret synchronization on a recurring basis" — no interval or latency bound stated.
  • No discussion of CSI driver alternatives beyond the implicit rejection ("better to retrieve secrets dynamically"); no comparison with AWS Secrets and Configuration Provider (ASCP).
  • No architectural deep-dive on GuardDuty's or Inspector's own internals — this post is about how to use them against EKS, not how they work. See future source pages for internals.
  • No mention of pod identity / IRSA — how IAM is bound to K8s service accounts for the External Secrets Operator and CloudWatch exporters — is skipped in the post; implied but not detailed.
  • Future scope: "expansion plans to host AI models and upcoming agentic applications" on the same EKS platform — named but not architected.

Raw source

raw/aws/2026-03-23-how-generali-malaysia-optimizes-operations-with-amazon-eks-7db7215f.md

Source

Last updated · 200 distilled / 1,178 read