Curriculum

180 course lessons from “The Kubernetes Odyssey,” organized sequentially for beginners, practitioners, and advanced learners. Each lesson includes its core learning objective and specific, multi-track lab details.

Beginner Track (Lessons 1-60)

  1. Docker Fundamentals

    • Objective: Understand what a container is and build your first image.

    • Details: Compare containers to VMs. Run pre-built images (nginx, postgres). Write a basic Dockerfile for a static website.

  2. Image Optimization

    • Objective: Build small, secure, and efficient container images.

    • Details: Learn about Docker layer caching. Build an image and see how changes affect build time.

  3. Container Networking & Storage

    • Objective: Connect containers and persist data.

    • Details: Create a Docker network. Connect a web container to a database container by name. Understand bind mounts for development.

  4. Multi-Container Apps

    • Objective: Define and run complex applications locally with Docker Compose.

    • Details: Create a simple docker-compose.yml for a frontend and backend. Use docker-compose up to launch the application.

  5. Break-It-Friday: Debug containerized environments

    • Objective: Debug common issues in containerized environments.

    • Details: A container fails to start. Use docker logs to find the incorrect database connection string.

  6. Intro to Kubernetes

    • Objective: Understand the purpose of Kubernetes and set up a local cluster.

    • Details: What is orchestration? Explore a K8s cluster using a UI like Lens. See how a Deployment manages Pods.

  7. Pods: The Atomic Unit

    • Objective: Master the fundamental building block of Kubernetes.

    • Details: Write a YAML manifest for a single Pod. Use kubectl apply and kubectl describe to inspect its state.

  8. Workload Controllers: Deployments

    • Objective: Manage stateless applications and perform rolling updates.

    • Details: Create a Deployment for your web server. Scale the number of replicas up and down using kubectl scale.

  9. Services & Networking

    • Objective: Expose applications within and outside the cluster.

    • Details: Create a ClusterIP service. Expose a service outside the cluster using NodePort.

  10. Break-It-Friday: Debug K8s scheduling and networking

    • Objective: Diagnose and resolve common Kubernetes scheduling and networking problems.

    • Details: A Pod is stuck in Pending. Use kubectl describe pod to find it cannot be scheduled due to insufficient CPU.

  11. ConfigMaps & Secrets

    • Objective: Manage application configuration and sensitive data.

    • Details: Create a ConfigMap and mount it as an environment variable. Understand why not to put secrets in ConfigMaps.

  12. Health Probes & Lifecycle

    • Objective: Build self-healing and resilient applications.

    • Details: Add livenessProbe and readinessProbe to a Deployment. Observe K8s restarting a failing container.

  13. Resource Management

  • Objective: Define resource requests and limits.

  • Details: Understand the difference between requests and limits. Add requests and limits to your Pods.

  1. Persistent Storage

  • Objective: Manage stateful data.

  • Details: Understand the relationship between PersistentVolume (PV) and PersistentVolumeClaim (PVC). Manually create a PV.

  1. Break-It-Friday: Debug a full multi-tier application

  • Objective: Diagnose and resolve issues in a complex, multi-service application.

  • Details: The frontend pod is in CrashLoopBackOff. Find logs indicating it can’t reach the backend API and fix the Service name.

  1. Ingress Controllers

  • Objective: Expose HTTP/S routes from outside the cluster.

  • Details: Deploy the NGINX Ingress Controller. Create an Ingress object to route traffic to your web service.

  1. Intro to Service Mesh

  • Objective: Understand the “why” behind a service mesh.

  • Details: Discuss challenges of microservice communication. Install Istio and view the control plane components.

  1. Istio Traffic Management

  • Objective: Implement advanced routing, reliability, and security patterns.

  • Details: Use an Istio VirtualService to route all traffic to one version of a service.

  1. Network Policies

  • Objective: Secure pod-to-pod communication within the cluster.

  • Details: Create a “deny-all” NetworkPolicy. Then, create a policy to allow ingress from the frontend to the backend.

  1. Break-It-Friday: Debug advanced networking and service mesh issues

  • Objective: Diagnose complex connectivity and policy issues.

  • Details: An Ingress route is returning a 404. Debug the service name, port, and path in the Ingress manifest.

  1. RBAC (Role-Based Access Control)

  • Objective: Configure role-based access control.

  • Details: Use kubectl auth can-i. Create a Role for read-only access and a RoleBinding to assign it.

  1. Pod Security

  • Objective: Harden workloads using built-in Kubernetes security standards.

  • Details: Understand Pod Security Standards. Apply namespace labels to enforce the baseline policy.

  1. Secrets Management

  • Objective: Securely manage and inject secrets into applications.

  • Details: Understand the risks of storing secrets in Git. Manually create a Secret and mount it into a pod.

  1. Runtime Security

  • Objective: Detect and prevent threats in running containers.

  • Details: Understand runtime security. Analyze a container’s syscalls using strace.

  1. Break-It-Friday: Debug security misconfigurations

  • Objective: Identify and correct common security flaws.

  • Details: A CI/CD pipeline fails with “forbidden” error. Diagnose the missing RBAC permissions for its ServiceAccount.

  1. Advanced Storage

  • Objective: Explore Container Storage Interface (CSI).

  • Details: Understand the role of CSI. Deploy the CSI driver for your cloud provider.

  1. Database Operations

  • Objective: Run production-grade stateful workloads.

  • Details: Understand challenges of running databases in K8s. Deploy a PostgreSQL instance using a StatefulSet.

  1. Backup and Restore

  • Objective: Implement disaster recovery strategies.

  • Details: Understand Velero. Install Velero and perform a backup of a single namespace’s resources.

  1. Data Pipelines

  • Objective: Run message queues and data streaming platforms.

  • Details: Discuss use cases for Kafka. Deploy a single-node Zookeeper and Kafka instance using a Helm chart.

  1. Break-It-Friday: Debug stateful application issues

  • Objective: Troubleshoot problems related to storage, databases, and data services.

  • Details: A PersistentVolumeClaim is stuck in Pending. Debug the StorageClass or resource quotas.

  1. Intro to GitOps

  • Objective: Manage cluster state declaratively.

  • Details: Understand GitOps principles. Install the ArgoCD UI. Manually create an application to sync a Git repository.

  1. Progressive Delivery

  • Objective: Implement safer deployment strategies like canary and blue-green.

  • Details: Discuss risks of standard rolling updates. Manually perform a canary release.

  1. Automated Testing

  • Objective: Run integration and end-to-end tests within the Kubernetes cluster.

  • Details: Run a simple test suite as a Kubernetes Job after a deployment.

  1. DevSecOps

  • Objective: Integrate security scanning and policy enforcement into CI/CD.

  • Details: Add a step to your CI pipeline to scan container images for vulnerabilities using trivy.

  1. Break-It-Friday: Debug CI/CD and GitOps pipelines

  • Objective: Troubleshoot and optimize automated delivery pipelines.

  • Details: ArgoCD shows OutOfSync. Find the manual kubectl change causing drift and revert it.

  1. Metrics with Prometheus

  • Objective: Collect and query time-series metrics.

  • Details: Deploy Prometheus. Explore the Prometheus UI and learn basic PromQL queries.

  1. Logging with ELK/Loki

  • Objective: Aggregate and analyze logs.

  • Details: Understand structured logging. Deploy a logging agent (e.g., Fluentd) to collect logs.

  1. Tracing with Jaeger

  • Objective: Trace requests across microservices.

  • Details: Understand distributed tracing. Deploy the Jaeger all-in-one template.

  1. Alerting with Alertmanager

  • Objective: Define alerts and route them to the correct teams.

  • Details: Write a simple Prometheus alerting rule. View the alert firing in the Prometheus UI.

  1. Break-It-Friday: Debug the observability stack

  • Objective: Diagnose and fix issues within the monitoring, logging, and tracing systems.

  • Details: Grafana shows “No Data.” Debug the entire metrics pipeline.

  1. Cluster Autoscaling

  • Objective: Automatically scale the number of nodes.

  • Details: Understand the need for cluster scaling. Manually add a node to your kind cluster.

  1. Custom Controllers

  • Objective: Extend the Kubernetes API.

  • Details: Understand the operator pattern. Explore a simple community operator and its CRD.

  1. Advanced Automation

  • Objective: Automate complex operational tasks.

  • Details: Create a simple script that runs kubectl commands.

  1. Cost Optimization (FinOps)

  • Objective: Monitor, allocate, and optimize Kubernetes cloud costs.

  • Details: Use kubectl cost to view the resource costs of your workloads.

  1. Break-It-Friday: Debug cluster operations

  • Objective: Troubleshoot core cluster-level operations and automation.

  • Details: The cluster autoscaler is not adding new nodes. Check its logs.

  1. Multi-Cluster Architectures

  • Objective: Understand patterns for managing multiple clusters.

  • Details: Discuss reasons for multi-cluster. Explore the concepts of tools like Karmada.

  1. Cross-Cluster Networking

  • Objective: Enable seamless communication between services in different clusters.

  • Details: Manually expose a service in one cluster with a LoadBalancer.

  1. Global Workload Distribution

  • Objective: Distribute traffic and workloads intelligently.

  • Details: Understand how a Global Load Balancer (GLB) works conceptually.

  1. Multi-Cluster Security

  • Objective: Manage identity and policy across a fleet of clusters.

  • Details: Discuss challenges of managing RBAC across many clusters.

  1. Break-It-Friday: Test and debug a multi-cluster failover

  • Objective: Validate and troubleshoot a multi-region disaster recovery setup.

  • Details: Manually simulate a regional outage. Update DNS records to fail traffic over.

  1. Advanced Autoscaling

  • Objective: Implement more sophisticated scaling strategies.

  • Details: Review Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler.

  1. Resource Efficiency

  • Objective: Optimize resource utilization.

  • Details: Use kubectl top to identify pods with high resource usage.

  1. High-Performance Networking

  • Objective: Accelerate network throughput.

  • Details: Discuss concepts like SR-IOV and DPDK.

  1. Chaos Engineering

  • Objective: Proactively test system resilience.

  • Details: Manually run chaos experiments, e.g., kubectl delete pod.

  1. Platform Engineering

  • Objective: Build tools to improve developer experience.

  • Details: Discuss the concept of an Internal Developer Platform (IDP).

  1. Capstone: Scoping

  • Objective: Scope and architect your final project.

  • Details: Choose your specialization. Draw an architecture diagram and create a work plan.

  1. Capstone: Build Day 1

  • Objective: Build the core infrastructure and application components.

  • Details: Set up your K8s cluster(s), Git repos, and CI/CD pipelines.

  1. Capstone: Build Day 2

  • Objective: Implement advanced, specialized features.

  • Details: Integrate the key technology for your track. Configure advanced networking and security.

  1. Capstone: Testing & Hardening

  • Objective: Test, secure, and document your platform.

  • Details: Perform load testing, run chaos experiments, and write operational runbooks.

  1. Capstone: Demo Day

  • Objective: Present your final project.

  • Details: Present your architecture, demo the live platform, and walk through your production readiness checklist.

Practitioner Track (Lessons 61-120)

  1. Docker Fundamentals

  • Objective: Understand what a container is and build your first image.

  • Details: Write a Dockerfile for a Node.js REST API. Understand COPY vs. ADD, CMD vs. ENTRYPOINT.

  1. Image Optimization

  • Objective: Build small, secure, and efficient container images.

  • Details: Implement multi-stage builds. Scan images for vulnerabilities using trivy.

  1. Container Networking & Storage

  • Objective: Connect containers and persist data.

  • Details: Differentiate between bind mounts and named volumes. Implement a database backup/restore workflow.

  1. Multi-Container Apps

  • Objective: Define and run complex applications locally with Docker Compose.

  • Details: Build a full docker-compose.yml for an e-commerce app, including depends_on, healthchecks, and environment variables.

  1. Break-It-Friday: Debug containerized environments

  • Objective: Debug common issues in containerized environments.

  • Details: A service cannot resolve another’s DNS name. Find and fix the misconfigured Docker network or service alias.

  1. Intro to Kubernetes

  • Objective: Understand the purpose of Kubernetes and set up a local cluster.

  • Details: Set up a local multi-node cluster using kind. Use kubectl to explore control plane components.

  1. Pods: The Atomic Unit

  • Objective: Master the fundamental building block of Kubernetes.

  • Details: Create a multi-container Pod. Configure Pod lifecycle hooks (postStart, preStop).

  1. Workload Controllers: Deployments

  • Objective: Manage stateless applications and perform rolling updates.

  • Details: Perform a zero-downtime rolling update. Inspect ReplicaSet objects. Configure maxSurge and maxUnavailable.

  1. Services & Networking

  • Objective: Expose applications within and outside the cluster.

  • Details: Use a LoadBalancer service. Understand how Endpoints link Services to Pods.

  1. Break-It-Friday: Debug K8s scheduling and networking

  • Objective: Diagnose and resolve common Kubernetes scheduling and networking problems.

  • Details: A Service has no Endpoints. Find the label selector mismatch between the Service and the Pods.

  1. ConfigMaps & Secrets

  • Objective: Manage application configuration and sensitive data.

  • Details: Mount ConfigMaps and Secrets as files. Implement a rolling update to pick up configuration changes.

  1. Health Probes & Lifecycle

  • Objective: Build self-healing and resilient applications.

  • Details: Implement a startupProbe for slow-starting applications. Configure a terminationGracePeriodSeconds.

  1. Resource Management

  • Objective: Define resource requests and limits.

  • Details: Observe the effects of setting different CPU/Memory values. See how Pods are OOMKilled.

  1. Persistent Storage

  • Objective: Manage stateful data.

  • Details: Use a StorageClass for dynamic PV provisioning. Deploy a simple database that uses a PVC.

  1. Break-It-Friday: Debug a full multi-tier application

  • Objective: Diagnose and resolve issues in a complex, multi-service application.

  • Details: The app returns 503 errors. Diagnose a failing readinessProbe and fix the underlying health check.

  1. Ingress Controllers

  • Objective: Expose HTTP/S routes from outside the cluster.

  • Details: Implement path-based routing. Configure TLS termination on the Ingress controller using cert-manager.

  1. Intro to Service Mesh

  • Objective: Understand the “why” behind a service mesh.

  • Details: Install Istio and automatically inject the Envoy sidecar proxy into your application pods. Use Kiali to visualize your service graph.

  1. Istio Traffic Management

  • Objective: Implement advanced routing, reliability, and security patterns.

  • Details: Implement a canary release, sending 10% of traffic to a new version. Configure request timeouts and retries.

  1. Network Policies

  • Objective: Secure pod-to-pod communication within the cluster.

  • Details: Implement granular policies. Allow ingress only on a specific port. Configure an egress policy.

  1. Break-It-Friday: Debug advanced networking and service mesh issues

  • Objective: Diagnose complex connectivity and policy issues.

  • Details: Services are returning 503s after Istio injection. Debug sidecar connectivity issues.

  1. RBAC (Role-Based Access Control)

  • Objective: Configure role-based access control.

  • Details: Differentiate Role vs. ClusterRole. Create a ServiceAccount for an application and bind it to a Role.

  1. Pod Security

  • Objective: Harden workloads using built-in Kubernetes security standards.

  • Details: Configure a securityContext to run as a non-root user and drop unnecessary capabilities.

  1. Secrets Management

  • Objective: Securely manage and inject secrets into applications.

  • Details: Integrate with HashiCorp Vault using the Vault Secrets Operator or External Secrets Operator.

  1. Runtime Security

  • Objective: Detect and prevent threats in running containers.

  • Details: Deploy Falco. Trigger a suspicious event and observe the Falco alert.

  1. Break-It-Friday: Debug security misconfigurations

  • Objective: Identify and correct common security flaws.

  • Details: A Pod fails to start due to a security policy violation. Read the error message to identify the constraint being violated.

  1. Advanced Storage

  • Objective: Explore Container Storage Interface (CSI).

  • Details: Use the CSI driver to take a snapshot of a PersistentVolume and restore it.

  1. Database Operations

  • Objective: Run production-grade stateful workloads.

  • Details: Deploy a highly-available PostgreSQL cluster using the Patroni operator. Simulate a node failure.

  1. Backup and Restore

  • Objective: Implement disaster recovery strategies.

  • Details: Perform a full cluster backup, including persistent volumes. Simulate a disaster and restore it using Velero.

  1. Data Pipelines

  • Objective: Run message queues and data streaming platforms.

  • Details: Deploy a scalable Kafka cluster using the Strimzi operator. Create topics and produce/consume messages.

  1. Break-It-Friday: Debug stateful application issues

  • Objective: Troubleshoot problems related to storage, databases, and data services.

  • Details: The database operator is failing to elect a new primary. Inspect the operator’s logs and the database cluster’s state.

  1. Intro to GitOps

  • Objective: Manage cluster state declaratively.

  • Details: Define an ArgoCD Application object as YAML. Implement an “app of apps” pattern.

  1. Progressive Delivery

  • Objective: Implement safer deployment strategies like canary and blue-green.

  • Details: Install Flagger. Configure a Canary custom resource to automate a canary release driven by Prometheus metrics.

  1. Automated Testing

  • Objective: Run integration and end-to-end tests within the Kubernetes cluster.

  • Details: Build a CI pipeline that runs integration tests against services inside the cluster.

  1. DevSecOps

  • Objective: Integrate security scanning and policy enforcement into CI/CD.

  • Details: Integrate static analysis (SAST) for your IaC files using kube-linter or terrascan.

  1. Break-It-Friday: Debug CI/CD and GitOps pipelines

  • Objective: Troubleshoot and optimize automated delivery pipelines.

  • Details: A canary analysis fails and rolls back. Analyze the metrics that caused Flagger to initiate the rollback.

  1. Metrics with Prometheus

  • Objective: Collect and query time-series metrics.

  • Details: Configure Prometheus to scrape custom application metrics. Instrument your application with a Prometheus client library.

  1. Logging with ELK/Loki

  • Objective: Aggregate and analyze logs.

  • Details: Deploy the ELK Stack. Create a Kibana dashboard to search and visualize logs.

  1. Tracing with Jaeger

  • Objective: Trace requests across microservices.

  • Details: Instrument your services using the OpenTelemetry SDK. View the end-to-end trace in Jaeger.

  1. Alerting with Alertmanager

  • Objective: Define alerts and route them.

  • Details: Configure Alertmanager to deduplicate and group alerts. Route alerts to a Slack channel.

  1. Break-It-Friday: Debug the observability stack * Objective: Diagnose and fix issues within the monitoring, logging, and tracing systems. * Details: Logs are not appearing in Kibana. Debug the logging agent’s configuration, permissions, and connection to Elasticsearch.

  2. Cluster Autoscaling * Objective: Automatically scale the number of nodes. * Details: Deploy the cluster-autoscaler for your cloud provider. Create a load that triggers the provisioning of a new node.

  3. Custom Controllers * Objective: Extend the Kubernetes API. * Details: Use Kubebuilder or Operator SDK to scaffold a basic operator.

  4. Advanced Automation * Objective: Automate complex operational tasks. * Details: Write a Kubernetes CronJobthat automatically runs a Velero backup.

  5. Cost Optimization (FinOps) * Objective: Monitor, allocate, and optimize Kubernetes cloud costs. * Details:Install and configure OpenCost to get a detailed breakdown of costs.

  6. Break-It-Friday: Debug cluster operations * Objective: Troubleshoot core cluster-level operations and automation. * Details: An operator is stuck in a reconciliation loop. Debug the operator’s logs to find the error.

  7. Multi-Cluster Architectures * Objective: Understand patterns for managing multiple clusters. * Details: Set up a second K8s cluster. Use ArgoCD to deploy applications to both clusters from a single Git repository.

  8. Cross-Cluster Networking * Objective: Enable seamless communication between services. * Details: Implement a multi-cluster service mesh using Istio’s east-west gateway pattern.

  9. Global Workload Distribution * Objective: Distribute traffic and workloads intelligently. * Details: Configure a Global Load Balancer (GLB) to distribute traffic to Ingress controllers in your two clusters.

  10. Multi-Cluster Security * Objective: Manage identity and policy across a fleet of clusters. * Details: Use a tool like Open Policy Agent (OPA) to enforce consistent security policies.

  11. Break-It-Friday: Test and debug a multi-cluster failover * Objective: Validate and troubleshoot a multi-region disaster recovery setup. * Details: Test the automated GLB health checks. Take down the application in one cluster and verify the GLB redirects traffic.

  12. Advanced Autoscaling * Objective: Implement more sophisticated scaling strategies. * Details: Deploy the Vertical Pod Autoscaler (VPA) to get automatic resource rightsizing recommendations.

  13. Resource Efficiency * Objective: Optimize resource utilization. * Details: Run a Descheduler to evict and rebalance pods for better bin packing.

  14. High-Performance Networking * Objective: Accelerate network throughput. * Details: Configure jumbo frames (MTU 9001) on your CNI.

  15. Chaos Engineering * Objective: Proactively test system resilience. * Details: Deploy Chaos Mesh. Use its UI to run controlled chaos experiments.

  16. Platform Engineering * Objective: Build tools to improve developer experience. * Details: Build a simple software template using Backstage.

  17. Capstone: Scoping * Objective: Scope and architect your final project. * Details: Choose your specialization. Draw an architecture diagram and create a work plan.

  18. Capstone: Build Day 1 * Objective: Build the core infrastructure and application components. * Details: Set up your K8s cluster(s), Git repos, and CI/CD pipelines.

  19. Capstone: Build Day 2 * Objective: Implement advanced, specialized features. * Details: Integrate the key technology for your track. Configure advanced networking and security.

  20. Capstone: Testing & Hardening * Objective: Test, secure, and document your platform. * Details: Perform load testing, run chaos experiments, and write operational runbooks.

  21. Capstone: Demo Day * Objective: Present your final project. * Details: Present your architecture, demo the live platform, and walk through your production readiness checklist.

Advanced Track (Lessons 121-180)

  1. Docker Fundamentals * Objective: Understand what a container is and build your first image. * Details: Deep dive into container primitives. Use unshare and nsenter to manually create and inspect kernel namespaces.

  2. Image Optimization * Objective: Build small, secure, and efficient container images. * Details: Explore advanced BuildKit features. Use --mount=type=cache. Compare image slimming techniques (distroless vs. Alpine vs. scratch).

  3. Container Networking & Storage * Objective: Connect containers and persist data. * Details: Create and inspect custom Docker network drivers. Benchmark I/O performance of different volume types.

  4. Multi-Container Apps * Objective: Define and run complex applications locally with Docker Compose. * Details: Use docker-compose override files for different environments. Template a docker-compose.yml using yq or Helm.

  5. Break-It-Friday: Debug containerized environments * Objective: Debug common issues in containerized environments. * Details: A container has a file permission error. Diagnose the UID/GID mismatch and fix it without running as root.

  6. Intro to Kubernetes * Objective: Understand the purpose of Kubernetes and set up a local cluster. * Details:Bootstrap a single-node cluster from scratch using kubeadm. Inspect control plane manifests in /etc/kubernetes/manifests.

  7. Pods: The Atomic Unit * Objective: Master the fundamental building block of Kubernetes. * Details: Explore Pod QoS Classes (Guaranteed, Burstable, BestEffort). Configure static Pods.

  8. Workload Controllers: Deployments * Objective: Manage stateless applications and perform rolling updates. * Details: Trigger a deployment rollback. Write a script to automate canary analysis.

  9. Services & Networking * Objective: Expose applications within and outside the cluster. * Details: Create a Headless Service for a StatefulSet. Use nslookup from within a pod to inspect K8s DNS A and SRV records.

  10. Break-It-Friday: Debug K8s scheduling and networking * Objective: Diagnose and resolve common Kubernetes scheduling and networking problems. * Details: DNS resolution fails intermittently. Debug CoreDNS logs and configurations.

  11. ConfigMaps & Secrets * Objective: Manage application configuration and sensitive data. * Details: Explore patterns for secret management. Use immutable Secrets. Implement a “reloader” sidecar.

  12. Health Probes & Lifecycle * Objective: Build self-healing and resilient applications. * Details: Write a custom health check endpoint that checks downstream dependencies. Configure probes to use this endpoint.

  13. Resource Management * Objective: Define resource requests and limits. * Details: Deep dive into CPU CFS quotas. Configure LimitRange objects to enforce resource constraints per-namespace.

  14. Persistent Storage * Objective: Manage stateful data. * Details: Explore different volume accessModes. Set up an NFS server and a StorageClass to provision ReadWriteMany volumes.

  15. Break-It-Friday: Debug a full multi-tier application * Objective: Diagnose and resolve issues in a complex, multi-service application. * Details: The database Pod can’t start after a node failure. Debug a multi-attach error on its volume.

  16. Ingress Controllers * Objective: Expose HTTP/S routes from outside the cluster. * Details: Explore the Gateway API. Implement advanced traffic manipulation with Ingress annotations.

  17. Intro to Service Mesh * Objective: Understand the “why” behind a service mesh. * Details: Analyze the resource overhead of the Istio sidecar. Explore alternative service mesh implementations.

  18. Istio Traffic Management * Objective: Implement advanced routing, reliability, and security patterns. * Details:Implement circuit breaking using a DestinationRule. Inject HTTP faults to test application resilience.

  19. Network Policies * Objective: Secure pod-to-pod communication within the cluster. * Details: Use a CNI that supports advanced policies. Implement L7 policies that filter traffic.

  20. Break-It-Friday: Debug advanced networking and service mesh issues * Objective: Diagnose complex connectivity and policy issues. * Details: A NetworkPolicy is blocking legitimate traffic. Use a tool like np-viewer or Cilium’s Hubble to visualize and debug policy rules.

  21. RBAC (Role-Based Access Control) * Objective: Configure role-based access control. * Details: Design an RBAC strategy for a multi-tenant organization. Use ClusterRole aggregation. Audit RBAC permissions.

  22. Pod Security * Objective: Harden workloads using built-in Kubernetes security standards. * Details: Explore an external policy engine like OPA Gatekeeper or Kyverno to enforce custom policies.

  23. Secrets Management * Objective: Securely manage and inject secrets into applications. * Details: Implement dynamic secrets with Vault. Configure automated secret rotation.

  24. Runtime Security * Objective: Detect and prevent threats in running containers. * Details: Write custom Falco rules. Integrate Falco alerts with a SIEM or Alertmanager.

  25. Break-It-Friday: Debug security misconfigurations * Objective: Identify and correct common security flaws. * Details: A user has overly broad permissions. Find and fix the ClusterRoleBinding that is granting cluster-admin.

  26. Advanced Storage * Objective: Explore Container Storage Interface (CSI). * Details: Implement Volume Cloning. Explore advanced features like topology-aware volume provisioning.

  27. Database Operations * Objective: Run production-grade stateful workloads. * Details: Configure a connection pooler like PgBouncer as a sidecar.

  28. Backup and Restore * Objective: Implement disaster recovery strategies. * Details: Configure scheduled backups. Set up Velero to back up to a different cloud region.

  29. Data Pipelines * Objective: Run message queues and data streaming platforms. * Details: Configure Kafka with mTLS for secure communication. Set up monitoring for Kafka brokers and topics.

  30. Break-It-Friday: Debug stateful application issues * Objective: Troubleshoot problems related to storage, databases, and data services. * Details: Velero volume backup is failing. Debug the volume snapshotter configuration and cloud provider permissions.

  31. Intro to GitOps * Objective: Manage cluster state declaratively. * Details: Explore advanced ArgoCD features like sync waves and hooks. Implement a custom health check.

  32. Progressive Delivery * Objective: Implement safer deployment strategies like canary and blue-green. * Details:Implement A/B testing with Flagger. Integrate with a service mesh for L7 traffic shifting.

  33. Automated Testing * Objective: Run integration and end-to-end tests within the Kubernetes cluster. * Details:Use a tool like Testkube or K6 Operator to run and manage complex test suites.

  34. DevSecOps * Objective: Integrate security scanning and policy enforcement into CI/CD. * Details: Implement a policy-as-code gate in your pipeline using OPA.

  35. Break-It-Friday: Debug CI/CD and GitOps pipelines * Objective: Troubleshoot and optimize automated delivery pipelines. * Details: The CI pipeline is slow. Optimize Docker layer caching and run pipeline stages in parallel.

  36. Metrics with Prometheus * Objective: Collect and query time-series metrics. * Details: Set up a highly-available, scalable Prometheus architecture using Thanos or Cortex.

  37. Logging with ELK/Loki * Objective: Aggregate and analyze logs. * Details: Evaluate and deploy Grafana Loki as a lightweight alternative to ELK. Correlate logs with metrics in Grafana.

  38. Tracing with Jaeger * Objective: Trace requests across microservices. * Details: Use traces to identify performance bottlenecks. Analyze the critical path of a request.

  39. Alerting with Alertmanager * Objective: Define alerts and route them. * Details: Implement complex alerting logic. Create escalation policies and notification routing.

  40. Break-It-Friday: Debug the observability stack * Objective: Diagnose and fix issues within the monitoring, logging, and tracing systems. * Details: An alert is flapping. Tune the for clause in the alerting rule to make it less sensitive.

  41. Cluster Autoscaling * Objective: Automatically scale the number of nodes. * Details: Replace cluster-autoscaler with Karpenter. Configure Karpenter provisioners for different workload types.

  42. Custom Controllers * Objective: Extend the Kubernetes API. * Details: Implement the reconciliation loop for your operator. Add logic to ensure the state of your application always matches the CRD.

  43. Advanced Automation * Objective: Automate complex operational tasks. * Details: Build a simple “chaos” operator that periodically and randomly deletes pods.

  44. Cost Optimization (FinOps) * Objective: Monitor, allocate, and optimize Kubernetes cloud costs. * Details:Implement a strategy to run stateless workloads on Spot Instances. Configure automated rightsizing recommendations.

  45. Break-It-Friday: Debug cluster operations * Objective: Troubleshoot core cluster-level operations and automation. * Details: Cloud costs have spiked unexpectedly. Use cost monitoring tools to identify the workload responsible for the increase.

  46. Multi-Cluster Architectures * Objective: Understand patterns for managing multiple clusters. * Details: Use a Cluster API (CAPI) provider to provision and manage the lifecycle of multiple clusters declaratively.

  47. Cross-Cluster Networking * Objective: Enable seamless communication between services. * Details: Deploy Submariner to create a flat L3 network fabric across multiple clusters.

  48. Global Workload Distribution * Objective: Distribute traffic and workloads intelligently. * Details: Implement advanced traffic policies on the GLB, such as latency-based routing or active-passive failover.

  49. Multi-Cluster Security * Objective: Manage identity and policy across a fleet of clusters. * Details: Implement a centralized identity management solution (e.g., using Pinniped).

  50. Break-It-Friday: Test and debug a multi-cluster failover * Objective: Validate and troubleshoot a multi-region disaster recovery setup. * Details: A cross-cluster call is failing. Debug the entire path: DNS, GLB, east-west gateway, network policies, and service mesh config.

  51. Advanced Autoscaling * Objective: Implement more sophisticated scaling strategies. * Details: Implement KEDA (Kubernetes Event-driven Autoscaling) to scale workloads based on external metrics.

  52. Resource Efficiency * Objective: Optimize resource utilization. * Details: Write custom scheduling plugins to implement advanced affinity, anti-affinity, or topology spread constraints.

  53. High-Performance Networking * Objective: Accelerate network throughput. * Details: Enable SR-IOV on a worker node and attach a pod directly to a Virtual Function.

  54. Chaos Engineering * Objective: Proactively test system resilience. * Details: Automate chaos experiments as part of your CI/CD pipeline.

  55. Platform Engineering * Objective: Build tools to improve developer experience. * Details: Use Crossplane to create a custom K8s API for provisioning cloud resources.

  56. Capstone: Scoping * Objective: Scope and architect your final project. * Details: Choose your specialization. Draw an architecture diagram and create a work plan.

  57. Capstone: Build Day 1 * Objective: Build the core infrastructure and application components. * Details: Set up your K8s cluster(s), Git repos, and CI/CD pipelines.

  58. Capstone: Build Day 2 * Objective: Implement advanced, specialized features. * Details: Integrate the key technology for your track. Configure advanced networking and security.

  59. Capstone: Testing & Hardening * Objective: Test, secure, and document your platform. * Details: Perform load testing, run chaos experiments, and write operational runbooks.

  60. Capstone: Demo Day * Objective: Present your final project. * Details: Present your architecture, demo the live platform, and walk through your production readiness checklist.