Curriculum
180 course lessons from “The Kubernetes Odyssey,” organized sequentially for beginners, practitioners, and advanced learners. Each lesson includes its core learning objective and specific, multi-track lab details.
Beginner Track (Lessons 1-60)
Docker Fundamentals
Objective: Understand what a container is and build your first image.
Details: Compare containers to VMs. Run pre-built images (nginx, postgres). Write a basic Dockerfile for a static website.
Image Optimization
Objective: Build small, secure, and efficient container images.
Details: Learn about Docker layer caching. Build an image and see how changes affect build time.
Container Networking & Storage
Objective: Connect containers and persist data.
Details: Create a Docker network. Connect a web container to a database container by name. Understand bind mounts for development.
Multi-Container Apps
Objective: Define and run complex applications locally with Docker Compose.
Details: Create a simple docker-compose.yml for a frontend and backend. Use docker-compose up to launch the application.
Break-It-Friday: Debug containerized environments
Objective: Debug common issues in containerized environments.
Details: A container fails to start. Use docker logs to find the incorrect database connection string.
Intro to Kubernetes
Objective: Understand the purpose of Kubernetes and set up a local cluster.
Details: What is orchestration? Explore a K8s cluster using a UI like Lens. See how a Deployment manages Pods.
Pods: The Atomic Unit
Objective: Master the fundamental building block of Kubernetes.
Details: Write a YAML manifest for a single Pod. Use kubectl apply and kubectl describe to inspect its state.
Workload Controllers: Deployments
Objective: Manage stateless applications and perform rolling updates.
Details: Create a Deployment for your web server. Scale the number of replicas up and down using kubectl scale.
Services & Networking
Objective: Expose applications within and outside the cluster.
Details: Create a ClusterIP service. Expose a service outside the cluster using NodePort.
Break-It-Friday: Debug K8s scheduling and networking
Objective: Diagnose and resolve common Kubernetes scheduling and networking problems.
Details: A Pod is stuck in Pending. Use kubectl describe pod to find it cannot be scheduled due to insufficient CPU.
ConfigMaps & Secrets
Objective: Manage application configuration and sensitive data.
Details: Create a ConfigMap and mount it as an environment variable. Understand why not to put secrets in ConfigMaps.
Health Probes & Lifecycle
Objective: Build self-healing and resilient applications.
Details: Add livenessProbe and readinessProbe to a Deployment. Observe K8s restarting a failing container.
Resource Management
Objective: Define resource requests and limits.
Details: Understand the difference between requests and limits. Add requests and limits to your Pods.
Persistent Storage
Objective: Manage stateful data.
Details: Understand the relationship between PersistentVolume (PV) and PersistentVolumeClaim (PVC). Manually create a PV.
Break-It-Friday: Debug a full multi-tier application
Objective: Diagnose and resolve issues in a complex, multi-service application.
Details: The frontend pod is in CrashLoopBackOff. Find logs indicating it can’t reach the backend API and fix the Service name.
Ingress Controllers
Objective: Expose HTTP/S routes from outside the cluster.
Details: Deploy the NGINX Ingress Controller. Create an Ingress object to route traffic to your web service.
Intro to Service Mesh
Objective: Understand the “why” behind a service mesh.
Details: Discuss challenges of microservice communication. Install Istio and view the control plane components.
Istio Traffic Management
Objective: Implement advanced routing, reliability, and security patterns.
Details: Use an Istio VirtualService to route all traffic to one version of a service.
Network Policies
Objective: Secure pod-to-pod communication within the cluster.
Details: Create a “deny-all” NetworkPolicy. Then, create a policy to allow ingress from the frontend to the backend.
Break-It-Friday: Debug advanced networking and service mesh issues
Objective: Diagnose complex connectivity and policy issues.
Details: An Ingress route is returning a 404. Debug the service name, port, and path in the Ingress manifest.
RBAC (Role-Based Access Control)
Objective: Configure role-based access control.
Details: Use kubectl auth can-i. Create a Role for read-only access and a RoleBinding to assign it.
Pod Security
Objective: Harden workloads using built-in Kubernetes security standards.
Details: Understand Pod Security Standards. Apply namespace labels to enforce the baseline policy.
Secrets Management
Objective: Securely manage and inject secrets into applications.
Details: Understand the risks of storing secrets in Git. Manually create a Secret and mount it into a pod.
Runtime Security
Objective: Detect and prevent threats in running containers.
Details: Understand runtime security. Analyze a container’s syscalls using strace.
Break-It-Friday: Debug security misconfigurations
Objective: Identify and correct common security flaws.
Details: A CI/CD pipeline fails with “forbidden” error. Diagnose the missing RBAC permissions for its ServiceAccount.
Advanced Storage
Objective: Explore Container Storage Interface (CSI).
Details: Understand the role of CSI. Deploy the CSI driver for your cloud provider.
Database Operations
Objective: Run production-grade stateful workloads.
Details: Understand challenges of running databases in K8s. Deploy a PostgreSQL instance using a StatefulSet.
Backup and Restore
Objective: Implement disaster recovery strategies.
Details: Understand Velero. Install Velero and perform a backup of a single namespace’s resources.
Data Pipelines
Objective: Run message queues and data streaming platforms.
Details: Discuss use cases for Kafka. Deploy a single-node Zookeeper and Kafka instance using a Helm chart.
Break-It-Friday: Debug stateful application issues
Objective: Troubleshoot problems related to storage, databases, and data services.
Details: A PersistentVolumeClaim is stuck in Pending. Debug the StorageClass or resource quotas.
Intro to GitOps
Objective: Manage cluster state declaratively.
Details: Understand GitOps principles. Install the ArgoCD UI. Manually create an application to sync a Git repository.
Progressive Delivery
Objective: Implement safer deployment strategies like canary and blue-green.
Details: Discuss risks of standard rolling updates. Manually perform a canary release.
Automated Testing
Objective: Run integration and end-to-end tests within the Kubernetes cluster.
Details: Run a simple test suite as a Kubernetes Job after a deployment.
DevSecOps
Objective: Integrate security scanning and policy enforcement into CI/CD.
Details: Add a step to your CI pipeline to scan container images for vulnerabilities using trivy.
Break-It-Friday: Debug CI/CD and GitOps pipelines
Objective: Troubleshoot and optimize automated delivery pipelines.
Details: ArgoCD shows OutOfSync. Find the manual kubectl change causing drift and revert it.
Metrics with Prometheus
Objective: Collect and query time-series metrics.
Details: Deploy Prometheus. Explore the Prometheus UI and learn basic PromQL queries.
Logging with ELK/Loki
Objective: Aggregate and analyze logs.
Details: Understand structured logging. Deploy a logging agent (e.g., Fluentd) to collect logs.
Tracing with Jaeger
Objective: Trace requests across microservices.
Details: Understand distributed tracing. Deploy the Jaeger all-in-one template.
Alerting with Alertmanager
Objective: Define alerts and route them to the correct teams.
Details: Write a simple Prometheus alerting rule. View the alert firing in the Prometheus UI.
Break-It-Friday: Debug the observability stack
Objective: Diagnose and fix issues within the monitoring, logging, and tracing systems.
Details: Grafana shows “No Data.” Debug the entire metrics pipeline.
Cluster Autoscaling
Objective: Automatically scale the number of nodes.
Details: Understand the need for cluster scaling. Manually add a node to your kind cluster.
Custom Controllers
Objective: Extend the Kubernetes API.
Details: Understand the operator pattern. Explore a simple community operator and its CRD.
Advanced Automation
Objective: Automate complex operational tasks.
Details: Create a simple script that runs kubectl commands.
Cost Optimization (FinOps)
Objective: Monitor, allocate, and optimize Kubernetes cloud costs.
Details: Use kubectl cost to view the resource costs of your workloads.
Break-It-Friday: Debug cluster operations
Objective: Troubleshoot core cluster-level operations and automation.
Details: The cluster autoscaler is not adding new nodes. Check its logs.
Multi-Cluster Architectures
Objective: Understand patterns for managing multiple clusters.
Details: Discuss reasons for multi-cluster. Explore the concepts of tools like Karmada.
Cross-Cluster Networking
Objective: Enable seamless communication between services in different clusters.
Details: Manually expose a service in one cluster with a LoadBalancer.
Global Workload Distribution
Objective: Distribute traffic and workloads intelligently.
Details: Understand how a Global Load Balancer (GLB) works conceptually.
Multi-Cluster Security
Objective: Manage identity and policy across a fleet of clusters.
Details: Discuss challenges of managing RBAC across many clusters.
Break-It-Friday: Test and debug a multi-cluster failover
Objective: Validate and troubleshoot a multi-region disaster recovery setup.
Details: Manually simulate a regional outage. Update DNS records to fail traffic over.
Advanced Autoscaling
Objective: Implement more sophisticated scaling strategies.
Details: Review Horizontal Pod Autoscaler (HPA) and Cluster Autoscaler.
Resource Efficiency
Objective: Optimize resource utilization.
Details: Use kubectl top to identify pods with high resource usage.
High-Performance Networking
Objective: Accelerate network throughput.
Details: Discuss concepts like SR-IOV and DPDK.
Chaos Engineering
Objective: Proactively test system resilience.
Details: Manually run chaos experiments, e.g., kubectl delete pod.
Platform Engineering
Objective: Build tools to improve developer experience.
Details: Discuss the concept of an Internal Developer Platform (IDP).
Capstone: Scoping
Objective: Scope and architect your final project.
Details: Choose your specialization. Draw an architecture diagram and create a work plan.
Capstone: Build Day 1
Objective: Build the core infrastructure and application components.
Details: Set up your K8s cluster(s), Git repos, and CI/CD pipelines.
Capstone: Build Day 2
Objective: Implement advanced, specialized features.
Details: Integrate the key technology for your track. Configure advanced networking and security.
Capstone: Testing & Hardening
Objective: Test, secure, and document your platform.
Details: Perform load testing, run chaos experiments, and write operational runbooks.
Capstone: Demo Day
Objective: Present your final project.
Details: Present your architecture, demo the live platform, and walk through your production readiness checklist.
Practitioner Track (Lessons 61-120)
Docker Fundamentals
Objective: Understand what a container is and build your first image.
Details: Write a Dockerfile for a Node.js REST API. Understand COPY vs. ADD, CMD vs. ENTRYPOINT.
Image Optimization
Objective: Build small, secure, and efficient container images.
Details: Implement multi-stage builds. Scan images for vulnerabilities using trivy.
Container Networking & Storage
Objective: Connect containers and persist data.
Details: Differentiate between bind mounts and named volumes. Implement a database backup/restore workflow.
Multi-Container Apps
Objective: Define and run complex applications locally with Docker Compose.
Details: Build a full docker-compose.yml for an e-commerce app, including depends_on, healthchecks, and environment variables.
Break-It-Friday: Debug containerized environments
Objective: Debug common issues in containerized environments.
Details: A service cannot resolve another’s DNS name. Find and fix the misconfigured Docker network or service alias.
Intro to Kubernetes
Objective: Understand the purpose of Kubernetes and set up a local cluster.
Details: Set up a local multi-node cluster using kind. Use kubectl to explore control plane components.
Pods: The Atomic Unit
Objective: Master the fundamental building block of Kubernetes.
Details: Create a multi-container Pod. Configure Pod lifecycle hooks (postStart, preStop).
Workload Controllers: Deployments
Objective: Manage stateless applications and perform rolling updates.
Details: Perform a zero-downtime rolling update. Inspect ReplicaSet objects. Configure maxSurge and maxUnavailable.
Services & Networking
Objective: Expose applications within and outside the cluster.
Details: Use a LoadBalancer service. Understand how Endpoints link Services to Pods.
Break-It-Friday: Debug K8s scheduling and networking
Objective: Diagnose and resolve common Kubernetes scheduling and networking problems.
Details: A Service has no Endpoints. Find the label selector mismatch between the Service and the Pods.
ConfigMaps & Secrets
Objective: Manage application configuration and sensitive data.
Details: Mount ConfigMaps and Secrets as files. Implement a rolling update to pick up configuration changes.
Health Probes & Lifecycle
Objective: Build self-healing and resilient applications.
Details: Implement a startupProbe for slow-starting applications. Configure a terminationGracePeriodSeconds.
Resource Management
Objective: Define resource requests and limits.
Details: Observe the effects of setting different CPU/Memory values. See how Pods are OOMKilled.
Persistent Storage
Objective: Manage stateful data.
Details: Use a StorageClass for dynamic PV provisioning. Deploy a simple database that uses a PVC.
Break-It-Friday: Debug a full multi-tier application
Objective: Diagnose and resolve issues in a complex, multi-service application.
Details: The app returns 503 errors. Diagnose a failing readinessProbe and fix the underlying health check.
Ingress Controllers
Objective: Expose HTTP/S routes from outside the cluster.
Details: Implement path-based routing. Configure TLS termination on the Ingress controller using cert-manager.
Intro to Service Mesh
Objective: Understand the “why” behind a service mesh.
Details: Install Istio and automatically inject the Envoy sidecar proxy into your application pods. Use Kiali to visualize your service graph.
Istio Traffic Management
Objective: Implement advanced routing, reliability, and security patterns.
Details: Implement a canary release, sending 10% of traffic to a new version. Configure request timeouts and retries.
Network Policies
Objective: Secure pod-to-pod communication within the cluster.
Details: Implement granular policies. Allow ingress only on a specific port. Configure an egress policy.
Break-It-Friday: Debug advanced networking and service mesh issues
Objective: Diagnose complex connectivity and policy issues.
Details: Services are returning 503s after Istio injection. Debug sidecar connectivity issues.
RBAC (Role-Based Access Control)
Objective: Configure role-based access control.
Details: Differentiate Role vs. ClusterRole. Create a ServiceAccount for an application and bind it to a Role.
Pod Security
Objective: Harden workloads using built-in Kubernetes security standards.
Details: Configure a securityContext to run as a non-root user and drop unnecessary capabilities.
Secrets Management
Objective: Securely manage and inject secrets into applications.
Details: Integrate with HashiCorp Vault using the Vault Secrets Operator or External Secrets Operator.
Runtime Security
Objective: Detect and prevent threats in running containers.
Details: Deploy Falco. Trigger a suspicious event and observe the Falco alert.
Break-It-Friday: Debug security misconfigurations
Objective: Identify and correct common security flaws.
Details: A Pod fails to start due to a security policy violation. Read the error message to identify the constraint being violated.
Advanced Storage
Objective: Explore Container Storage Interface (CSI).
Details: Use the CSI driver to take a snapshot of a PersistentVolume and restore it.
Database Operations
Objective: Run production-grade stateful workloads.
Details: Deploy a highly-available PostgreSQL cluster using the Patroni operator. Simulate a node failure.
Backup and Restore
Objective: Implement disaster recovery strategies.
Details: Perform a full cluster backup, including persistent volumes. Simulate a disaster and restore it using Velero.
Data Pipelines
Objective: Run message queues and data streaming platforms.
Details: Deploy a scalable Kafka cluster using the Strimzi operator. Create topics and produce/consume messages.
Break-It-Friday: Debug stateful application issues
Objective: Troubleshoot problems related to storage, databases, and data services.
Details: The database operator is failing to elect a new primary. Inspect the operator’s logs and the database cluster’s state.
Intro to GitOps
Objective: Manage cluster state declaratively.
Details: Define an ArgoCD Application object as YAML. Implement an “app of apps” pattern.
Progressive Delivery
Objective: Implement safer deployment strategies like canary and blue-green.
Details: Install Flagger. Configure a Canary custom resource to automate a canary release driven by Prometheus metrics.
Automated Testing
Objective: Run integration and end-to-end tests within the Kubernetes cluster.
Details: Build a CI pipeline that runs integration tests against services inside the cluster.
DevSecOps
Objective: Integrate security scanning and policy enforcement into CI/CD.
Details: Integrate static analysis (SAST) for your IaC files using kube-linter or terrascan.
Break-It-Friday: Debug CI/CD and GitOps pipelines
Objective: Troubleshoot and optimize automated delivery pipelines.
Details: A canary analysis fails and rolls back. Analyze the metrics that caused Flagger to initiate the rollback.
Metrics with Prometheus
Objective: Collect and query time-series metrics.
Details: Configure Prometheus to scrape custom application metrics. Instrument your application with a Prometheus client library.
Logging with ELK/Loki
Objective: Aggregate and analyze logs.
Details: Deploy the ELK Stack. Create a Kibana dashboard to search and visualize logs.
Tracing with Jaeger
Objective: Trace requests across microservices.
Details: Instrument your services using the OpenTelemetry SDK. View the end-to-end trace in Jaeger.
Alerting with Alertmanager
Objective: Define alerts and route them.
Details: Configure Alertmanager to deduplicate and group alerts. Route alerts to a Slack channel.
Break-It-Friday: Debug the observability stack * Objective: Diagnose and fix issues within the monitoring, logging, and tracing systems. * Details: Logs are not appearing in Kibana. Debug the logging agent’s configuration, permissions, and connection to Elasticsearch.
Cluster Autoscaling * Objective: Automatically scale the number of nodes. * Details: Deploy the cluster-autoscaler for your cloud provider. Create a load that triggers the provisioning of a new node.
Custom Controllers * Objective: Extend the Kubernetes API. * Details: Use Kubebuilder or Operator SDK to scaffold a basic operator.
Advanced Automation * Objective: Automate complex operational tasks. * Details: Write a Kubernetes CronJobthat automatically runs a Velero backup.
Cost Optimization (FinOps) * Objective: Monitor, allocate, and optimize Kubernetes cloud costs. * Details:Install and configure OpenCost to get a detailed breakdown of costs.
Break-It-Friday: Debug cluster operations * Objective: Troubleshoot core cluster-level operations and automation. * Details: An operator is stuck in a reconciliation loop. Debug the operator’s logs to find the error.
Multi-Cluster Architectures * Objective: Understand patterns for managing multiple clusters. * Details: Set up a second K8s cluster. Use ArgoCD to deploy applications to both clusters from a single Git repository.
Cross-Cluster Networking * Objective: Enable seamless communication between services. * Details: Implement a multi-cluster service mesh using Istio’s east-west gateway pattern.
Global Workload Distribution * Objective: Distribute traffic and workloads intelligently. * Details: Configure a Global Load Balancer (GLB) to distribute traffic to Ingress controllers in your two clusters.
Multi-Cluster Security * Objective: Manage identity and policy across a fleet of clusters. * Details: Use a tool like Open Policy Agent (OPA) to enforce consistent security policies.
Break-It-Friday: Test and debug a multi-cluster failover * Objective: Validate and troubleshoot a multi-region disaster recovery setup. * Details: Test the automated GLB health checks. Take down the application in one cluster and verify the GLB redirects traffic.
Advanced Autoscaling * Objective: Implement more sophisticated scaling strategies. * Details: Deploy the Vertical Pod Autoscaler (VPA) to get automatic resource rightsizing recommendations.
Resource Efficiency * Objective: Optimize resource utilization. * Details: Run a Descheduler to evict and rebalance pods for better bin packing.
High-Performance Networking * Objective: Accelerate network throughput. * Details: Configure jumbo frames (MTU 9001) on your CNI.
Chaos Engineering * Objective: Proactively test system resilience. * Details: Deploy Chaos Mesh. Use its UI to run controlled chaos experiments.
Platform Engineering * Objective: Build tools to improve developer experience. * Details: Build a simple software template using Backstage.
Capstone: Scoping * Objective: Scope and architect your final project. * Details: Choose your specialization. Draw an architecture diagram and create a work plan.
Capstone: Build Day 1 * Objective: Build the core infrastructure and application components. * Details: Set up your K8s cluster(s), Git repos, and CI/CD pipelines.
Capstone: Build Day 2 * Objective: Implement advanced, specialized features. * Details: Integrate the key technology for your track. Configure advanced networking and security.
Capstone: Testing & Hardening * Objective: Test, secure, and document your platform. * Details: Perform load testing, run chaos experiments, and write operational runbooks.
Capstone: Demo Day * Objective: Present your final project. * Details: Present your architecture, demo the live platform, and walk through your production readiness checklist.
Advanced Track (Lessons 121-180)
Docker Fundamentals * Objective: Understand what a container is and build your first image. * Details: Deep dive into container primitives. Use unshare and nsenter to manually create and inspect kernel namespaces.
Image Optimization * Objective: Build small, secure, and efficient container images. * Details: Explore advanced BuildKit features. Use --mount=type=cache. Compare image slimming techniques (distroless vs. Alpine vs. scratch).
Container Networking & Storage * Objective: Connect containers and persist data. * Details: Create and inspect custom Docker network drivers. Benchmark I/O performance of different volume types.
Multi-Container Apps * Objective: Define and run complex applications locally with Docker Compose. * Details: Use docker-compose override files for different environments. Template a docker-compose.yml using yq or Helm.
Break-It-Friday: Debug containerized environments * Objective: Debug common issues in containerized environments. * Details: A container has a file permission error. Diagnose the UID/GID mismatch and fix it without running as root.
Intro to Kubernetes * Objective: Understand the purpose of Kubernetes and set up a local cluster. * Details:Bootstrap a single-node cluster from scratch using kubeadm. Inspect control plane manifests in /etc/kubernetes/manifests.
Pods: The Atomic Unit * Objective: Master the fundamental building block of Kubernetes. * Details: Explore Pod QoS Classes (Guaranteed, Burstable, BestEffort). Configure static Pods.
Workload Controllers: Deployments * Objective: Manage stateless applications and perform rolling updates. * Details: Trigger a deployment rollback. Write a script to automate canary analysis.
Services & Networking * Objective: Expose applications within and outside the cluster. * Details: Create a Headless Service for a StatefulSet. Use nslookup from within a pod to inspect K8s DNS A and SRV records.
Break-It-Friday: Debug K8s scheduling and networking * Objective: Diagnose and resolve common Kubernetes scheduling and networking problems. * Details: DNS resolution fails intermittently. Debug CoreDNS logs and configurations.
ConfigMaps & Secrets * Objective: Manage application configuration and sensitive data. * Details: Explore patterns for secret management. Use immutable Secrets. Implement a “reloader” sidecar.
Health Probes & Lifecycle * Objective: Build self-healing and resilient applications. * Details: Write a custom health check endpoint that checks downstream dependencies. Configure probes to use this endpoint.
Resource Management * Objective: Define resource requests and limits. * Details: Deep dive into CPU CFS quotas. Configure LimitRange objects to enforce resource constraints per-namespace.
Persistent Storage * Objective: Manage stateful data. * Details: Explore different volume accessModes. Set up an NFS server and a StorageClass to provision ReadWriteMany volumes.
Break-It-Friday: Debug a full multi-tier application * Objective: Diagnose and resolve issues in a complex, multi-service application. * Details: The database Pod can’t start after a node failure. Debug a multi-attach error on its volume.
Ingress Controllers * Objective: Expose HTTP/S routes from outside the cluster. * Details: Explore the Gateway API. Implement advanced traffic manipulation with Ingress annotations.
Intro to Service Mesh * Objective: Understand the “why” behind a service mesh. * Details: Analyze the resource overhead of the Istio sidecar. Explore alternative service mesh implementations.
Istio Traffic Management * Objective: Implement advanced routing, reliability, and security patterns. * Details:Implement circuit breaking using a DestinationRule. Inject HTTP faults to test application resilience.
Network Policies * Objective: Secure pod-to-pod communication within the cluster. * Details: Use a CNI that supports advanced policies. Implement L7 policies that filter traffic.
Break-It-Friday: Debug advanced networking and service mesh issues * Objective: Diagnose complex connectivity and policy issues. * Details: A NetworkPolicy is blocking legitimate traffic. Use a tool like np-viewer or Cilium’s Hubble to visualize and debug policy rules.
RBAC (Role-Based Access Control) * Objective: Configure role-based access control. * Details: Design an RBAC strategy for a multi-tenant organization. Use ClusterRole aggregation. Audit RBAC permissions.
Pod Security * Objective: Harden workloads using built-in Kubernetes security standards. * Details: Explore an external policy engine like OPA Gatekeeper or Kyverno to enforce custom policies.
Secrets Management * Objective: Securely manage and inject secrets into applications. * Details: Implement dynamic secrets with Vault. Configure automated secret rotation.
Runtime Security * Objective: Detect and prevent threats in running containers. * Details: Write custom Falco rules. Integrate Falco alerts with a SIEM or Alertmanager.
Break-It-Friday: Debug security misconfigurations * Objective: Identify and correct common security flaws. * Details: A user has overly broad permissions. Find and fix the ClusterRoleBinding that is granting cluster-admin.
Advanced Storage * Objective: Explore Container Storage Interface (CSI). * Details: Implement Volume Cloning. Explore advanced features like topology-aware volume provisioning.
Database Operations * Objective: Run production-grade stateful workloads. * Details: Configure a connection pooler like PgBouncer as a sidecar.
Backup and Restore * Objective: Implement disaster recovery strategies. * Details: Configure scheduled backups. Set up Velero to back up to a different cloud region.
Data Pipelines * Objective: Run message queues and data streaming platforms. * Details: Configure Kafka with mTLS for secure communication. Set up monitoring for Kafka brokers and topics.
Break-It-Friday: Debug stateful application issues * Objective: Troubleshoot problems related to storage, databases, and data services. * Details: Velero volume backup is failing. Debug the volume snapshotter configuration and cloud provider permissions.
Intro to GitOps * Objective: Manage cluster state declaratively. * Details: Explore advanced ArgoCD features like sync waves and hooks. Implement a custom health check.
Progressive Delivery * Objective: Implement safer deployment strategies like canary and blue-green. * Details:Implement A/B testing with Flagger. Integrate with a service mesh for L7 traffic shifting.
Automated Testing * Objective: Run integration and end-to-end tests within the Kubernetes cluster. * Details:Use a tool like Testkube or K6 Operator to run and manage complex test suites.
DevSecOps * Objective: Integrate security scanning and policy enforcement into CI/CD. * Details: Implement a policy-as-code gate in your pipeline using OPA.
Break-It-Friday: Debug CI/CD and GitOps pipelines * Objective: Troubleshoot and optimize automated delivery pipelines. * Details: The CI pipeline is slow. Optimize Docker layer caching and run pipeline stages in parallel.
Metrics with Prometheus * Objective: Collect and query time-series metrics. * Details: Set up a highly-available, scalable Prometheus architecture using Thanos or Cortex.
Logging with ELK/Loki * Objective: Aggregate and analyze logs. * Details: Evaluate and deploy Grafana Loki as a lightweight alternative to ELK. Correlate logs with metrics in Grafana.
Tracing with Jaeger * Objective: Trace requests across microservices. * Details: Use traces to identify performance bottlenecks. Analyze the critical path of a request.
Alerting with Alertmanager * Objective: Define alerts and route them. * Details: Implement complex alerting logic. Create escalation policies and notification routing.
Break-It-Friday: Debug the observability stack * Objective: Diagnose and fix issues within the monitoring, logging, and tracing systems. * Details: An alert is flapping. Tune the for clause in the alerting rule to make it less sensitive.
Cluster Autoscaling * Objective: Automatically scale the number of nodes. * Details: Replace cluster-autoscaler with Karpenter. Configure Karpenter provisioners for different workload types.
Custom Controllers * Objective: Extend the Kubernetes API. * Details: Implement the reconciliation loop for your operator. Add logic to ensure the state of your application always matches the CRD.
Advanced Automation * Objective: Automate complex operational tasks. * Details: Build a simple “chaos” operator that periodically and randomly deletes pods.
Cost Optimization (FinOps) * Objective: Monitor, allocate, and optimize Kubernetes cloud costs. * Details:Implement a strategy to run stateless workloads on Spot Instances. Configure automated rightsizing recommendations.
Break-It-Friday: Debug cluster operations * Objective: Troubleshoot core cluster-level operations and automation. * Details: Cloud costs have spiked unexpectedly. Use cost monitoring tools to identify the workload responsible for the increase.
Multi-Cluster Architectures * Objective: Understand patterns for managing multiple clusters. * Details: Use a Cluster API (CAPI) provider to provision and manage the lifecycle of multiple clusters declaratively.
Cross-Cluster Networking * Objective: Enable seamless communication between services. * Details: Deploy Submariner to create a flat L3 network fabric across multiple clusters.
Global Workload Distribution * Objective: Distribute traffic and workloads intelligently. * Details: Implement advanced traffic policies on the GLB, such as latency-based routing or active-passive failover.
Multi-Cluster Security * Objective: Manage identity and policy across a fleet of clusters. * Details: Implement a centralized identity management solution (e.g., using Pinniped).
Break-It-Friday: Test and debug a multi-cluster failover * Objective: Validate and troubleshoot a multi-region disaster recovery setup. * Details: A cross-cluster call is failing. Debug the entire path: DNS, GLB, east-west gateway, network policies, and service mesh config.
Advanced Autoscaling * Objective: Implement more sophisticated scaling strategies. * Details: Implement KEDA (Kubernetes Event-driven Autoscaling) to scale workloads based on external metrics.
Resource Efficiency * Objective: Optimize resource utilization. * Details: Write custom scheduling plugins to implement advanced affinity, anti-affinity, or topology spread constraints.
High-Performance Networking * Objective: Accelerate network throughput. * Details: Enable SR-IOV on a worker node and attach a pod directly to a Virtual Function.
Chaos Engineering * Objective: Proactively test system resilience. * Details: Automate chaos experiments as part of your CI/CD pipeline.
Platform Engineering * Objective: Build tools to improve developer experience. * Details: Use Crossplane to create a custom K8s API for provisioning cloud resources.
Capstone: Scoping * Objective: Scope and architect your final project. * Details: Choose your specialization. Draw an architecture diagram and create a work plan.
Capstone: Build Day 1 * Objective: Build the core infrastructure and application components. * Details: Set up your K8s cluster(s), Git repos, and CI/CD pipelines.
Capstone: Build Day 2 * Objective: Implement advanced, specialized features. * Details: Integrate the key technology for your track. Configure advanced networking and security.
Capstone: Testing & Hardening * Objective: Test, secure, and document your platform. * Details: Perform load testing, run chaos experiments, and write operational runbooks.
Capstone: Demo Day * Objective: Present your final project. * Details: Present your architecture, demo the live platform, and walk through your production readiness checklist.
