github-projects
tracemypods-AI
Devopsflow

🧠 TraceMyPods – DevOps First Microservice AI Platform

TraceMyPods is a Kubernetes-native DevOps Focused (chat WITH OllamaAI) token-gated service platform that leverages a multiple microservices architecture with secure Istio service mesh, GPU acceleration, and a robust monitoring stack with full Terraform and Helm packaged to deploy on AWS EKS. It provides a seamless user experience for AI interactions, token management, and advanced analytics.

Visit Site : https://tracemypods.ahmadraza.in

Demo Video :

Flow


πŸ› οΈ InfraStack

Istio Service Mesh | Jaeger | AWS EKS | GPU Nodes | Grafana | Loki | Prometheus
HELM | Terraform | Trevy | Github Actions | Falco (Security) | Kyverno (PSPolicy) | Kube-Hunter (CIS Benchmark) |


☸️ Kubernetes Stack

IstioGateway/VirtualService | Ingress | HPA | VPA | Pod Disruption Budget | Network Policy | Resource Quotas | SecurityContext | PV, Storage Class, PVC, EBS | PodSecurityPolicies | RBAC | ConfigMaps / Secrets | Taints & Tolerations | Readiness & Liveness Probes
Deployments / CronJobs | Affinity / Anti-Affinity | Kiali | Kube-Hunter | Minikube (local)


πŸ”— Application Stack

Kafka | Redis | MongoDB | Ollama (AI) | Node.js | Python | GO |VectorDB (qdrant) | Microservices | Payments (razorpay) | Invoice/Reports | EMAIL | PostmanCollection | LoadTesting | OpenTelemetry | Gemini_AI | S3 (File Browser) | Image Generation API


πŸ“Έ UI Previews

πŸ“ Landing Page

  • Landing page where users can explore the platform features, chat with AI models, and access various services.

Landing Page

πŸ€– Chat With Premium Models (Chat Box)

  • Chat interface for users to interact with premium AI models, powered by Ollama.

Chat With Premium Models

πŸ€– Free AI Assistant (Chat Box)

  • Chat interface for users to interact with free AI models, providing a seamless user experience.

Chat Box

πŸ–ΌοΈ Image Generator API

  • Image generation API that allows users to create images based on text prompts, integrated with Cloudflare AI Workers.

Image Generator

πŸ“œ Platform Features

  • Features overview showcasing the capabilities of the TraceMyPods platform, including AI interactions, token management, and advanced analytics.

Platform Features

πŸ’³ Purchase API (Model Selection)

  • Model purchase interface where users can select models and proceed with payment.

Purchase Model Selection

πŸ’Έ Payment Success

  • Payment success page confirming the successful transaction and model activation.

Payment Success

πŸ’Έ RazorPay Checkout (Card Details)

  • Card details page for entering payment information securely.

RazorPay Checkout Card Details

πŸ’Έ RazorPay Checkout (Confirm Payment)

  • Payment confirmation page for reviewing and confirming payment details.

RazorPay Checkout OTP

πŸ’Έ RazorPay Checkout (Payment Successful)

  • Payment successful page confirming the completion of the transaction.

RazorPay Checkout Payment Successful

πŸ” Purchase Invoice (S3-Bucket)

  • Preview of the invoice generated after a successful purchase. (sent to user email and stored in S3 bucket)

Invoice Preview

πŸ” Purchase Successful Email

  • Preview of the email sent to users upon successful purchase, containing invoice/token details and confirmation.

Email Preview

Email Preview

Admin/Analytics Dashboard

Admin dashboard to view purchase history, manage free and paid tokens and earnings and export business purchase data.

  • Here we can view and search invoices stored in S3 Bucket

podpays3-invoices

  • Here we can view order history, earnings and export business purchase data including the active tokens and free tokens

Admin Dashboard

πŸ“œ OtelAPI and Otel Dashboard

  • OtelAPI for code-level tracing and AI integration debugging, providing insights into application performance.

OtelDashboard OtelAI OtelAPI

πŸ” Kiali Observability

  • To view the service mesh topology, traffic flow, and health status of microservices.

Kiali Dark Mode Kiali Light Mode

πŸ“Š Prometheus Metrics in Grafana

  • To visualize metrics collected from the Kubernetes cluster and applications.

Grafana Metrics

πŸ“œ Logs from Loki

  • Viewing and searching logs from various microservices using Loki.

Loki Logs

πŸ“œ Jaeger Tracing (OTEL + GenAI)

  • Distributed tracing for microservices using Jaeger.

OtelAI

πŸ“œ Trivy Vulnerability Scanning

  • Scanning container images and Kubernetes clusters for vulnerabilities using Trivy.

πŸ“œ Falco Runtime Security Monitoring

  • Real-time monitoring and detection of security threats in running containers using Falco.

Falco Runtime Security Monitoring

πŸ“œ Kube-Hunter Vulnerability Scanning

  • Active reconnaissance tool for Kubernetes clusters to identify potential security issues.

πŸ“œ Kyverno PodSecurityPolicy Enforcement

  • Enforcing security policies for Kubernetes pods using Kyverno.

πŸ“œ ArgoCD

  • GitOps to deploy tracemypods and kafka in Kubernetes.

πŸ“œ CloudWatch GPU Monitoring

  • Monitoring GPU utilization and performance metrics using CloudWatch.

CloudWatch GPU Monitoring

πŸ“œ HashiCorp Vault

  • Secrets management and data protection for sensitive information.

πŸ“Έ Infra Architecture


☸️ EKS APP Architecture

This architecture diagram illustrates the deployment of the TraceMyPods application on AWS EKS, showcasing the integration of Istio service mesh, GPU nodes, and various microservices communication and workflows. It highlights the use of Istio for secure service communication, Jaeger for distributed tracing, and the overall infrastructure setup including monitoring and security components. EKS APP Architecture

Key Components:

  • EKS Cluster: The Kubernetes cluster where the TraceMyPods application is deployed. This cluster is configured with GPU nodes for AI workloads.
  • Istio Gateway: Manages ingress traffic, routing to different APIs and load balancing. IstioGateway.yaml
  • Istio Service Mesh: Enables secure, observable communication between microservices, with features like mutual TLS, traffic management, and telemetry. Istio
  • Kiali: Visualizes and manages the service mesh.
  • Jaeger: Jaeger with Otelapi used to display distributed traces for microservices, helping to identify performance bottlenecks and trace requests across services with AI integration. Jaeger (opens in a new tab)
  • Kubecost: Integrates for real-time cost monitoring and optimization. (inprogress)
  • ALB Ingress Controller: Handles external traffic routing via AWS Application Load Balancer. Ingress.tf
  • HPA & VPA: Horizontal and Vertical Pod Autoscalers for dynamic scaling based on resource usage.
  • PodDisruptionBudget: Maintains application availability during node updates or disruptions.
  • Resource Quotas: Enforces resource limits per namespace.
  • SecurityContext: Applies security settings at the pod and container level.
  • ConfigMaps & Secrets: Manages configuration and sensitive data.
  • Taints & Tolerations: Controls pod scheduling on specific nodes.
  • Readiness & Liveness Probes: Ensures pod health and availability.
  • NetworkPolicy: Restricts and controls pod-to-pod communication.
  • Affinity & Anti-Affinity: Optimizes pod placement for reliability and performance.
  • Persistent Storage: Uses StorageClass to dynamically provision EBS volumes for PVCs.
  • IRSA: Implements IAM Roles for Service Accounts for secure AWS service access (e.g., S3).
  • GPU Nodes: Supports AI workloads with NVIDIA T4 GPUs (g4dn.xlarge).
    tolerations:
      - key: "gpu"
        operator: "Equal"
        value: "true"
        effect: "NoSchedule"

AWS Architecture

This architecture diagram illustrates the AWS infrastructure setup for the TraceMyPods application, including EKS, VPC with Public and Private Subnets, NAT Gateway, S3, SES, Nodes and other AWS services. It highlights the use of Terraform for Infrastructure as Code (IAC) to provision and manage resources. VPC

IAC Stack | TF

Terraform to provision AWS resources for the TraceMyPods application, including EKS, VPC, IAM roles, S3 buckets, and more. This support reusable modules to provision infrastructure components.

Terraform modules include:

  • EKS Cluster with GPU nodes
  • VPC with Public and Private Subnets
  • IAM roles for EKS and IRSA
  • S3 bucket for file storage
  • NAT Gateway for internet access in private subnets
  • Cloudflare for DNS and CDN
  • Security Groups for network access control
  • ALB Ingress Controller for traffic routing
  • Global Accelerator for improved application availability and performance
  • Helm Installation of (Istio, Kiali, Prometheus, Loki, Grafana, falco, Kafka)

Monitoring Stack

This Stack includes Prometheus for metrics collection, Loki for log aggregation, and Grafana for visualization. It provides comprehensive monitoring and alerting capabilities for the TraceMyPods application. Monitoring

Security Stack

This Stack includes security measures such as Kube-Hunter for vulnerability scanning, Falco for runtime security monitoring, Kyverno for PodSecurityPolicy enforcement and Trivy for container image vulnerability scanning and Code Scanning. It ensures the TraceMyPods application is secure and compliant with best practices.

Tools used:

  • Kube-Hunter for vulnerability scanning | docs
  • Falco for runtime security monitoring | docs
  • Kyverno for PodSecurityPolicy enforcement | docs
  • GitHub Actions for CI to automate Trivy for container image vulnerability scanning and code scanning | docs | CI | Code Scanning
  • Kube-Bench for CIS Benchmark compliance | docs
  • Vault and Kubernetes Secrets for sensitive data management | docs
  • Network Policies for pod communication control | NP

Deployment Strategy

This Stack includes ArgoCD for GitOps, Helm for package management, and Terraform for Infrastructure as Code (IAC). It supports canary and blue-green deployments and provides a robust deployment strategy for the TraceMyPods application.

Tools used:

  • ArgoCD for GitOps | docs
  • Helm for package management | HELM
  • IAC with Terraform | TF
  • Canary, Rolling, and Blue-Green deployments in YAML files
  • PriorityClass for pod priority scheduling | PriorityClass

Devops Stack

This Stack includes various tools and technologies used in the TraceMyPods application, such as Istio for service mesh, Jaeger for distributed tracing, Prometheus for metrics collection, Loki for log aggregation, and Grafana for visualization. It provides a comprehensive DevOps stack for the TraceMyPods application.

Tools

Istio Service Mesh & Istio Gateway

  • Service mesh for secure, observable microservices | docs
  • Istio Gateway for traffic routing | IstioConfig.yaml
  • Kiali for service mesh observability

Jaeger

Kube-Hunter

  • Cluster Security scanning for vulnerabilities | text

falco

  • Runtime security monitoring | text

kyerno

Trivy

  • Container image vulnerability scanning. text
  • Cluster security checks text
  • Github Actions for CI/CD for trivy code scanning ( text )

Github Actions

  • CI pipeline to build docker image (x86 and ARM) and push to hub.docker.com | CI

Helm

  • Kubernetes package management for deployments | HelM

PostMan Collection

Load Testing Grafana K6

Inprogress Features

  • KubeCost for cost monitoring
  • Spot Intance for cost optimization
  • AWS API Gateway for API management
  • EKS RBAC for fine-grained access control
  • Litmus chaos engineering
  • EFK stack (Elasticsearch, Fluentbit, Kibana) or AWS OpenSearch (Remove loki logs and use EFK stack)

πŸ“˜ Learning Resources

TopicResource
CKAKubernetes Official Docs (opens in a new tab), Killer.sh CKA (opens in a new tab)
CKADKodeKloud CKAD (opens in a new tab), Katacoda Scenarios (opens in a new tab)
CKSPractical CKS Guide (opens in a new tab), Sysdig Threat Hunting (opens in a new tab)
IstioIstio.io Docs (opens in a new tab)
MonitoringAwesome Prometheus Alerts (opens in a new tab)
GitOpsArgoCD Docs (opens in a new tab), Weave GitOps (opens in a new tab)
EKS Best PracticesAWS EKS Workshop (opens in a new tab)

πŸ“Œ Author

Ahmad Raza
Sr. DevOps Engineer | Cloud Infra Specialist
πŸ”— ahmadraza.in (opens in a new tab)
πŸ”— linkedin.com/in/ahmad-raza-devops (opens in a new tab)


For more, visit ahmadraza.in (opens in a new tab)
Detailed commands, manifests, and guides are available on my blog.


πŸ§™ AI Wizard - Instant Page Insights

Click the button below to analyze this page.
Get an AI-generated summary and key insights in seconds.
Powered by Perplexity AI!