github-projects
tracemypods-AI
Devappflow

TraceMyPods – Application Workflow (Dev Focused)

TraceMyPods is AI ChatBot with OllamaAI integration, enabling users to interact with AI models effortlessly. This platform is designed for developers and DevOps engineers, offering a comprehensive solution for building, deploying, and managing microservice architectures

Tech Stack : Kafka | Redis | MongoDB | Ollama (AI) | Node.js | Python | GO |VectorDB (qdrant) | Microservices | Payments (razorpay) | Invoice/Reports | EMAIL | PostmanCollection | LoadTesting | OpenTelemetry | Gemini_AI | S3 (File Browser) | Image Generation API

πŸ“ Landing Page

Landing Page

πŸ€– AI Assistant (Chat Box)

Chat Box

Workflow Overview

Workflow Architecture

Workflow Architecture

This document outlines the architecture and workflow of the TraceMyPods application, detailing how various components interact to provide a seamless AI chatbot experience.

  • App level communication
  • kafka uses
  • why vectors and otel
  • why razorpay payment
  • why mongo db and redis
  • why ollama and free model -->

Routes and APIs and post requests

  • /api/ask : Handles AI queries and routes them to the appropriate model.
  • /api/redis-data : Provides Redis analytics and active user data.
  • /api/db-data : Displays MongoDB analytics and user data.
  • /api/s3-data : Fetches S3 analytics and file browser data.
  • /api/s3-page : Displays S3 file browser with pagination.
  • /api/s3-analytics : Provides S3 analytics and file statistics.
  • /api/s3-presigned : Generates presigned URLs for S3 files.
  • /api/s3-folders : Lists S3 folders for file organization.
  • /api/s3-search : Implements advanced search capabilities for S3 files.
  • /deliver : Handles invoice delivery via email.
  • /order : Manages order processing and payment verification.
  • /send-otp : Sends OTP for email verification during order processing.
  • /verify-otp : Verifies OTP for email confirmation.
  • /create-order : Creates an order after payment verification.
  • /verify-payment : Verifies payment status with Razorpay.
  • /api/token : Manages token generation and validation.
  • /api/validate-premium-token : Validates premium tokens for model access.
  • /api/services : Lists services of jaeger and their traces.
  • /api/heatmap : Provides a heatmap of user interactions and application performance from jaeger.
  • /api/traces : Displays trace information for requests and their paths.
  • /api/chat : Manages chat sessions and user interactions in oteldashapi.
  • /api/optimize : Provides optimization suggestions for application performance.
  • /otel (rewritten to /) : Integrates OpenTelemetry for observability.
  • / (default/fallback route) : Serves the main application interface, including the chat box and admin dashboard.
  • Public API used :
    • Cloudflare workers AI for Image Generation
    • Gemini AI assistant in Jaeger Traces Explanation
  • Internal Routes:
    • /api/embedding : Handles embedding generation for user queries.
    • /api/vector : Manages vector storage and retrieval using Qdrant.

🧩 Components Overview (ARCH)

1. Frontend Pod: ai-frontend

Simple browser UI as Landing Page build on pure html

  • Tech: HTML, CSS, JavaScript
  • /chat.html : Frontend chat box with prompt & token input
  • /image.html : Image Generation UI with text input
  • /admin.html : Admin dashboard for analytics and data management
    • s3 file browser
    • active users
    • redis and mongo db analytics
    • Bucket Invoices
    • S3 Search
    • Total Orders and Revenue
  • /otel : OpenTelemetry dashboard for observability integrated with jaeger and Gemini AI
  • /pay.html : Payment page to buy premium model access

2. adminapi microservice

  • Tech: Node.js, Express, MongoDB, Redis, S3
  • Handles admin operations, analytics, and data management.
  • Provides endpoints for viewing Redis, MongoDB, and S3 analytics.
  • Show active users, total orders, and revenue.
  • Integrates with S3 for invoice preview with presigned URLs.
  • Backed by Advanced search capabilities for S3 files.

3. askapi microservice

  • Tech: Node.js, Express, Redis, MongoDB, Ollama
  • Handles AI queries and token validation from mongo db and redis.
  • Routes requests to the appropriate AI model.
  • Integrates with vectorapi to create embedding and store in vector DB and handle vector search for enhanced query handling.
  • Uses OpenTelemetry for tracing and send otel traces to Jaeger and then later visualized in otel dashboard.

4. deliverapi microservice

  • Tech: Node.js, Express, S3, Kafka
  • Manages invoice generation and and Email delivery to users.
  • Integrates with S3 for invoice storage
  • Uses Kafka for event-driven processing of order events.

5. tokenapi microservice

  • Tech: Node.js, Express, MongoDB, Redis
  • Manages token generation and validation.
  • Issues free tokens for basic model access and store in Redis with 1-hour TTL.
  • Generate and Validates premium tokens against MongoDB for paid model and cache in Redis.
  • Integrates with orderapi for premium token issuance after payment.

6. orderapi microservice

  • Tech: Node.js, Express, MongoDB, Redis, Kafka
  • Handles order processing, Email verification via OTP and payment verification.
  • Integrates paymentapi which is backend by Razorpay for payment processing.
  • Integrate with Tokenapi to Generates premium tokens upon successful payment and stores them in Redis and MongoDB.
  • Created orders and tokens send to kafka topic for further processing e.g. deliverapi

7. paymentapi microservice

  • Tech: Node.js, Express, Razorpay
  • Manages payment operations using Razorpay (test-keys).
  • Handles OTP verification and payment completion.
  • Integrates with orderapi to create premium tokens after payment.

8. oteldash microservice

  • Tech: React, Gemini AI
  • Provides observability dashboards for monitoring application traces and performance.
  • Integrates with Jaeger for distributed tracing visualization.
  • Displays metrics and traces from various microservices.
  • Dashboard AI Feature include :
    • Trace visualization with detailed spans and logs.
    • Integration with Gemini AI for enhanced trace explanations.
    • Get Optimized Recommendations based on trace data.
    • Trace Explanation using Gemini AI for better understanding of complex traces.

9. otelapi microservice

  • Tech: Go, OpenTelemetry, Jaeger
  • Backend for OpenTelemetry metrics and traces.
  • Get traces from jaeger and visualized in oteldash.
  • Provides APIs for querying and visualizing metrics.
  • handle the AI explanation of traces using Gemini AI.

10. vectorapi microservice

  • Tech: Python, Qdrant, Embeddings
  • Manages vector embeddings and similarity search.
  • Integrates with Qdrant for efficient vector storage and retrieval.
  • Integrate with embeddingapi to create embeddings for user queries and store them in Qdrant.
  • Integrates with askapi to check vector cache before querying AI models.

11. embeddingapi microservice

  • Tech: Python, OpenAI Embeddings, Qdrant
  • Generates embeddings for user queries using OpenAI's embedding models.
  • Handles embedding generation for text queries.
  • Integrates with vectorapi for storing and querying embeddings.

12. Other Services

ollamapods

  • Hosts various AI models using Ollama.
  • Provides endpoints for querying AI models.
  • Supports Custom Model Hosting and management.

REDIS

  • Caching layer for token storage and user sessions ttl 1 hour
  • Used for quick lookups and reducing database load

MongoDB

  • Primary database for user data, orders details and premium tokens
  • Stores persistent data with high availability

Qdrant (VectorDB)

  • Specialized database for storing and querying vector embeddings
  • Supports efficient similarity search and retrieval

Kafka (Event Streaming)

  • Used to handle order and delivery events
  • Ensures decoupled communication between services

S3 (AWS)

  • Object storage for invoices and other files
  • Provides scalable storage with presigned URLs for secure access

Cloudflare AI Workers

  • Provides AI capabilities for text/image generation
  • Integrates with the application for enhanced AI features

13. Load Testing

  • Load testing is performed using k6 to ensure the application can handle high traffic.
  • Simulates user interactions and measures performance metrics.

AI Models Overview

Top Models

ModelSize (Quantized)RAM (Min)GPU (Optional)Notes
TinyLlama~1.1 GB4 GBNone / 2 GB+Lightweight
Mistral-7B~4.2 GB8–16 GB8 GB+ VRAMPowerful general-purpose
CodeLlama4.5–10 GB16–24 GB8–16 GB+ VRAMCode-optimized
LLaMA 24.5–40 GB16–80 GB8–64 GB+ VRAMVersatile but resource-heavy
Phi-2~1.7 GB6–8 GB4 GB+ VRAMEfficient and compact

Mini Models

Model NameApprox. SizeRAM RequiredDescription
TinyLlama (1.1B)~1.1 GB2–3 GBExtremely lightweight; suitable for simple QA/chat
Phi-1.5 / Phi-2~1.5–1.7 GB3–4 GBCompact model from Microsoft optimized for reasoning
Gemma-2B (Google)~2.1 GB (quantized)~4 GBLightweight open-source model focused on chat

πŸ“Œ Author

Ahmad Raza
Sr. DevOps Engineer | Cloud Infra Specialist
πŸ”— ahmadraza.in (opens in a new tab)
πŸ”— linkedin.com/in/ahmad-raza-devops (opens in a new tab)


For more, visit ahmadraza.in (opens in a new tab)
Detailed commands, manifests, and guides are available on my blog.


πŸ§™ AI Wizard - Instant Page Insights

Click the button below to analyze this page.
Get an AI-generated summary and key insights in seconds.
Powered by Perplexity AI!