Project Architecture Stack Code Contact
Full-Stack Developer & AI Engineer

Building intelligent systems that think.

Cinematch AI is a full-featured movie recommendation platform with an intelligent chatbot, multi-provider AI integration, real-time streaming, and an enterprise-grade admin system — built from the ground up with Flask, PostgreSQL, and Redis.

0 Python Modules
0 Data Sources
90% Cost Reduction
0 AI Providers

Cinematch AI

An AI-powered movie recommendation platform with intelligent chat, multi-provider LLM orchestration, and production-grade safety and admin systems.

Local-First AI Architecture

Intelligent provider routing chains: Ollama (local) → Claude → OpenAI → Template fallback. Achieves sub-500ms local response times with graceful degradation to cloud providers.

LangChain Ollama Multi-Provider

Multi-Layer Safety

Content filtering, age verification, violence/profanity detection via Guardrails, domain restrictions, and fraud detection systems.

Guardrails CSRF

Couples Mode

Smart recommendations for two people watching together — balances both preferences so neither partner gets stuck with something they hate.

Scoring Engine Preference Matching

Data Pipeline

Automated scraping from TMDB, IMDB, Rotten Tomatoes, Metacritic, Wikipedia with content validation and deduplication.

Playwright Scrapy Celery

CineBot Chat

Real-time streaming responses via SSE, conversation memory across sessions, personality engine, and context-aware recommendations.

SSE Flask Redis

Enterprise Admin & Monitoring

GUI desktop panel, web admin dashboard, CLI utilities, Prometheus + Grafana integration, health checks, rotating log management, and comprehensive backup systems with integrity verification.

Prometheus Grafana Gunicorn Docker Nginx

How It Was Built

A deep dive into the architectural decisions, design patterns, and engineering challenges behind Cinematch AI.

01

Foundation & Data Layer

Started with PostgreSQL schema design optimized for movie metadata — B-tree indexing on frequently queried fields, full-text search, and pgvector for embedding-based similarity. Built the Flask application with SQLAlchemy ORM and Alembic migrations for schema versioning.

PostgreSQLpgvectorSQLAlchemyAlembic
02

AI Provider Chain

Designed the intelligent routing system using LangChain as the orchestration layer. The chain attempts local Ollama first (free, fast, private), falls back to Claude for complex queries, then OpenAI, with a template-based fallback ensuring the system never fails. Added dynamic parameter controls — temperature, top-k sampling — adjusted automatically based on query context.

LangChainOllamaClaude APIOpenAI APIGemini
03

Recommendation Engine

Built the couples recommendation feature that balances preferences for two viewers. Designed a scoring system that ensures neither person gets stuck with something they dislike, plus adaptive learning that improves suggestions over time based on user feedback.

PythonNumPyPandasAdaptive Learning
04

Safety & Production Hardening

Implemented multi-layer content filtering with age verification guardrails, violence/profanity detection, and fraud detection. Added OAuth 2.0 with Google, CSRF protection, rate limiting with Redis-backed storage, and role-based access control. Containerized the full stack with Docker Compose, Nginx reverse proxy, Prometheus/Grafana monitoring, and automated backup systems.

DockerNginxRedisOAuth 2.0Prometheus

Tech Stack

Backend

Flask 3.0
Gunicorn
SQLAlchemy
Celery
Flask-Login
Flask-Limiter

AI / ML

LangChain
OpenAI
Anthropic Claude
Google Gemini
Ollama
spaCy / TextBlob
Sentence Transformers

Data

PostgreSQL
pgvector
Redis
Pandas / NumPy
Playwright
Scrapy

Infrastructure

Docker Compose
Nginx
Prometheus
Grafana
S3 / Boto3
Let's Encrypt SSL

Code Highlights

Key patterns and architectural decisions from the codebase.

services/ai_manager.py
class AIProviderChain:
    """Intelligent multi-provider LLM routing
    with automatic fallback and cost optimization."""

    PROVIDER_ORDER = [
        ("ollama", 0.00),    # Free, local
        ("claude", 0.015),   # Best quality
        ("openai", 0.02),    # Wide capability
        ("gemini", 0.01),    # Cost-effective
        ("template", 0.00),  # Never fails
    ]

    async def get_recommendation(self, query, ctx):
        for provider, cost in self.PROVIDER_ORDER:
            try:
                chain = self._build_chain(provider)
                params = self._dynamic_params(query, ctx)
                response = await chain.ainvoke({
                    "query": query,
                    "temperature": params.temp,
                    "context": ctx.history[-5:]
                })
                self.metrics.record(provider, cost)
                return response
            except ProviderError:
                logger.warning(f"{provider} failed, trying next")
                continue

Intelligent Provider Routing

The AI provider chain uses a priority-ordered fallback system. Each request attempts the cheapest provider first (local Ollama), cascading through cloud providers only when needed.

  • Local-first for privacy and zero cost
  • Dynamic parameter tuning per query context
  • Automatic metrics tracking per provider
  • Template fallback guarantees 100% uptime
90% API cost reduction vs cloud-only
safety_integration.py
class SafetyPipeline:
    """Multi-layer content safety with
    age-appropriate filtering and guardrails."""

    def validate_response(self, response, user):
        checks = [
            self._check_age_appropriate(response, user),
            self._check_violence_level(response),
            self._check_profanity(response),
            self._check_domain_safety(response),
            self._check_content_filter(response),
        ]

        violations = [c for c in checks if not c.passed]

        if violations:
            logger.warning(
                "Safety violations",
                count=len(violations),
                types=[v.type for v in violations]
            )
            return self._safe_fallback(response, violations)

        return SafetyResult(
            passed=True,
            response=response,
            confidence=self._confidence_score(checks)
        )

Multi-Layer Safety Pipeline

Every AI response passes through 5 independent safety checks before reaching the user. Any violation triggers a safe fallback response.

  • Age-appropriate content filtering
  • Violence and profanity detection
  • Domain restriction enforcement
  • Confidence scoring for transparency
5 independent safety layers