Cinematch AI is a full-featured movie recommendation platform with an intelligent chatbot, multi-provider AI integration, real-time streaming, and an enterprise-grade admin system — built from the ground up with Flask, PostgreSQL, and Redis.
An AI-powered movie recommendation platform with intelligent chat, multi-provider LLM orchestration, and production-grade safety and admin systems.
Intelligent provider routing chains: Ollama (local) → Claude → OpenAI → Template fallback. Achieves sub-500ms local response times with graceful degradation to cloud providers.
Content filtering, age verification, violence/profanity detection via Guardrails, domain restrictions, and fraud detection systems.
Smart recommendations for two people watching together — balances both preferences so neither partner gets stuck with something they hate.
Automated scraping from TMDB, IMDB, Rotten Tomatoes, Metacritic, Wikipedia with content validation and deduplication.
Real-time streaming responses via SSE, conversation memory across sessions, personality engine, and context-aware recommendations.
GUI desktop panel, web admin dashboard, CLI utilities, Prometheus + Grafana integration, health checks, rotating log management, and comprehensive backup systems with integrity verification.
A deep dive into the architectural decisions, design patterns, and engineering challenges behind Cinematch AI.
Started with PostgreSQL schema design optimized for movie metadata — B-tree indexing on frequently queried fields, full-text search, and pgvector for embedding-based similarity. Built the Flask application with SQLAlchemy ORM and Alembic migrations for schema versioning.
Designed the intelligent routing system using LangChain as the orchestration layer. The chain attempts local Ollama first (free, fast, private), falls back to Claude for complex queries, then OpenAI, with a template-based fallback ensuring the system never fails. Added dynamic parameter controls — temperature, top-k sampling — adjusted automatically based on query context.
Built the couples recommendation feature that balances preferences for two viewers. Designed a scoring system that ensures neither person gets stuck with something they dislike, plus adaptive learning that improves suggestions over time based on user feedback.
Implemented multi-layer content filtering with age verification guardrails, violence/profanity detection, and fraud detection. Added OAuth 2.0 with Google, CSRF protection, rate limiting with Redis-backed storage, and role-based access control. Containerized the full stack with Docker Compose, Nginx reverse proxy, Prometheus/Grafana monitoring, and automated backup systems.
Key patterns and architectural decisions from the codebase.
class AIProviderChain:
"""Intelligent multi-provider LLM routing
with automatic fallback and cost optimization."""
PROVIDER_ORDER = [
("ollama", 0.00), # Free, local
("claude", 0.015), # Best quality
("openai", 0.02), # Wide capability
("gemini", 0.01), # Cost-effective
("template", 0.00), # Never fails
]
async def get_recommendation(self, query, ctx):
for provider, cost in self.PROVIDER_ORDER:
try:
chain = self._build_chain(provider)
params = self._dynamic_params(query, ctx)
response = await chain.ainvoke({
"query": query,
"temperature": params.temp,
"context": ctx.history[-5:]
})
self.metrics.record(provider, cost)
return response
except ProviderError:
logger.warning(f"{provider} failed, trying next")
continue
The AI provider chain uses a priority-ordered fallback system. Each request attempts the cheapest provider first (local Ollama), cascading through cloud providers only when needed.
class SafetyPipeline:
"""Multi-layer content safety with
age-appropriate filtering and guardrails."""
def validate_response(self, response, user):
checks = [
self._check_age_appropriate(response, user),
self._check_violence_level(response),
self._check_profanity(response),
self._check_domain_safety(response),
self._check_content_filter(response),
]
violations = [c for c in checks if not c.passed]
if violations:
logger.warning(
"Safety violations",
count=len(violations),
types=[v.type for v in violations]
)
return self._safe_fallback(response, violations)
return SafetyResult(
passed=True,
response=response,
confidence=self._confidence_score(checks)
)
Every AI response passes through 5 independent safety checks before reaching the user. Any violation triggers a safe fallback response.