# Privacy Policy Analyzer A self-hosted web application that analyzes privacy policies using AI and provides easy-to-understand A-E grades. ## Features - **AI-Powered Analysis**: Uses Ollama (local LLM) with OpenAI fallback to analyze privacy policies - **Background Processing**: Analysis jobs run asynchronously to prevent timeouts - **A-E Grading System**: Clear letter grades based on privacy practices - **Service Management**: Add, edit, and manage services through admin panel - **Search**: Full-text search powered by Meilisearch - **Caching**: Redis caching for fast page loads - **SEO Optimized**: Sitemap.xml, robots.txt, and Schema.org structured data - **Accessibility**: WCAG 2.1 AA compliant - **Security**: OWASP-compliant security headers and best practices ## Tech Stack - **Runtime**: Bun (JavaScript) - **Database**: PostgreSQL 15 - **Cache**: Redis 7 - **Search**: Meilisearch 1.6 - **AI**: Ollama (gpt-oss:latest) with OpenAI fallback - **Templating**: EJS - **Containerization**: Docker + Docker Compose ## Quick Start 1. **Clone the repository**: ```bash git clone cd privacy-policy-analyzer ``` 2. **Set up environment variables**: ```bash cp .env.example .env # Edit .env with your settings ``` 3. **Start all services**: ```bash docker-compose up -d ``` 4. **Run database migrations**: ```bash docker-compose exec app bun run migrate ``` 5. **Access the application**: - Public site: http://localhost:3000 - Admin panel: http://localhost:3000/admin/login - Default credentials: admin / secure_password_here ## Configuration ### Environment Variables ```bash # Database DATABASE_URL=postgresql://postgres:changeme@postgres:5432/privacy_analyzer # Redis REDIS_URL=redis://redis:6379 # Meilisearch MEILISEARCH_URL=http://meilisearch:7700 MEILISEARCH_API_KEY=your_secure_master_key # AI Provider (Ollama - default, no API costs) USE_OLLAMA=true OLLAMA_URL=http://ollama:11434 OLLAMA_MODEL=gpt-oss:latest # AI Provider (OpenAI - optional fallback) OPENAI_API_KEY=sk-your-openai-api-key OPENAI_MODEL=gpt-4o # Admin Credentials ADMIN_USERNAME=admin ADMIN_PASSWORD=secure_password_here SESSION_SECRET=your_random_session_secret # Base URL for sitemap BASE_URL=https://yourdomain.com ``` ## Usage ### Adding a Service 1. Log in to admin panel 2. Click "Add New Service" 3. Enter service details: - **Name**: Service name (e.g., "Facebook") - **Service URL**: Main website URL - **Privacy Policy URL**: Direct link to privacy policy - **Logo URL**: (Optional) Service logo 4. Click "Add Service" 5. Click "Analyze" to queue analysis ### Viewing Analysis Results - **Public site**: Browse all analyzed services with grades - **Service detail**: Click any service for full analysis - **Filter by grade**: Use grade filters on homepage - **Search**: Use search bar to find services ### Admin Features - **Dashboard**: Overview of all services and statistics - **Background Analysis**: Analysis runs asynchronously - **Queue Status**: Real-time view of analysis queue - **Edit/Delete**: Manage existing services ## API Endpoints ### Public Endpoints ``` GET / # Homepage with service listing GET /service/:id # Service detail page GET /search?q=query # Search services GET /sitemap.xml # XML sitemap GET /robots.txt # Robots.txt GET /api/health # Health check GET /api/analysis/status/:jobId # Check analysis job status ``` ### Admin Endpoints (Requires authentication) ``` GET/POST /admin/login # Login GET /admin/logout # Logout GET /admin/dashboard # Admin dashboard GET/POST /admin/services/new # Add service GET/POST /admin/services/:id # Edit service POST /admin/services/:id/delete # Delete service POST /admin/services/:id/analyze # Queue analysis GET /api/analysis/queue # Queue status ``` ## Background Analysis The system uses a background worker for privacy policy analysis: 1. **Queue Job**: When you click "Analyze", job is added to Redis queue 2. **Process**: Worker picks up job and fetches policy 3. **Analyze**: AI analyzes the policy (Ollama or OpenAI) 4. **Store**: Results saved to database 5. **Notify**: Dashboard auto-refreshes with results ### Analysis Timing - Local Ollama: 2-5 minutes per policy - OpenAI API: 10-30 seconds per policy ## Caching Redis caching improves performance: - **Homepage**: 1 hour cache - **Service detail**: 2 hour cache - **Search results**: 5 minute cache - **API responses**: 1 minute cache Cache is automatically invalidated when services are created, updated, or deleted. ## Security ### Implemented Security Features - Security headers (CSP, HSTS, X-Frame-Options, etc.) - Session-based authentication with Redis storage - CSRF protection - Rate limiting - Input validation and sanitization - SQL injection prevention (parameterized queries) - XSS prevention (EJS auto-escaping) - Non-root Docker containers ### Security Headers ``` Strict-Transport-Security: max-age=31536000 Content-Security-Policy: default-src 'self' X-Frame-Options: DENY X-Content-Type-Options: nosniff X-XSS-Protection: 1; mode=block Referrer-Policy: strict-origin-when-cross-origin ``` ## Performance ### Optimizations - 75% HTML to text reduction for AI analysis - Smart content truncation (keeps important sections) - Redis page caching - Database indexes on frequently queried columns - Connection pooling (PostgreSQL, Redis) ### Target Metrics - First Contentful Paint: < 1.0s - Largest Contentful Paint: < 2.5s - Time to Interactive: < 3.8s ## Deployment ### Production Deployment 1. **Set production environment**: ```bash NODE_ENV=production BASE_URL=https://yourdomain.com ``` 2. **Update admin credentials**: ```bash ADMIN_USERNAME=your_username ADMIN_PASSWORD=strong_password_hash SESSION_SECRET=random_32_char_string ``` 3. **Enable HTTPS** (use reverse proxy like Nginx): ```bash # Example Nginx config server { listen 443 ssl; server_name yourdomain.com; location / { proxy_pass http://localhost:3000; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; } } ``` 4. **Backup strategy**: ```bash # Backup database docker-compose exec postgres pg_dump -U postgres privacy_analyzer > backup.sql # Backup Redis docker-compose exec redis redis-cli SAVE ``` ## Troubleshooting ### Common Issues **1. Analysis times out** - Solution: Analysis runs in background, check dashboard for status - Ollama may take 2-5 minutes for first analysis **2. 429 Rate Limit Error** - You're using OpenAI without sufficient quota - Solution: Switch to Ollama (default) or add billing to OpenAI account **3. Service won't start** ```bash # Check logs docker-compose logs app # Verify environment docker-compose config # Restart all services docker-compose restart ``` **4. Database connection fails** ```bash # Check PostgreSQL status docker-compose ps postgres # Run migrations docker-compose exec app bun run migrate ``` ### Logs ```bash # App logs docker-compose logs -f app # All services docker-compose logs -f # Specific service docker-compose logs -f ollama ``` ## Development ### Project Structure ``` privacy-policy-analyzer/ ├── docker-compose.yml # Service orchestration ├── Dockerfile # Bun app container ├── .env # Environment variables ├── src/ │ ├── app.js # Main application │ ├── config/ # Database, Redis, etc. │ ├── models/ # Data models │ ├── services/ # Business logic │ ├── middleware/ # Auth, security, cache │ ├── views/ # EJS templates │ └── scripts/ # Utility scripts ├── migrations/ # SQL migrations └── public/ # Static assets ``` ### Useful Commands ```bash # Start services docker-compose up -d # View logs docker-compose logs -f app # Run migrations docker-compose exec app bun run migrate # Database shell docker-compose exec postgres psql -U postgres -d privacy_analyzer # Redis shell docker-compose exec redis redis-cli # Test AI integration docker-compose exec app bun run src/scripts/test-ollama.js ``` ## License MIT License - Private pet project ## Contributing This is a private project. No external contributions expected. ## Support For issues or questions, check the logs and ensure all services are healthy: ```bash docker-compose ps docker-compose logs ```