356 lines
8.5 KiB
Markdown
356 lines
8.5 KiB
Markdown
# Privacy Policy Analyzer
|
|
|
|
A self-hosted web application that analyzes privacy policies using AI and provides easy-to-understand A-E grades.
|
|
|
|
## Features
|
|
|
|
- **AI-Powered Analysis**: Uses Ollama (local LLM) with OpenAI fallback to analyze privacy policies
|
|
- **Background Processing**: Analysis jobs run asynchronously to prevent timeouts
|
|
- **A-E Grading System**: Clear letter grades based on privacy practices
|
|
- **Service Management**: Add, edit, and manage services through admin panel
|
|
- **Search**: Full-text search powered by Meilisearch
|
|
- **Caching**: Redis caching for fast page loads
|
|
- **SEO Optimized**: Sitemap.xml, robots.txt, and Schema.org structured data
|
|
- **Accessibility**: WCAG 2.1 AA compliant
|
|
- **Security**: OWASP-compliant security headers and best practices
|
|
|
|
## Tech Stack
|
|
|
|
- **Runtime**: Bun (JavaScript)
|
|
- **Database**: PostgreSQL 15
|
|
- **Cache**: Redis 7
|
|
- **Search**: Meilisearch 1.6
|
|
- **AI**: Ollama (gpt-oss:latest) with OpenAI fallback
|
|
- **Templating**: EJS
|
|
- **Containerization**: Docker + Docker Compose
|
|
|
|
## Quick Start
|
|
|
|
1. **Clone the repository**:
|
|
```bash
|
|
git clone <repository-url>
|
|
cd privacy-policy-analyzer
|
|
```
|
|
|
|
2. **Set up environment variables**:
|
|
```bash
|
|
cp .env.example .env
|
|
# Edit .env with your settings
|
|
```
|
|
|
|
3. **Start all services**:
|
|
```bash
|
|
docker-compose up -d
|
|
```
|
|
|
|
4. **Run database migrations**:
|
|
```bash
|
|
docker-compose exec app bun run migrate
|
|
```
|
|
|
|
5. **Access the application**:
|
|
- Public site: http://localhost:3000
|
|
- Admin panel: http://localhost:3000/admin/login
|
|
- Default credentials: admin / secure_password_here
|
|
|
|
## Configuration
|
|
|
|
### Environment Variables
|
|
|
|
```bash
|
|
# Database
|
|
DATABASE_URL=postgresql://postgres:changeme@postgres:5432/privacy_analyzer
|
|
|
|
# Redis
|
|
REDIS_URL=redis://redis:6379
|
|
|
|
# Meilisearch
|
|
MEILISEARCH_URL=http://meilisearch:7700
|
|
MEILISEARCH_API_KEY=your_secure_master_key
|
|
|
|
# AI Provider (Ollama - default, no API costs)
|
|
USE_OLLAMA=true
|
|
OLLAMA_URL=http://ollama:11434
|
|
OLLAMA_MODEL=gpt-oss:latest
|
|
|
|
# AI Provider (OpenAI - optional fallback)
|
|
OPENAI_API_KEY=sk-your-openai-api-key
|
|
OPENAI_MODEL=gpt-4o
|
|
|
|
# Admin Credentials
|
|
ADMIN_USERNAME=admin
|
|
ADMIN_PASSWORD=secure_password_here
|
|
SESSION_SECRET=your_random_session_secret
|
|
|
|
# Base URL for sitemap
|
|
BASE_URL=https://yourdomain.com
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Adding a Service
|
|
|
|
1. Log in to admin panel
|
|
2. Click "Add New Service"
|
|
3. Enter service details:
|
|
- **Name**: Service name (e.g., "Facebook")
|
|
- **Service URL**: Main website URL
|
|
- **Privacy Policy URL**: Direct link to privacy policy
|
|
- **Logo URL**: (Optional) Service logo
|
|
4. Click "Add Service"
|
|
5. Click "Analyze" to queue analysis
|
|
|
|
### Viewing Analysis Results
|
|
|
|
- **Public site**: Browse all analyzed services with grades
|
|
- **Service detail**: Click any service for full analysis
|
|
- **Filter by grade**: Use grade filters on homepage
|
|
- **Search**: Use search bar to find services
|
|
|
|
### Admin Features
|
|
|
|
- **Dashboard**: Overview of all services and statistics
|
|
- **Background Analysis**: Analysis runs asynchronously
|
|
- **Queue Status**: Real-time view of analysis queue
|
|
- **Edit/Delete**: Manage existing services
|
|
|
|
## API Endpoints
|
|
|
|
### Public Endpoints
|
|
|
|
```
|
|
GET / # Homepage with service listing
|
|
GET /service/:id # Service detail page
|
|
GET /search?q=query # Search services
|
|
GET /sitemap.xml # XML sitemap
|
|
GET /robots.txt # Robots.txt
|
|
GET /api/health # Health check
|
|
GET /api/analysis/status/:jobId # Check analysis job status
|
|
```
|
|
|
|
### Admin Endpoints (Requires authentication)
|
|
|
|
```
|
|
GET/POST /admin/login # Login
|
|
GET /admin/logout # Logout
|
|
GET /admin/dashboard # Admin dashboard
|
|
GET/POST /admin/services/new # Add service
|
|
GET/POST /admin/services/:id # Edit service
|
|
POST /admin/services/:id/delete # Delete service
|
|
POST /admin/services/:id/analyze # Queue analysis
|
|
GET /api/analysis/queue # Queue status
|
|
```
|
|
|
|
## Background Analysis
|
|
|
|
The system uses a background worker for privacy policy analysis:
|
|
|
|
1. **Queue Job**: When you click "Analyze", job is added to Redis queue
|
|
2. **Process**: Worker picks up job and fetches policy
|
|
3. **Analyze**: AI analyzes the policy (Ollama or OpenAI)
|
|
4. **Store**: Results saved to database
|
|
5. **Notify**: Dashboard auto-refreshes with results
|
|
|
|
### Analysis Timing
|
|
|
|
- Local Ollama: 2-5 minutes per policy
|
|
- OpenAI API: 10-30 seconds per policy
|
|
|
|
## Caching
|
|
|
|
Redis caching improves performance:
|
|
|
|
- **Homepage**: 1 hour cache
|
|
- **Service detail**: 2 hour cache
|
|
- **Search results**: 5 minute cache
|
|
- **API responses**: 1 minute cache
|
|
|
|
Cache is automatically invalidated when services are created, updated, or deleted.
|
|
|
|
## Security
|
|
|
|
### Implemented Security Features
|
|
|
|
- Security headers (CSP, HSTS, X-Frame-Options, etc.)
|
|
- Session-based authentication with Redis storage
|
|
- CSRF protection
|
|
- Rate limiting
|
|
- Input validation and sanitization
|
|
- SQL injection prevention (parameterized queries)
|
|
- XSS prevention (EJS auto-escaping)
|
|
- Non-root Docker containers
|
|
|
|
### Security Headers
|
|
|
|
```
|
|
Strict-Transport-Security: max-age=31536000
|
|
Content-Security-Policy: default-src 'self'
|
|
X-Frame-Options: DENY
|
|
X-Content-Type-Options: nosniff
|
|
X-XSS-Protection: 1; mode=block
|
|
Referrer-Policy: strict-origin-when-cross-origin
|
|
```
|
|
|
|
## Performance
|
|
|
|
### Optimizations
|
|
|
|
- 75% HTML to text reduction for AI analysis
|
|
- Smart content truncation (keeps important sections)
|
|
- Redis page caching
|
|
- Database indexes on frequently queried columns
|
|
- Connection pooling (PostgreSQL, Redis)
|
|
|
|
### Target Metrics
|
|
|
|
- First Contentful Paint: < 1.0s
|
|
- Largest Contentful Paint: < 2.5s
|
|
- Time to Interactive: < 3.8s
|
|
|
|
## Deployment
|
|
|
|
### Production Deployment
|
|
|
|
1. **Set production environment**:
|
|
```bash
|
|
NODE_ENV=production
|
|
BASE_URL=https://yourdomain.com
|
|
```
|
|
|
|
2. **Update admin credentials**:
|
|
```bash
|
|
ADMIN_USERNAME=your_username
|
|
ADMIN_PASSWORD=strong_password_hash
|
|
SESSION_SECRET=random_32_char_string
|
|
```
|
|
|
|
3. **Enable HTTPS** (use reverse proxy like Nginx):
|
|
```bash
|
|
# Example Nginx config
|
|
server {
|
|
listen 443 ssl;
|
|
server_name yourdomain.com;
|
|
|
|
location / {
|
|
proxy_pass http://localhost:3000;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
}
|
|
}
|
|
```
|
|
|
|
4. **Backup strategy**:
|
|
```bash
|
|
# Backup database
|
|
docker-compose exec postgres pg_dump -U postgres privacy_analyzer > backup.sql
|
|
|
|
# Backup Redis
|
|
docker-compose exec redis redis-cli SAVE
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
|
|
**1. Analysis times out**
|
|
- Solution: Analysis runs in background, check dashboard for status
|
|
- Ollama may take 2-5 minutes for first analysis
|
|
|
|
**2. 429 Rate Limit Error**
|
|
- You're using OpenAI without sufficient quota
|
|
- Solution: Switch to Ollama (default) or add billing to OpenAI account
|
|
|
|
**3. Service won't start**
|
|
```bash
|
|
# Check logs
|
|
docker-compose logs app
|
|
|
|
# Verify environment
|
|
docker-compose config
|
|
|
|
# Restart all services
|
|
docker-compose restart
|
|
```
|
|
|
|
**4. Database connection fails**
|
|
```bash
|
|
# Check PostgreSQL status
|
|
docker-compose ps postgres
|
|
|
|
# Run migrations
|
|
docker-compose exec app bun run migrate
|
|
```
|
|
|
|
### Logs
|
|
|
|
```bash
|
|
# App logs
|
|
docker-compose logs -f app
|
|
|
|
# All services
|
|
docker-compose logs -f
|
|
|
|
# Specific service
|
|
docker-compose logs -f ollama
|
|
```
|
|
|
|
## Development
|
|
|
|
### Project Structure
|
|
|
|
```
|
|
privacy-policy-analyzer/
|
|
├── docker-compose.yml # Service orchestration
|
|
├── Dockerfile # Bun app container
|
|
├── .env # Environment variables
|
|
├── src/
|
|
│ ├── app.js # Main application
|
|
│ ├── config/ # Database, Redis, etc.
|
|
│ ├── models/ # Data models
|
|
│ ├── services/ # Business logic
|
|
│ ├── middleware/ # Auth, security, cache
|
|
│ ├── views/ # EJS templates
|
|
│ └── scripts/ # Utility scripts
|
|
├── migrations/ # SQL migrations
|
|
└── public/ # Static assets
|
|
```
|
|
|
|
### Useful Commands
|
|
|
|
```bash
|
|
# Start services
|
|
docker-compose up -d
|
|
|
|
# View logs
|
|
docker-compose logs -f app
|
|
|
|
# Run migrations
|
|
docker-compose exec app bun run migrate
|
|
|
|
# Database shell
|
|
docker-compose exec postgres psql -U postgres -d privacy_analyzer
|
|
|
|
# Redis shell
|
|
docker-compose exec redis redis-cli
|
|
|
|
# Test AI integration
|
|
docker-compose exec app bun run src/scripts/test-ollama.js
|
|
```
|
|
|
|
## License
|
|
|
|
MIT License - Private pet project
|
|
|
|
## Contributing
|
|
|
|
This is a private project. No external contributions expected.
|
|
|
|
## Support
|
|
|
|
For issues or questions, check the logs and ensure all services are healthy:
|
|
|
|
```bash
|
|
docker-compose ps
|
|
docker-compose logs
|
|
```
|