phase 5
This commit is contained in:
355
README.md
Normal file
355
README.md
Normal file
@@ -0,0 +1,355 @@
|
||||
# Privacy Policy Analyzer
|
||||
|
||||
A self-hosted web application that analyzes privacy policies using AI and provides easy-to-understand A-E grades.
|
||||
|
||||
## Features
|
||||
|
||||
- **AI-Powered Analysis**: Uses Ollama (local LLM) with OpenAI fallback to analyze privacy policies
|
||||
- **Background Processing**: Analysis jobs run asynchronously to prevent timeouts
|
||||
- **A-E Grading System**: Clear letter grades based on privacy practices
|
||||
- **Service Management**: Add, edit, and manage services through admin panel
|
||||
- **Search**: Full-text search powered by Meilisearch
|
||||
- **Caching**: Redis caching for fast page loads
|
||||
- **SEO Optimized**: Sitemap.xml, robots.txt, and Schema.org structured data
|
||||
- **Accessibility**: WCAG 2.1 AA compliant
|
||||
- **Security**: OWASP-compliant security headers and best practices
|
||||
|
||||
## Tech Stack
|
||||
|
||||
- **Runtime**: Bun (JavaScript)
|
||||
- **Database**: PostgreSQL 15
|
||||
- **Cache**: Redis 7
|
||||
- **Search**: Meilisearch 1.6
|
||||
- **AI**: Ollama (gpt-oss:latest) with OpenAI fallback
|
||||
- **Templating**: EJS
|
||||
- **Containerization**: Docker + Docker Compose
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. **Clone the repository**:
|
||||
```bash
|
||||
git clone <repository-url>
|
||||
cd privacy-policy-analyzer
|
||||
```
|
||||
|
||||
2. **Set up environment variables**:
|
||||
```bash
|
||||
cp .env.example .env
|
||||
# Edit .env with your settings
|
||||
```
|
||||
|
||||
3. **Start all services**:
|
||||
```bash
|
||||
docker-compose up -d
|
||||
```
|
||||
|
||||
4. **Run database migrations**:
|
||||
```bash
|
||||
docker-compose exec app bun run migrate
|
||||
```
|
||||
|
||||
5. **Access the application**:
|
||||
- Public site: http://localhost:3000
|
||||
- Admin panel: http://localhost:3000/admin/login
|
||||
- Default credentials: admin / secure_password_here
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
```bash
|
||||
# Database
|
||||
DATABASE_URL=postgresql://postgres:changeme@postgres:5432/privacy_analyzer
|
||||
|
||||
# Redis
|
||||
REDIS_URL=redis://redis:6379
|
||||
|
||||
# Meilisearch
|
||||
MEILISEARCH_URL=http://meilisearch:7700
|
||||
MEILISEARCH_API_KEY=your_secure_master_key
|
||||
|
||||
# AI Provider (Ollama - default, no API costs)
|
||||
USE_OLLAMA=true
|
||||
OLLAMA_URL=http://ollama:11434
|
||||
OLLAMA_MODEL=gpt-oss:latest
|
||||
|
||||
# AI Provider (OpenAI - optional fallback)
|
||||
OPENAI_API_KEY=sk-your-openai-api-key
|
||||
OPENAI_MODEL=gpt-4o
|
||||
|
||||
# Admin Credentials
|
||||
ADMIN_USERNAME=admin
|
||||
ADMIN_PASSWORD=secure_password_here
|
||||
SESSION_SECRET=your_random_session_secret
|
||||
|
||||
# Base URL for sitemap
|
||||
BASE_URL=https://yourdomain.com
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
### Adding a Service
|
||||
|
||||
1. Log in to admin panel
|
||||
2. Click "Add New Service"
|
||||
3. Enter service details:
|
||||
- **Name**: Service name (e.g., "Facebook")
|
||||
- **Service URL**: Main website URL
|
||||
- **Privacy Policy URL**: Direct link to privacy policy
|
||||
- **Logo URL**: (Optional) Service logo
|
||||
4. Click "Add Service"
|
||||
5. Click "Analyze" to queue analysis
|
||||
|
||||
### Viewing Analysis Results
|
||||
|
||||
- **Public site**: Browse all analyzed services with grades
|
||||
- **Service detail**: Click any service for full analysis
|
||||
- **Filter by grade**: Use grade filters on homepage
|
||||
- **Search**: Use search bar to find services
|
||||
|
||||
### Admin Features
|
||||
|
||||
- **Dashboard**: Overview of all services and statistics
|
||||
- **Background Analysis**: Analysis runs asynchronously
|
||||
- **Queue Status**: Real-time view of analysis queue
|
||||
- **Edit/Delete**: Manage existing services
|
||||
|
||||
## API Endpoints
|
||||
|
||||
### Public Endpoints
|
||||
|
||||
```
|
||||
GET / # Homepage with service listing
|
||||
GET /service/:id # Service detail page
|
||||
GET /search?q=query # Search services
|
||||
GET /sitemap.xml # XML sitemap
|
||||
GET /robots.txt # Robots.txt
|
||||
GET /api/health # Health check
|
||||
GET /api/analysis/status/:jobId # Check analysis job status
|
||||
```
|
||||
|
||||
### Admin Endpoints (Requires authentication)
|
||||
|
||||
```
|
||||
GET/POST /admin/login # Login
|
||||
GET /admin/logout # Logout
|
||||
GET /admin/dashboard # Admin dashboard
|
||||
GET/POST /admin/services/new # Add service
|
||||
GET/POST /admin/services/:id # Edit service
|
||||
POST /admin/services/:id/delete # Delete service
|
||||
POST /admin/services/:id/analyze # Queue analysis
|
||||
GET /api/analysis/queue # Queue status
|
||||
```
|
||||
|
||||
## Background Analysis
|
||||
|
||||
The system uses a background worker for privacy policy analysis:
|
||||
|
||||
1. **Queue Job**: When you click "Analyze", job is added to Redis queue
|
||||
2. **Process**: Worker picks up job and fetches policy
|
||||
3. **Analyze**: AI analyzes the policy (Ollama or OpenAI)
|
||||
4. **Store**: Results saved to database
|
||||
5. **Notify**: Dashboard auto-refreshes with results
|
||||
|
||||
### Analysis Timing
|
||||
|
||||
- Local Ollama: 2-5 minutes per policy
|
||||
- OpenAI API: 10-30 seconds per policy
|
||||
|
||||
## Caching
|
||||
|
||||
Redis caching improves performance:
|
||||
|
||||
- **Homepage**: 1 hour cache
|
||||
- **Service detail**: 2 hour cache
|
||||
- **Search results**: 5 minute cache
|
||||
- **API responses**: 1 minute cache
|
||||
|
||||
Cache is automatically invalidated when services are created, updated, or deleted.
|
||||
|
||||
## Security
|
||||
|
||||
### Implemented Security Features
|
||||
|
||||
- Security headers (CSP, HSTS, X-Frame-Options, etc.)
|
||||
- Session-based authentication with Redis storage
|
||||
- CSRF protection
|
||||
- Rate limiting
|
||||
- Input validation and sanitization
|
||||
- SQL injection prevention (parameterized queries)
|
||||
- XSS prevention (EJS auto-escaping)
|
||||
- Non-root Docker containers
|
||||
|
||||
### Security Headers
|
||||
|
||||
```
|
||||
Strict-Transport-Security: max-age=31536000
|
||||
Content-Security-Policy: default-src 'self'
|
||||
X-Frame-Options: DENY
|
||||
X-Content-Type-Options: nosniff
|
||||
X-XSS-Protection: 1; mode=block
|
||||
Referrer-Policy: strict-origin-when-cross-origin
|
||||
```
|
||||
|
||||
## Performance
|
||||
|
||||
### Optimizations
|
||||
|
||||
- 75% HTML to text reduction for AI analysis
|
||||
- Smart content truncation (keeps important sections)
|
||||
- Redis page caching
|
||||
- Database indexes on frequently queried columns
|
||||
- Connection pooling (PostgreSQL, Redis)
|
||||
|
||||
### Target Metrics
|
||||
|
||||
- First Contentful Paint: < 1.0s
|
||||
- Largest Contentful Paint: < 2.5s
|
||||
- Time to Interactive: < 3.8s
|
||||
|
||||
## Deployment
|
||||
|
||||
### Production Deployment
|
||||
|
||||
1. **Set production environment**:
|
||||
```bash
|
||||
NODE_ENV=production
|
||||
BASE_URL=https://yourdomain.com
|
||||
```
|
||||
|
||||
2. **Update admin credentials**:
|
||||
```bash
|
||||
ADMIN_USERNAME=your_username
|
||||
ADMIN_PASSWORD=strong_password_hash
|
||||
SESSION_SECRET=random_32_char_string
|
||||
```
|
||||
|
||||
3. **Enable HTTPS** (use reverse proxy like Nginx):
|
||||
```bash
|
||||
# Example Nginx config
|
||||
server {
|
||||
listen 443 ssl;
|
||||
server_name yourdomain.com;
|
||||
|
||||
location / {
|
||||
proxy_pass http://localhost:3000;
|
||||
proxy_set_header Host $host;
|
||||
proxy_set_header X-Real-IP $remote_addr;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
4. **Backup strategy**:
|
||||
```bash
|
||||
# Backup database
|
||||
docker-compose exec postgres pg_dump -U postgres privacy_analyzer > backup.sql
|
||||
|
||||
# Backup Redis
|
||||
docker-compose exec redis redis-cli SAVE
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**1. Analysis times out**
|
||||
- Solution: Analysis runs in background, check dashboard for status
|
||||
- Ollama may take 2-5 minutes for first analysis
|
||||
|
||||
**2. 429 Rate Limit Error**
|
||||
- You're using OpenAI without sufficient quota
|
||||
- Solution: Switch to Ollama (default) or add billing to OpenAI account
|
||||
|
||||
**3. Service won't start**
|
||||
```bash
|
||||
# Check logs
|
||||
docker-compose logs app
|
||||
|
||||
# Verify environment
|
||||
docker-compose config
|
||||
|
||||
# Restart all services
|
||||
docker-compose restart
|
||||
```
|
||||
|
||||
**4. Database connection fails**
|
||||
```bash
|
||||
# Check PostgreSQL status
|
||||
docker-compose ps postgres
|
||||
|
||||
# Run migrations
|
||||
docker-compose exec app bun run migrate
|
||||
```
|
||||
|
||||
### Logs
|
||||
|
||||
```bash
|
||||
# App logs
|
||||
docker-compose logs -f app
|
||||
|
||||
# All services
|
||||
docker-compose logs -f
|
||||
|
||||
# Specific service
|
||||
docker-compose logs -f ollama
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
### Project Structure
|
||||
|
||||
```
|
||||
privacy-policy-analyzer/
|
||||
├── docker-compose.yml # Service orchestration
|
||||
├── Dockerfile # Bun app container
|
||||
├── .env # Environment variables
|
||||
├── src/
|
||||
│ ├── app.js # Main application
|
||||
│ ├── config/ # Database, Redis, etc.
|
||||
│ ├── models/ # Data models
|
||||
│ ├── services/ # Business logic
|
||||
│ ├── middleware/ # Auth, security, cache
|
||||
│ ├── views/ # EJS templates
|
||||
│ └── scripts/ # Utility scripts
|
||||
├── migrations/ # SQL migrations
|
||||
└── public/ # Static assets
|
||||
```
|
||||
|
||||
### Useful Commands
|
||||
|
||||
```bash
|
||||
# Start services
|
||||
docker-compose up -d
|
||||
|
||||
# View logs
|
||||
docker-compose logs -f app
|
||||
|
||||
# Run migrations
|
||||
docker-compose exec app bun run migrate
|
||||
|
||||
# Database shell
|
||||
docker-compose exec postgres psql -U postgres -d privacy_analyzer
|
||||
|
||||
# Redis shell
|
||||
docker-compose exec redis redis-cli
|
||||
|
||||
# Test AI integration
|
||||
docker-compose exec app bun run src/scripts/test-ollama.js
|
||||
```
|
||||
|
||||
## License
|
||||
|
||||
MIT License - Private pet project
|
||||
|
||||
## Contributing
|
||||
|
||||
This is a private project. No external contributions expected.
|
||||
|
||||
## Support
|
||||
|
||||
For issues or questions, check the logs and ensure all services are healthy:
|
||||
|
||||
```bash
|
||||
docker-compose ps
|
||||
docker-compose logs
|
||||
```
|
||||
Reference in New Issue
Block a user