# 11. Troubleshooting Guide **Document Type:** Operations Guide **Last Updated:** 2026-04-04 **Maintainer:** DevOps Team --- ## Table of Contents 1. [Quick Reference](#1-quick-reference) 2. [Infrastructure & Network Issues](#2-infrastructure--network-issues) 3. [Application & Service Issues](#3-application--service-issues) 4. [Database Issues](#4-database-issues) 5. [API Integration Issues](#5-api-integration-issues) 6. [Authentication & Security Issues](#6-authentication--security-issues) 7. [Performance Issues](#7-performance-issues) 8. [Monitoring & Diagnostics](#8-monitoring--diagnostics) 9. [Emergency Procedures](#9-emergency-procedures) 10. [Diagnostic Commands Reference](#10-diagnostic-commands-reference) --- ## 1. Quick Reference ### 1.1 Emergency Contacts | Role | Contact | Availability | |------|---------|--------------| | System Administrator | maciejpi@inpi.local | Business hours | | Database Administrator | maciejpi@inpi.local | Business hours | | On-Call Support | See CLAUDE.md | 24/7 | ### 1.2 Critical Services Status Check ```bash # Quick health check - run this first! curl -I https://nordabiznes.pl/health # Expected: HTTP/2 200 # If failed, proceed to relevant section below ``` ### 1.3 Issue Decision Tree ```mermaid graph TD A[Issue Detected] --> B{Can access site?} B -->|No| C{From where?} B -->|Yes, but slow| D[Check Performance Issues] C -->|Nowhere| E[Section 2.1: ERR_TOO_MANY_REDIRECTS] C -->|Only internal| E C -->|500 Error| F[Section 3.1: Application Crash] B -->|Yes, specific feature broken| G{Which feature?} G -->|Login/Auth| H[Section 6: Authentication Issues] G -->|Search| I[Section 3.3: Search Issues] G -->|AI Chat| J[Section 5.2: Gemini API Issues] G -->|Database| K[Section 4: Database Issues] ``` ### 1.4 Severity Levels | Level | Description | Response Time | Example | |-------|-------------|---------------|---------| | **CRITICAL** | Complete service outage | Immediate | ERR_TOO_MANY_REDIRECTS | | **HIGH** | Major feature broken | < 1 hour | Database connection lost | | **MEDIUM** | Minor feature degraded | < 4 hours | Search slow | | **LOW** | Cosmetic or minor bug | Next business day | UI glitch | --- ## 2. Infrastructure & Network Issues ### 2.1 Site Not Accessible / ERR_TOO_MANY_REDIRECTS **Severity:** CRITICAL > **Note:** The ERR_TOO_MANY_REDIRECTS issue from 2026-01-02 was specific to the old NPM+on-prem setup. With the current OVH VPS architecture, redirect loops are unlikely since nginx and Gunicorn run on the same host. #### Symptoms - Browser error or timeout accessing https://nordabiznes.pl - Portal completely inaccessible #### Diagnosis ```bash # 1. Check nginx status on OVH VPS ssh maciejpi@57.128.200.27 "sudo systemctl status nginx" # 2. Check Gunicorn status ssh maciejpi@57.128.200.27 "sudo systemctl status nordabiznes" # 3. Test backend directly ssh maciejpi@57.128.200.27 "curl -I http://localhost:5000/health" # Should return 200 OK if Flask is running # 4. Check DNS resolution dig nordabiznes.pl +short # Expected: 57.128.200.27 # 5. Test nginx config ssh maciejpi@57.128.200.27 "sudo nginx -t" ``` #### Solution **Option A: Fix via NPM Web UI (Recommended)** ```bash # 1. Access NPM admin panel open http://10.22.68.250:81 # 2. Navigate to: Proxy Hosts → nordabiznes.pl (ID 27) # 3. Edit configuration: # - Forward Hostname/IP: 57.128.200.27 # - Forward Port: 5000 (CRITICAL!) #### Solution ```bash # If nginx is down: ssh maciejpi@57.128.200.27 "sudo systemctl restart nginx" # If Gunicorn is down: ssh maciejpi@57.128.200.27 "sudo systemctl restart nordabiznes" # If nginx config is broken: ssh maciejpi@57.128.200.27 "sudo nginx -t && sudo systemctl reload nginx" ``` #### Verification ```bash curl -I https://nordabiznes.pl/health # Expected: HTTP/2 200 ``` #### Prevention - Monitor /health endpoint for non-200 responses - Test nginx config before reload: `sudo nginx -t` --- ### 2.2 502 Bad Gateway **Severity:** HIGH #### Symptoms - Browser shows "502 Bad Gateway" error - nginx error log shows "upstream connection failed" - Site completely inaccessible #### Root Causes 1. Flask/Gunicorn service stopped 2. Gunicorn not listening on 127.0.0.1:5000 #### Diagnosis ```bash # 1. Check Flask service status ssh maciejpi@57.128.200.27 sudo systemctl status nordabiznes # 2. Check if port 5000 is listening sudo netstat -tlnp | grep :5000 # Expected: gunicorn process listening # 3. Check Flask logs sudo journalctl -u nordabiznes -n 50 --no-pager # 4. Test backend directly curl http://localhost:5000/health # Should return JSON with status ``` #### Solution **If service is stopped:** ```bash # Restart Flask application sudo systemctl restart nordabiznes # Check status sudo systemctl status nordabiznes # Verify it's working curl http://localhost:5000/health ``` **If service won't start:** ```bash # Check for syntax errors cd /var/www/nordabiznes /var/www/nordabiznes/venv/bin/python3 -m py_compile app.py # Check for missing dependencies /var/www/nordabiznes/venv/bin/python3 -c "import flask; import sqlalchemy" # Check environment variables cat /var/www/nordabiznes/.env | grep -v "PASSWORD\|SECRET\|KEY" # Try running manually (for debugging) /var/www/nordabiznes/venv/bin/python3 app.py ``` **If Gunicorn not responding:** ```bash # Check if Gunicorn is listening ssh maciejpi@57.128.200.27 "ss -tlnp | grep 5000" # Check nginx error log ssh maciejpi@57.128.200.27 "sudo tail -20 /var/log/nginx/error.log" ``` #### Verification ```bash curl -I https://nordabiznes.pl/health # Expected: HTTP/2 200 ``` --- ### 2.3 504 Gateway Timeout **Severity:** MEDIUM #### Symptoms - Browser shows "504 Gateway Timeout" - Requests take >60 seconds - Some requests succeed, others timeout #### Root Causes 1. Database query hanging 2. External API timeout (Gemini, PageSpeed, etc.) 3. Insufficient Gunicorn workers 4. Resource exhaustion (CPU, memory) #### Diagnosis ```bash # 1. Check Gunicorn worker status ssh maciejpi@57.128.200.27 ps aux | grep gunicorn # Look for zombie workers or high CPU usage # 2. Check database connections psql -h localhost -U nordabiz_app -d nordabiz -c \ "SELECT count(*) FROM pg_stat_activity WHERE datname = 'nordabiz';" # 3. Check for long-running queries psql -h localhost -U nordabiz_app -d nordabiz -c \ "SELECT pid, now() - query_start AS duration, query FROM pg_stat_activity WHERE state = 'active' AND now() - query_start > interval '5 seconds';" # 4. Check system resources top -n 1 free -h df -h # 5. Check Flask logs for slow requests sudo journalctl -u nordabiznes -n 100 --no-pager | grep -E "slow|timeout|took" ``` #### Solution **If database query hanging:** ```bash # Identify and kill long-running query psql -h localhost -U nordabiz_app -d nordabiz # Find problematic query SELECT pid, query FROM pg_stat_activity WHERE state = 'active' AND now() - query_start > interval '30 seconds'; # Kill it (replace PID) SELECT pg_terminate_backend(12345); ``` **If resource exhaustion:** ```bash # Restart Flask to clear memory sudo systemctl restart nordabiznes # Consider increasing Gunicorn workers (edit systemd service) sudo nano /etc/systemd/system/nordabiznes.service # Change: --workers=4 (adjust based on CPU cores) sudo systemctl daemon-reload sudo systemctl restart nordabiznes ``` **If external API timeout:** ```bash # Check if Gemini API is responsive curl -I https://generativelanguage.googleapis.com/v1beta/models # Check PageSpeed API curl -I https://www.googleapis.com/pagespeedonline/v5/runPagespeed # Check Brave Search API curl -I https://api.search.brave.com/res/v1/web/search ``` #### Verification ```bash # Test response time time curl -I https://nordabiznes.pl/health # Should complete in < 2 seconds ``` --- ### 2.4 SSL Certificate Issues **Severity:** HIGH #### Symptoms - Browser shows "Your connection is not private" - SSL certificate expired or invalid - Mixed content warnings #### Diagnosis ```bash # 1. Check certificate expiry echo | openssl s_client -servername nordabiznes.pl -connect nordabiznes.pl:443 2>/dev/null | \ openssl x509 -noout -dates # 2. Check certificate details curl -vI https://nordabiznes.pl 2>&1 | grep -E "SSL|certificate" # 3. Check certbot certificate status ssh maciejpi@57.128.200.27 "sudo certbot certificates" ``` #### Solution **If certificate expired:** ```bash # certbot auto-renews Let's Encrypt certificates # Force renewal: ssh maciejpi@57.128.200.27 "sudo certbot renew --force-renewal" # Reload nginx after renewal: ssh maciejpi@57.128.200.27 "sudo systemctl reload nginx" ``` **If mixed content warnings:** ```bash # Check Flask is generating HTTPS URLs # Verify in templates: url_for(..., _external=True, _scheme='https') # Check CSP headers in app.py grep "Content-Security-Policy" /var/www/nordabiznes/app.py ``` --- ### 2.5 DNS Resolution Issues **Severity:** MEDIUM #### Symptoms - `nslookup nordabiznes.pl` fails - Site accessible by IP but not domain - Inconsistent access from different networks #### Diagnosis ```bash # 1. Check external DNS (OVH) nslookup nordabiznes.pl 8.8.8.8 # Should return: 85.237.177.83 # 2. Check internal DNS (inpi.local) nslookup nordabiznes.inpi.local 10.22.68.1 # Should return: 57.128.200.27 # 3. Test from different locations curl -I -H "Host: nordabiznes.pl" http://85.237.177.83/health # 4. Check DNS points to OVH VPS dig nordabiznes.pl +short # Expected: 57.128.200.27 ``` #### Solution **If external DNS issue:** ```bash # Check OVH DNS settings # Login to OVH control panel # Verify A record: nordabiznes.pl → 85.237.177.83 # Verify A record: www.nordabiznes.pl → 85.237.177.83 ``` **If internal DNS issue:** ```bash # Update internal DNS server # This requires access to INPI DNS management (see dns-manager skill) ``` --- ## 3. Application & Service Issues ### 3.1 Application Crash / Won't Start **Severity:** CRITICAL #### Symptoms - Flask service status shows "failed" or "inactive" - Systemd shows error in logs - Manual start fails with traceback #### Diagnosis ```bash # 1. Check service status ssh maciejpi@57.128.200.27 sudo systemctl status nordabiznes # 2. Check recent logs sudo journalctl -u nordabiznes -n 100 --no-pager # 3. Try manual start for detailed error cd /var/www/nordabiznes /var/www/nordabiznes/venv/bin/python3 app.py # Read the traceback carefully ``` #### Common Root Causes & Solutions **A. Python Syntax Error** ```bash # Symptom: SyntaxError in logs # Cause: Recent code change introduced syntax error # Fix: Check syntax /var/www/nordabiznes/venv/bin/python3 -m py_compile app.py # Rollback if necessary cd /var/www/nordabiznes git log --oneline -5 git revert HEAD # or specific commit sudo systemctl restart nordabiznes ``` **B. Missing Environment Variables** ```bash # Symptom: KeyError or "SECRET_KEY not found" # Cause: .env file missing or incomplete # Fix: Check .env exists and has required variables ls -la /var/www/nordabiznes/.env cat /var/www/nordabiznes/.env | grep -E "^[A-Z_]+=" | wc -l # Should have ~20 environment variables # Required variables (add if missing): # - SECRET_KEY # - DATABASE_URL # - GEMINI_API_KEY # - BRAVE_SEARCH_API_KEY # - GOOGLE_PAGESPEED_API_KEY # - ADMIN_EMAIL # - ADMIN_PASSWORD ``` **C. Database Connection Failed** ```bash # Symptom: "could not connect to server" or "FATAL: password authentication failed" # Cause: PostgreSQL not running or wrong credentials # Fix: Check PostgreSQL sudo systemctl status postgresql # Test connection psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;" # If password wrong, update .env and restart ``` **D. Missing Python Dependencies** ```bash # Symptom: ImportError or ModuleNotFoundError # Cause: Dependency not installed in venv # Fix: Reinstall dependencies cd /var/www/nordabiznes /var/www/nordabiznes/venv/bin/pip install -r requirements.txt # Verify specific package /var/www/nordabiznes/venv/bin/pip show flask ``` **E. Port 5000 Already in Use** ```bash # Symptom: "Address already in use" # Cause: Another process using port 5000 # Fix: Find and kill process sudo lsof -i :5000 sudo kill # Or restart server if unclear sudo reboot ``` #### Verification ```bash sudo systemctl status nordabiznes # Should show "active (running)" curl http://localhost:5000/health # Should return JSON ``` --- ### 3.2 White Screen / Blank Page **Severity:** HIGH #### Symptoms - Page loads but shows blank white screen - No error message in browser - HTML source is empty or minimal #### Diagnosis ```bash # 1. Check browser console (F12) # Look for JavaScript errors # 2. Check Flask logs ssh maciejpi@57.128.200.27 sudo journalctl -u nordabiznes -n 50 --no-pager | grep ERROR # 3. Check template rendering curl https://nordabiznes.pl/ -o /tmp/page.html less /tmp/page.html # Check if HTML is complete # 4. Check static assets loading curl -I https://nordabiznes.pl/static/css/styles.css # Should return 200 ``` #### Root Causes & Solutions **A. Template Rendering Error** ```bash # Symptom: Jinja2 error in logs # Cause: Syntax error in template file # Fix: Check Flask logs for template name sudo journalctl -u nordabiznes -n 100 | grep -i jinja # Test template syntax cd /var/www/nordabiznes /var/www/nordabiznes/venv/bin/python3 -c " from jinja2 import Template with open('templates/index.html') as f: Template(f.read()) " ``` **B. JavaScript Error** ```bash # Symptom: Console shows JS error # Cause: Syntax error in JavaScript code # Fix: Check browser console # Common issues: # - extra_js block has {% endblock %} # RIGHT: {% block extra_js %}code{% endblock %} ``` **C. Database Query Failed** ```bash # Symptom: 500 error in network tab # Cause: Database query error preventing page render # Fix: Check Flask logs sudo journalctl -u nordabiznes -n 50 | grep -i "sqlalchemy\|database" # Check database connectivity psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;" ``` --- ### 3.3 Search Not Working **Severity:** MEDIUM #### Symptoms - Search returns no results for valid queries - Search is very slow (>5 seconds) - Search returns "Database error" #### Diagnosis ```bash # 1. Test search endpoint curl "https://nordabiznes.pl/search?q=test" -v # 2. Check search_service.py logs ssh maciejpi@57.128.200.27 sudo journalctl -u nordabiznes -n 100 | grep -i search # 3. Test database FTS psql -h localhost -U nordabiz_app -d nordabiz # Test FTS query SELECT name, ts_rank(search_vector, to_tsquery('polish', 'web')) AS score FROM companies WHERE search_vector @@ to_tsquery('polish', 'web') ORDER BY score DESC LIMIT 5; # Check pg_trgm extension SELECT * FROM pg_extension WHERE extname = 'pg_trgm'; ``` #### Root Causes & Solutions **A. Full-Text Search Index Outdated** ```bash # Symptom: Recent companies don't appear in search # Cause: search_vector not updated # Fix: Rebuild FTS index psql -h localhost -U nordabiz_app -d nordabiz UPDATE companies SET search_vector = to_tsvector('polish', COALESCE(name, '') || ' ' || COALESCE(description, '') || ' ' || COALESCE(array_to_string(services, ' '), '') || ' ' || COALESCE(array_to_string(competencies, ' '), '') ); VACUUM ANALYZE companies; ``` **B. Synonym Expansion Not Working** ```bash # Symptom: Search for "www" doesn't find "strony internetowe" # Cause: SYNONYM_EXPANSION dict in search_service.py incomplete # Fix: Check synonyms cd /var/www/nordabiznes grep -A 20 "SYNONYM_EXPANSION" search_service.py # Add missing synonyms if needed # Restart service after editing ``` **C. Search Timeout** ```bash # Symptom: Search takes >30 seconds # Cause: Missing database indexes # Fix: Add indexes psql -h localhost -U nordabiz_app -d nordabiz CREATE INDEX IF NOT EXISTS idx_companies_search_vector ON companies USING gin(search_vector); CREATE INDEX IF NOT EXISTS idx_companies_name_trgm ON companies USING gin(name gin_trgm_ops); VACUUM ANALYZE companies; ``` #### Verification ```bash # Test search curl "https://nordabiznes.pl/search?q=web" | grep -c "company-card" # Should return number of results found ``` --- ### 3.4 AI Chat Not Responding **Severity:** MEDIUM #### Symptoms - Chat shows "thinking..." forever - Chat returns error message - Empty responses from AI #### Root Causes & Solutions See [Section 5.2: Gemini API Issues](#52-gemini-api-issues) for detailed troubleshooting. Quick check: ```bash # 1. Verify Gemini API key ssh maciejpi@57.128.200.27 cat /var/www/nordabiznes/.env | grep GEMINI_API_KEY # Should not be empty # 2. Test Gemini API directly curl -H "x-goog-api-key: YOUR_API_KEY" \ "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \ -H "Content-Type: application/json" \ -d '{"contents":[{"parts":[{"text":"Hello"}]}]}' # 3. Check quota # Visit: https://console.cloud.google.com/apis/api/generativelanguage.googleapis.com/quotas ``` --- ## 4. Database Issues ### 4.1 Database Connection Failed **Severity:** CRITICAL #### Symptoms - Flask logs show "could not connect to server" - All database queries fail - 500 error on all pages #### Diagnosis ```bash # 1. Check PostgreSQL service ssh maciejpi@57.128.200.27 sudo systemctl status postgresql # 2. Check PostgreSQL is listening sudo netstat -tlnp | grep 5432 # Should show: LISTEN on 127.0.0.1:5432 # 3. Check logs sudo journalctl -u postgresql -n 50 # 4. Test connection psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;" ``` #### Solution **If PostgreSQL is stopped:** ```bash sudo systemctl start postgresql sudo systemctl status postgresql # If fails to start, check logs sudo journalctl -u postgresql -n 100 --no-pager ``` **If connection refused:** ```bash # Check pg_hba.conf allows local connections sudo cat /etc/postgresql/*/main/pg_hba.conf | grep "127.0.0.1" # Should have: host all all 127.0.0.1/32 md5 # Reload if changed sudo systemctl reload postgresql ``` **If authentication failed:** ```bash # Verify user exists sudo -u postgres psql -c "\du nordabiz_app" # Reset password if needed sudo -u postgres psql ALTER USER nordabiz_app WITH PASSWORD 'NEW_PASSWORD'; \q # Update .env with new password sudo nano /var/www/nordabiznes/.env # Update DATABASE_URL line # Restart Flask sudo systemctl restart nordabiznes ``` #### Verification ```bash psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT count(*) FROM companies;" # Should return count ``` --- ### 4.2 Database Query Slow **Severity:** MEDIUM #### Symptoms - Pages load slowly (>5 seconds) - Database queries take long time - High CPU usage on database server #### Diagnosis ```bash # 1. Check for slow queries psql -h localhost -U nordabiz_app -d nordabiz SELECT pid, now() - query_start AS duration, query FROM pg_stat_activity WHERE state = 'active' AND now() - query_start > interval '1 second' ORDER BY duration DESC; # 2. Check for missing indexes SELECT schemaname, tablename, indexname, idx_scan FROM pg_stat_user_indexes WHERE idx_scan = 0 AND indexname NOT LIKE '%pkey'; # 3. Check table statistics SELECT schemaname, tablename, n_live_tup, n_dead_tup, last_autovacuum, last_autoanalyze FROM pg_stat_user_tables WHERE schemaname = 'public' ORDER BY n_live_tup DESC; # 4. Enable query logging temporarily ALTER DATABASE nordabiz SET log_min_duration_statement = 1000; -- Log queries taking > 1 second ``` #### Solution **If missing indexes:** ```bash # Add appropriate indexes based on queries psql -h localhost -U nordabiz_app -d nordabiz -- Example: Index on foreign key CREATE INDEX idx_company_news_company_id ON company_news(company_id); -- Example: Composite index for common query CREATE INDEX idx_users_email_active ON users(email, is_active); -- Rebuild search index REINDEX INDEX idx_companies_search_vector; VACUUM ANALYZE; ``` **If high dead tuple ratio:** ```bash # Run vacuum psql -h localhost -U nordabiz_app -d nordabiz VACUUM ANALYZE; # For severe cases VACUUM FULL companies; -- Locks table! ``` **If table statistics outdated:** ```bash psql -h localhost -U nordabiz_app -d nordabiz ANALYZE companies; ANALYZE users; ANALYZE ai_chat_messages; ``` #### Verification ```bash # Check query performance improved \timing SELECT * FROM companies WHERE name ILIKE '%test%' LIMIT 10; # Should complete in < 100ms ``` --- ### 4.3 Database Disk Full **Severity:** HIGH #### Symptoms - PostgreSQL logs show "No space left on device" - INSERT/UPDATE queries fail - Database becomes read-only #### Diagnosis ```bash # 1. Check disk usage ssh maciejpi@57.128.200.27 df -h # Check /var/lib/postgresql usage # 2. Check database size psql -h localhost -U nordabiz_app -d nordabiz SELECT pg_size_pretty(pg_database_size('nordabiz')); SELECT tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) FROM pg_tables WHERE schemaname = 'public' ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC; # 3. Check WAL files sudo du -sh /var/lib/postgresql/*/main/pg_wal/ ``` #### Solution **If WAL files accumulating:** ```bash # Check WAL settings sudo -u postgres psql -c "SHOW max_wal_size;" sudo -u postgres psql -c "SHOW wal_keep_size;" # Trigger checkpoint sudo -u postgres psql -c "CHECKPOINT;" ``` **If old backups not cleaned:** ```bash # Remove old backups (keep last 7 days) find /backup/nordabiz/ -name "*.sql" -mtime +7 -delete ``` **If logs too large:** ```bash # Truncate old logs sudo journalctl --vacuum-time=7d # Rotate PostgreSQL logs sudo -u postgres pg_archivecleanup /var/lib/postgresql/*/main/pg_wal/ 000000010000000000000001 ``` **Emergency: Archive and purge old data:** ```bash # Archive old data before deletion psql -h localhost -U nordabiz_app -d nordabiz -- Example: Archive old AI chat messages (>6 months) CREATE TABLE ai_chat_messages_archive AS SELECT * FROM ai_chat_messages WHERE created_at < NOW() - INTERVAL '6 months'; DELETE FROM ai_chat_messages WHERE created_at < NOW() - INTERVAL '6 months'; VACUUM FULL ai_chat_messages; ``` --- ### 4.4 Database Migration Failed **Severity:** HIGH #### Symptoms - Migration script returns error - Database schema out of sync with code - Missing tables or columns #### Diagnosis ```bash # 1. Check current schema version psql -h localhost -U nordabiz_app -d nordabiz \dt # List all tables \d companies # Describe companies table # 2. Check migration logs ls -la /var/www/nordabiznes/database/migrations/ # 3. Check Flask-Migrate status (if using Alembic) cd /var/www/nordabiznes /var/www/nordabiznes/venv/bin/flask db current ``` #### Solution **If table missing:** ```bash # Re-run migration script cd /var/www/nordabiznes psql -h localhost -U nordabiz_app -d nordabiz < database/schema.sql ``` **If column added but missing:** ```bash # Add column manually psql -h localhost -U nordabiz_app -d nordabiz ALTER TABLE companies ADD COLUMN IF NOT EXISTS new_column VARCHAR(255); -- Grant permissions GRANT ALL ON TABLE companies TO nordabiz_app; ``` **If migration stuck:** ```bash # Rollback last migration /var/www/nordabiznes/venv/bin/flask db downgrade # Re-apply /var/www/nordabiznes/venv/bin/flask db upgrade ``` #### Verification ```bash psql -h localhost -U nordabiz_app -d nordabiz \d companies # Verify schema matches expected structure ``` --- ## 5. API Integration Issues ### 5.1 API Rate Limit Exceeded **Severity:** MEDIUM #### Symptoms - 429 "Too Many Requests" errors - API calls fail with quota exceeded message - Features stop working after heavy usage #### Diagnosis ```bash # 1. Check API usage in database ssh maciejpi@57.128.200.27 psql -h localhost -U nordabiz_app -d nordabiz -- Gemini API usage today SELECT COUNT(*), SUM(input_tokens), SUM(output_tokens) FROM ai_api_costs WHERE DATE(created_at) = CURRENT_DATE; -- PageSpeed API usage today SELECT COUNT(*) FROM company_website_analysis WHERE DATE(created_at) = CURRENT_DATE; -- Brave Search API usage this month SELECT COUNT(*) FROM company_news WHERE DATE(created_at) >= DATE_TRUNC('month', CURRENT_DATE); # 2. Check rate limiting logs sudo journalctl -u nordabiznes -n 100 | grep -i "rate limit\|quota\|429" ``` #### API Quotas Reference | API | Free Tier Limit | Current Usage Query | |-----|-----------------|---------------------| | Gemini AI | 1,500 req/day | `SELECT COUNT(*) FROM ai_api_costs WHERE DATE(created_at) = CURRENT_DATE;` | | PageSpeed | 25,000 req/day | `SELECT COUNT(*) FROM company_website_analysis WHERE DATE(created_at) = CURRENT_DATE;` | | Brave Search | 2,000 req/month | `SELECT COUNT(*) FROM company_news WHERE created_at >= DATE_TRUNC('month', CURRENT_DATE);` | | Google Places | Limited | Check Google Cloud Console | | MS Graph | Per tenant | Check Azure AD logs | #### Solution **If Gemini quota exceeded:** ```bash # Wait until next day (quota resets at midnight UTC) # OR upgrade to paid tier # Temporary workaround: Disable AI chat sudo nano /var/www/nordabiznes/app.py # Comment out @app.route('/chat') temporarily sudo systemctl restart nordabiznes ``` **If PageSpeed quota exceeded:** ```bash # Stop SEO audit script pkill -f seo_audit.py # Wait until next day # Consider batching audits to stay under quota ``` **If Brave Search quota exceeded:** ```bash # Disable news monitoring temporarily # Wait until next month # Consider upgrading to paid tier ($5/month for 20k requests) ``` #### Prevention ```bash # Add quota monitoring alerts # Create script: /var/www/nordabiznes/scripts/check_api_quotas.sh #!/bin/bash GEMINI_COUNT=$(psql -h localhost -U nordabiz_app -d nordabiz -t -c \ "SELECT COUNT(*) FROM ai_api_costs WHERE DATE(created_at) = CURRENT_DATE;") if [ "$GEMINI_COUNT" -gt 1400 ]; then echo "WARNING: Gemini API usage at $GEMINI_COUNT / 1500" # Send alert email fi # Add to crontab: run hourly # 0 * * * * /var/www/nordabiznes/scripts/check_api_quotas.sh ``` --- ### 5.2 Gemini API Issues **Severity:** MEDIUM #### Symptoms - AI chat returns empty responses - "Safety filter blocked response" error - Gemini API timeout - "Conversation not found" error #### Diagnosis ```bash # 1. Test Gemini API directly GEMINI_KEY=$( grep GEMINI_API_KEY /var/www/nordabiznes/.env | cut -d= -f2) curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash-preview:generateContent" \ -H "x-goog-api-key: $GEMINI_KEY" \ -H "Content-Type: application/json" \ -d '{"contents":[{"parts":[{"text":"Hello, test"}]}]}' # 2. Check Flask logs for Gemini errors ssh maciejpi@57.128.200.27 sudo journalctl -u nordabiznes -n 100 | grep -i gemini # 3. Check conversation ownership psql -h localhost -U nordabiz_app -d nordabiz SELECT id, user_id, created_at FROM ai_chat_conversations WHERE id = 123; -- Replace with conversation ID ``` #### Common Issues & Solutions **A. Empty AI Responses** ```bash # Cause: Safety filters blocking response # OR context too long # Check last message for safety filter psql -h localhost -U nordabiz_app -d nordabiz SELECT message, ai_response, error_message FROM ai_chat_messages ORDER BY created_at DESC LIMIT 5; # If error_message contains "safety" or "blocked": # - Rephrase user query to be less controversial # - No technical fix needed, it's Gemini's safety system ``` **B. Conversation Not Found** ```bash # Cause: User trying to access someone else's conversation # Verify conversation ownership SELECT c.id, c.user_id, u.email FROM ai_chat_conversations c JOIN users u ON c.user_id = u.id WHERE c.id = 123; -- Replace ID # Fix: Ensure frontend passes correct conversation_id # OR create new conversation for user ``` **C. Token Limit Exceeded** ```bash # Cause: Conversation history too long (>200k tokens) # Check token usage SELECT id, input_tokens, output_tokens, input_tokens + output_tokens AS total_tokens FROM ai_chat_messages WHERE conversation_id = 123 ORDER BY created_at DESC; # Fix: Trim old messages DELETE FROM ai_chat_messages WHERE conversation_id = 123 AND created_at < ( SELECT created_at FROM ai_chat_messages WHERE conversation_id = 123 ORDER BY created_at DESC LIMIT 1 OFFSET 10 ); ``` **D. API Key Invalid** ```bash # Symptom: 401 Unauthorized or 403 Forbidden # Verify API key grep GEMINI_API_KEY /var/www/nordabiznes/.env # Test key directly curl -H "x-goog-api-key: YOUR_KEY" \ "https://generativelanguage.googleapis.com/v1beta/models" # If invalid, regenerate key in Google Cloud Console # https://console.cloud.google.com/apis/credentials ``` #### Verification ```bash # Test AI chat endpoint curl -X POST https://nordabiznes.pl/api/chat \ -H "Content-Type: application/json" \ -d '{"conversation_id":123,"message":"test"}' \ -b "session=YOUR_SESSION_COOKIE" # Should return JSON with AI response ``` --- ### 5.3 PageSpeed API Issues **Severity:** LOW #### Symptoms - SEO audit fails with API error - PageSpeed scores show as null/0 - Timeout errors in audit script #### Diagnosis ```bash # 1. Test PageSpeed API directly PAGESPEED_KEY=$( grep GOOGLE_PAGESPEED_API_KEY /var/www/nordabiznes/.env | cut -d= -f2) curl "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://nordabiznes.pl&key=$PAGESPEED_KEY" # 2. Check audit logs ssh maciejpi@57.128.200.27 sudo journalctl -u nordabiznes -n 100 | grep -i pagespeed # 3. Check recent audits psql -h localhost -U nordabiz_app -d nordabiz SELECT company_id, url, seo_score, performance_score, audited_at, error_message FROM company_website_analysis ORDER BY audited_at DESC LIMIT 10; ``` #### Solution **If API key invalid:** ```bash # Regenerate key in Google Cloud Console # https://console.cloud.google.com/apis/credentials?project=gen-lang-client-0540794446 # Update .env sudo nano /var/www/nordabiznes/.env # GOOGLE_PAGESPEED_API_KEY=NEW_KEY sudo systemctl restart nordabiznes ``` **If quota exceeded:** ```bash # Wait until next day (25k/day limit) # Check usage # https://console.cloud.google.com/apis/api/pagespeedonline.googleapis.com/quotas ``` **If timeout:** ```bash # Increase timeout in seo_audit.py sudo nano /var/www/nordabiznes/scripts/seo_audit.py # Find: timeout=30 # Change to: timeout=60 # Or run audits in smaller batches python seo_audit.py --batch 1-10 # Wait 5 minutes between batches ``` --- ### 5.4 Brave Search API Issues **Severity:** LOW #### Symptoms - News monitoring returns no results - Brave API 429 error - Invalid search results #### Diagnosis ```bash # 1. Test Brave API directly BRAVE_KEY=$( grep BRAVE_SEARCH_API_KEY /var/www/nordabiznes/.env | cut -d= -f2) curl -H "X-Subscription-Token: $BRAVE_KEY" \ "https://api.search.brave.com/res/v1/news/search?q=test&count=5" # 2. Check usage this month psql -h localhost -U nordabiz_app -d nordabiz SELECT COUNT(*) AS searches_this_month FROM company_news WHERE created_at >= DATE_TRUNC('month', CURRENT_DATE); -- Free tier: 2,000/month # 3. Check for error logs sudo journalctl -u nordabiznes -n 100 | grep -i brave ``` #### Solution **If quota exceeded (2000/month):** ```bash # Wait until next month # OR upgrade to paid tier # Temporary: Disable news monitoring # Comment out news fetch cron job ``` **If API key invalid:** ```bash # Get new key from https://brave.com/search/api/ # Update .env sudo nano /var/www/nordabiznes/.env # BRAVE_SEARCH_API_KEY=NEW_KEY sudo systemctl restart nordabiznes ``` --- ## 6. Authentication & Security Issues ### 6.1 Cannot Login / Session Expired **Severity:** MEDIUM #### Symptoms - "Invalid credentials" despite correct password - Redirected to login immediately after logging in - Session expires too quickly - "CSRF token missing" error #### Diagnosis ```bash # 1. Check user exists and is active ssh maciejpi@57.128.200.27 psql -h localhost -U nordabiz_app -d nordabiz SELECT id, email, is_active, email_verified, failed_login_attempts FROM users WHERE email = 'user@example.com'; # 2. Check session configuration grep -E "SECRET_KEY|PERMANENT_SESSION_LIFETIME" /var/www/nordabiznes/app.py # 3. Check Flask logs for auth errors sudo journalctl -u nordabiznes -n 100 | grep -i "login\|session\|auth" # 4. Test from server (bypass network) curl -c /tmp/cookies.txt -X POST http://localhost:5000/login \ -d "email=test@nordabiznes.pl&password=TEST_PASSWORD" ``` #### Common Issues & Solutions **A. Account Locked (Failed Login Attempts)** ```bash # Check failed attempts psql -h localhost -U nordabiz_app -d nordabiz SELECT email, failed_login_attempts, last_failed_login FROM users WHERE email = 'user@example.com'; # If >= 5 attempts, reset: UPDATE users SET failed_login_attempts = 0 WHERE email = 'user@example.com'; ``` **B. Email Not Verified** ```bash # Check verification status SELECT email, email_verified, verification_token, verification_token_expiry FROM users WHERE email = 'user@example.com'; # Force verify (for testing) UPDATE users SET email_verified = TRUE WHERE email = 'user@example.com'; ``` **C. Session Cookie Not Persisting** ```bash # Check cookie settings in app.py grep -A 5 "SESSION_COOKIE" /var/www/nordabiznes/app.py # Should have: # SESSION_COOKIE_SECURE = True # HTTPS only # SESSION_COOKIE_HTTPONLY = True # No JS access # SESSION_COOKIE_SAMESITE = 'Lax' # CSRF protection # If accessing via HTTP (not HTTPS), session won't work # Ensure using https://nordabiznes.pl not http:// ``` **D. CSRF Token Mismatch** ```bash # Symptom: "400 Bad Request - CSRF token missing" # Cause: Form submitted without CSRF token # Fix: Ensure all forms have: # {{ form.hidden_tag() }} # WTForms # OR # # Check template grep -r "csrf_token" /var/www/nordabiznes/templates/login.html ``` **E. Password Hash Algorithm Changed** ```bash # Symptom: Old users can't login after upgrade # Check hash format SELECT id, email, SUBSTRING(password_hash, 1, 20) FROM users WHERE email = 'user@example.com'; # Should start with: pbkdf2:sha256: # If different, user needs password reset # Send reset email via /forgot-password ``` #### Verification ```bash # Test login flow curl -c /tmp/cookies.txt -X POST http://localhost:5000/login \ -d "email=test@nordabiznes.pl&password=TEST_PASSWORD" \ -L -v # Should see: Set-Cookie: session=... # Should redirect to /dashboard ``` --- ### 6.2 Unauthorized Access / Permission Denied **Severity:** HIGH #### Symptoms - "403 Forbidden" error - User can access pages they shouldn't - Admin panel not accessible #### Diagnosis ```bash # 1. Check user role psql -h localhost -U nordabiz_app -d nordabiz SELECT id, email, is_admin, is_norda_member FROM users WHERE email = 'user@example.com'; # 2. Check route decorators grep -B 2 "@app.route('/admin" /var/www/nordabiznes/app.py # Should have: @login_required and @admin_required # 3. Check Flask logs sudo journalctl -u nordabiznes -n 50 | grep -i "forbidden\|unauthorized" ``` #### Solution **If user should be admin:** ```bash # Grant admin role psql -h localhost -U nordabiz_app -d nordabiz UPDATE users SET is_admin = TRUE WHERE email = 'admin@nordabiznes.pl'; ``` **If authorization check broken:** ```bash # Check app.py decorators # Should have: @app.route('/admin/users') @login_required @admin_required def admin_users(): ... # Verify @admin_required is defined: grep -A 5 "def admin_required" /var/www/nordabiznes/app.py ``` **If company ownership check failed:** ```bash # Verify company-user association SELECT c.id, c.name, u.email FROM companies c LEFT JOIN users u ON c.id = u.company_id WHERE c.slug = 'company-slug'; # Update user's company UPDATE users SET company_id = 123 WHERE email = 'user@example.com'; ``` --- ### 6.3 Password Reset Not Working **Severity:** MEDIUM #### Symptoms - Password reset email not received - Reset token expired or invalid - "Invalid token" error #### Diagnosis ```bash # 1. Check user reset token psql -h localhost -U nordabiz_app -d nordabiz SELECT email, reset_token, reset_token_expiry FROM users WHERE email = 'user@example.com'; # 2. Check email service logs sudo journalctl -u nordabiznes -n 100 | grep -i "email\|smtp" # 3. Test MS Graph API (email service) # Check if AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_TENANT_ID set grep AZURE /var/www/nordabiznes/.env ``` #### Solution **If token expired:** ```bash # Tokens expire after 1 hour # Generate new token via /forgot-password # OR manually extend expiry: UPDATE users SET reset_token_expiry = NOW() + INTERVAL '1 hour' WHERE email = 'user@example.com'; ``` **If email not sent:** ```bash # Check MS Graph credentials python3 << 'EOF' import os from email_service import EmailService service = EmailService() result = service.send_email( to_email="test@example.com", subject="Test", body="Test email" ) print(result) EOF # If fails, check Azure AD app registration # Ensure "Mail.Send" permission granted ``` **Manual password reset (emergency):** ```bash # Generate new password hash python3 << 'EOF' from werkzeug.security import generate_password_hash password = "NewPassword123" print(generate_password_hash(password)) EOF # Update database psql -h localhost -U nordabiz_app -d nordabiz UPDATE users SET password_hash = 'HASH_FROM_ABOVE' WHERE email = 'user@example.com'; ``` --- ## 7. Performance Issues ### 7.1 Slow Page Load Times **Severity:** MEDIUM #### Symptoms - Pages take >5 seconds to load - TTFB (Time to First Byte) is high - Browser shows "waiting for nordabiznes.pl..." #### Diagnosis ```bash # 1. Measure response time time curl -I https://nordabiznes.pl/ # 2. Check Gunicorn worker status ssh maciejpi@57.128.200.27 ps aux | grep gunicorn # Look for: worker processes (should be 4-8) # 3. Check server load top -n 1 # Look at: CPU usage, memory usage, load average # 4. Check database query times psql -h localhost -U nordabiz_app -d nordabiz SELECT calls, mean_exec_time, query FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10; -- If pg_stat_statements not enabled, see solution below # 5. Profile Flask app sudo journalctl -u nordabiznes -n 100 | grep -E "took|slow|timeout" ``` #### Root Causes & Solutions **A. Too Few Gunicorn Workers** ```bash # Current workers ps aux | grep gunicorn | grep -v grep | wc -l # Recommended: (2 x CPU cores) + 1 # For 4 core VM: 9 workers # Update systemd service sudo nano /etc/systemd/system/nordabiznes.service # Change: ExecStart=/var/www/nordabiznes/venv/bin/gunicorn --workers=9 \ --bind 0.0.0.0:5000 --timeout 120 app:app sudo systemctl daemon-reload sudo systemctl restart nordabiznes ``` **B. Slow Database Queries** ```bash # Enable query stats (if not enabled) sudo -u postgres psql ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements'; # Restart PostgreSQL sudo systemctl restart postgresql # Check slow queries psql -h localhost -U nordabiz_app -d nordabiz SELECT calls, mean_exec_time, query FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10; # Add indexes for slow queries (see Section 4.2) ``` **C. External API Timeouts** ```bash # Check for API timeout logs sudo journalctl -u nordabiznes -n 200 | grep -i timeout # Common culprits: # - Gemini API (text generation) # - PageSpeed API (site audit) # - Brave Search API # Solution: Add caching # Example: Cache PageSpeed results for 24 hours # app.py modification (pseudocode): # if last_audit < 24h ago: # return cached_result # else: # fetch new audit ``` **D. Missing Static Asset Caching** ```bash # Check cache headers curl -I https://nordabiznes.pl/static/css/styles.css | grep -i cache # Should have: Cache-Control: max-age=31536000 # If missing, add to NPM proxy or app.py: @app.after_request def add_cache_header(response): if request.path.startswith('/static/'): response.cache_control.max_age = 31536000 return response ``` **E. Large Database Result Sets** ```bash # Check for N+1 queries or loading too much data # Example bad query: # for company in Company.query.all(): # Loads ALL companies! # print(company.name) # Fix: Add pagination # companies = Company.query.paginate(page=1, per_page=20) ``` #### Verification ```bash # Test response time for i in {1..5}; do time curl -s -o /dev/null https://nordabiznes.pl/ done # Should average < 500ms ``` --- ### 7.2 High Memory Usage **Severity:** MEDIUM #### Symptoms - Server OOM (Out of Memory) errors - Swapping active (slow performance) - Gunicorn workers killed by OOM killer #### Diagnosis ```bash # 1. Check memory usage ssh maciejpi@57.128.200.27 free -h # 2. Check which process using memory ps aux --sort=-%mem | head -10 # 3. Check for memory leaks # Monitor over time: watch -n 5 'ps aux | grep gunicorn | awk "{sum+=\$6} END {print sum/1024 \" MB\"}"' # 4. Check OOM killer logs sudo dmesg | grep -i "out of memory\|oom" ``` #### Solution **If Gunicorn workers too many:** ```bash # Reduce workers sudo nano /etc/systemd/system/nordabiznes.service # Change: --workers=9 to --workers=4 sudo systemctl daemon-reload sudo systemctl restart nordabiznes ``` **If memory leak in application:** ```bash # Restart workers periodically sudo nano /etc/systemd/system/nordabiznes.service # Add: --max-requests=1000 --max-requests-jitter=100 # This restarts workers after 1000 requests sudo systemctl daemon-reload sudo systemctl restart nordabiznes ``` **If PostgreSQL using too much memory:** ```bash # Check PostgreSQL memory settings sudo -u postgres psql -c "SHOW shared_buffers;" sudo -u postgres psql -c "SHOW work_mem;" # Reduce if necessary sudo nano /etc/postgresql/*/main/postgresql.conf # shared_buffers = 256MB # Was 512MB # work_mem = 4MB # Was 16MB sudo systemctl restart postgresql ``` **If server needs more RAM:** ```bash # Increase VM RAM in Proxmox # OR add swap space # Add 2GB swap sudo fallocate -l 2G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile # Make permanent echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab ``` --- ### 7.3 High CPU Usage **Severity:** MEDIUM #### Symptoms - CPU at 100% constantly - Server load average > number of cores - Slow response times #### Diagnosis ```bash # 1. Check CPU usage ssh maciejpi@57.128.200.27 top -n 1 # Look for processes using >80% CPU # 2. Check load average uptime # Load should be < number of CPU cores # 3. Identify CPU-heavy queries psql -h localhost -U nordabiz_app -d nordabiz SELECT pid, now() - query_start AS duration, state, query FROM pg_stat_activity WHERE state != 'idle' ORDER BY now() - query_start DESC; ``` #### Solution **If database query CPU-intensive:** ```bash # Kill long-running query psql -h localhost -U nordabiz_app -d nordabiz SELECT pg_terminate_backend(PID); # Add index to optimize query # See Section 4.2 ``` **If AI chat overwhelming CPU:** ```bash # Add rate limiting to chat endpoint # app.py modification: from flask_limiter import Limiter limiter = Limiter(app, key_func=lambda: current_user.id) @app.route('/api/chat', methods=['POST']) @limiter.limit("10 per minute") # Add this def chat_api(): ... ``` **If search causing high CPU:** ```bash # Optimize search query # Use indexes instead of ILIKE # Cache search results # Add to app.py: from functools import lru_cache @lru_cache(maxsize=100) def search_companies_cached(query): return search_companies(db, query) ``` --- ## 8. Monitoring & Diagnostics ### 8.1 Health Check Endpoints ```bash # Application health curl https://nordabiznes.pl/health # Expected response: { "status": "healthy", "database": "connected", "timestamp": "2026-01-10T12:00:00Z" } # Database health psql -h localhost -U nordabiz_app -d nordabiz -c "SELECT 1;" # Nginx health (on OVH VPS) ssh maciejpi@57.128.200.27 "sudo systemctl status nginx" # Should show: Up X hours # Flask service health ssh maciejpi@57.128.200.27 sudo systemctl status nordabiznes # Should show: active (running) ``` --- ### 8.2 Log Locations ```bash # Flask application logs sudo journalctl -u nordabiznes -n 100 --no-pager # Follow live logs sudo journalctl -u nordabiznes -f # PostgreSQL logs sudo journalctl -u postgresql -n 50 # Nginx logs (OVH VPS) ssh maciejpi@57.128.200.27 "sudo tail -50 /var/log/nginx/error.log" # System logs sudo journalctl -n 100 # Nginx access logs (on backend) sudo tail -f /var/log/nginx/access.log sudo tail -f /var/log/nginx/error.log ``` --- ### 8.3 Performance Metrics ```bash # Response time monitoring # Create script: /usr/local/bin/check_nordabiz_performance.sh #!/bin/bash RESPONSE_TIME=$(curl -w '%{time_total}\n' -o /dev/null -s https://nordabiznes.pl/health) echo "Response time: ${RESPONSE_TIME}s" if (( $(echo "$RESPONSE_TIME > 2" | bc -l) )); then echo "WARNING: Slow response time!" fi # Add to cron: */5 * * * * /usr/local/bin/check_nordabiz_performance.sh ``` ```bash # Database performance psql -h localhost -U nordabiz_app -d nordabiz -- Connection count SELECT count(*) FROM pg_stat_activity; -- Active queries SELECT count(*) FROM pg_stat_activity WHERE state = 'active'; -- Cache hit ratio (should be > 99%) SELECT sum(heap_blks_hit) / (sum(heap_blks_hit) + sum(heap_blks_read)) * 100 AS cache_hit_ratio FROM pg_statio_user_tables; -- Table sizes SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size FROM pg_tables WHERE schemaname = 'public' ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC; ``` --- ### 8.4 Database Backup Verification ```bash # Check last backup ssh maciejpi@57.128.200.27 ls -lah /backup/nordabiz/ | head -10 # Expected: Daily backups (.sql files) # Test restore (to test database) sudo -u postgres createdb nordabiz_test sudo -u postgres psql nordabiz_test < /backup/nordabiz/nordabiz_YYYY-MM-DD.sql # Verify restore sudo -u postgres psql nordabiz_test -c "SELECT count(*) FROM companies;" # Cleanup sudo -u postgres dropdb nordabiz_test ``` --- ## 9. Emergency Procedures ### 9.1 Complete Service Outage **Severity:** CRITICAL #### Immediate Actions (First 5 Minutes) ```bash # 1. Verify outage scope curl -I https://nordabiznes.pl/health # If fails, proceed # 2. Check from internal network ssh maciejpi@57.128.200.27 curl -I http://localhost:5000/health # If this works → Network/NPM issue # If this fails → Application issue # 3. Notify stakeholders # Send email/message: "nordabiznes.pl experiencing outage, investigating" # 4. Check service status sudo systemctl status nordabiznes sudo systemctl status postgresql ``` #### If Network/NPM Issue ```bash # 1. Verify NPM is running ssh maciejpi@10.22.68.250 docker ps | grep nginx-proxy-manager # If not running: docker start nginx-proxy-manager_app_1 # 2. Check NPM configuration docker exec nginx-proxy-manager_app_1 \ sqlite3 /data/database.sqlite \ "SELECT id, forward_host, forward_port FROM proxy_host WHERE id = 27;" # Must show: 27|57.128.200.27|5000 # 3. Check Fortigate NAT # Access Fortigate admin panel # Verify: 85.237.177.83:443 → 10.22.68.250:443 ``` #### If Application Issue ```bash # 1. Check Flask service ssh maciejpi@57.128.200.27 sudo systemctl status nordabiznes # If failed, check logs sudo journalctl -u nordabiznes -n 50 # 2. Try restart sudo systemctl restart nordabiznes # If restart fails, check manually cd /var/www/nordabiznes /var/www/nordabiznes/venv/bin/python3 app.py # Read error message # 3. Common quick fixes: # - Syntax error: git revert last commit # - Database down: sudo systemctl start postgresql # - Port conflict: sudo lsof -i :5000 && kill PID ``` #### If Database Issue ```bash # 1. Check PostgreSQL sudo systemctl status postgresql # If stopped: sudo systemctl start postgresql # If start fails: sudo journalctl -u postgresql -n 50 # 2. Check disk space df -h # If full, clean old backups/logs (see Section 4.3) # 3. Emergency: Restore from backup sudo systemctl stop nordabiznes sudo -u postgres dropdb nordabiz sudo -u postgres createdb nordabiz sudo -u postgres psql nordabiz < /backup/nordabiz/latest.sql sudo systemctl start nordabiznes ``` --- ### 9.2 Data Loss / Corruption **Severity:** CRITICAL #### Immediate Actions ```bash # 1. STOP the application immediately sudo systemctl stop nordabiznes # 2. Create emergency backup of current state sudo -u postgres pg_dump nordabiz > /tmp/nordabiz_emergency_$(date +%Y%m%d_%H%M%S).sql # 3. Assess damage psql -h localhost -U nordabiz_app -d nordabiz -- Check table counts SELECT 'companies' AS table, count(*) FROM companies UNION ALL SELECT 'users', count(*) FROM users UNION ALL SELECT 'ai_chat_conversations', count(*) FROM ai_chat_conversations; -- Compare with expected counts (should have ~80 companies, etc.) ``` #### Recovery Procedures **If recent corruption (< 24 hours ago):** ```bash # Restore from last night's backup sudo systemctl stop nordabiznes sudo -u postgres dropdb nordabiz sudo -u postgres createdb nordabiz sudo -u postgres psql nordabiz < /backup/nordabiz/nordabiz_$(date -d yesterday +%Y-%m-%d).sql # Re-grant permissions sudo -u postgres psql nordabiz << 'EOF' GRANT ALL PRIVILEGES ON DATABASE nordabiz TO nordabiz_app; GRANT ALL ON ALL TABLES IN SCHEMA public TO nordabiz_app; GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO nordabiz_app; EOF sudo systemctl start nordabiznes ``` **If partial data loss:** ```bash # Identify missing/corrupted records psql -h localhost -U nordabiz_app -d nordabiz -- Example: Find companies with NULL required fields SELECT id, slug, name FROM companies WHERE name IS NULL; -- Restore specific table from backup # Extract table from backup pg_restore -t companies -d nordabiz /backup/nordabiz/latest.sql ``` --- ### 9.3 Security Breach **Severity:** CRITICAL #### Immediate Actions (First 10 Minutes) ```bash # 1. ISOLATE the server ssh maciejpi@57.128.200.27 # Block all incoming traffic except your IP sudo iptables -A INPUT -s YOUR_IP -j ACCEPT sudo iptables -A INPUT -j DROP # 2. Create forensic copy sudo -u postgres pg_dump nordabiz > /tmp/forensic_$(date +%Y%m%d_%H%M%S).sql sudo tar czf /tmp/www_forensic.tar.gz /var/www/nordabiznes/ # 3. Check for unauthorized access psql -h localhost -U nordabiz_app -d nordabiz -- Check for new admin users SELECT id, email, created_at, is_admin FROM users WHERE is_admin = TRUE ORDER BY created_at DESC; -- Check for recent logins SELECT user_id, ip_address, created_at FROM user_login_history WHERE created_at > NOW() - INTERVAL '24 hours' ORDER BY created_at DESC; # 4. Check logs for suspicious activity sudo journalctl -u nordabiznes --since "24 hours ago" | grep -iE "admin|delete|drop|unauthorized" # 5. Notify stakeholders # Email: "Security incident detected on nordabiznes.pl, investigating" ``` #### Investigation ```bash # Check for SQL injection attempts sudo journalctl -u nordabiznes --since "7 days ago" | grep -i "UNION\|DROP\|;--" # Check for unauthorized file changes sudo find /var/www/nordabiznes/ -type f -mtime -1 -ls # Check for backdoors sudo grep -r "eval\|exec\|system\|subprocess" /var/www/nordabiznes/*.py # Check database for malicious data psql -h localhost -U nordabiz_app -d nordabiz SELECT * FROM users WHERE email LIKE '% interval '2 seconds' ORDER BY duration DESC; EOF # Check locks psql -h localhost -U nordabiz_app -d nordabiz << 'EOF' SELECT relation::regclass, mode, granted FROM pg_locks WHERE NOT granted; EOF ``` ### 10.4 Performance Diagnostics ```bash # Response time test (10 requests) for i in {1..10}; do curl -w "Request $i: %{time_total}s\n" -o /dev/null -s https://nordabiznes.pl/ done # Server resource usage ssh maciejpi@57.128.200.27 "top -b -n 1 | head -20" # Disk usage ssh maciejpi@57.128.200.27 "df -h && echo -e '\n=== Top 10 Directories ===\n' && du -sh /* 2>/dev/null | sort -rh | head -10" # Network connectivity ping -c 5 nordabiznes.pl traceroute nordabiznes.pl # SSL certificate check echo | openssl s_client -servername nordabiznes.pl -connect nordabiznes.pl:443 2>/dev/null | openssl x509 -noout -dates -subject ``` ### 10.5 API Integration Diagnostics ```bash # Test all external APIs ssh maciejpi@57.128.200.27 # Gemini API GEMINI_KEY=$( grep GEMINI_API_KEY .env | cut -d= -f2) curl -s -H "x-goog-api-key: $GEMINI_KEY" \ "https://generativelanguage.googleapis.com/v1beta/models" | jq '.models[0].name' # PageSpeed API PAGESPEED_KEY=$( grep GOOGLE_PAGESPEED_API_KEY .env | cut -d= -f2) curl -s "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://nordabiznes.pl&key=$PAGESPEED_KEY" | jq '.lighthouseResult.categories.performance.score' # Brave Search API BRAVE_KEY=$( grep BRAVE_SEARCH_API_KEY .env | cut -d= -f2) curl -s -H "X-Subscription-Token: $BRAVE_KEY" \ "https://api.search.brave.com/res/v1/web/search?q=test&count=1" | jq '.web.results[0].title' # KRS API curl -s "https://api-krs.ms.gov.pl/api/krs/OdpisAktualny/0000878913" | jq '.odpis.dane.dzial1' ``` ### 10.6 Git & Deployment Diagnostics ```bash # Check current deployment version ssh maciejpi@57.128.200.27 "cd /var/www/nordabiznes && git log --oneline -5" # Check for uncommitted changes ssh maciejpi@57.128.200.27 "cd /var/www/nordabiznes && git status" # Check remote sync ssh maciejpi@57.128.200.27 "cd /var/www/nordabiznes && git remote -v && git fetch && git status" # Verify file permissions ssh maciejpi@57.128.200.27 "ls -la /var/www/nordabiznes/ | head -10" ``` --- ## Appendix: Related Documentation - **System Architecture:** [01-system-context.md](01-system-context.md) - **Container Diagram:** [02-container-diagram.md](02-container-diagram.md) - **Deployment Architecture:** [03-deployment-architecture.md](03-deployment-architecture.md) - **Network Topology:** [07-network-topology.md](07-network-topology.md) - **Critical Configurations:** [08-critical-configurations.md](08-critical-configurations.md) - **Security Architecture:** [09-security-architecture.md](09-security-architecture.md) - **API Endpoints:** [10-api-endpoints.md](10-api-endpoints.md) - **HTTP Request Flow:** [flows/06-http-request-flow.md](flows/06-http-request-flow.md) - **Authentication Flow:** [flows/01-authentication-flow.md](flows/01-authentication-flow.md) - **Incident Report:** [../../INCIDENT_REPORT_20260102.md](../../INCIDENT_REPORT_20260102.md) --- **Document Status:** ✅ Complete **Version:** 1.0 **Last Review:** 2026-01-10 --- ## Maintenance Notes **When to Update This Guide:** 1. After any production incident → Add to relevant section 2. When new features added → Add new troubleshooting scenarios 3. When infrastructure changes → Update diagnostic commands 4. Monthly review → Verify commands still work 5. After major version upgrades → Test all procedures **Contribution Guidelines:** - Keep solutions actionable (copy-paste commands when possible) - Include expected output for diagnostic commands - Reference related architecture docs - Test all commands before adding - Use consistent formatting (bash code blocks) --- **End of Troubleshooting Guide**