Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
Production moved from on-prem VM 249 (10.22.68.249) to OVH VPS (57.128.200.27, inpi-vps-waw01). Updated ALL documentation, slash commands, memory files, architecture docs, and deploy procedures. Added |local_time Jinja filter (UTC→Europe/Warsaw) and converted 155 .strftime() calls across 71 templates so timestamps display in Polish timezone regardless of server timezone. Also includes: created_by_id tracking, abort import fix, ICS calendar fix for missing end times, Pros Poland data cleanup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1047 lines
29 KiB
Markdown
1047 lines
29 KiB
Markdown
# Container Diagram (C4 Level 2)
|
|
|
|
**Document Version:** 1.0
|
|
**Last Updated:** 2026-04-04
|
|
**Status:** Production LIVE
|
|
**Diagram Type:** C4 Model - Level 2 (Containers)
|
|
|
|
---
|
|
|
|
## Overview
|
|
|
|
This diagram shows the **major containers** (applications and data stores) that make up the Norda Biznes Partner system. It illustrates:
|
|
|
|
- **What containers exist** (web app, database, proxy, external services)
|
|
- **How containers communicate** with each other
|
|
- **Technology choices** for each container
|
|
- **Network boundaries** and protocols
|
|
|
|
**Abstraction Level:** Container (C4 Level 2)
|
|
**Audience:** Developers, DevOps, Technical Architects
|
|
**Purpose:** Understanding system decomposition and technology stack
|
|
|
|
---
|
|
|
|
## Container Diagram
|
|
|
|
```mermaid
|
|
graph TB
|
|
%% External actors
|
|
Users["👥 Users<br/>(Members, Visitors, Admins)<br/>Web Browser"]
|
|
Admin["👨💼 System Admin<br/>Web Browser + SSH"]
|
|
|
|
%% System boundary
|
|
subgraph "Norda Biznes Partner System"
|
|
subgraph "OVH VPS [inpi-vps-waw01 | 57.128.200.27]"
|
|
Nginx["🔒 Nginx<br/>(Reverse Proxy)<br/><br/>Technology: Nginx<br/>Port: 443 (HTTPS)<br/><br/>Responsibilities:<br/>- SSL/TLS termination<br/>- Request routing<br/>- HTTP→HTTPS redirect<br/>- Let's Encrypt (certbot)"]
|
|
end
|
|
|
|
subgraph "OVH VPS [inpi-vps-waw01 | 57.128.200.27]"
|
|
WebApp["🌐 Flask Web Application<br/>(Application Server)<br/><br/>Technology: Flask 3.0 + Gunicorn<br/>Language: Python 3.9+<br/>Port: 5000<br/><br/>Responsibilities:<br/>- HTTP request handling<br/>- Business logic<br/>- Template rendering<br/>- API endpoints<br/>- Authentication & authorization<br/>- Session management"]
|
|
|
|
Database["💾 PostgreSQL Database<br/>(Data Store)<br/><br/>Technology: PostgreSQL 14<br/>Port: 5432 (localhost only)<br/><br/>Responsibilities:<br/>- Persistent data storage<br/>- Full-text search (FTS)<br/>- Fuzzy matching (pg_trgm)<br/>- Data integrity & constraints<br/>- 36 tables (companies, users, etc.)"]
|
|
|
|
Scripts["⚙️ Background Scripts<br/>(Batch Jobs)<br/><br/>Technology: Python 3.9+<br/>Execution: Cron / Manual<br/><br/>Responsibilities:<br/>- SEO auditing<br/>- Social media auditing<br/>- Data verification<br/>- Database migrations"]
|
|
end
|
|
|
|
subgraph "Service Layer (within Flask)"
|
|
SearchSvc["🔍 Search Service<br/>search_service.py"]
|
|
ChatSvc["💬 Chat Service<br/>nordabiz_chat.py"]
|
|
EmailSvc["📧 Email Service<br/>email_service.py"]
|
|
GeminiSvc["🤖 Gemini Service<br/>gemini_service.py"]
|
|
KRSSvc["🏛️ KRS Service<br/>krs_api_service.py"]
|
|
GBPSvc["📊 GBP Audit Service<br/>gbp_audit_service.py"]
|
|
ITSvc["🖥️ IT Audit Service<br/>it_audit_service.py"]
|
|
end
|
|
end
|
|
|
|
%% External systems
|
|
subgraph "External APIs & Services"
|
|
Gemini["🤖 Google Gemini API<br/>gemini-3-flash-preview<br/>REST API (HTTPS)"]
|
|
BraveAPI["🔍 Brave Search API<br/>News & Social Discovery<br/>REST API (HTTPS)"]
|
|
PageSpeed["📊 Google PageSpeed API<br/>SEO & Performance<br/>REST API (HTTPS)"]
|
|
Places["📍 Google Places API<br/>Business Profiles<br/>REST API (HTTPS)"]
|
|
KRS["🏛️ KRS Open API<br/>Company Registry<br/>REST API (HTTPS)"]
|
|
MSGraph["📧 Microsoft Graph API<br/>Email Sending<br/>REST API + OAuth 2.0"]
|
|
ALEO["🌐 ALEO.com<br/>NIP Verification<br/>Web Scraping (Playwright)"]
|
|
Rejestr["🔗 rejestr.io<br/>Company Connections<br/>Web Scraping (Playwright)"]
|
|
end
|
|
|
|
%% User interactions
|
|
Users -->|"HTTPS<br/>Port 443"| Nginx
|
|
Admin -->|"HTTPS<br/>Port 443"| Nginx
|
|
Admin -->|"SSH<br/>Port 22"| WebApp
|
|
|
|
%% Nginx routing
|
|
Nginx -->|"HTTP<br/>127.0.0.1:5000"| WebApp
|
|
|
|
%% Web app to database
|
|
WebApp -->|"SQL Queries<br/>SQLAlchemy ORM<br/>localhost:5432"| Database
|
|
Scripts -->|"SQL Queries<br/>Direct connection<br/>127.0.0.1:5432"| Database
|
|
|
|
%% Service layer connections
|
|
WebApp --> SearchSvc
|
|
WebApp --> ChatSvc
|
|
WebApp --> EmailSvc
|
|
WebApp --> GeminiSvc
|
|
WebApp --> KRSSvc
|
|
WebApp --> GBPSvc
|
|
WebApp --> ITSvc
|
|
|
|
SearchSvc --> Database
|
|
ChatSvc --> SearchSvc
|
|
ChatSvc --> GeminiSvc
|
|
|
|
%% External API integrations
|
|
GeminiSvc -->|"HTTPS<br/>API Key Auth"| Gemini
|
|
WebApp -->|"HTTPS<br/>API Key Auth"| BraveAPI
|
|
Scripts -->|"HTTPS<br/>API Key Auth"| PageSpeed
|
|
GBPSvc -->|"HTTPS<br/>API Key Auth"| Places
|
|
KRSSvc -->|"HTTPS<br/>Public API"| KRS
|
|
EmailSvc -->|"HTTPS<br/>OAuth 2.0"| MSGraph
|
|
Scripts -->|"HTTPS<br/>Web Scraping"| ALEO
|
|
Scripts -->|"HTTPS<br/>Web Scraping"| Rejestr
|
|
|
|
%% Styling
|
|
classDef containerStyle fill:#1168bd,stroke:#0b4884,color:#ffffff,stroke-width:3px
|
|
classDef databaseStyle fill:#438dd5,stroke:#2e6295,color:#ffffff,stroke-width:3px
|
|
classDef serviceStyle fill:#85bbf0,stroke:#5d92c7,color:#000000,stroke-width:2px
|
|
classDef proxyStyle fill:#ff6b6b,stroke:#cc5555,color:#ffffff,stroke-width:3px
|
|
classDef externalStyle fill:#999999,stroke:#666666,color:#ffffff,stroke-width:2px
|
|
classDef personStyle fill:#08427b,stroke:#052e56,color:#ffffff,stroke-width:2px
|
|
|
|
class WebApp,Scripts containerStyle
|
|
class Database databaseStyle
|
|
class SearchSvc,ChatSvc,EmailSvc,GeminiSvc,KRSSvc,GBPSvc,ITSvc serviceStyle
|
|
class Nginx proxyStyle
|
|
class Gemini,BraveAPI,PageSpeed,Places,KRS,MSGraph,ALEO,Rejestr externalStyle
|
|
class Users,Admin personStyle
|
|
```
|
|
|
|
---
|
|
|
|
## Container Descriptions
|
|
|
|
### 🔒 Nginx (Reverse Proxy)
|
|
|
|
**Location:** OVH VPS (57.128.200.27, hostname: inpi-vps-waw01)
|
|
**Technology:** Nginx
|
|
**Protocol:** HTTPS (Port 443)
|
|
**Purpose:** SSL termination and reverse proxy
|
|
|
|
**Responsibilities:**
|
|
- Terminate SSL/TLS connections (Let's Encrypt certificates via certbot)
|
|
- Route incoming HTTPS requests to backend Gunicorn on 127.0.0.1:5000
|
|
- Automatically renew SSL certificates
|
|
- Force HTTP→HTTPS redirects
|
|
- Security headers (HSTS, CSP)
|
|
|
|
**Critical Configuration:**
|
|
```
|
|
HTTPS :443 → HTTP 127.0.0.1:5000
|
|
```
|
|
|
|
**Verification:**
|
|
```bash
|
|
curl -I https://nordabiznes.pl/health
|
|
# Expected: HTTP 200 OK
|
|
```
|
|
|
|
**SSL:** Let's Encrypt (certbot auto-renewal)
|
|
|
|
---
|
|
|
|
### 🌐 Flask Web Application
|
|
|
|
**Location:** OVH VPS (57.128.200.27, hostname: inpi-vps-waw01)
|
|
**Technology:** Flask 3.0 + Gunicorn WSGI server
|
|
**Language:** Python 3.9+
|
|
**Protocol:** HTTP (Port 5000 - localhost only, via nginx proxy_pass)
|
|
**Main File:** `/var/www/nordabiznes/app.py`
|
|
|
|
**Responsibilities:**
|
|
- Handle HTTP requests from nginx
|
|
- Render HTML templates (Jinja2)
|
|
- Provide REST API endpoints (90+ routes)
|
|
- User authentication (Flask-Login)
|
|
- Session management (Flask sessions)
|
|
- CSRF protection (Flask-WTF)
|
|
- Rate limiting (Flask-Limiter: 200 req/day, 50 req/hour)
|
|
- Business logic orchestration
|
|
- Database ORM interactions (SQLAlchemy)
|
|
|
|
**Technology Stack:**
|
|
- **Framework:** Flask 3.0
|
|
- **WSGI Server:** Gunicorn (4 workers)
|
|
- **Template Engine:** Jinja2
|
|
- **Forms:** Flask-WTF
|
|
- **Authentication:** Flask-Login
|
|
- **ORM:** SQLAlchemy 2.0
|
|
- **Security:** Flask-Limiter, Flask-SeaSurf (CSRF)
|
|
|
|
**Key Modules:**
|
|
| Module | Lines | Purpose |
|
|
|--------|-------|---------|
|
|
| `app.py` | 13,144 | Main application (routes, auth, API) |
|
|
| `database.py` | ~1,500 | SQLAlchemy models (36 tables) |
|
|
| `search_service.py` | ~400 | Company search (FTS, fuzzy) |
|
|
| `nordabiz_chat.py` | ~800 | AI chat engine |
|
|
| `gemini_service.py` | ~500 | Google Gemini integration |
|
|
| `email_service.py` | ~200 | MS Graph email sender |
|
|
| `krs_api_service.py` | ~300 | KRS company verification |
|
|
| `gbp_audit_service.py` | ~600 | Google Business Profile audit |
|
|
| `it_audit_service.py` | ~500 | IT infrastructure assessment |
|
|
|
|
**Service Management:**
|
|
```bash
|
|
sudo systemctl status nordabiznes
|
|
sudo systemctl restart nordabiznes
|
|
sudo journalctl -u nordabiznes -f
|
|
```
|
|
|
|
**Application User:** `maciejpi`
|
|
|
|
---
|
|
|
|
### 💾 PostgreSQL Database
|
|
|
|
**Location:** OVH VPS (57.128.200.27, hostname: inpi-vps-waw01)
|
|
**Technology:** PostgreSQL 14
|
|
**Protocol:** PostgreSQL wire protocol (Port 5432)
|
|
**Access:** **localhost ONLY** (127.0.0.1)
|
|
|
|
**Responsibilities:**
|
|
- Store all application data (companies, users, chats, etc.)
|
|
- Full-text search (tsvector, tsquery)
|
|
- Fuzzy matching (pg_trgm extension)
|
|
- Enforce data integrity (foreign keys, constraints)
|
|
- Transaction management
|
|
- Backup and recovery
|
|
|
|
**Database Details:**
|
|
- **Database Name:** `nordabiz`
|
|
- **Application User:** `nordabiz_app`
|
|
- **Admin User:** `postgres`
|
|
- **Encoding:** UTF-8
|
|
- **Locale:** en_US.UTF-8
|
|
- **Total Tables:** 36 (via SQLAlchemy)
|
|
|
|
**Key Tables:**
|
|
| Table | Records | Purpose |
|
|
|-------|---------|---------|
|
|
| `companies` | 80 | Member company profiles |
|
|
| `users` | ~50 | Registered users |
|
|
| `chat_sessions` | ~500 | AI chat conversations |
|
|
| `chat_messages` | ~2000 | Chat message history |
|
|
| `seo_metrics` | ~200 | SEO audit results |
|
|
| `company_social_media` | ~115 | Social media profiles |
|
|
| `forum_topics` | ~30 | Forum discussions |
|
|
| `company_news` | ~50 | News articles (moderated) |
|
|
|
|
**PostgreSQL Extensions:**
|
|
- `pg_trgm` - Trigram similarity (fuzzy search)
|
|
- `uuid-ossp` - UUID generation
|
|
- `pgcrypto` - Cryptographic functions
|
|
|
|
**Connection Strings:**
|
|
```python
|
|
# Flask App (from OVH VPS)
|
|
DATABASE_URL = 'postgresql://nordabiz_app:***@127.0.0.1:5432/nordabiz'
|
|
|
|
# Background Scripts (from OVH VPS)
|
|
DATABASE_URL = 'postgresql://nordabiz_app:***@127.0.0.1:5432/nordabiz'
|
|
```
|
|
|
|
**⚠️ IMPORTANT:** PostgreSQL is configured to reject external connections.
|
|
Only connections from `localhost` (127.0.0.1) are allowed.
|
|
|
|
**Development Database:**
|
|
- **Host:** localhost (Mac)
|
|
- **Port:** 5433 (Docker mapped port)
|
|
- **Container:** `nordabiz-postgres`
|
|
- **Access:** `postgresql://nordabiz_app:***@localhost:5433/nordabiz`
|
|
|
|
---
|
|
|
|
### ⚙️ Background Scripts
|
|
|
|
**Location:** OVH VPS (57.128.200.27, hostname: inpi-vps-waw01)
|
|
**Directory:** `/var/www/nordabiznes/scripts/`
|
|
**Technology:** Python 3.9+ (same virtualenv as Flask app)
|
|
**Execution:** Manual via SSH or Cron jobs
|
|
|
|
**Purpose:**
|
|
- Periodic data enrichment (SEO, social media, GBP audits)
|
|
- Database maintenance (migrations, cleanup)
|
|
- External data verification (NIP, KRS)
|
|
- AI quality testing
|
|
|
|
**Key Scripts:**
|
|
| Script | Purpose | API Dependencies |
|
|
|--------|---------|------------------|
|
|
| `seo_audit.py` | Website SEO & performance audit | Google PageSpeed |
|
|
| `social_media_audit.py` | Social profile verification | Brave Search |
|
|
| `gbp_audit.py` | Google Business Profile audit | Google Places |
|
|
| `fetch_company_news.py` | News monitoring (planned) | Brave Search + Gemini |
|
|
| `verify_all_companies_data.py` | Data quality report | - |
|
|
| `fix_krs_verification.py` | KRS data validation | KRS API |
|
|
| `import_*.py` | Company data import | ALEO, rejestr.io |
|
|
| `run_ai_quality_tests.py` | AI chat quality tests | Gemini |
|
|
|
|
**Execution Example:**
|
|
```bash
|
|
cd /var/www/nordabiznes
|
|
/var/www/nordabiznes/venv/bin/python3 scripts/seo_audit.py --company-id 26
|
|
```
|
|
|
|
**Cron Jobs (Planned):**
|
|
```bash
|
|
# SEO audit - every week
|
|
0 2 * * 0 /var/www/nordabiznes/venv/bin/python3 /var/www/nordabiznes/scripts/seo_audit.py --all
|
|
|
|
# News monitoring - every 6 hours
|
|
0 */6 * * * /var/www/nordabiznes/venv/bin/python3 /var/www/nordabiznes/scripts/fetch_company_news.py --all
|
|
```
|
|
|
|
---
|
|
|
|
## Service Layer Components
|
|
|
|
The Flask application uses a **service layer pattern** to encapsulate business logic and external integrations. Services are Python modules imported by `app.py`.
|
|
|
|
### 🔍 Search Service (`search_service.py`)
|
|
|
|
**Purpose:** Unified company search across multiple methods
|
|
**Used By:** `/search` route, AI chat, API endpoints
|
|
**Database Access:** Direct SQL + SQLAlchemy ORM
|
|
|
|
**Search Methods:**
|
|
1. **NIP/REGON Lookup** - Direct identifier match
|
|
2. **Synonym Expansion** - Keyword mapping (e.g., "strony" → www, web, portal)
|
|
3. **PostgreSQL FTS** - Full-text search with `tsvector`
|
|
4. **Fuzzy Matching** - `pg_trgm` similarity (handles typos)
|
|
|
|
**Scoring System:**
|
|
- Company name match: +10 points
|
|
- Description match: +5 points
|
|
- Services match: +8 points
|
|
- Competencies match: +7 points
|
|
- City match: +3 points
|
|
|
|
**API:**
|
|
```python
|
|
from search_service import search_companies
|
|
results = search_companies(db, "strony www", limit=10)
|
|
# Returns: List[SearchResult] with company, score, match_type
|
|
```
|
|
|
|
---
|
|
|
|
### 💬 Chat Service (`nordabiz_chat.py`)
|
|
|
|
**Purpose:** AI-powered chat with company context
|
|
**Dependencies:** Gemini Service, Search Service
|
|
**Database Tables:** `chat_sessions`, `chat_messages`, `gemini_usage`
|
|
|
|
**Workflow:**
|
|
1. User sends message → create/load chat session
|
|
2. Search for relevant companies (max 8 companies)
|
|
3. Build conversation context (last 10 messages + company data)
|
|
4. Send to Gemini API with system prompt
|
|
5. Stream response back to user
|
|
6. Track token usage and costs
|
|
|
|
**Context Limits:**
|
|
- **Companies in context:** 8 (prevents token overflow)
|
|
- **Message history:** 10 recent messages
|
|
- **Max response length:** ~1000 tokens
|
|
|
|
**Cost Tracking:**
|
|
- Input tokens counted per request
|
|
- Output tokens counted per response
|
|
- Stored in `gemini_usage` table
|
|
- Cost calculated: $0.075-$1.25 per 1M tokens
|
|
|
|
---
|
|
|
|
### 🤖 Gemini Service (`gemini_service.py`)
|
|
|
|
**Purpose:** Interface to Google Gemini AI API
|
|
**Models:** `gemini-3-flash-preview` (default), `gemini-3-pro-preview` (advanced)
|
|
**Authentication:** API key from `.env`
|
|
|
|
**Capabilities:**
|
|
- Text generation (chat responses)
|
|
- Image analysis (logo descriptions)
|
|
- Content scoring (news relevance)
|
|
- Streaming responses
|
|
|
|
**API Wrapper Functions:**
|
|
```python
|
|
generate_text(prompt, model="gemini-3-flash-preview")
|
|
generate_chat_response(messages, context, stream=True)
|
|
analyze_image(image_bytes, prompt)
|
|
score_content_relevance(content, company_name)
|
|
```
|
|
|
|
**Error Handling:**
|
|
- Rate limit errors → retry with exponential backoff
|
|
- Invalid API key → log error + return fallback response
|
|
- Timeout → 30s default, configurable
|
|
|
|
---
|
|
|
|
### 📧 Email Service (`email_service.py`)
|
|
|
|
**Purpose:** Send email notifications via Microsoft Graph API
|
|
**Authentication:** OAuth 2.0 client credentials
|
|
**Email Provider:** Office 365 / Outlook
|
|
|
|
**Use Cases:**
|
|
- Email verification during registration
|
|
- Password reset emails
|
|
- Admin notifications
|
|
- User alerts (forum replies, messages)
|
|
|
|
**Configuration:**
|
|
```python
|
|
# .env
|
|
MS_GRAPH_CLIENT_ID=...
|
|
MS_GRAPH_CLIENT_SECRET=...
|
|
MS_GRAPH_TENANT_ID=...
|
|
MS_GRAPH_FROM_EMAIL=noreply@nordabiznes.pl
|
|
```
|
|
|
|
**API:**
|
|
```python
|
|
from email_service import send_email
|
|
send_email(to="user@example.com", subject="...", body="...")
|
|
```
|
|
|
|
---
|
|
|
|
### 🏛️ KRS Service (`krs_api_service.py`)
|
|
|
|
**Purpose:** Verify Polish company data from KRS registry
|
|
**API:** Polish Ministry of Justice KRS Open API
|
|
**Authentication:** Public API (no key required)
|
|
|
|
**Data Retrieved:**
|
|
- Company legal verification (KRS number)
|
|
- Corporate structure
|
|
- Board members
|
|
- Share capital
|
|
- Legal form
|
|
|
|
**API:**
|
|
```python
|
|
from krs_api_service import verify_krs
|
|
data = verify_krs(krs_number="0000123456")
|
|
# Returns: Dict with company details or None
|
|
```
|
|
|
|
---
|
|
|
|
### 📊 GBP Audit Service (`gbp_audit_service.py`)
|
|
|
|
**Purpose:** Audit Google Business Profile completeness
|
|
**Dependencies:** Google Places API, Gemini AI
|
|
**Database Table:** `gbp_audits`
|
|
|
|
**Audit Checks:**
|
|
- Profile completeness (name, address, hours, etc.)
|
|
- Photo quantity and quality
|
|
- Review count and average rating
|
|
- Q&A presence
|
|
- Posts activity
|
|
|
|
**AI Recommendations:**
|
|
- Uses Gemini to generate improvement suggestions
|
|
- Personalized based on profile gaps
|
|
- Actionable advice
|
|
|
|
---
|
|
|
|
### 🖥️ IT Audit Service (`it_audit_service.py`)
|
|
|
|
**Purpose:** Assess IT infrastructure maturity
|
|
**Method:** Questionnaire-based scoring
|
|
**Database Table:** `it_audits`
|
|
|
|
**Assessment Categories:**
|
|
- IT Infrastructure (servers, network, backup)
|
|
- Security (firewall, antivirus, access control)
|
|
- Cloud Services (usage, integration)
|
|
- Digital Tools (CRM, ERP, communication)
|
|
- IT Management (policies, documentation, support)
|
|
|
|
**Scoring:**
|
|
- 0-40: Basic level
|
|
- 41-70: Intermediate level
|
|
- 71-100: Advanced level
|
|
|
|
---
|
|
|
|
## External API Integrations
|
|
|
|
All external APIs are called via HTTPS with appropriate authentication.
|
|
|
|
### 🤖 Google Gemini AI API
|
|
|
|
**Endpoint:** `https://generativelanguage.googleapis.com/v1beta/models/*`
|
|
**Authentication:** API Key (in header: `x-goog-api-key`)
|
|
**Pricing:** $0.075-$1.25 per 1M tokens (varies by model)
|
|
**Rate Limit:** No strict limit (cost-limited)
|
|
|
|
**Used For:**
|
|
- AI chat responses
|
|
- Image analysis
|
|
- Content relevance scoring
|
|
- Recommendation generation
|
|
|
|
---
|
|
|
|
### 🔍 Brave Search API
|
|
|
|
**Endpoint:** `https://api.search.brave.com/res/v1/news/search`
|
|
**Authentication:** API Key (header: `X-Subscription-Token`)
|
|
**Pricing:** Free tier - 2,000 requests/month
|
|
**Rate Limit:** 2,000 req/month
|
|
|
|
**Used For:**
|
|
- News monitoring (company mentions)
|
|
- Social media profile discovery
|
|
- Press release detection
|
|
|
|
---
|
|
|
|
### 📊 Google PageSpeed Insights API
|
|
|
|
**Endpoint:** `https://www.googleapis.com/pagespeedonline/v5/runPagespeed`
|
|
**Authentication:** API Key (query param: `key`)
|
|
**Pricing:** Free - 25,000 queries/day
|
|
**Rate Limit:** 25,000 req/day
|
|
|
|
**Used For:**
|
|
- SEO score calculation
|
|
- Performance metrics (Core Web Vitals)
|
|
- Accessibility auditing
|
|
- Best practices assessment
|
|
|
|
---
|
|
|
|
### 📍 Google Places API
|
|
|
|
**Endpoint:** `https://maps.googleapis.com/maps/api/place/*`
|
|
**Authentication:** API Key
|
|
**Pricing:** Pay-per-use (varies by request type)
|
|
**Rate Limit:** Quota-based
|
|
|
|
**Used For:**
|
|
- Business reviews and ratings
|
|
- Business hours and location
|
|
- Photo gallery
|
|
- Google Maps integration
|
|
|
|
---
|
|
|
|
### 🏛️ KRS Open API
|
|
|
|
**Endpoint:** `https://api-krs.ms.gov.pl/`
|
|
**Authentication:** None (public API)
|
|
**Pricing:** Free
|
|
**Rate Limit:** Not strictly enforced
|
|
|
|
**Used For:**
|
|
- Company legal verification
|
|
- Corporate structure data
|
|
- Board member information
|
|
- Share capital details
|
|
|
|
---
|
|
|
|
### 📧 Microsoft Graph API
|
|
|
|
**Endpoint:** `https://graph.microsoft.com/v1.0/`
|
|
**Authentication:** OAuth 2.0 (client credentials)
|
|
**Pricing:** Included with Office 365
|
|
**Rate Limit:** Throttling based on tenant
|
|
|
|
**Used For:**
|
|
- Email notifications
|
|
- User mailbox access (delegated)
|
|
|
|
---
|
|
|
|
### 🌐 ALEO.com
|
|
|
|
**Method:** Web scraping (Playwright)
|
|
**Authentication:** None
|
|
**Rate Limit:** Self-imposed (polite scraping)
|
|
**Reliability:** Medium (subject to website changes)
|
|
|
|
**Used For:**
|
|
- NIP number validation
|
|
- Company legal status check
|
|
- Basic company information
|
|
|
|
---
|
|
|
|
### 🔗 rejestr.io
|
|
|
|
**Method:** Web scraping (Playwright)
|
|
**Authentication:** None
|
|
**Rate Limit:** Self-imposed
|
|
**Reliability:** Medium
|
|
|
|
**Used For:**
|
|
- Board member identification
|
|
- Shareholder/ownership structure
|
|
- Cross-company relationships
|
|
- Beneficial owners
|
|
|
|
---
|
|
|
|
## Network Flow and Protocols
|
|
|
|
### 1. User HTTP Request Flow
|
|
|
|
```
|
|
User Browser (HTTPS :443)
|
|
│
|
|
▼
|
|
Nginx @ 57.128.200.27:443 (SSL termination)
|
|
│
|
|
├─ Decrypt HTTPS
|
|
├─ Verify SSL certificate
|
|
└─ Forward as HTTP
|
|
│
|
|
▼
|
|
Flask App @ 127.0.0.1:5000 (HTTP)
|
|
│
|
|
├─ Authenticate user (session cookie)
|
|
├─ Authorize request (permissions)
|
|
├─ Execute business logic
|
|
└─ Query database
|
|
│
|
|
▼
|
|
PostgreSQL @ 127.0.0.1:5432 (local only)
|
|
│
|
|
├─ Execute SQL query
|
|
├─ Return result set
|
|
└─ Commit transaction
|
|
│
|
|
▼
|
|
Flask App (render template / JSON)
|
|
│
|
|
▼
|
|
Nginx (encrypt response)
|
|
│
|
|
▼
|
|
User Browser (HTTPS response)
|
|
```
|
|
|
|
### 2. AI Chat Request Flow
|
|
|
|
```
|
|
User → Flask /api/chat/<id>/message (POST)
|
|
│
|
|
▼
|
|
Chat Service (nordabiz_chat.py)
|
|
│
|
|
├─ Load chat session from DB
|
|
├─ Extract user query
|
|
└─ Search for relevant companies
|
|
│
|
|
▼
|
|
Search Service (search_companies)
|
|
│
|
|
└─ PostgreSQL FTS + fuzzy search
|
|
│
|
|
▼
|
|
Chat Service (build context)
|
|
│
|
|
├─ Last 10 messages
|
|
├─ Max 8 company profiles
|
|
└─ System prompt
|
|
│
|
|
▼
|
|
Gemini Service → Google Gemini API (HTTPS)
|
|
│
|
|
├─ Send context + query
|
|
├─ Receive streaming response
|
|
└─ Track token usage
|
|
│
|
|
▼
|
|
Chat Service (save to DB)
|
|
│
|
|
├─ Store user message
|
|
├─ Store AI response
|
|
└─ Update gemini_usage
|
|
│
|
|
▼
|
|
Flask → User (streaming JSON)
|
|
```
|
|
|
|
### 3. SEO Audit Flow
|
|
|
|
```
|
|
Admin → /admin/seo (trigger audit)
|
|
│
|
|
▼
|
|
Background Script: scripts/seo_audit.py
|
|
│
|
|
├─ Get company URL from DB
|
|
└─ Call PageSpeed API
|
|
│
|
|
▼
|
|
Google PageSpeed API @ https://www.googleapis.com/...
|
|
│
|
|
├─ Analyze website
|
|
├─ Calculate scores (SEO, Performance, etc.)
|
|
└─ Return audit data (JSON)
|
|
│
|
|
▼
|
|
Script (parse results)
|
|
│
|
|
└─ Store in seo_metrics table
|
|
│
|
|
▼
|
|
PostgreSQL (save audit)
|
|
│
|
|
▼
|
|
Admin Dashboard (display results)
|
|
```
|
|
|
|
---
|
|
|
|
## Security Boundaries
|
|
|
|
### Network Segmentation
|
|
|
|
| Zone | Components | Access Level | Trust Level |
|
|
|------|------------|--------------|-------------|
|
|
| **Public Internet** | User browsers | Untrusted | Low |
|
|
| **Proxy (nginx)** | Nginx on OVH VPS (57.128.200.27:443) | Semi-trusted | Medium |
|
|
| **App Zone** | Flask App (127.0.0.1:5000) | Trusted | High |
|
|
| **Data Zone** | PostgreSQL (127.0.0.1:5432) | Highly trusted | Critical |
|
|
| **External APIs** | Gemini, Brave, etc. | Untrusted | Low |
|
|
|
|
### Authentication & Authorization
|
|
|
|
**User Authentication:**
|
|
- Method: Email/password (bcrypt hashed)
|
|
- Session: Flask-Login (server-side sessions)
|
|
- Cookie: `session` (encrypted, HttpOnly, SameSite=Lax)
|
|
|
|
**Admin Authorization:**
|
|
- Check: `current_user.is_authenticated and current_user.is_admin`
|
|
- Decorator: `@login_required` + manual `is_admin` check
|
|
|
|
**API Authentication:**
|
|
- External APIs: API keys in `.env` (never committed)
|
|
- MS Graph: OAuth 2.0 client credentials
|
|
|
|
### HTTPS/TLS
|
|
|
|
- **Certificate:** Let's Encrypt (auto-renewal via certbot on OVH VPS)
|
|
- **Protocols:** TLS 1.2, TLS 1.3
|
|
- **Cipher Suites:** Modern (AES-GCM, ChaCha20-Poly1305)
|
|
- **HSTS:** Enabled (max-age=31536000)
|
|
|
|
### Database Security
|
|
|
|
- **Access:** Localhost only (no external connections)
|
|
- **Authentication:** Password-based (strong passwords)
|
|
- **Encryption:** At-rest encryption (OS-level)
|
|
- **Backups:** pg_dump + offsite copy
|
|
|
|
---
|
|
|
|
## Deployment Configuration
|
|
|
|
### Production Environment
|
|
|
|
**Nginx (OVH VPS):**
|
|
```nginx
|
|
# /etc/nginx/sites-available/nordabiznes.pl
|
|
server {
|
|
listen 443 ssl http2;
|
|
server_name nordabiznes.pl www.nordabiznes.pl;
|
|
|
|
ssl_certificate /etc/letsencrypt/live/nordabiznes.pl/fullchain.pem;
|
|
ssl_certificate_key /etc/letsencrypt/live/nordabiznes.pl/privkey.pem;
|
|
|
|
location / {
|
|
proxy_pass http://127.0.0.1:5000;
|
|
proxy_set_header Host $host;
|
|
proxy_set_header X-Real-IP $remote_addr;
|
|
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
|
|
proxy_set_header X-Forwarded-Proto $scheme;
|
|
}
|
|
}
|
|
```
|
|
|
|
**Flask/Gunicorn (OVH VPS):**
|
|
```ini
|
|
# /etc/systemd/system/nordabiznes.service
|
|
[Unit]
|
|
Description=Norda Biznes Partner - Flask Application
|
|
After=network.target postgresql.service
|
|
|
|
[Service]
|
|
Type=notify
|
|
User=maciejpi
|
|
Group=maciejpi
|
|
WorkingDirectory=/var/www/nordabiznes
|
|
Environment="PATH=/var/www/nordabiznes/venv/bin"
|
|
ExecStart=/var/www/nordabiznes/venv/bin/gunicorn \
|
|
--workers 4 \
|
|
--bind 127.0.0.1:5000 \
|
|
--timeout 120 \
|
|
--access-logfile /var/log/nordabiznes/access.log \
|
|
--error-logfile /var/log/nordabiznes/error.log \
|
|
app:app
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
**PostgreSQL (OVH VPS):**
|
|
```conf
|
|
# /etc/postgresql/14/main/postgresql.conf
|
|
listen_addresses = 'localhost' # ONLY localhost!
|
|
port = 5432
|
|
max_connections = 100
|
|
shared_buffers = 256MB
|
|
|
|
# /etc/postgresql/14/main/pg_hba.conf
|
|
local nordabiz nordabiz_app md5
|
|
host nordabiz nordabiz_app 127.0.0.1/32 md5
|
|
```
|
|
|
|
---
|
|
|
|
## Development Environment
|
|
|
|
### Local Development (Mac)
|
|
|
|
**Flask Application:**
|
|
```bash
|
|
# Run locally on port 5000 or 5001
|
|
python3 app.py
|
|
# Access: http://localhost:5000
|
|
```
|
|
|
|
**PostgreSQL (Docker):**
|
|
```bash
|
|
# Start PostgreSQL container
|
|
docker compose up -d
|
|
|
|
# Connection string
|
|
postgresql://nordabiz_app:***@localhost:5433/nordabiz
|
|
```
|
|
|
|
**Environment Variables:**
|
|
```bash
|
|
# .env file (NEVER commit!)
|
|
DATABASE_URL=postgresql://nordabiz_app:***@localhost:5433/nordabiz
|
|
SECRET_KEY=...
|
|
GOOGLE_API_KEY=...
|
|
BRAVE_SEARCH_API_KEY=...
|
|
GOOGLE_PAGESPEED_API_KEY=...
|
|
```
|
|
|
|
---
|
|
|
|
## Monitoring and Diagnostics
|
|
|
|
### Health Check Endpoint
|
|
|
|
```bash
|
|
# Production
|
|
curl -I https://nordabiznes.pl/health
|
|
# Expected: HTTP 200 OK
|
|
|
|
# Development
|
|
curl -I http://localhost:5000/health
|
|
# Expected: HTTP 200 OK
|
|
```
|
|
|
|
### Logs
|
|
|
|
**Flask Application:**
|
|
```bash
|
|
# Systemd journal
|
|
sudo journalctl -u nordabiznes -f
|
|
|
|
# Application logs
|
|
tail -f /var/log/nordabiznes/access.log
|
|
tail -f /var/log/nordabiznes/error.log
|
|
```
|
|
|
|
**Nginx (OVH VPS):**
|
|
```bash
|
|
# Nginx access logs
|
|
tail -f /var/log/nginx/access.log
|
|
|
|
# Nginx error logs
|
|
tail -f /var/log/nginx/error.log
|
|
```
|
|
|
|
**PostgreSQL:**
|
|
```bash
|
|
# PostgreSQL logs
|
|
sudo tail -f /var/log/postgresql/postgresql-14-main.log
|
|
```
|
|
|
|
### Database Diagnostics
|
|
|
|
```bash
|
|
# Connect to database
|
|
sudo -u postgres psql nordabiz
|
|
|
|
# Check active connections
|
|
SELECT count(*) FROM pg_stat_activity WHERE datname = 'nordabiz';
|
|
|
|
# Check table sizes
|
|
SELECT schemaname, tablename, pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename))
|
|
FROM pg_tables WHERE schemaname = 'public' ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
|
|
```
|
|
|
|
---
|
|
|
|
## Critical Configuration Warnings
|
|
|
|
### ⚠️ Nginx Proxy Configuration
|
|
|
|
**Production** uses nginx on OVH VPS (57.128.200.27) as reverse proxy to Gunicorn on 127.0.0.1:5000.
|
|
|
|
**Correct Configuration:**
|
|
```
|
|
Nginx (57.128.200.27:443) → Gunicorn (127.0.0.1:5000) ✓
|
|
```
|
|
|
|
**Verification After Changes:**
|
|
```bash
|
|
curl -I https://nordabiznes.pl/health
|
|
# Must return: HTTP 200 OK
|
|
```
|
|
|
|
**Historical Note:** The old on-prem setup used NPM on 10.22.68.250 forwarding to 10.22.68.249:5000. See `docs/INCIDENT_REPORT_20260102.md` for historical port misconfiguration incident.
|
|
|
|
---
|
|
|
|
### ⚠️ Database Connection Security
|
|
|
|
**PostgreSQL MUST reject external connections:**
|
|
|
|
```conf
|
|
# postgresql.conf
|
|
listen_addresses = 'localhost' # NOT '*' or '0.0.0.0'!
|
|
```
|
|
|
|
**Scripts MUST use localhost (127.0.0.1):**
|
|
|
|
```python
|
|
# CORRECT (from OVH VPS)
|
|
DATABASE_URL = 'postgresql://nordabiz_app:***@127.0.0.1:5432/nordabiz'
|
|
|
|
# INCORRECT (external connection attempt)
|
|
DATABASE_URL = 'postgresql://nordabiz_app:***@57.128.200.27:5432/nordabiz'
|
|
```
|
|
|
|
---
|
|
|
|
### ⚠️ API Keys and Secrets
|
|
|
|
**NEVER commit to git:**
|
|
- `.env` file (must be in `.gitignore`)
|
|
- API keys in code
|
|
- Database passwords
|
|
- OAuth client secrets
|
|
|
|
**Storage:**
|
|
- Production: `.env` file on OVH VPS (57.128.200.27)
|
|
- Development: `.env` file on local machine
|
|
- Backup: Secure password manager (not in git!)
|
|
|
|
---
|
|
|
|
## Related Documentation
|
|
|
|
### Next Level (Deeper Dive)
|
|
|
|
- **[Flask Application Components](./04-flask-components.md)** - Internal Flask structure (routes, services, models)
|
|
- **[Database Schema](./05-database-schema.md)** - Table relationships and data model
|
|
- **[External Integrations](./06-external-integrations.md)** - Detailed API documentation
|
|
|
|
### Same Level (Alternative Views)
|
|
|
|
- **[Deployment Architecture](./03-deployment-architecture.md)** - Infrastructure view (servers, IPs, ports)
|
|
- **[Network Topology](./07-network-topology.md)** - Network diagram with Fortigate, VLANs
|
|
|
|
### Higher Level (Context)
|
|
|
|
- **[System Context Diagram](./01-system-context.md)** - Overall system and external actors
|
|
|
|
### Data Flows
|
|
|
|
- **[Authentication Flow](./flows/01-authentication-flow.md)** - User login sequence
|
|
- **[Search Flow](./flows/02-search-flow.md)** - Company search process
|
|
- **[AI Chat Flow](./flows/03-ai-chat-flow.md)** - AI conversation handling
|
|
- **[SEO Audit Flow](./flows/04-seo-audit-flow.md)** - Audit execution
|
|
- **[HTTP Request Flow](./flows/06-http-request-flow.md)** - NPM → Flask → DB
|
|
|
|
---
|
|
|
|
## Maintenance Notes
|
|
|
|
### When to Update This Diagram
|
|
|
|
**✏️ UPDATE when:**
|
|
- New container added (e.g., Redis cache, Celery worker)
|
|
- Container technology changed (e.g., Flask → FastAPI)
|
|
- New external API integrated
|
|
- Database technology changed (e.g., PostgreSQL → MySQL)
|
|
- Service layer significantly refactored
|
|
|
|
**❌ DON'T UPDATE for:**
|
|
- New Flask routes (component-level change)
|
|
- New database tables (schema-level change)
|
|
- Code refactoring within existing containers
|
|
- Infrastructure changes (deployment diagram)
|
|
|
|
### Review Frequency
|
|
|
|
- **After major releases:** Review container boundaries
|
|
- **Quarterly:** Verify technology versions
|
|
- **When onboarding:** Ensure diagram matches reality
|
|
|
|
---
|
|
|
|
## Glossary
|
|
|
|
| Term | Definition |
|
|
|------|------------|
|
|
| **Container** | Separately deployable/runnable unit (app, database, service) |
|
|
| **NPM** | Nginx Proxy Manager - reverse proxy with SSL termination |
|
|
| **Gunicorn** | Python WSGI HTTP server for running Flask apps |
|
|
| **WSGI** | Web Server Gateway Interface - Python web server standard |
|
|
| **ORM** | Object-Relational Mapping (SQLAlchemy) |
|
|
| **FTS** | Full-Text Search (PostgreSQL feature) |
|
|
| **pg_trgm** | PostgreSQL trigram extension for fuzzy search |
|
|
| **Service Layer** | Business logic abstraction (services/*.py) |
|
|
| **Flask-Login** | Flask extension for user session management |
|
|
| **bcrypt** | Password hashing algorithm |
|
|
| **HSTS** | HTTP Strict Transport Security |
|
|
| **DMZ** | Demilitarized Zone (network security boundary) |
|
|
|
|
---
|
|
|
|
**Document Status:** ✅ Complete
|
|
**Diagram Validated:** 2026-04-04
|
|
**Mermaid Syntax:** v10.6+
|
|
**Renders in:** GitHub, GitLab, VS Code (with Mermaid extension)
|
|
**Production Verified:** 2026-04-04 (OVH VPS migration)
|