nordabiz/docs/architecture/05-database-schema.md
Maciej Pienczyn 110d971dca
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
feat: migrate prod docs to OVH VPS + UTC→Warsaw timezone in all templates
Production moved from on-prem VM 249 (10.22.68.249) to OVH VPS
(57.128.200.27, inpi-vps-waw01). Updated ALL documentation, slash
commands, memory files, architecture docs, and deploy procedures.

Added |local_time Jinja filter (UTC→Europe/Warsaw) and converted
155 .strftime() calls across 71 templates so timestamps display
in Polish timezone regardless of server timezone.

Also includes: created_by_id tracking, abort import fix, ICS
calendar fix for missing end times, Pros Poland data cleanup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:41:53 +02:00

1234 lines
37 KiB
Markdown

# Database Schema Diagram (Entity Relationship Diagram)
**Document Version:** 1.0
**Last Updated:** 2026-01-10
**Status:** Production LIVE
**Diagram Type:** Entity Relationship Diagram (ERD)
---
## Overview
This diagram shows the **complete database schema** for the Norda Biznes Partner application. It illustrates:
- **All 36 database entities** (tables) organized into functional domains
- **Relationships** between entities with proper cardinality
- **Key constraints** (primary keys, foreign keys, unique constraints)
- **Data organization** patterns and domain boundaries
**Database Technology:** PostgreSQL 14+
**ORM:** SQLAlchemy 2.0
**Total Tables:** 36
**Total Relationships:** 60+ foreign key relationships
**Special Features:** Full-Text Search (FTS), JSONB, ARRAY types, fuzzy matching
**Abstraction Level:** Data Model (ERD)
**Audience:** Database Administrators, Backend Developers, System Architects
**Purpose:** Understanding data structure, relationships, and database design patterns
---
## Database Architecture Overview
### Functional Domains
| Domain | Tables | Purpose |
|--------|--------|---------|
| **User Management** | 1 | User accounts, authentication, authorization |
| **Company Directory** | 10 | Company data, services, competencies, certifications |
| **Digital Maturity** | 5 | Digital maturity scoring, website analysis, quality tracking |
| **AI Chat** | 4 | Chat conversations, messages, feedback, cost tracking |
| **Forum** | 2 | Community forum topics and replies |
| **Calendar/Events** | 2 | Norda Biznes events and RSVPs |
| **Private Messages** | 1 | Peer-to-peer messaging |
| **B2B Classifieds** | 1 | Company listings (Szukam/Oferuję) |
| **Social & Contact** | 3 | Social media profiles, contact info, recommendations |
| **Auditing Systems** | 3 | GBP audits, IT audits, collaboration matching |
| **Membership Fees** | 2 | Payment tracking, fee configuration |
| **Notifications** | 1 | In-app notifications |
---
## Complete Entity Relationship Diagram
```mermaid
erDiagram
%% ============================================================
%% CORE DOMAIN - User Management & Company Directory
%% ============================================================
users {
int id PK
string email UK "UNIQUE, indexed"
string password_hash "NOT NULL"
string name
int company_id FK "nullable"
string company_nip
boolean is_active "default: true"
boolean is_verified "default: false"
boolean is_admin "default: false"
boolean is_norda_member "default: false"
timestamp created_at
timestamp last_login
string verification_token
string reset_token
}
companies {
int id PK
string name "NOT NULL"
string legal_name
string slug UK "UNIQUE, indexed"
int category_id FK
string nip UK "UNIQUE, 10 digits"
string regon "9 or 14 digits"
string krs "10 digits"
string website
string email
string phone
string status "active/inactive"
string data_quality "basic/enhanced/complete"
int digital_maturity_score "0-100"
boolean ai_enabled
array ai_tools_used "PostgreSQL ARRAY"
timestamp created_at
}
categories {
int id PK
string name UK "UNIQUE"
string slug UK "UNIQUE"
text description
string icon
int sort_order
}
services {
int id PK
string name UK "UNIQUE"
string slug UK "UNIQUE"
text description
}
competencies {
int id PK
string name UK "UNIQUE"
string slug UK "UNIQUE"
string category
text description
}
company_services {
int company_id PK,FK
int service_id PK,FK
boolean is_primary "default: false"
timestamp added_at
}
company_competencies {
int company_id PK,FK
int competency_id PK,FK
string level "proficiency level"
timestamp added_at
}
certifications {
int id PK
int company_id FK
string name "NOT NULL"
string issuer
string certificate_number
date issue_date
date expiry_date
boolean is_active
}
awards {
int id PK
int company_id FK
string name "NOT NULL"
string issuer
int year
text description
}
company_events {
int id PK
int company_id FK
string event_type "news_mention, press_release, etc"
string title "NOT NULL"
text description
date event_date
string source_url
}
%% ============================================================
%% DIGITAL MATURITY DOMAIN
%% ============================================================
company_digital_maturity {
int id PK
int company_id UK "UNIQUE"
int overall_score "0-100"
int online_presence_score "0-100"
int social_media_score "0-100"
int it_infrastructure_score "0-100"
int business_applications_score "0-100"
int backup_disaster_recovery_score "0-100"
int cybersecurity_score "0-100"
int ai_readiness_score "0-100"
int digital_marketing_score "0-100"
array critical_gaps
string improvement_priority "critical/high/medium/low"
numeric estimated_investment_needed
int rank_in_category
int rank_overall
int percentile
numeric total_opportunity_value
string sales_readiness "hot/warm/cold/not_ready"
}
company_website_analysis {
int id PK
int company_id FK
string website_url
int http_status_code
int load_time_ms
boolean has_ssl
timestamp ssl_expires_at
boolean is_responsive
string cms_detected
numeric google_rating "0.0-5.0"
int google_reviews_count
string google_place_id
int content_richness_score "1-10"
int pagespeed_seo_score "0-100"
int pagespeed_performance_score "0-100"
int pagespeed_accessibility_score "0-100"
int pagespeed_best_practices_score "0-100"
jsonb pagespeed_audits
boolean has_google_analytics
int opportunity_score "0-100"
numeric estimated_project_value
timestamp analyzed_at
}
maturity_assessments {
int id PK
int company_id FK
int assessed_by_user_id FK
timestamp assessed_at
string assessment_type "full/quick/self_reported/audit"
int overall_score
int online_presence_score
int social_media_score
int it_infrastructure_score
int score_change "change since last"
array areas_improved
array areas_declined
text notes
}
company_quality_tracking {
int id PK
int company_id UK "UNIQUE"
int verification_count
timestamp last_verified_at
string verified_by
text verification_notes
int quality_score "0-100"
int issues_found
int issues_fixed
}
company_website_content {
int id PK
int company_id FK
timestamp scraped_at
string url
int http_status
text raw_html
text raw_text
string page_title
text meta_description
text main_content
array email_addresses
array phone_numbers
jsonb social_media
int word_count
}
company_ai_insights {
int id PK
int company_id UK "UNIQUE"
int content_id FK
text business_summary
array services_list
text target_market
array unique_selling_points
array company_values
array certifications
string suggested_category
numeric category_confidence "0.00-1.00"
array industry_tags
numeric ai_confidence_score
int processing_time_ms
}
%% ============================================================
%% AI CHAT DOMAIN
%% ============================================================
ai_chat_conversations {
int id PK
int user_id FK "indexed"
string title
string conversation_type "general/search"
timestamp started_at
timestamp updated_at
boolean is_active
int message_count
string model_name
}
ai_chat_messages {
int id PK
int conversation_id FK "indexed"
timestamp created_at
string role "user/assistant"
text content "NOT NULL"
int tokens_input
int tokens_output
numeric cost_usd
int latency_ms
boolean edited
boolean regenerated
int feedback_rating "1=down, 2=up"
text feedback_comment
int companies_mentioned
string query_intent
}
ai_chat_feedback {
int id PK
int message_id UK "UNIQUE"
int user_id FK
int rating "1-5 stars"
boolean is_helpful
boolean is_accurate
boolean found_company
text comment
text suggested_answer
text original_query
text expected_companies
timestamp created_at
}
ai_api_costs {
int id PK
timestamp timestamp "indexed"
string api_provider "gemini/brave/etc"
string model_name
string feature "ai_chat/general"
int user_id FK "indexed"
int input_tokens
int output_tokens
int total_tokens
numeric input_cost "USD"
numeric output_cost "USD"
numeric total_cost "USD"
boolean success
text error_message
int latency_ms
string prompt_hash "SHA256"
}
%% ============================================================
%% COMMUNITY FEATURES DOMAIN
%% ============================================================
forum_topics {
int id PK
string title "NOT NULL"
text content "NOT NULL"
int author_id FK
boolean is_pinned
boolean is_locked
int views_count
timestamp created_at
timestamp updated_at
}
forum_replies {
int id PK
int topic_id FK
int author_id FK
text content "NOT NULL"
timestamp created_at
timestamp updated_at
}
norda_events {
int id PK
string title "NOT NULL"
text description
string event_type "meeting/webinar/networking"
date event_date "NOT NULL"
time time_start
time time_end
string location
string location_url
string speaker_name
int speaker_company_id FK
boolean is_featured
int max_attendees
int created_by FK
timestamp created_at
}
event_attendees {
int id PK
int event_id FK
int user_id FK
string status "confirmed/maybe/declined"
timestamp registered_at
}
private_messages {
int id PK
int sender_id FK
int recipient_id FK
string subject
text content "NOT NULL"
boolean is_read
timestamp read_at
int parent_id FK "thread support"
timestamp created_at
}
classifieds {
int id PK
int author_id FK
int company_id FK
string listing_type "szukam/oferuje"
string category "uslugi/produkty/wspolpraca"
string title "NOT NULL"
text description "NOT NULL"
string budget_info
string location_info
boolean is_active
timestamp expires_at
int views_count
timestamp created_at
}
%% ============================================================
%% SOCIAL & CONTACT DOMAIN
%% ============================================================
company_contacts {
int id PK
int company_id FK
string contact_type "phone/email/fax/mobile"
string value
string purpose "Biuro/Sprzedaż"
boolean is_primary
string source "website/krs/google"
string source_url
date source_date
boolean is_verified
timestamp verified_at
string verified_by
}
company_social_media {
int id PK
int company_id FK "indexed"
string platform "facebook/linkedin/instagram"
string url
timestamp verified_at "indexed"
string source "website_scrape/brave_search"
boolean is_valid
timestamp last_checked_at
string check_status "ok/404/redirect"
string page_name
int followers_count
timestamp created_at
}
company_recommendations {
int id PK
int company_id FK "indexed"
int user_id FK "indexed"
text recommendation_text
string service_category
boolean show_contact
string status "pending/approved/rejected"
int moderated_by FK
timestamp moderated_at
text rejection_reason
timestamp created_at
}
%% ============================================================
%% AUDITING SYSTEMS DOMAIN
%% ============================================================
gbp_audits {
int id PK
int company_id FK "indexed"
timestamp audit_date "indexed"
int completeness_score "0-100"
jsonb fields_status
jsonb recommendations
boolean has_name
boolean has_address
boolean has_phone
boolean has_website
boolean has_hours
boolean has_categories
boolean has_photos
boolean has_description
boolean has_services
boolean has_reviews
int photo_count
boolean logo_present
boolean cover_photo_present
int review_count
numeric average_rating "0.0-5.0"
string google_place_id
string google_maps_url
string audit_source "manual/automated/api"
string audit_version
text audit_errors
}
it_audits {
int id PK
int company_id FK "indexed"
timestamp audit_date "indexed"
string audit_source "form/api_sync"
int audited_by FK
int overall_score "0-100"
int completeness_score "0-100"
int security_score "0-100"
int collaboration_score "0-100"
string maturity_level "basic/developing/established/advanced"
boolean has_it_manager
boolean it_outsourced
string it_provider_name
boolean has_azure_ad
string azure_tenant_name
string azure_user_count "1-10, 11-50, etc"
boolean has_m365
array m365_plans
array teams_usage
boolean has_google_workspace
string server_count "0, 1-3, 4-10, 10+"
array server_types "physical/vm_onprem/cloud_iaas"
string virtualization_platform "vmware/hyperv/proxmox"
array server_os
string network_firewall_brand
string employee_count
string computer_count
array desktop_os
boolean has_mdm
string antivirus_solution
boolean has_edr
boolean has_vpn
boolean has_mfa
array mfa_scope
string backup_solution
array backup_targets
string backup_frequency
boolean has_proxmox_pbs
boolean has_dr_plan
string monitoring_solution
jsonb zabbix_integration
boolean open_to_shared_licensing
boolean open_to_backup_replication
boolean open_to_teams_federation
boolean open_to_shared_monitoring
boolean open_to_collective_purchasing
boolean open_to_knowledge_sharing
jsonb form_data
jsonb recommendations
text audit_errors
}
it_collaboration_matches {
int id PK
int company_a_id FK "indexed"
int company_b_id FK "indexed"
string match_type "indexed: shared_licensing, etc"
text match_reason
int match_score "0-100"
string status "suggested/contacted/in_progress"
jsonb shared_attributes
timestamp created_at
}
%% ============================================================
%% MEMBERSHIP MANAGEMENT DOMAIN
%% ============================================================
membership_fees {
int id PK
int company_id FK "indexed"
int fee_year "e.g., 2026"
int fee_month "1-12"
numeric amount "PLN"
numeric amount_paid "PLN"
string status "pending/paid/partial/overdue"
date payment_date
string payment_method "transfer/cash/card"
string payment_reference
int recorded_by FK
timestamp recorded_at
text notes
}
membership_fee_config {
int id PK
string scope "global/category/company"
int category_id FK "nullable"
int company_id FK "nullable"
numeric monthly_amount "PLN"
date valid_from
date valid_until "NULL = active"
int created_by FK
text notes
timestamp created_at
}
%% ============================================================
%% NOTIFICATIONS DOMAIN
%% ============================================================
user_notifications {
int id PK
int user_id FK "indexed"
string title
text message
string notification_type "news/system/message/event"
string related_type "company_news/event/message"
int related_id
boolean is_read "indexed"
timestamp read_at
string action_url
timestamp created_at "indexed"
}
%% ============================================================
%% RELATIONSHIPS - Core Domain
%% ============================================================
users ||--o{ companies : "manages (company_id)"
companies }o--|| categories : "belongs_to (category_id)"
companies ||--o{ company_services : "has"
services ||--o{ company_services : "offered_by"
companies ||--o{ company_competencies : "has"
competencies ||--o{ company_competencies : "possessed_by"
companies ||--o{ certifications : "holds"
companies ||--o{ awards : "received"
companies ||--o{ company_events : "has_events"
%% ============================================================
%% RELATIONSHIPS - Digital Maturity Domain
%% ============================================================
companies ||--o| company_digital_maturity : "has_maturity (1:1)"
companies ||--o{ company_website_analysis : "analyzed"
companies ||--o{ maturity_assessments : "assessed"
companies ||--o| company_quality_tracking : "tracked (1:1)"
companies ||--o{ company_website_content : "scraped"
companies ||--o| company_ai_insights : "analyzed_by_ai (1:1)"
maturity_assessments }o--|| users : "assessed_by"
company_ai_insights }o--o| company_website_content : "based_on"
%% ============================================================
%% RELATIONSHIPS - AI Chat Domain
%% ============================================================
users ||--o{ ai_chat_conversations : "owns"
ai_chat_conversations ||--o{ ai_chat_messages : "contains"
ai_chat_messages ||--o| ai_chat_feedback : "has_feedback (1:1)"
ai_chat_feedback }o--|| users : "submitted_by"
ai_api_costs }o--o| users : "attributed_to"
%% ============================================================
%% RELATIONSHIPS - Community Features Domain
%% ============================================================
users ||--o{ forum_topics : "created (author_id)"
users ||--o{ forum_replies : "created (author_id)"
forum_topics ||--o{ forum_replies : "has_replies"
users ||--o{ norda_events : "created (created_by)"
companies ||--o{ norda_events : "speaker_company"
norda_events ||--o{ event_attendees : "has_attendees"
users ||--o{ event_attendees : "registered (user_id)"
users ||--o{ private_messages : "sent (sender_id)"
users ||--o{ private_messages : "received (recipient_id)"
private_messages ||--o{ private_messages : "thread (parent_id)"
users ||--o{ classifieds : "posted (author_id)"
companies ||--o{ classifieds : "related_to"
%% ============================================================
%% RELATIONSHIPS - Social & Contact Domain
%% ============================================================
companies ||--o{ company_contacts : "has_contacts"
companies ||--o{ company_social_media : "has_profiles"
companies ||--o{ company_recommendations : "recommended"
users ||--o{ company_recommendations : "recommender"
users ||--o{ company_recommendations : "moderator"
%% ============================================================
%% RELATIONSHIPS - Auditing Systems Domain
%% ============================================================
companies ||--o{ gbp_audits : "audited_gbp"
companies ||--o{ it_audits : "audited_it"
users ||--o{ it_audits : "auditor"
companies ||--o{ it_collaboration_matches : "match_company_a"
companies ||--o{ it_collaboration_matches : "match_company_b"
%% ============================================================
%% RELATIONSHIPS - Membership Management Domain
%% ============================================================
companies ||--o{ membership_fees : "pays_fees"
users ||--o{ membership_fees : "recorded_by"
categories ||--o{ membership_fee_config : "category_fees"
companies ||--o{ membership_fee_config : "company_fees"
users ||--o{ membership_fee_config : "configured_by"
%% ============================================================
%% RELATIONSHIPS - Notifications Domain
%% ============================================================
users ||--o{ user_notifications : "receives"
```
---
## Key Relationships Explained
### One-to-Many Relationships (45+ total)
| Parent | Child | Cardinality | Cascade | Description |
|--------|-------|-------------|---------|-------------|
| **User** → Company | 1:many | No cascade | Users can manage multiple companies |
| **Company** → Certifications | 1:many | CASCADE | Company certifications auto-delete with company |
| **Company** → Awards | 1:many | CASCADE | Company awards auto-delete with company |
| **Company** → CompanyEvents | 1:many | CASCADE | Company news/events auto-delete |
| **User** → AIChatConversation | 1:many | CASCADE | User's chat history deleted with user |
| **AIChatConversation** → AIChatMessage | 1:many | CASCADE | Messages deleted when conversation deleted |
| **User** → ForumTopic | 1:many | CASCADE | User's forum topics deleted with user |
| **ForumTopic** → ForumReply | 1:many | CASCADE | Replies deleted when topic deleted |
| **NordaEvent** → EventAttendee | 1:many | CASCADE | RSVPs deleted when event deleted |
| **Company** → MembershipFee | 1:many | CASCADE | Fee records deleted with company |
| **Company** → GBPAudit | 1:many | CASCADE | Audit history deleted with company |
| **Company** → ITAudit | 1:many | CASCADE | Audit history deleted with company |
### Many-to-Many Relationships (2 total)
| Entity A | Junction Table | Entity B | Description |
|----------|----------------|----------|-------------|
| **Company** | company_services | **Service** | Companies can offer multiple services |
| **Company** | company_competencies | **Competency** | Companies can have multiple competencies |
**Junction Table Pattern:**
```sql
-- company_services (composite primary key)
PRIMARY KEY (company_id, service_id)
FOREIGN KEY company_id companies.id
FOREIGN KEY service_id services.id
+ is_primary (boolean) - flag primary service
+ added_at (timestamp) - when added
```
### One-to-One Relationships (4 total)
| Parent | Child | Constraint | Description |
|--------|-------|------------|-------------|
| **Company** → CompanyDigitalMaturity | 1:1 | UNIQUE(company_id) | One maturity record per company |
| **Company** → CompanyQualityTracking | 1:1 | UNIQUE(company_id) | One quality tracking record |
| **Company** → CompanyAIInsights | 1:1 | UNIQUE(company_id) | One AI insights record |
| **AIChatMessage** → AIChatFeedback | 1:1 | UNIQUE(message_id) | One feedback per message |
### Self-Referential Relationships (1 total)
| Table | Relationship | Description |
|-------|--------------|-------------|
| **PrivateMessage** → PrivateMessage | parent_id → id | Message threading (conversations) |
### Many-to-Many (Self-Join) Relationships (1 total)
| Table | Relationship | Description |
|-------|--------------|-------------|
| **Company****Company** | via ITCollaborationMatch | IT collaboration opportunities between two companies |
```sql
-- it_collaboration_matches
company_a_id companies.id
company_b_id companies.id
UNIQUE(company_a_id, company_b_id, match_type)
```
---
## Unique Constraints & Indexes
### Unique Constraints (20+ total)
| Table | Column(s) | Purpose |
|-------|-----------|---------|
| users | `email` | One account per email |
| companies | `slug` | Unique URL identifier |
| companies | `nip` | One company per NIP (nullable) |
| categories | `name`, `slug` | Unique category identifiers |
| services | `name`, `slug` | Unique service identifiers |
| competencies | `name`, `slug` | Unique competency identifiers |
| company_digital_maturity | `company_id` | One maturity record per company |
| company_quality_tracking | `company_id` | One tracking record per company |
| company_ai_insights | `company_id` | One AI insights per company |
| ai_chat_feedback | `message_id` | One feedback per message |
| company_contacts | `(company_id, contact_type, value)` | Prevent duplicate contacts |
| company_social_media | `(company_id, platform, url)` | Prevent duplicate social links |
| company_recommendations | `(user_id, company_id)` | One recommendation per user-company pair |
| it_collaboration_matches | `(company_a_id, company_b_id, match_type)` | Unique collaboration matches |
| membership_fees | `(company_id, fee_year, fee_month)` | One fee record per company per month |
### Performance Indexes (60+ total)
**Primary Key Indexes (36):** All tables have auto-indexed primary key `id`
**Foreign Key Indexes (40+):**
- All `company_id` columns (indexed for JOIN performance)
- All `user_id` columns (indexed for user-related queries)
- All `category_id`, `service_id`, `competency_id` columns
- All conversation/topic/event relationship keys
**Composite Indexes (5+):**
| Table | Columns | Purpose |
|-------|---------|---------|
| company_website_analysis | `(company_id, analyzed_at)` | Latest analysis queries |
| gbp_audits | `(company_id, audit_date)` | Latest audit queries |
| it_audits | `(company_id, audit_date)` | Latest audit queries |
| user_notifications | `(user_id, is_read, created_at)` | Unread notifications queries |
| ai_api_costs | `(timestamp, user_id)` | Cost tracking queries |
**Special Indexes:**
| Table | Type | Purpose |
|-------|------|---------|
| companies | Full-Text Search (tsvector) | Fast company search (PostgreSQL FTS) |
| companies | pg_trgm (trigram) | Fuzzy matching for typos (when available) |
---
## PostgreSQL-Specific Features
### Native Data Types
**ARRAY Types:**
- Used for: `ai_tools_used`, `m365_plans`, `server_types`, `critical_gaps`, `areas_improved`, etc.
- Storage: PostgreSQL native ARRAY(String)
- Fallback: JSON string in SQLite
```sql
-- Example
ai_tools_used ARRAY(String) -- ['ChatGPT', 'Copilot', 'Gemini']
```
**JSONB Types:**
- Used for: `pagespeed_audits`, `fields_status`, `recommendations`, `form_data`, `zabbix_integration`
- Storage: Binary JSON with indexing support
- Fallback: JSON string in SQLite
```sql
-- Example
pagespeed_audits JSONB -- {'performance': 85, 'seo': 92, ...}
```
**Numeric Types:**
- `Numeric(10,2)` - Currency (PLN) - e.g., 150.00
- `Numeric(2,1)` - Ratings (0.0-5.0) - e.g., 4.5
- `Numeric(3,2)` - Confidence scores (0.00-1.00) - e.g., 0.87
### Full-Text Search (FTS)
**Implementation:**
```sql
-- companies table has tsvector column for search
search_vector tsvector
-- Trigger updates search_vector on INSERT/UPDATE
CREATE TRIGGER companies_search_trigger
BEFORE INSERT OR UPDATE ON companies
FOR EACH ROW EXECUTE FUNCTION companies_search_trigger();
-- Search query example
SELECT * FROM companies
WHERE search_vector @@ to_tsquery('polish', 'strony & www');
```
**Indexed Columns:** `name`, `description_short`, `description_full`, services, competencies
### Fuzzy Matching (pg_trgm)
**Extension:** `pg_trgm` (Trigram matching)
```sql
-- Find companies with similar names (typos)
SELECT * FROM companies
WHERE similarity(name, 'PIXLB') > 0.3 -- matches 'PIXLAB'
ORDER BY similarity(name, 'PIXLB') DESC;
```
---
## Data Validation & Constraints
### Check Constraints
**NIP Validation (PostgreSQL):**
```sql
CONSTRAINT valid_nip CHECK (
nip ~ '^\d{10}$' OR nip IS NULL
)
-- Ensures NIP is exactly 10 digits or NULL
```
**Email Validation (PostgreSQL):**
```sql
CONSTRAINT valid_email CHECK (
email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$'
OR email IS NULL
)
-- Regex validation for email format
```
### Enums (PostgreSQL)
**Data Quality Levels:**
```sql
CREATE TYPE data_quality_level AS ENUM (
'basic', -- Minimal data (name, NIP, contact)
'partial', -- Some enriched data
'complete', -- Full data with verifications
'verified' -- Manually verified
);
```
**Company Status:**
```sql
CREATE TYPE company_status AS ENUM (
'active', -- Active company
'inactive', -- Inactive company
'pending', -- Pending verification
'archived' -- Archived
);
```
---
## Cascade Behaviors
### ON DELETE CASCADE (Database Level)
Applied to prevent orphaned records:
| Parent Table | Child Table | Cascade On |
|-------------|-------------|------------|
| companies | company_contacts | DELETE |
| companies | company_social_media | DELETE |
| companies | company_recommendations | DELETE |
| companies | gbp_audits | DELETE |
| companies | it_audits | DELETE |
| companies | it_collaboration_matches | DELETE (both company_a and company_b) |
| companies | membership_fees | DELETE |
| users | user_notifications | DELETE |
**Effect:** When a company or user is deleted, all related audit records, contacts, and notifications are automatically removed.
### SQLAlchemy cascade='all, delete-orphan' (ORM Level)
Applied to parent-child relationships where children cannot exist without parent:
```python
# Example from User model
conversations = relationship(
'AIChatConversation',
back_populates='user',
cascade='all, delete-orphan'
)
```
**Applied To:**
- User → AIChatConversation
- User → ForumTopic
- User → ForumReply
- Company → CompanyService (M2M junction)
- Company → CompanyCompetency (M2M junction)
- Company → Certification
- Company → Award
- Company → CompanyEvent
- Company → CompanyWebsiteAnalysis
- Company → MaturityAssessment
- Company → CompanyWebsiteContent
- AIChatConversation → AIChatMessage
- ForumTopic → ForumReply
- NordaEvent → EventAttendee
**Effect:** When parent is deleted via SQLAlchemy, all children are deleted. Orphaned children (parent_id becomes NULL) are also deleted.
---
## Database Statistics
### Table Statistics
| Metric | Count |
|--------|-------|
| **Total Tables** | 36 |
| **Core Domain Tables** | 10 |
| **Digital Maturity Tables** | 5 |
| **AI Chat Tables** | 4 |
| **Community Features** | 6 |
| **Audit Systems** | 3 |
| **Support Tables** | 8 |
### Relationship Statistics
| Relationship Type | Count | Examples |
|-------------------|-------|----------|
| **One-to-Many** | 45+ | Company → Certifications, User → Conversations |
| **Many-to-Many** | 2 | Company ↔ Services, Company ↔ Competencies |
| **One-to-One** | 4 | Company → CompanyDigitalMaturity |
| **Self-Referential** | 2 | PrivateMessage → PrivateMessage, Company ↔ Company (IT matches) |
### Index Statistics
| Index Type | Count | Purpose |
|------------|-------|---------|
| **Primary Keys** | 36 | Unique row identifiers |
| **Foreign Key Indexes** | 60+ | JOIN performance |
| **Unique Constraints** | 20+ | Data integrity |
| **Composite Indexes** | 5+ | Multi-column queries |
| **Full-Text Search** | 1 | Company search |
| **Trigram (Fuzzy)** | 1 | Typo tolerance |
---
## Schema Evolution & Versioning
### Version History
| Version | Date | Changes |
|---------|------|---------|
| 1.0 | 2025-11-23 | Initial schema (basic company directory, users, auth) |
| 1.1 | 2025-11-26 | Digital maturity system (CompanyDigitalMaturity, MaturityAssessment) |
| 1.2 | 2025-12-29 | Social media & news monitoring (CompanySocialMedia, CompanyEvents) |
| 1.3 | 2026-01-09 | IT audit & collaboration (ITAudit, ITCollaborationMatch) |
| 1.4 | 2026-01-10 | Current production schema (36 tables) |
### Migration Strategy
**Development:**
```bash
# Create migration
alembic revision --autogenerate -m "Add IT audit tables"
# Apply migration
alembic upgrade head
```
**Production:**
```bash
# SSH to OVH VPS
ssh maciejpi@57.128.200.27
# Backup database
pg_dump nordabiz > backup_$(date +%Y%m%d).sql
# Apply migration
cd /var/www/nordabiznes
/var/www/nordabiznes/venv/bin/python3 scripts/run_migration.py database/migrations/XXX_nazwa.sql
# Verify
psql -U nordabiz_app -d nordabiz -c "\dt"
```
---
## Best Practices
### Query Optimization
**✅ DO:**
```python
# Use indexed columns in WHERE clauses
companies = db.query(Company).filter(Company.slug == 'pixlab-sp-z-o-o').first()
# Use eager loading for frequently joined relations
companies = db.query(Company).options(joinedload(Company.category)).all()
# Limit result sets
companies = db.query(Company).limit(10).all()
```
**❌ DON'T:**
```python
# Avoid SELECT * without WHERE
all_companies = db.query(Company).all() # loads entire table
# Avoid N+1 queries
for company in companies:
print(company.category.name) # separate query for each company
# Avoid unindexed WHERE clauses
companies = db.query(Company).filter(Company.description_full.like('%keyword%')).all()
```
### Data Integrity
**Transaction Management:**
```python
from database import SessionLocal
db = SessionLocal()
try:
# Multi-table operation
company = Company(name="New Company", slug="new-company")
db.add(company)
maturity = CompanyDigitalMaturity(company_id=company.id, overall_score=0)
db.add(maturity)
db.commit() # Atomic commit
except Exception as e:
db.rollback() # Rollback on error
raise
finally:
db.close()
```
### Connection Management
**Production Settings:**
```python
# Connection pooling (SQLAlchemy default)
engine = create_engine(
DATABASE_URL,
pool_size=20, # Max active connections
max_overflow=10, # Max overflow connections
pool_timeout=30, # Connection timeout (seconds)
pool_recycle=3600 # Recycle connections after 1 hour
)
```
### Security
**User Permissions:**
```sql
-- Application user (nordabiz_app) has limited permissions
GRANT CONNECT ON DATABASE nordabiz TO nordabiz_app;
GRANT USAGE ON SCHEMA public TO nordabiz_app;
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO nordabiz_app;
GRANT USAGE, SELECT ON ALL SEQUENCES IN SCHEMA public TO nordabiz_app;
-- PostgreSQL listens only on localhost (security)
listen_addresses = 'localhost'
```
**Never store sensitive data:**
- Passwords must be hashed (bcrypt)
- API keys only in `.env` files (never in database)
- Credit card data should never be stored
---
## Glossary
| Term | Definition |
|------|------------|
| **PK** | Primary Key - unique identifier for table row |
| **FK** | Foreign Key - reference to another table's primary key |
| **UK** | Unique Key - column(s) that must have unique values |
| **CASCADE** | Auto-delete related records when parent is deleted |
| **JSONB** | PostgreSQL binary JSON format (indexed, searchable) |
| **ARRAY** | PostgreSQL native array type for lists |
| **tsvector** | PostgreSQL full-text search vector |
| **pg_trgm** | PostgreSQL trigram extension for fuzzy matching |
| **ORM** | Object-Relational Mapping (SQLAlchemy) |
| **ERD** | Entity Relationship Diagram |
---
## Maintenance
### Regular Tasks
**Daily:**
- Monitor database size: `SELECT pg_size_pretty(pg_database_size('nordabiz'));`
- Check slow queries: `SELECT * FROM pg_stat_statements ORDER BY mean_exec_time DESC LIMIT 10;`
**Weekly:**
- Vacuum analyze: `VACUUM ANALYZE;`
- Check index usage: `SELECT * FROM pg_stat_user_indexes WHERE idx_scan = 0;`
**Monthly:**
- Review data quality scores
- Archive old audit records (>6 months)
- Optimize indexes if needed
### Monitoring Queries
**Find missing indexes:**
```sql
SELECT schemaname, tablename, attname, n_distinct, correlation
FROM pg_stats
WHERE schemaname = 'public'
AND n_distinct > 100
AND correlation < 0.1;
```
**Check table sizes:**
```sql
SELECT
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;
```
---
## Related Documentation
- **Flask Components Diagram:** `04-flask-components.md` - How Flask app uses these models
- **Container Diagram:** `02-container-diagram.md` - PostgreSQL container details
- **Deployment Architecture:** `03-deployment-architecture.md` - Database server configuration
- **Database Schema Analysis:** `../.auto-claude/specs/003-.../analysis/database-schema.md` - Detailed model documentation
---
**Document Version:** 1.0
**Last Updated:** 2026-01-10
**Maintained By:** Norda Biznes Development Team
**Next Review:** When schema changes are deployed