feat(ai): Thinking Mode dla NordaGPT
- Nowy SDK google-genai z obsługą thinking mode - Przełącznik poziomu rozumowania w UI chatu (3 poziomy) - Błyskawiczny (minimal) - szybkie odpowiedzi - Szybki (low) - zrównoważony - Głęboki (high) - maksymalna analiza - Endpoint /api/chat/settings do zapisywania preferencji - Dokumentacja dla NotebookLM (prezentacja) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This commit is contained in:
parent
2c13509cd1
commit
30729ef83e
89
CLAUDE.md
89
CLAUDE.md
@ -347,3 +347,92 @@ Szczegóły: `docs/DEVELOPMENT.md#audyt-seo`
|
||||
## Plan rozwoju
|
||||
|
||||
Roadmap, priorytety i strategia monetyzacji: `docs/ROADMAP.md`
|
||||
|
||||
## NordaGPT - Konfiguracja AI
|
||||
|
||||
### Aktualny model (stan: 2026-01-29)
|
||||
- **Model:** `gemini-3-flash-preview` (Gemini 3 Flash Preview)
|
||||
- **SDK:** `google-genai>=1.0.0` (nowy SDK z thinking mode)
|
||||
- **Inicjalizacja:** `app.py:286` - `gemini_service.init_gemini_service(model='3-flash')`
|
||||
- **Silnik chatu:** `nordabiz_chat.py` używa globalnego `gemini_service`
|
||||
|
||||
### Thinking Mode (NOWE!)
|
||||
Użytkownicy mogą wybierać poziom rozumowania AI w UI chatu (dropdown obok badge "Gemini 3").
|
||||
|
||||
| Poziom | Opis | Zastosowanie |
|
||||
|--------|------|--------------|
|
||||
| **Błyskawiczny** (minimal) | Najszybsze odpowiedzi | Proste pytania: "kto?", "gdzie?" |
|
||||
| **Szybki** (low) | Zrównoważony | Większość pytań o firmy i usługi |
|
||||
| **Głęboki** (high) | Maksymalna analiza | Złożone pytania, rekomendacje, strategia |
|
||||
|
||||
**Zmiana poziomu:**
|
||||
- UI: Dropdown w headerze chatu
|
||||
- API: `POST /api/chat/settings` z `{"thinking_level": "high"}`
|
||||
- Zapisywane w sesji użytkownika
|
||||
|
||||
### Dostępne modele w `gemini_service.py`
|
||||
| Alias | Model ID | Opis |
|
||||
|-------|----------|------|
|
||||
| `flash` | `gemini-2.5-flash` | Ogólnego przeznaczenia |
|
||||
| `flash-lite` | `gemini-2.5-flash-lite` | Ultra tani ($0.10/$0.40 per 1M) |
|
||||
| `pro` | `gemini-2.5-pro` | Najlepszy reasoning |
|
||||
| `flash-2.0` | `gemini-2.0-flash` | Poprzednia generacja (wycofywany 31.03.2026) |
|
||||
| `3-flash` | `gemini-3-flash-preview` | **AKTUALNY** - 7x lepszy reasoning, thinking mode |
|
||||
| `3-pro` | `gemini-3-pro-preview` | Premium - 2M context |
|
||||
|
||||
### Zmiana modelu
|
||||
```python
|
||||
# W app.py linia ~286:
|
||||
gemini_service.init_gemini_service(model='3-flash') # Zmień alias tutaj
|
||||
```
|
||||
|
||||
### UI Badge
|
||||
W `templates/chat.html` badge w headerze: `<span class="chat-header-badge">Gemini 3</span>`
|
||||
|
||||
## Prezentacja dla członków Izby (AKTYWNY PROJEKT)
|
||||
|
||||
### Cel projektu
|
||||
Stworzenie materiałów wideo prezentujących portal NordaBiz dla członków Izby NORDA.
|
||||
|
||||
### Produkty końcowe
|
||||
1. **Podcast NotebookLM** (2-3 min) - rozmowa AI o portalu
|
||||
2. **Zajawka Remotion** (30s) - scenariusz "Problem → Rozwiązanie"
|
||||
3. **Tutorial wideo** (2-3 min) - nagrania portalu + dialogi Zofia/Marek
|
||||
4. **Integracja z portalem** - Akademia + widget na dashboardzie
|
||||
|
||||
### Dokument źródłowy dla NotebookLM
|
||||
`docs/notebooklm-source.md` - markdown do wgrania do notebooklm.google.com
|
||||
|
||||
### Scenariusz zajawki 30s (Remotion)
|
||||
```
|
||||
[0-8s] "Szukasz partnera do projektu?"
|
||||
[8-16s] "Nie wiesz, kto w Izbie ma potrzebne kompetencje?"
|
||||
[16-24s] "NordaGPT zna 150 firm i pomoże Ci znaleźć"
|
||||
[24-30s] "nordabiznes.pl - Twoja sieć kontaktów"
|
||||
```
|
||||
|
||||
### Głosy edge-tts
|
||||
- Marek: `pl-PL-MarekNeural` (męski)
|
||||
- Zofia: `pl-PL-ZofiaNeural` (żeński)
|
||||
|
||||
### GIFy do nagrania (Chrome MCP)
|
||||
1. Strona główna zalogowanego
|
||||
2. Katalog firm + filtrowanie
|
||||
3. Profil firmy
|
||||
4. Chat NordaGPT - pytanie
|
||||
5. Chat NordaGPT - odpowiedź
|
||||
6. Forum
|
||||
7. Kalendarz
|
||||
8. Tablica B2B
|
||||
|
||||
### Pliki Remotion
|
||||
- Lokalizacja: `/Users/maciejpi/claude/projects/active/remotion/my-video/`
|
||||
- Komponenty: `NordaBizZajawka.tsx`, `NordaBizTutorial.tsx`
|
||||
- Audio: `public/audio/`, `public/voice/tutorial/`
|
||||
- Nagrania: `public/recordings/*.gif`
|
||||
|
||||
### Konto testowe (PROD)
|
||||
- Email: `test@nordabiznes.pl`
|
||||
- Hasło: `&Rc2LdbSw&jiGR0ek@Bz`
|
||||
|
||||
Data projektu: 2026-01-29
|
||||
|
||||
60
app.py
60
app.py
@ -4513,6 +4513,44 @@ def chat():
|
||||
return render_template('chat.html')
|
||||
|
||||
|
||||
@app.route('/api/chat/settings', methods=['GET', 'POST'])
|
||||
@csrf.exempt
|
||||
@login_required
|
||||
def chat_settings():
|
||||
"""Get or update chat settings (thinking level)"""
|
||||
if request.method == 'GET':
|
||||
# Get current thinking level from session or default
|
||||
thinking_level = session.get('thinking_level', 'high')
|
||||
return jsonify({
|
||||
'success': True,
|
||||
'thinking_level': thinking_level
|
||||
})
|
||||
|
||||
# POST - update settings
|
||||
try:
|
||||
data = request.get_json()
|
||||
thinking_level = data.get('thinking_level', 'high')
|
||||
|
||||
# Validate thinking level
|
||||
valid_levels = ['minimal', 'low', 'medium', 'high']
|
||||
if thinking_level not in valid_levels:
|
||||
thinking_level = 'high'
|
||||
|
||||
# Store in session
|
||||
session['thinking_level'] = thinking_level
|
||||
|
||||
logger.info(f"User {current_user.id} set thinking_level to: {thinking_level}")
|
||||
|
||||
return jsonify({
|
||||
'success': True,
|
||||
'thinking_level': thinking_level
|
||||
})
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error updating chat settings: {e}")
|
||||
return jsonify({'success': False, 'error': str(e)}), 500
|
||||
|
||||
|
||||
@app.route('/api/chat/start', methods=['POST'])
|
||||
@csrf.exempt
|
||||
@login_required
|
||||
@ -4564,11 +4602,15 @@ def chat_send_message(conversation_id):
|
||||
finally:
|
||||
db.close()
|
||||
|
||||
# Get thinking level from request or session
|
||||
thinking_level = data.get('thinking_level') or session.get('thinking_level', 'high')
|
||||
|
||||
chat_engine = NordaBizChatEngine()
|
||||
response = chat_engine.send_message(
|
||||
conversation_id=conversation_id,
|
||||
user_message=message,
|
||||
user_id=current_user.id
|
||||
user_id=current_user.id,
|
||||
thinking_level=thinking_level
|
||||
)
|
||||
|
||||
# Get free tier usage stats for today
|
||||
@ -4586,21 +4628,23 @@ def chat_send_message(conversation_id):
|
||||
'created_at': response.created_at.isoformat(),
|
||||
# Technical metadata
|
||||
'tech_info': {
|
||||
'model': 'gemini-2.0-flash',
|
||||
'data_source': 'PostgreSQL (111 firm Norda Biznes)',
|
||||
'architecture': 'Full DB Context (wszystkie firmy w kontekście AI)',
|
||||
'model': gemini_service.get_gemini_service().model_name if gemini_service.get_gemini_service() else 'gemini-3-flash-preview',
|
||||
'thinking_level': thinking_level,
|
||||
'thinking_enabled': gemini_service.get_gemini_service().thinking_enabled if gemini_service.get_gemini_service() else True,
|
||||
'data_source': 'PostgreSQL (150 firm Norda Biznes)',
|
||||
'architecture': 'Full DB Context + Thinking Mode',
|
||||
'tokens_input': tokens_in,
|
||||
'tokens_output': tokens_out,
|
||||
'tokens_total': tokens_in + tokens_out,
|
||||
'latency_ms': response.latency_ms or 0,
|
||||
'theoretical_cost_usd': round(theoretical_cost, 6),
|
||||
'actual_cost_usd': 0.0, # Free tier
|
||||
'actual_cost_usd': 0.0, # Paid tier but tracked
|
||||
'free_tier': {
|
||||
'is_free': True,
|
||||
'daily_limit': 1500, # Gemini free tier: 1500 req/day
|
||||
'is_free': False,
|
||||
'daily_limit': 10000, # Gemini paid tier
|
||||
'requests_today': free_tier_stats['requests_today'],
|
||||
'tokens_today': free_tier_stats['tokens_today'],
|
||||
'remaining': max(0, 1500 - free_tier_stats['requests_today'])
|
||||
'remaining': max(0, 10000 - free_tier_stats['requests_today'])
|
||||
}
|
||||
}
|
||||
})
|
||||
|
||||
163
docs/notebooklm-source.md
Normal file
163
docs/notebooklm-source.md
Normal file
@ -0,0 +1,163 @@
|
||||
# Norda Biznes Hub - Dokumentacja Platformy
|
||||
|
||||
## O projekcie
|
||||
|
||||
**Norda Biznes Hub** to nowoczesna platforma katalogowa i networkingowa dla członków **Izby Przedsiębiorców NORDA** z Wejherowa. Portal jest dostępny pod adresem https://nordabiznes.pl i działa produkcyjnie od 23 listopada 2025 roku.
|
||||
|
||||
Platforma zrzesza **150 podmiotów gospodarczych** z 4 głównych kategorii branżowych i 17 podkategorii. To centralny hub dla przedsiębiorców z Pomorza - miejsce gdzie można znaleźć partnera biznesowego, sprawdzić kompetencje firm członkowskich i nawiązać współpracę.
|
||||
|
||||
## Główne funkcje portalu
|
||||
|
||||
### 1. Katalog firm (150 podmiotów)
|
||||
|
||||
Pełny katalog wszystkich firm członkowskich Izby NORDA z podziałem na kategorie:
|
||||
- **Usługi** - IT i technologie, księgowość, usługi finansowe, HR, marketing, doradztwo prawne
|
||||
- **Budownictwo** - budownictwo ogólne, instalacje, wykończenia
|
||||
- **Handel** - hurt, detal, automotive
|
||||
- **Produkcja** - produkcja ogólna, meble, przemysł
|
||||
|
||||
Każda firma ma szczegółowy profil zawierający:
|
||||
- Dane kontaktowe (telefon, email, adres)
|
||||
- Opis działalności i specjalizacji
|
||||
- Usługi i kompetencje (tagi)
|
||||
- Dane rejestrowe (NIP, REGON, KRS)
|
||||
- Linki do social mediów (Facebook, LinkedIn, Instagram)
|
||||
- Data przystąpienia do Izby NORDA
|
||||
- Oceny Google (z audytu Google Business Profile)
|
||||
|
||||
### 2. NordaGPT - Asystent AI
|
||||
|
||||
NordaGPT to inteligentny chatbot oparty na **Google Gemini 3 Flash** - najnowszej generacji modelu AI z zaawansowanym rozumowaniem (thinking mode). Jest to flagowa funkcja platformy.
|
||||
|
||||
**Co potrafi NordaGPT:**
|
||||
- Znajdowanie firm po usługach, branży lub słowach kluczowych
|
||||
- Odpowiadanie na pytania o członków Izby (np. "Kto jest prezesem PIXLAB?")
|
||||
- Sprawdzanie kalendarza wydarzeń Norda Biznes
|
||||
- Podawanie rekomendacji (np. "Która firma ma najlepsze opinie Google?")
|
||||
- Informowanie o ogłoszeniach B2B na tablicy
|
||||
- Wyszukiwanie osób powiązanych z firmami (zarząd, właściciele)
|
||||
|
||||
**Przykładowe pytania:**
|
||||
- "Kto oferuje usługi IT w Wejherowie?"
|
||||
- "Kiedy następne spotkanie Norda Biznes?"
|
||||
- "Poleć firmę budowlaną z dobrymi opiniami"
|
||||
- "Szukam drukarni dla mojej firmy"
|
||||
|
||||
NordaGPT ma dostęp do pełnej bazy danych 150 firm, kalendarza wydarzeń, tablicy B2B, forum i rekomendacji członków. Odpowiada w języku polskim i podaje bezpośrednie linki do profilów firm.
|
||||
|
||||
### 3. Kalendarz wydarzeń
|
||||
|
||||
Interaktywny kalendarz spotkań i wydarzeń organizowanych przez Izbę NORDA:
|
||||
- Widok miesięczny z oznaczonymi wydarzeniami
|
||||
- Szczegóły: data, godzina, miejsce, opis
|
||||
- System zapisów (RSVP) - potwierdź udział jednym kliknięciem
|
||||
- Lista uczestników - zobacz kto jeszcze będzie
|
||||
- Banner na stronie głównej z najbliższym wydarzeniem
|
||||
|
||||
### 4. Forum dyskusyjne
|
||||
|
||||
Miejsce wymiany wiedzy i doświadczeń między członkami:
|
||||
- Tematy z różnych kategorii (pytania, ogłoszenia, dyskusje)
|
||||
- Możliwość odpowiadania i komentowania
|
||||
- Załączniki (dokumenty, zdjęcia)
|
||||
- Powiadomienia o nowych odpowiedziach
|
||||
|
||||
### 5. Tablica ogłoszeń B2B
|
||||
|
||||
Marketplace dla członków Izby:
|
||||
- Ogłoszenia typu "Szukam" i "Oferuję"
|
||||
- Kategorie: usługi, produkty, współpraca, zlecenia
|
||||
- Kontakt bezpośredni z autorem ogłoszenia
|
||||
- Automatyczne wygasanie po 30 dniach
|
||||
|
||||
### 6. Prywatne wiadomości
|
||||
|
||||
System komunikacji między członkami:
|
||||
- Bezpośrednie wiadomości do innych użytkowników
|
||||
- Historia rozmów
|
||||
- Powiadomienia o nowych wiadomościach
|
||||
- Możliwość blokowania kontaktów
|
||||
|
||||
### 7. Raporty i statystyki
|
||||
|
||||
Dostępne dla zalogowanych użytkowników:
|
||||
- Staż członkostwa w Izbie (ranking od najdłużej)
|
||||
- Pokrycie social mediów (analiza 6 platform)
|
||||
- Struktura branżowa (rozkład firm wg kategorii)
|
||||
|
||||
### 8. Aktualności
|
||||
|
||||
Sekcja z ogłoszeniami i newsami:
|
||||
- Informacje od zarządu Izby
|
||||
- Okazje biznesowe
|
||||
- Wydarzenia partnerskie
|
||||
- Możliwość przypinania ważnych ogłoszeń
|
||||
|
||||
## Korzyści dla członków
|
||||
|
||||
### Networking
|
||||
- Poznaj 150 firm z różnych branż
|
||||
- Znajdź partnera do projektu
|
||||
- Nawiąż współpracę B2B
|
||||
- Wymieniaj się doświadczeniami na forum
|
||||
|
||||
### Widoczność
|
||||
- Profesjonalny profil firmy z pełnym opisem
|
||||
- Pozycjonowanie w katalogu branżowym
|
||||
- Linki do social mediów
|
||||
- Prezentacja usług i kompetencji
|
||||
|
||||
### Wygoda
|
||||
- Jeden portal dla wszystkiego
|
||||
- NordaGPT odpowie na pytania 24/7
|
||||
- Powiadomienia o wydarzeniach
|
||||
- Mobilna wersja responsywna
|
||||
|
||||
### Bezpieczeństwo
|
||||
- Tylko zweryfikowani członkowie Izby
|
||||
- Kontrola prywatności (ukrywanie danych kontaktowych)
|
||||
- System blokowania niechcianych kontaktów
|
||||
- Zgodność z RODO
|
||||
|
||||
## Historia rozwoju
|
||||
|
||||
### Listopad 2025
|
||||
- Premiera platformy nordabiznes.pl
|
||||
- Uruchomienie NordaGPT z Gemini 2.0
|
||||
- Import danych 111 firm członkowskich
|
||||
|
||||
### Grudzień 2025
|
||||
- Rozbudowa profili firm o dane KRS
|
||||
- Dodanie audytów Google Business Profile
|
||||
- Integracja z social mediami
|
||||
|
||||
### Styczeń 2026
|
||||
- Upgrade NordaGPT do Gemini 3 Flash (7x lepsze rozumowanie)
|
||||
- Dodanie historii konwersacji w chacie
|
||||
- System prywatności i blokowania kontaktów
|
||||
- Hierarchiczne kategorie firm (4 główne grupy)
|
||||
- Panel raportów i statystyk
|
||||
- Sekcja aktualności
|
||||
- Integracja z CEIDG do pobierania danych JDG
|
||||
|
||||
## Technologia
|
||||
|
||||
Portal zbudowany jest na nowoczesnym stacku technologicznym:
|
||||
- **Backend:** Python Flask z SQLAlchemy
|
||||
- **Frontend:** HTML5, CSS3, JavaScript
|
||||
- **Baza danych:** PostgreSQL
|
||||
- **AI:** Google Gemini 3 Flash (1M tokenów kontekstu)
|
||||
- **Hosting:** Serwery w Polsce (INPI)
|
||||
- **SSL:** Certyfikat Let's Encrypt
|
||||
|
||||
## Informacje kontaktowe
|
||||
|
||||
- **Strona:** https://nordabiznes.pl
|
||||
- **Organizator:** Izba Przedsiębiorców NORDA, Wejherowo
|
||||
- **Kontakt:** Przez formularz na stronie lub poprzez profil firmy
|
||||
|
||||
## Podsumowanie
|
||||
|
||||
Norda Biznes Hub to nowoczesna platforma dla przedsiębiorców z Pomorza. Łączy katalog 150 firm z inteligentnym asystentem AI, kalendarzem wydarzeń i narzędziami do networkingu. Dzięki NordaGPT użytkownicy mogą szybko znaleźć odpowiedniego partnera biznesowego - wystarczy zadać pytanie w naturalnym języku.
|
||||
|
||||
Platforma jest stale rozwijana. W 2026 roku planowane są: platforma edukacyjna z webinarami, rozbudowa funkcji rekomendacji między firmami oraz integracje z zewnętrznymi systemami.
|
||||
@ -4,14 +4,14 @@ Google Gemini AI Service
|
||||
Reusable service for interacting with Google Gemini API.
|
||||
|
||||
Features:
|
||||
- Multiple model support (Flash, Pro, Flash-8B)
|
||||
- Multiple model support (Flash, Pro, Gemini 3)
|
||||
- Thinking Mode support for Gemini 3 models
|
||||
- Error handling and retries
|
||||
- Cost tracking
|
||||
- Streaming responses
|
||||
- Safety settings configuration
|
||||
|
||||
Author: MTB Tracker Team
|
||||
Created: 2025-10-18
|
||||
Author: NordaBiz Team
|
||||
Updated: 2026-01-29 (Gemini 3 SDK migration)
|
||||
"""
|
||||
|
||||
import os
|
||||
@ -20,8 +20,10 @@ import hashlib
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Optional, Dict, Any, List
|
||||
import google.generativeai as genai
|
||||
from google.generativeai.types import HarmCategory, HarmBlockThreshold
|
||||
|
||||
# New Gemini SDK (google-genai) with thinking mode support
|
||||
from google import genai
|
||||
from google.genai import types
|
||||
|
||||
# Configure logging
|
||||
logger = logging.getLogger(__name__)
|
||||
@ -44,26 +46,46 @@ GEMINI_MODELS = {
|
||||
'3-pro': 'gemini-3-pro-preview', # Gemini 3 Pro - najlepszy reasoning, 2M context
|
||||
}
|
||||
|
||||
# Pricing per 1M tokens (USD) - updated 2026-01-29
|
||||
GEMINI_PRICING = {
|
||||
'gemini-2.5-flash': {'input': 0.30, 'output': 2.50},
|
||||
'gemini-2.5-flash-lite': {'input': 0.10, 'output': 0.40},
|
||||
'gemini-2.5-pro': {'input': 1.25, 'output': 10.00},
|
||||
'gemini-2.0-flash': {'input': 0.10, 'output': 0.40},
|
||||
'gemini-3-flash-preview': {'input': 0.50, 'output': 3.00},
|
||||
'gemini-3-pro-preview': {'input': 2.00, 'output': 12.00},
|
||||
# Models that support thinking mode
|
||||
THINKING_MODELS = {'gemini-3-flash-preview', 'gemini-3-pro-preview'}
|
||||
|
||||
# Available thinking levels for Gemini 3 Flash
|
||||
THINKING_LEVELS = {
|
||||
'minimal': 'MINIMAL', # Lowest latency, minimal reasoning
|
||||
'low': 'LOW', # Fast, simple tasks
|
||||
'medium': 'MEDIUM', # Balanced (Gemini 3 Flash only)
|
||||
'high': 'HIGH', # Maximum reasoning depth (default)
|
||||
}
|
||||
|
||||
class GeminiService:
|
||||
"""Service class for Google Gemini API interactions."""
|
||||
# Pricing per 1M tokens (USD) - updated 2026-01-29
|
||||
GEMINI_PRICING = {
|
||||
'gemini-2.5-flash': {'input': 0.30, 'output': 2.50, 'thinking': 0},
|
||||
'gemini-2.5-flash-lite': {'input': 0.10, 'output': 0.40, 'thinking': 0},
|
||||
'gemini-2.5-pro': {'input': 1.25, 'output': 10.00, 'thinking': 0},
|
||||
'gemini-2.0-flash': {'input': 0.10, 'output': 0.40, 'thinking': 0},
|
||||
'gemini-3-flash-preview': {'input': 0.50, 'output': 3.00, 'thinking': 1.00},
|
||||
'gemini-3-pro-preview': {'input': 2.00, 'output': 12.00, 'thinking': 4.00},
|
||||
}
|
||||
|
||||
def __init__(self, api_key: Optional[str] = None, model: str = 'flash'):
|
||||
|
||||
class GeminiService:
|
||||
"""Service class for Google Gemini API interactions with Thinking Mode support."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
api_key: Optional[str] = None,
|
||||
model: str = 'flash',
|
||||
thinking_level: str = 'high',
|
||||
include_thoughts: bool = False
|
||||
):
|
||||
"""
|
||||
Initialize Gemini service.
|
||||
|
||||
Args:
|
||||
api_key: Google AI API key (reads from GOOGLE_GEMINI_API_KEY env if not provided)
|
||||
model: Model to use ('flash', 'flash-lite', 'pro', 'flash-2.0')
|
||||
model: Model to use ('flash', 'flash-lite', 'pro', '3-flash', '3-pro')
|
||||
thinking_level: Reasoning depth ('minimal', 'low', 'medium', 'high')
|
||||
include_thoughts: Whether to include thinking process in response (for debugging)
|
||||
"""
|
||||
self.api_key = api_key or os.getenv('GOOGLE_GEMINI_API_KEY')
|
||||
|
||||
@ -79,31 +101,76 @@ class GeminiService:
|
||||
"Please add your API key to .env file."
|
||||
)
|
||||
|
||||
# Configure Gemini
|
||||
genai.configure(api_key=self.api_key)
|
||||
# Initialize new Gemini client
|
||||
self.client = genai.Client(api_key=self.api_key)
|
||||
|
||||
# Set model
|
||||
self.model_name = GEMINI_MODELS.get(model, GEMINI_MODELS['flash'])
|
||||
self.model = genai.GenerativeModel(self.model_name)
|
||||
|
||||
# Safety settings (disabled for testing - enable in production if needed)
|
||||
# Note: Even BLOCK_ONLY_HIGH was blocking neutral prompts like "mountain biking"
|
||||
# For production apps, consider using BLOCK_ONLY_HIGH or BLOCK_MEDIUM_AND_ABOVE
|
||||
self.safety_settings = {
|
||||
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_NONE,
|
||||
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_NONE,
|
||||
HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: HarmBlockThreshold.BLOCK_NONE,
|
||||
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_NONE,
|
||||
# Thinking mode configuration
|
||||
self.thinking_level = thinking_level
|
||||
self.include_thoughts = include_thoughts
|
||||
self._thinking_enabled = self.model_name in THINKING_MODELS
|
||||
|
||||
# Safety settings
|
||||
self.safety_settings = [
|
||||
types.SafetySetting(
|
||||
category="HARM_CATEGORY_HATE_SPEECH",
|
||||
threshold="BLOCK_NONE"
|
||||
),
|
||||
types.SafetySetting(
|
||||
category="HARM_CATEGORY_DANGEROUS_CONTENT",
|
||||
threshold="BLOCK_NONE"
|
||||
),
|
||||
types.SafetySetting(
|
||||
category="HARM_CATEGORY_SEXUALLY_EXPLICIT",
|
||||
threshold="BLOCK_NONE"
|
||||
),
|
||||
types.SafetySetting(
|
||||
category="HARM_CATEGORY_HARASSMENT",
|
||||
threshold="BLOCK_NONE"
|
||||
),
|
||||
]
|
||||
|
||||
logger.info(
|
||||
f"Gemini service initialized: model={self.model_name}, "
|
||||
f"thinking={self._thinking_enabled}, level={thinking_level}"
|
||||
)
|
||||
|
||||
@property
|
||||
def thinking_enabled(self) -> bool:
|
||||
"""Whether thinking mode is enabled for current model."""
|
||||
return self._thinking_enabled
|
||||
|
||||
@property
|
||||
def thinking_level_display(self) -> str:
|
||||
"""Human-readable thinking level for UI."""
|
||||
if not self._thinking_enabled:
|
||||
return "Wyłączony"
|
||||
return {
|
||||
'minimal': 'Minimalny',
|
||||
'low': 'Niski',
|
||||
'medium': 'Średni',
|
||||
'high': 'Wysoki'
|
||||
}.get(self.thinking_level, self.thinking_level)
|
||||
|
||||
def get_status(self) -> Dict[str, Any]:
|
||||
"""Get service status for UI display."""
|
||||
return {
|
||||
'model': self.model_name,
|
||||
'thinking_enabled': self._thinking_enabled,
|
||||
'thinking_level': self.thinking_level,
|
||||
'thinking_level_display': self.thinking_level_display,
|
||||
'include_thoughts': self.include_thoughts
|
||||
}
|
||||
|
||||
logger.info(f"Gemini service initialized with model: {self.model_name}")
|
||||
|
||||
def generate_text(
|
||||
self,
|
||||
prompt: str,
|
||||
temperature: float = 0.7,
|
||||
max_tokens: Optional[int] = None,
|
||||
stream: bool = False,
|
||||
thinking_level: Optional[str] = None,
|
||||
feature: str = 'general',
|
||||
user_id: Optional[int] = None,
|
||||
company_id: Optional[int] = None,
|
||||
@ -111,13 +178,14 @@ class GeminiService:
|
||||
related_entity_id: Optional[int] = None
|
||||
) -> str:
|
||||
"""
|
||||
Generate text using Gemini API with automatic cost tracking.
|
||||
Generate text using Gemini API with automatic cost tracking and thinking mode.
|
||||
|
||||
Args:
|
||||
prompt: Text prompt to send to the model
|
||||
temperature: Sampling temperature (0.0-1.0). Higher = more creative
|
||||
max_tokens: Maximum tokens to generate (None = model default)
|
||||
stream: Whether to stream the response
|
||||
thinking_level: Override default thinking level for this call
|
||||
feature: Feature name for cost tracking ('chat', 'news_evaluation', etc.)
|
||||
user_id: Optional user ID for cost tracking
|
||||
company_id: Optional company ID for context
|
||||
@ -133,65 +201,53 @@ class GeminiService:
|
||||
start_time = time.time()
|
||||
|
||||
try:
|
||||
# Use minimal configuration to avoid blocking issues with FREE tier
|
||||
# Only set temperature if different from default
|
||||
generation_config = None
|
||||
if temperature != 0.7 or max_tokens:
|
||||
generation_config = {'temperature': temperature}
|
||||
# Build generation config
|
||||
config_params = {
|
||||
'temperature': temperature,
|
||||
}
|
||||
if max_tokens:
|
||||
generation_config['max_output_tokens'] = max_tokens
|
||||
config_params['max_output_tokens'] = max_tokens
|
||||
|
||||
# Try passing safety_settings to reduce blocking for legitimate news content
|
||||
# Note: FREE tier may still have built-in restrictions
|
||||
if generation_config:
|
||||
response = self.model.generate_content(
|
||||
prompt,
|
||||
generation_config=generation_config,
|
||||
# Add thinking config for Gemini 3 models
|
||||
if self._thinking_enabled:
|
||||
level = thinking_level or self.thinking_level
|
||||
thinking_config = types.ThinkingConfig(
|
||||
thinking_level=THINKING_LEVELS.get(level, 'HIGH'),
|
||||
include_thoughts=self.include_thoughts
|
||||
)
|
||||
config_params['thinking_config'] = thinking_config
|
||||
|
||||
# Build full config
|
||||
generation_config = types.GenerateContentConfig(
|
||||
**config_params,
|
||||
safety_settings=self.safety_settings
|
||||
)
|
||||
else:
|
||||
response = self.model.generate_content(
|
||||
prompt,
|
||||
safety_settings=self.safety_settings
|
||||
|
||||
# Call API
|
||||
response = self.client.models.generate_content(
|
||||
model=self.model_name,
|
||||
contents=prompt,
|
||||
config=generation_config
|
||||
)
|
||||
|
||||
if stream:
|
||||
# Return generator for streaming
|
||||
return response
|
||||
|
||||
# Check if response was blocked by safety filters
|
||||
if not response.candidates:
|
||||
raise Exception(
|
||||
f"Response blocked. No candidates returned. "
|
||||
f"This may be due to safety filters."
|
||||
)
|
||||
|
||||
candidate = response.candidates[0]
|
||||
|
||||
# Check finish reason
|
||||
if candidate.finish_reason not in [1, 0]: # 1=STOP, 0=UNSPECIFIED
|
||||
finish_reasons = {
|
||||
2: "SAFETY - Content blocked by safety filters",
|
||||
3: "RECITATION - Content blocked due to recitation",
|
||||
4: "OTHER - Other reason",
|
||||
5: "MAX_TOKENS - Reached max token limit"
|
||||
}
|
||||
reason = finish_reasons.get(candidate.finish_reason, f"Unknown ({candidate.finish_reason})")
|
||||
raise Exception(
|
||||
f"Response incomplete. Finish reason: {reason}. "
|
||||
f"Try adjusting safety settings or prompt."
|
||||
)
|
||||
# Extract response text
|
||||
response_text = response.text
|
||||
|
||||
# Count tokens and log cost
|
||||
response_text = response.text
|
||||
latency_ms = int((time.time() - start_time) * 1000)
|
||||
|
||||
input_tokens = self.count_tokens(prompt)
|
||||
output_tokens = self.count_tokens(response_text)
|
||||
# Get token counts from response metadata
|
||||
input_tokens = self._count_tokens_from_response(response, 'input')
|
||||
output_tokens = self._count_tokens_from_response(response, 'output')
|
||||
thinking_tokens = self._count_tokens_from_response(response, 'thinking')
|
||||
|
||||
logger.info(
|
||||
f"Gemini API call successful. "
|
||||
f"Tokens: {input_tokens}+{output_tokens}, "
|
||||
f"Tokens: {input_tokens}+{output_tokens}"
|
||||
f"{f'+{thinking_tokens}t' if thinking_tokens else ''}, "
|
||||
f"Latency: {latency_ms}ms, "
|
||||
f"Model: {self.model_name}"
|
||||
)
|
||||
@ -202,6 +258,7 @@ class GeminiService:
|
||||
response_text=response_text,
|
||||
input_tokens=input_tokens,
|
||||
output_tokens=output_tokens,
|
||||
thinking_tokens=thinking_tokens,
|
||||
latency_ms=latency_ms,
|
||||
success=True,
|
||||
feature=feature,
|
||||
@ -220,8 +277,9 @@ class GeminiService:
|
||||
self._log_api_cost(
|
||||
prompt=prompt,
|
||||
response_text='',
|
||||
input_tokens=self.count_tokens(prompt),
|
||||
input_tokens=self._estimate_tokens(prompt),
|
||||
output_tokens=0,
|
||||
thinking_tokens=0,
|
||||
latency_ms=latency_ms,
|
||||
success=False,
|
||||
error_message=str(e),
|
||||
@ -251,15 +309,33 @@ class GeminiService:
|
||||
Model's response to the last message
|
||||
"""
|
||||
try:
|
||||
chat = self.model.start_chat(history=[])
|
||||
# Build contents from messages
|
||||
contents = []
|
||||
for msg in messages:
|
||||
role = 'user' if msg['role'] == 'user' else 'model'
|
||||
contents.append(types.Content(
|
||||
role=role,
|
||||
parts=[types.Part(text=msg['content'])]
|
||||
))
|
||||
|
||||
# Add conversation history
|
||||
for msg in messages[:-1]: # All except last
|
||||
if msg['role'] == 'user':
|
||||
chat.send_message(msg['content'])
|
||||
# Build config with thinking if available
|
||||
config_params = {'temperature': 0.7}
|
||||
if self._thinking_enabled:
|
||||
config_params['thinking_config'] = types.ThinkingConfig(
|
||||
thinking_level=THINKING_LEVELS.get(self.thinking_level, 'HIGH'),
|
||||
include_thoughts=self.include_thoughts
|
||||
)
|
||||
|
||||
# Send last message and get response
|
||||
response = chat.send_message(messages[-1]['content'])
|
||||
generation_config = types.GenerateContentConfig(
|
||||
**config_params,
|
||||
safety_settings=self.safety_settings
|
||||
)
|
||||
|
||||
response = self.client.models.generate_content(
|
||||
model=self.model_name,
|
||||
contents=contents,
|
||||
config=generation_config
|
||||
)
|
||||
|
||||
return response.text
|
||||
|
||||
@ -283,9 +359,25 @@ class GeminiService:
|
||||
|
||||
img = PIL.Image.open(image_path)
|
||||
|
||||
response = self.model.generate_content(
|
||||
[prompt, img],
|
||||
safety_settings=self.safety_settings
|
||||
# Convert image to bytes
|
||||
import io
|
||||
img_bytes = io.BytesIO()
|
||||
img.save(img_bytes, format=img.format or 'PNG')
|
||||
img_bytes = img_bytes.getvalue()
|
||||
|
||||
contents = [
|
||||
types.Part(text=prompt),
|
||||
types.Part(
|
||||
inline_data=types.Blob(
|
||||
mime_type=f"image/{(img.format or 'png').lower()}",
|
||||
data=img_bytes
|
||||
)
|
||||
)
|
||||
]
|
||||
|
||||
response = self.client.models.generate_content(
|
||||
model=self.model_name,
|
||||
contents=contents
|
||||
)
|
||||
|
||||
return response.text
|
||||
@ -305,20 +397,44 @@ class GeminiService:
|
||||
Number of tokens
|
||||
"""
|
||||
try:
|
||||
result = self.model.count_tokens(text)
|
||||
result = self.client.models.count_tokens(
|
||||
model=self.model_name,
|
||||
contents=text
|
||||
)
|
||||
return result.total_tokens
|
||||
except Exception as e:
|
||||
logger.warning(f"Token counting failed: {e}")
|
||||
# Rough estimate: ~4 chars per token
|
||||
return self._estimate_tokens(text)
|
||||
|
||||
def _estimate_tokens(self, text: str) -> int:
|
||||
"""Estimate tokens when API counting fails (~4 chars per token)."""
|
||||
return len(text) // 4
|
||||
|
||||
def _count_tokens_from_response(self, response, token_type: str) -> int:
|
||||
"""Extract token count from API response metadata."""
|
||||
try:
|
||||
usage = response.usage_metadata
|
||||
if not usage:
|
||||
return 0
|
||||
if token_type == 'input':
|
||||
return getattr(usage, 'prompt_token_count', 0) or 0
|
||||
elif token_type == 'output':
|
||||
return getattr(usage, 'candidates_token_count', 0) or 0
|
||||
elif token_type == 'thinking':
|
||||
# Gemini 3 reports thinking tokens separately
|
||||
return getattr(usage, 'thinking_token_count', 0) or 0
|
||||
except Exception:
|
||||
return 0
|
||||
return 0
|
||||
|
||||
def _log_api_cost(
|
||||
self,
|
||||
prompt: str,
|
||||
response_text: str,
|
||||
input_tokens: int,
|
||||
output_tokens: int,
|
||||
latency_ms: int,
|
||||
thinking_tokens: int = 0,
|
||||
latency_ms: int = 0,
|
||||
success: bool = True,
|
||||
error_message: Optional[str] = None,
|
||||
feature: str = 'general',
|
||||
@ -328,13 +444,14 @@ class GeminiService:
|
||||
related_entity_id: Optional[int] = None
|
||||
):
|
||||
"""
|
||||
Log API call costs to database for monitoring
|
||||
Log API call costs to database for monitoring.
|
||||
|
||||
Args:
|
||||
prompt: Input prompt text
|
||||
response_text: Output response text
|
||||
input_tokens: Number of input tokens used
|
||||
output_tokens: Number of output tokens generated
|
||||
thinking_tokens: Number of thinking tokens (Gemini 3)
|
||||
latency_ms: Response time in milliseconds
|
||||
success: Whether API call succeeded
|
||||
error_message: Error details if failed
|
||||
@ -349,12 +466,13 @@ class GeminiService:
|
||||
|
||||
try:
|
||||
# Calculate costs
|
||||
pricing = GEMINI_PRICING.get(self.model_name, {'input': 0.075, 'output': 0.30})
|
||||
pricing = GEMINI_PRICING.get(self.model_name, {'input': 0.50, 'output': 3.00, 'thinking': 1.00})
|
||||
input_cost = (input_tokens / 1_000_000) * pricing['input']
|
||||
output_cost = (output_tokens / 1_000_000) * pricing['output']
|
||||
total_cost = input_cost + output_cost
|
||||
thinking_cost = (thinking_tokens / 1_000_000) * pricing.get('thinking', 0)
|
||||
total_cost = input_cost + output_cost + thinking_cost
|
||||
|
||||
# Cost in cents for AIUsageLog (more precise)
|
||||
# Cost in cents for AIUsageLog
|
||||
cost_cents = total_cost * 100
|
||||
|
||||
# Create prompt hash (for debugging, not storing full prompt for privacy)
|
||||
@ -371,10 +489,10 @@ class GeminiService:
|
||||
feature=feature,
|
||||
user_id=user_id,
|
||||
input_tokens=input_tokens,
|
||||
output_tokens=output_tokens,
|
||||
total_tokens=input_tokens + output_tokens,
|
||||
output_tokens=output_tokens + thinking_tokens, # Combined for legacy
|
||||
total_tokens=input_tokens + output_tokens + thinking_tokens,
|
||||
input_cost=input_cost,
|
||||
output_cost=output_cost,
|
||||
output_cost=output_cost + thinking_cost, # Combined for legacy
|
||||
total_cost=total_cost,
|
||||
success=success,
|
||||
error_message=error_message,
|
||||
@ -383,12 +501,12 @@ class GeminiService:
|
||||
)
|
||||
db.add(legacy_log)
|
||||
|
||||
# Log to new AIUsageLog table (with automatic daily aggregation via trigger)
|
||||
# Log to new AIUsageLog table
|
||||
usage_log = AIUsageLog(
|
||||
request_type=feature,
|
||||
model=self.model_name,
|
||||
tokens_input=input_tokens,
|
||||
tokens_output=output_tokens,
|
||||
tokens_output=output_tokens + thinking_tokens,
|
||||
cost_cents=cost_cents,
|
||||
user_id=user_id,
|
||||
company_id=company_id,
|
||||
@ -406,7 +524,8 @@ class GeminiService:
|
||||
|
||||
logger.info(
|
||||
f"API cost logged: {feature} - ${total_cost:.6f} "
|
||||
f"({input_tokens}+{output_tokens} tokens, {latency_ms}ms)"
|
||||
f"({input_tokens}+{output_tokens}"
|
||||
f"{f'+{thinking_tokens}t' if thinking_tokens else ''} tokens, {latency_ms}ms)"
|
||||
)
|
||||
finally:
|
||||
db.close()
|
||||
@ -417,7 +536,7 @@ class GeminiService:
|
||||
def generate_embedding(
|
||||
self,
|
||||
text: str,
|
||||
task_type: str = 'retrieval_document',
|
||||
task_type: str = 'RETRIEVAL_DOCUMENT',
|
||||
title: Optional[str] = None,
|
||||
user_id: Optional[int] = None,
|
||||
feature: str = 'embedding'
|
||||
@ -428,19 +547,17 @@ class GeminiService:
|
||||
Args:
|
||||
text: Text to embed
|
||||
task_type: One of:
|
||||
- 'retrieval_document': For documents to be retrieved
|
||||
- 'retrieval_query': For search queries
|
||||
- 'semantic_similarity': For comparing texts
|
||||
- 'classification': For text classification
|
||||
- 'clustering': For text clustering
|
||||
- 'RETRIEVAL_DOCUMENT': For documents to be retrieved
|
||||
- 'RETRIEVAL_QUERY': For search queries
|
||||
- 'SEMANTIC_SIMILARITY': For comparing texts
|
||||
- 'CLASSIFICATION': For text classification
|
||||
- 'CLUSTERING': For text clustering
|
||||
title: Optional title for document (improves quality)
|
||||
user_id: User ID for cost tracking
|
||||
feature: Feature name for cost tracking
|
||||
|
||||
Returns:
|
||||
768-dimensional embedding vector or None on error
|
||||
|
||||
Cost: ~$0.00001 per 1K tokens (very cheap)
|
||||
"""
|
||||
if not text or not text.strip():
|
||||
logger.warning("Empty text provided for embedding")
|
||||
@ -449,26 +566,29 @@ class GeminiService:
|
||||
start_time = time.time()
|
||||
|
||||
try:
|
||||
# Use text-embedding-004 model (768 dimensions)
|
||||
# This is Google's recommended model for embeddings
|
||||
result = genai.embed_content(
|
||||
model='models/text-embedding-004',
|
||||
content=text,
|
||||
task_type=task_type,
|
||||
title=title
|
||||
# Build content with optional title
|
||||
content_parts = []
|
||||
if title:
|
||||
content_parts.append(types.Part(text=f"Title: {title}\n\n"))
|
||||
content_parts.append(types.Part(text=text))
|
||||
|
||||
result = self.client.models.embed_content(
|
||||
model='text-embedding-004',
|
||||
contents=types.Content(parts=content_parts),
|
||||
config=types.EmbedContentConfig(
|
||||
task_type=task_type
|
||||
)
|
||||
)
|
||||
|
||||
embedding = result.get('embedding')
|
||||
embedding = result.embeddings[0].values if result.embeddings else None
|
||||
|
||||
if not embedding:
|
||||
logger.error("No embedding returned from API")
|
||||
return None
|
||||
|
||||
# Log cost (embedding API is very cheap)
|
||||
# Log cost
|
||||
latency_ms = int((time.time() - start_time) * 1000)
|
||||
token_count = len(text) // 4 # Approximate
|
||||
|
||||
# Embedding pricing: ~$0.00001 per 1K tokens
|
||||
token_count = len(text) // 4
|
||||
cost_usd = (token_count / 1000) * 0.00001
|
||||
|
||||
logger.debug(
|
||||
@ -476,31 +596,7 @@ class GeminiService:
|
||||
f"{token_count} tokens, {latency_ms}ms, ${cost_usd:.8f}"
|
||||
)
|
||||
|
||||
# Log to database (if cost tracking is important)
|
||||
if DB_AVAILABLE and user_id:
|
||||
try:
|
||||
db = SessionLocal()
|
||||
try:
|
||||
usage_log = AIUsageLog(
|
||||
request_type=feature,
|
||||
model='text-embedding-004',
|
||||
tokens_input=token_count,
|
||||
tokens_output=0,
|
||||
cost_cents=cost_usd * 100,
|
||||
user_id=user_id,
|
||||
prompt_length=len(text),
|
||||
response_length=len(embedding) * 4, # 4 bytes per float
|
||||
response_time_ms=latency_ms,
|
||||
success=True
|
||||
)
|
||||
db.add(usage_log)
|
||||
db.commit()
|
||||
finally:
|
||||
db.close()
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to log embedding cost: {e}")
|
||||
|
||||
return embedding
|
||||
return list(embedding)
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Embedding generation error: {e}")
|
||||
@ -509,7 +605,7 @@ class GeminiService:
|
||||
def generate_embeddings_batch(
|
||||
self,
|
||||
texts: List[str],
|
||||
task_type: str = 'retrieval_document',
|
||||
task_type: str = 'RETRIEVAL_DOCUMENT',
|
||||
user_id: Optional[int] = None
|
||||
) -> List[Optional[List[float]]]:
|
||||
"""
|
||||
@ -539,18 +635,27 @@ class GeminiService:
|
||||
_gemini_service: Optional[GeminiService] = None
|
||||
|
||||
|
||||
def init_gemini_service(api_key: Optional[str] = None, model: str = 'flash'):
|
||||
def init_gemini_service(
|
||||
api_key: Optional[str] = None,
|
||||
model: str = 'flash',
|
||||
thinking_level: str = 'high'
|
||||
):
|
||||
"""
|
||||
Initialize global Gemini service instance.
|
||||
Call this in app.py during Flask app initialization.
|
||||
|
||||
Args:
|
||||
api_key: Google AI API key (optional if set in env)
|
||||
model: Model to use ('flash', 'flash-8b', 'pro')
|
||||
model: Model to use ('flash', 'flash-lite', 'pro', '3-flash', '3-pro')
|
||||
thinking_level: Reasoning depth for Gemini 3 models ('minimal', 'low', 'medium', 'high')
|
||||
"""
|
||||
global _gemini_service
|
||||
try:
|
||||
_gemini_service = GeminiService(api_key=api_key, model=model)
|
||||
_gemini_service = GeminiService(
|
||||
api_key=api_key,
|
||||
model=model,
|
||||
thinking_level=thinking_level
|
||||
)
|
||||
logger.info("Global Gemini service initialized successfully")
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to initialize Gemini service: {e}")
|
||||
|
||||
@ -98,7 +98,8 @@ class NordaBizChatEngine:
|
||||
if use_global_service:
|
||||
# Use global gemini_service for automatic cost tracking to ai_api_costs table
|
||||
self.gemini_service = gemini_service.get_gemini_service()
|
||||
self.model_name = "gemini-2.5-flash"
|
||||
# Get model name from global service (currently Gemini 3 Flash Preview)
|
||||
self.model_name = self.gemini_service.model_name if self.gemini_service else "gemini-3-flash-preview"
|
||||
self.model = None
|
||||
|
||||
# Initialize tokenizer for cost calculation (still needed for per-message tracking)
|
||||
@ -167,7 +168,8 @@ class NordaBizChatEngine:
|
||||
self,
|
||||
conversation_id: int,
|
||||
user_message: str,
|
||||
user_id: int
|
||||
user_id: int,
|
||||
thinking_level: str = 'high'
|
||||
) -> AIChatMessage:
|
||||
"""
|
||||
Send message and get AI response
|
||||
@ -179,6 +181,7 @@ class NordaBizChatEngine:
|
||||
conversation_id: Conversation ID
|
||||
user_message: User's message text
|
||||
user_id: User ID (required for ownership validation and cost tracking)
|
||||
thinking_level: AI reasoning depth ('minimal', 'low', 'medium', 'high')
|
||||
|
||||
Returns:
|
||||
AIChatMessage: AI response message
|
||||
@ -240,7 +243,8 @@ class NordaBizChatEngine:
|
||||
response = self._query_ai(
|
||||
context,
|
||||
user_message,
|
||||
user_id=user_id
|
||||
user_id=user_id,
|
||||
thinking_level=thinking_level
|
||||
)
|
||||
|
||||
# Calculate metrics for per-message tracking in AIChatMessage table
|
||||
@ -828,7 +832,8 @@ class NordaBizChatEngine:
|
||||
self,
|
||||
context: Dict[str, Any],
|
||||
user_message: str,
|
||||
user_id: Optional[int] = None
|
||||
user_id: Optional[int] = None,
|
||||
thinking_level: str = 'high'
|
||||
) -> str:
|
||||
"""
|
||||
Query Gemini AI with full company database context
|
||||
@ -837,6 +842,7 @@ class NordaBizChatEngine:
|
||||
context: Context dict with ALL companies
|
||||
user_message: User's message
|
||||
user_id: User ID for cost tracking
|
||||
thinking_level: AI reasoning depth ('minimal', 'low', 'medium', 'high')
|
||||
|
||||
Returns:
|
||||
AI response text
|
||||
@ -1193,7 +1199,8 @@ BŁĘDNIE (NIE RÓB - resetuje numerację):
|
||||
prompt=full_prompt,
|
||||
feature='ai_chat',
|
||||
user_id=user_id,
|
||||
temperature=0.7
|
||||
temperature=0.7,
|
||||
thinking_level=thinking_level
|
||||
)
|
||||
# Post-process to ensure links are added even if AI didn't format them
|
||||
return self._postprocess_links(response_text, context)
|
||||
|
||||
@ -14,8 +14,8 @@ Flask-Limiter==3.5.0
|
||||
SQLAlchemy==2.0.23
|
||||
psycopg2-binary==2.9.9
|
||||
|
||||
# Google Gemini AI
|
||||
google-generativeai==0.3.2
|
||||
# Google Gemini AI (new SDK with thinking mode support)
|
||||
google-genai>=1.0.0
|
||||
|
||||
# Google Maps/Places API
|
||||
googlemaps==4.10.0
|
||||
|
||||
@ -538,6 +538,151 @@
|
||||
margin-right: var(--spacing-sm);
|
||||
}
|
||||
|
||||
/* ============================================
|
||||
Thinking Mode Toggle
|
||||
============================================ */
|
||||
.thinking-toggle {
|
||||
position: relative;
|
||||
margin-left: var(--spacing-sm);
|
||||
}
|
||||
|
||||
.thinking-btn {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 6px;
|
||||
padding: 4px 10px;
|
||||
background: rgba(255,255,255,0.2);
|
||||
border: 1px solid rgba(255,255,255,0.3);
|
||||
border-radius: var(--radius);
|
||||
color: white;
|
||||
font-size: var(--font-size-xs);
|
||||
font-weight: 500;
|
||||
cursor: pointer;
|
||||
transition: var(--transition);
|
||||
}
|
||||
|
||||
.thinking-btn:hover {
|
||||
background: rgba(255,255,255,0.3);
|
||||
}
|
||||
|
||||
.thinking-icon {
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
.thinking-arrow {
|
||||
transition: transform 0.2s ease;
|
||||
}
|
||||
|
||||
.thinking-toggle.open .thinking-arrow {
|
||||
transform: rotate(180deg);
|
||||
}
|
||||
|
||||
.thinking-dropdown {
|
||||
display: none;
|
||||
position: absolute;
|
||||
top: calc(100% + 8px);
|
||||
right: 0;
|
||||
width: 280px;
|
||||
background: white;
|
||||
border-radius: var(--radius-lg);
|
||||
box-shadow: 0 10px 40px rgba(0,0,0,0.2);
|
||||
z-index: 100;
|
||||
overflow: hidden;
|
||||
animation: dropdownSlide 0.2s ease;
|
||||
}
|
||||
|
||||
@keyframes dropdownSlide {
|
||||
from { opacity: 0; transform: translateY(-8px); }
|
||||
to { opacity: 1; transform: translateY(0); }
|
||||
}
|
||||
|
||||
.thinking-toggle.open .thinking-dropdown {
|
||||
display: block;
|
||||
}
|
||||
|
||||
.thinking-dropdown-header {
|
||||
padding: var(--spacing-md);
|
||||
background: #f5f3ff;
|
||||
border-bottom: 1px solid #e5e7eb;
|
||||
}
|
||||
|
||||
.thinking-dropdown-header strong {
|
||||
display: block;
|
||||
color: #5b21b6;
|
||||
font-size: var(--font-size-sm);
|
||||
margin-bottom: 4px;
|
||||
}
|
||||
|
||||
.thinking-dropdown-header p {
|
||||
color: var(--text-secondary);
|
||||
font-size: var(--font-size-xs);
|
||||
margin: 0;
|
||||
}
|
||||
|
||||
.thinking-option {
|
||||
padding: var(--spacing-sm) var(--spacing-md);
|
||||
cursor: pointer;
|
||||
transition: var(--transition);
|
||||
border-bottom: 1px solid #f3f4f6;
|
||||
}
|
||||
|
||||
.thinking-option:last-child {
|
||||
border-bottom: none;
|
||||
}
|
||||
|
||||
.thinking-option:hover {
|
||||
background: #f9fafb;
|
||||
}
|
||||
|
||||
.thinking-option.active {
|
||||
background: #f5f3ff;
|
||||
border-left: 3px solid #7c3aed;
|
||||
}
|
||||
|
||||
.thinking-option-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: var(--spacing-sm);
|
||||
margin-bottom: 4px;
|
||||
}
|
||||
|
||||
.thinking-option-icon {
|
||||
font-size: 16px;
|
||||
}
|
||||
|
||||
.thinking-option-name {
|
||||
font-weight: 600;
|
||||
color: var(--text-primary);
|
||||
font-size: var(--font-size-sm);
|
||||
}
|
||||
|
||||
.thinking-option-badge {
|
||||
font-size: 10px;
|
||||
padding: 2px 6px;
|
||||
background: #7c3aed;
|
||||
color: white;
|
||||
border-radius: var(--radius-sm);
|
||||
font-weight: 500;
|
||||
}
|
||||
|
||||
.thinking-option-desc {
|
||||
color: var(--text-secondary);
|
||||
font-size: var(--font-size-xs);
|
||||
margin: 0;
|
||||
line-height: 1.4;
|
||||
}
|
||||
|
||||
@media (max-width: 768px) {
|
||||
.thinking-label {
|
||||
display: none;
|
||||
}
|
||||
|
||||
.thinking-dropdown {
|
||||
width: 260px;
|
||||
right: -60px;
|
||||
}
|
||||
}
|
||||
|
||||
/* ============================================
|
||||
Model Info Button & Modal
|
||||
============================================ */
|
||||
@ -978,6 +1123,44 @@
|
||||
<span style="font-size: 1.5rem;">🤖</span>
|
||||
<h1>NordaGPT</h1>
|
||||
<span class="chat-header-badge">Gemini 3</span>
|
||||
<!-- Thinking Mode Toggle -->
|
||||
<div class="thinking-toggle" id="thinkingToggle">
|
||||
<button class="thinking-btn" onclick="toggleThinkingDropdown()" title="Tryb rozumowania AI">
|
||||
<span class="thinking-icon">🧠</span>
|
||||
<span class="thinking-label" id="thinkingLabel">Wysoki</span>
|
||||
<svg class="thinking-arrow" fill="none" stroke="currentColor" viewBox="0 0 24 24" width="12" height="12">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M19 9l-7 7-7-7"/>
|
||||
</svg>
|
||||
</button>
|
||||
<div class="thinking-dropdown" id="thinkingDropdown">
|
||||
<div class="thinking-dropdown-header">
|
||||
<strong>Tryb rozumowania AI</strong>
|
||||
<p>Określa głębokość analizy przed odpowiedzią</p>
|
||||
</div>
|
||||
<div class="thinking-option" data-level="minimal" onclick="setThinkingLevel('minimal')">
|
||||
<div class="thinking-option-header">
|
||||
<span class="thinking-option-icon">⚡</span>
|
||||
<span class="thinking-option-name">Błyskawiczny</span>
|
||||
</div>
|
||||
<p class="thinking-option-desc">Najszybsze odpowiedzi. Dla prostych pytań typu "kto?", "gdzie?".</p>
|
||||
</div>
|
||||
<div class="thinking-option" data-level="low" onclick="setThinkingLevel('low')">
|
||||
<div class="thinking-option-header">
|
||||
<span class="thinking-option-icon">🚀</span>
|
||||
<span class="thinking-option-name">Szybki</span>
|
||||
</div>
|
||||
<p class="thinking-option-desc">Zrównoważony. Dobre dla większości pytań o firmy i usługi.</p>
|
||||
</div>
|
||||
<div class="thinking-option active" data-level="high" onclick="setThinkingLevel('high')">
|
||||
<div class="thinking-option-header">
|
||||
<span class="thinking-option-icon">🧠</span>
|
||||
<span class="thinking-option-name">Głęboki</span>
|
||||
<span class="thinking-option-badge">Domyślny</span>
|
||||
</div>
|
||||
<p class="thinking-option-desc">Maksymalna analiza. Dla złożonych pytań, rekomendacji, strategii.</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
<button class="model-info-btn" onclick="openModelInfoModal()" title="Informacje o modelu AI">
|
||||
<svg fill="none" stroke="currentColor" viewBox="0 0 24 24" width="16" height="16">
|
||||
<path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 16h-1v-4h-1m1-4h.01M21 12a9 9 0 11-18 0 9 9 0 0118 0z"/>
|
||||
@ -1189,6 +1372,61 @@
|
||||
// NordaGPT Chat - State
|
||||
let currentConversationId = null;
|
||||
let conversations = [];
|
||||
let currentThinkingLevel = 'high'; // Default thinking level
|
||||
|
||||
// ============================================
|
||||
// Thinking Mode Toggle Functions
|
||||
// ============================================
|
||||
const THINKING_LABELS = {
|
||||
'minimal': 'Błyskawiczny',
|
||||
'low': 'Szybki',
|
||||
'high': 'Głęboki'
|
||||
};
|
||||
|
||||
function toggleThinkingDropdown() {
|
||||
const toggle = document.getElementById('thinkingToggle');
|
||||
toggle.classList.toggle('open');
|
||||
}
|
||||
|
||||
function setThinkingLevel(level) {
|
||||
currentThinkingLevel = level;
|
||||
|
||||
// Update UI
|
||||
document.getElementById('thinkingLabel').textContent = THINKING_LABELS[level] || level;
|
||||
|
||||
// Update active state
|
||||
document.querySelectorAll('.thinking-option').forEach(opt => {
|
||||
opt.classList.toggle('active', opt.dataset.level === level);
|
||||
});
|
||||
|
||||
// Close dropdown
|
||||
document.getElementById('thinkingToggle').classList.remove('open');
|
||||
|
||||
// Save preference to server
|
||||
saveThinkingPreference(level);
|
||||
|
||||
console.log('Thinking level set to:', level);
|
||||
}
|
||||
|
||||
async function saveThinkingPreference(level) {
|
||||
try {
|
||||
await fetch('/api/chat/settings', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ thinking_level: level })
|
||||
});
|
||||
} catch (error) {
|
||||
console.error('Failed to save thinking preference:', error);
|
||||
}
|
||||
}
|
||||
|
||||
// Close thinking dropdown when clicking outside
|
||||
document.addEventListener('click', function(e) {
|
||||
const toggle = document.getElementById('thinkingToggle');
|
||||
if (toggle && !toggle.contains(e.target)) {
|
||||
toggle.classList.remove('open');
|
||||
}
|
||||
});
|
||||
|
||||
// ============================================
|
||||
// Model Info Modal Functions
|
||||
@ -1255,8 +1493,26 @@ document.addEventListener('click', function(e) {
|
||||
document.addEventListener('DOMContentLoaded', function() {
|
||||
loadConversations();
|
||||
autoResizeTextarea();
|
||||
loadThinkingSettings();
|
||||
});
|
||||
|
||||
// Load thinking settings from server
|
||||
async function loadThinkingSettings() {
|
||||
try {
|
||||
const response = await fetch('/api/chat/settings');
|
||||
const data = await response.json();
|
||||
if (data.success && data.thinking_level) {
|
||||
currentThinkingLevel = data.thinking_level;
|
||||
document.getElementById('thinkingLabel').textContent = THINKING_LABELS[data.thinking_level] || data.thinking_level;
|
||||
document.querySelectorAll('.thinking-option').forEach(opt => {
|
||||
opt.classList.toggle('active', opt.dataset.level === data.thinking_level);
|
||||
});
|
||||
}
|
||||
} catch (error) {
|
||||
console.log('Using default thinking level:', currentThinkingLevel);
|
||||
}
|
||||
}
|
||||
|
||||
// Load conversations list
|
||||
async function loadConversations() {
|
||||
try {
|
||||
@ -1454,11 +1710,14 @@ async function sendMessage() {
|
||||
}
|
||||
}
|
||||
|
||||
// Send message
|
||||
// Send message with thinking level
|
||||
const response = await fetch(`/api/chat/${currentConversationId}/message`, {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({ message: message })
|
||||
body: JSON.stringify({
|
||||
message: message,
|
||||
thinking_level: currentThinkingLevel
|
||||
})
|
||||
});
|
||||
const data = await response.json();
|
||||
|
||||
|
||||
Loading…
Reference in New Issue
Block a user