Commit Graph

35 Commits

Author SHA1 Message Date
5030b71beb chore: update Author to Maciej Pienczyn, InPi sp. z o.o. across all files
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:20:47 +02:00
660ed68a0d feat: Twitter/X data fetching via guest token GraphQL API
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
Add twitter_service.py using Twitter's internal GraphQL API with
guest token authentication (free, no API key needed). Fetches
followers, tweets, bio, location, media count, and more.

Integrated into social audit enrichment with _twitter_extra data
stored in content_types JSONB, displayed on audit detail cards.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:25:41 +01:00
fe288f0441 feat: YouTube Data API v3 integration for social media enrichment
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
- YouTubeService now fetches: subscribers, views, video count, description,
  avatar, banner, country, creation date, recent 5 videos
- Enricher uses API first, falls back to scraping
- Extra YouTube data stored in content_types JSONB
- Audit detail shows view count, country, creation date, recent videos
- Requires enabling YouTube Data API v3 in Google Cloud Console

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 14:01:15 +01:00
2c9a45230d feat: LinkedIn scraper retry with random delays + authwall detection
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
3 attempts with 2-5s random delay between retries. Detects authwall
and rate limit (429/999) responses. Updated status message to explain
LinkedIn's inconsistent availability to users.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:41:50 +01:00
ce0a6863d2 fix: decode HTML entities in social audit, replace browser dialogs with modals
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
- Fix checkmarks showing as &#10003; by using Unicode ✓/✗ directly
- Decode HTML entities (&#39; &amp;) from og:meta in enricher results
- Replace native confirm()/alert() with styled modal dialogs and toasts

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 12:31:02 +01:00
d3c81e4880 feat: dual data sources for social audit with provenance indicators
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
- Scraper no longer overwrites API data (source priority hierarchy)
- Per-platform data provenance badges (API OAuth/Scraping/Manual/Unknown)
- Expandable field-level source breakdown (which fields from API vs scraping)
- OAuth status per platform with connect/renew/sync links
- "Run audit" button on dashboard (background enrichment for all companies)
- "Run audit" button on detail view (single company enrichment)
- Enrichment progress polling with real-time status updates

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 08:37:02 +01:00
ce9513b4bb fix: Remove Brave Search from social media audit — too many false positives
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
Brave Search matched unrelated companies by name token (e.g. VINDOR matched
vindorclothing, vindormusic, beautybyneyador). Social media profiles are now
sourced only from website scraping and manual admin entry.

- Disabled BraveSearcher initialization and call in audit_company()
- Removed Brave Search step from audit progress animation
- Updated missing profile message with explanation and link to profile editor
- Added migration 071 to clean up existing brave_search entries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 18:20:52 +01:00
f2f65abca2 revert: Remove city-aware token matching, keep handle-based exclude fix
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
City tokens caused too many false positives (matching any business from
the same city). Reverted to name-only matching. The exclude fix
(checking handle instead of full URL substring) is preserved as it
fixes a genuine bug where 'p' in exclude list matched any URL.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 18:05:40 +01:00
9892e07b26 fix: Check excludes against extracted handle, not full URL
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
The 'p' exclude in SOCIAL_MEDIA_EXCLUDE matched any URL containing
the letter 'p' (e.g. ?locale=pl_PL). Now extracts handle first
and checks exclusion with exact match on first path segment.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 17:52:16 +01:00
304f0a5002 fix: Add city-aware token matching to social media Brave Search validator
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
Company name tokens have weight 2, city tokens weight 1. City-only
matches accepted only for top-3 Brave results to prevent false positives.
Fixes detection of facebook.com/itwejherowo for Informatyk1 (Wejherowo).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 17:43:00 +01:00
a0c85b6182 fix: Remove direct LinkedIn URL check - risk of false positives
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
Direct slug check (e.g. linkedin.com/company/waterm) can match
a different company with the same name. LinkedIn public metadata
is too minimal to verify location/industry without API access.
Rely on Brave Search with title/description validation instead.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 12:16:19 +01:00
990c6537cb fix: Prioritize LinkedIn company pages over personal profiles
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
- Add direct URL check for linkedin.com/company/{slug} before Brave Search
- Prioritize /company/ over /in/ in search result ranking
- Use targeted query "company_name linkedin.com/company" first
- Fall back to personal profile search only if company page not found
- Verify page title matches company name to avoid false positives

Fixes: WATERM showed employee's personal profile instead of existing
company page at linkedin.com/company/waterm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 11:08:13 +01:00
6633b94644 fix: Implement Brave Search for LinkedIn detection and fix URL construction
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
- Replace placeholder _search_brave() with real Brave API integration
- Fix LinkedIn URL construction: /in/ profiles were incorrectly built as /company/
- Add word-boundary matching to validate search results against company name
- Track source (website_scrape vs brave_search) per platform in audit results
- Increase search results from 5 to 10 for better coverage

Fixes: WATERM LinkedIn profile not detected (website has no LinkedIn link,
but Brave Search finds the personal /in/ profile)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 10:52:09 +01:00
fae99a3137 fix(social-audit): Add 'tr' (Meta Pixel) and other FB endpoints to exclude list
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
The scraper was matching facebook.com/tr (Meta Pixel tracking endpoint)
as a valid Facebook profile handle. Added 'tr', 'privacy', 'policies',
'ads', 'business', 'legal', 'flx' to the Facebook exclusion list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-12 16:44:53 +01:00
70e40d133b feat(oauth): Add OAuth integration UI, API clients, and audit enrichment (Phase 3)
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
- Company settings page with 4 OAuth cards (GBP, Search Console, Facebook, Instagram)
- 3 API service clients: GBP Management, Search Console, Facebook Graph
- OAuth enrichment in GBP audit (owner responses, posts), social media (FB/IG Graph API),
  and SEO prompt (Search Console data)
- Fix OAuth callback redirects to point to company settings page
- All integrations have graceful fallback when no OAuth credentials configured

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 15:55:02 +01:00
b1438dd514 feat(audit): Phase 0 quick wins - fix bugs, enrich AI prompts, add metrics
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
GBP audit:
- Fix review_response_rate bug: check ownerResponse instead of authorAttribution.displayName
- Mark has_posts/has_products/has_qa as OAuth-dependent in AI prompt
- Add review_keywords and description_keywords to AI prompt

SEO audit:
- Replace deprecated FID with INP (Core Web Vital since March 2024)
- Pass 10 additional metrics to AI prompt: FCP, TTFB, TBT, Speed Index,
  meta title/desc length, html lang, Schema.org field details
- Update templates with INP thresholds (200ms/500ms)

Social media audit:
- Calculate engagement_rate from industry base rates × activity multiplier
- Calculate posting_frequency_score (0-10 based on posts_count_30d)
- Enrich AI prompt with page_name, freq_score, engagement, last_post_date
- Add avg engagement rate and brand name consistency check to prompt

Completeness: 52% → ~68% (estimated)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 11:24:03 +01:00
42ddeabf2a feat(backend): Add enhanced audit models and scraper improvements
- database.py: GBPReview, CompanyCitation, CompanyCompetitor, CompetitorSnapshot, AuditReport models
- gbp_audit_service.py: Enhanced review analysis, NAP consistency, keyword analysis
- scripts/seo_audit.py: Core Web Vitals, heading/image/link analysis, SSL, analytics detection
- scripts/social_media_audit.py: Profile enrichment, content types, posting frequency

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-07 12:00:42 +01:00
60d28a5c24 fix(social-audit): Fix Facebook URL truncation and improve scraping patterns
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
- Add regex pattern for Facebook /p/PageName-ID/ multi-segment URLs
- Add 'p' to Facebook exclusion list (bare /p is always truncated)
- Add minimum length validation for extracted social handles
- Strip Instagram tracking params (?igsh=, &utm_source=) from handles

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 19:29:06 +01:00
14ce54d8b5 fix(social-audit): Fix Facebook profile.php URLs being saved without ID
Some checks are pending
NordaBiz Tests / Unit & Integration Tests (push) Waiting to run
NordaBiz Tests / E2E Tests (Playwright) (push) Blocked by required conditions
NordaBiz Tests / Smoke Tests (Production) (push) Blocked by required conditions
NordaBiz Tests / Send Failure Notification (push) Blocked by required conditions
The regex was capturing 'profile.php' as a username instead of extracting
the numeric ID from profile.php?id=XXX links. Added dedicated pattern for
profile.php URLs and added profile.php to exclusion list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-06 18:03:01 +01:00
986360f7d5 feat: Add URL normalization and inline audit sections
- Add normalize_social_url() function to database.py to prevent
  www vs non-www duplicates in social media records
- Update update_social_media.py to normalize URLs before insert
- Update social_media_audit.py to normalize URLs before insert
- Add inline GBP Audit section to company profile
- Add inline Social Media Audit section to company profile
- Add inline IT Audit section to company profile

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 23:07:03 +01:00
b4dcca6d55 auto-claude: 2.3 - Replace hardcoded password in scripts/social_media_audit.py with safe fallback 2026-01-10 12:50:39 +01:00
39cd257f4e Fix YouTube detection overwriting valid matches
- Add 'channel', 'c', 'user', '@' etc. to YouTube exclusion list
- Add 'bold_themes', 'boldthemes' to Twitter/Facebook exclusions (theme creators)
- Fix pattern matching loop to stop after first valid match per platform
- Prevents fallback pattern from overwriting correct channel ID with 'channel'

Fixes issue where youtube.com/channel/ID was being overwritten with
youtube.com/channel/channel by the second fallback pattern.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 05:36:06 +01:00
c319777d58 Social Media audit: progress bar improvements
- Add detailed logging to SocialMediaAuditor (website scan, Brave search, results)
- Slow down progress bar animation (400ms instead of 200ms) for better readability
- Bold "ZNALEZIONO" text for found platforms
- Display Google rating and review count in progress
- Increase wait time before modal close (4 seconds)
- Add console.log for debugging audit response

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 05:29:17 +01:00
8fed190303 fix(social-audit): Convert opening_hours dict to JSON for JSONB column
Fixes: psycopg2.ProgrammingError: can't adapt type 'dict'

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-09 05:14:01 +01:00
5ed97ac1dd auto-claude: subtask-5-1 - Fix opening_hours and photos data passing in audit_company
Fixed a bug where google_opening_hours and google_photos_count were being
fetched from the Google Places API but not passed through to the result
dictionary correctly:

- Changed 'opening_hours' key to 'google_opening_hours' to match what
  save_audit_result() expects
- Added 'google_photos_count' to the result dictionary

Verified with dry-run: INPI company now shows opening hours schedule
and 10 photos count from Google Business Profile.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 23:08:19 +01:00
aacf2cf54b auto-claude: subtask-2-3 - Update save_audit_result() to store google_opening_hours and google_photos_count
- Added google_opening_hours and google_photos_count to INSERT column list
- Added corresponding placeholders to VALUES list
- Added to ON CONFLICT UPDATE SET clause
- Added to parameter dictionary reading from google_reviews result

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 23:00:22 +01:00
5f2cfa06fd auto-claude: subtask-2-2 - Update get_place_details() to return photos count
- Add google_photos_count to result dictionary initialization
- Extract photos count from API response using len(place['photos'])
- Update logging to include photos count in output

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 22:59:04 +01:00
5fa80f9efa auto-claude: subtask-2-1 - Add 'photos' to fields list in GooglePlacesSearcher
Added 'photos' field to the fields list in get_place_details() method
to enable fetching business photos from Google Places API.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 22:58:04 +01:00
06c22539d7 auto-claude: subtask-4-2 - Add --company-slug support and dotenv loading
- Add --company-slug argument to social_media_audit.py for easier testing
- Add get_company_id_by_slug() method to SocialMediaAuditor class
- Add python-dotenv support to load .env file from project root
- Create verify_google_places.py script for direct API testing

Note: Full verification blocked - current API key (PageSpeed) doesn't have
Places API enabled. Requires enabling Places API in Google Cloud Console
for project NORDABIZNES (gen-lang-client-0540794446).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:49:59 +01:00
3d69d53550 auto-claude: subtask-3-4 - Run all tests and verify they pass
Fixed bug in social media exclusion logic that was too aggressive.
The substring check `any(ex in match.lower() for ex in excludes)`
was incorrectly excluding valid usernames containing exclusion
strings (e.g., 'testcompany' was excluded because it contained 'p').

Changed to exact match only to properly handle Instagram post URLs
(`instagram.com/p/...`) without false positives on valid usernames.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:41:54 +01:00
3bdbde1621 auto-claude: subtask-2-3 - Update SocialMediaAuditor to use GooglePlacesSearcher
- Add google_places_searcher attribute to SocialMediaAuditor
- Initialize GooglePlacesSearcher if GOOGLE_PLACES_API_KEY env var is set
- Update audit_company() to use Places API directly when available
- Fallback to Brave Search when API key not configured
- Log which data source is being used for reviews

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:31:22 +01:00
b389287697 auto-claude: subtask-2-2 - Replace placeholder search_google_reviews() method
Implemented actual Google reviews data collection in BraveSearcher class:
- Uses GooglePlacesSearcher to find company and get place details
- Returns google_rating, google_reviews_count, opening_hours, business_status
- Falls back to Brave Search API parsing when Google API key not available
- Added _search_brave_for_reviews() helper for fallback implementation
- Proper error handling and logging throughout

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:29:50 +01:00
4110ef63b5 auto-claude: subtask-2-1 - Add GooglePlacesSearcher class to social_media_audit.py
Implements GooglePlacesSearcher class with:
- find_place() method: searches for business by name and city
  using Google Places findplacefromtext API
- get_place_details() method: retrieves rating, review count,
  opening hours, business status, phone, and website

Features:
- Uses GOOGLE_PLACES_API_KEY environment variable
- Comprehensive error handling (timeout, request errors)
- Polish language locale support
- Follows existing BraveSearcher class pattern

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 20:27:49 +01:00
af003798a7 Napraw połączenie z bazą danych w skryptach SEO (localhost zamiast zewnętrznego IP) 2026-01-08 15:57:39 +01:00
02fc67bf40 Initial commit 2026-01-01 14:01:49 +01:00