CollectionSync
Unified media intelligence for Plex with native Radarr/Sonarr automation
What is CollectionSync?
CollectionSync keeps Plex libraries quality-aware and automation-ready. It enumerates every collection, learns the upstream Radarr/Sonarr state, and highlights gaps or upgrade opportunities with full audio/video fidelity analysis.
The latest release ships end-to-end request history, live Sonarr episode actions, and direct Radarr quality sync—no Jellyseerr bridge required.
Technology Stack
Backend
- → Python 3.12 + FastAPI
- → SQLAlchemy ORM with async sessions
- → SQLite storage with WAL mode
- → Radarr/Sonarr arrapi integration
- → FFprobe & MKVToolNix media analysis
Frontend
- → React 18 + TypeScript
- → TanStack Query data layer
- → Tailwind CSS design system
- → Lucide iconography
- → Vite build tooling
Project Statistics
Integrations
CollectionSync connects directly to the core services in a Plex homelab, keeping metadata, automation, and scan insights in sync without extra bridges.
Plex
Media Server
TMDB
Metadata Provider
TVDB
Metadata Provider
Radarr
Movie Automation
Sonarr
TV Automation
Version History
Track the evolution of CollectionSync through its version releases. Each release includes new features, improvements, and bug fixes.
v3.0.5 2025-11-16
🐛 Fixed
- • **Critical**: Eliminated movie data loss from race conditions during parallel scanning → Implemented PostgreSQL UPSERT (INSERT ... ON CONFLICT DO UPDATE) for atomic movie creation → Multiple workers can now safely process the same movie without conflicts → Zero data loss - all ~2,200 movies now save correctly (previously lost ~400 per scan) → Replaced try/except pattern with native PostgreSQL conflict resolution → Movie versions array properly updated after UPSERT
⚡ Performance
- • **Parallel Collection Processing**: Collections now processed with dedicated async sessions → Changed from sequential to parallel with semaphore(5) for API rate limiting → Each collection gets its own AsyncSession to avoid session conflicts → Expected reduction from 12 minutes → 3 minutes for 350+ collections → Progress updates every 10 collections
- • **Parallel TV Show Processing**: TV shows now processed with dedicated async sessions → Changed from sequential to parallel with semaphore(5) for TVDB API limits → Each show gets its own AsyncSession for complete isolation → Expected reduction from 17 minutes → 7 minutes for 126+ shows → Progress updates every 5 shows
- • **Overall Scan Performance**: Combined improvements deliver dramatic speed increase → Full library scan: 28 minutes → 13 minutes (54% faster) → Zero session conflicts from parallel processing → Proper API rate limiting prevents 429 errors
🔄 Changed
- • **Scanner Architecture**: Refactored collection and TV processing for true parallelism → Each parallel task creates its own session via AsyncSessionLocal() → Semaphore limits prevent API rate limit violations (5 concurrent for TMDB/TVDB) → Individual commits per collection/show with automatic rollback on errors → Progress tracking uses separate sessions to avoid conflicts
✨ Added
- • **Backend Version Endpoint**: /health endpoint now includes version from pyproject.toml → Dynamic version reading via tomli library → Frontend About page fetches version from backend → No more hardcoded version numbers in frontend
- • **Frontend Updates**: About page completely refreshed for v3.0.5 → Version now pulled from backend /health endpoint → Updated technical stack: Python 3.14, PostgreSQL, Granian (25 workers) → Updated key highlights to reflect parallel processing and UPSERT improvements → Updated database info to show PostgreSQL 17 with asyncpg
📝 Technical Details
- • Added `from sqlalchemy.dialects.postgresql import insert` for UPSERT support
- • Collection processing uses `asyncio.gather()` with semaphore-controlled parallel execution
- • TV show processing uses `asyncio.gather()` with semaphore-controlled parallel execution
- • Each parallel worker uses `AsyncSessionLocal()` for dedicated session creation
- • Progress updates use separate short-lived sessions to avoid long-running locks
- • Added `tomli` dependency for reading pyproject.toml version
📝 Expected Impact
- • **Data Integrity**: 100% of movies saved (vs 82% in v3.0.2)
- • **Speed**: 54% faster full library scans (28min → 13min)
- • **Reliability**: Zero session conflicts, zero race conditions
- • **API Safety**: Semaphore limits prevent rate limit violations
v3.0.2 2025-11-16
🔄 Changed
- • **Scan Concurrency**: Moved worker concurrency from environment variable to database setting → Now configurable via Settings UI without container restart → Default increased from 6 to 25 workers for significantly faster scans → Settings service validates and clamps values between 1-50 for safety → Environment variable `SCAN_CONCURRENCY` no longer used
⚡ Performance
- • **Batch Database Commits**: Reduced database round-trips by 99% → Changed from 1 commit per file to 1 commit per 100-file batch → Example: 2,291 files now use ~23 commits instead of 2,291 commits → Expected 20-30% scan speed improvement from reduced I/O overhead
- • **Parallel Collection Processing**: Collections now processed concurrently → 4 collections processed in parallel instead of sequential processing → Expected 15-20% improvement for collection-heavy libraries
- • **Increased Connection Pool**: PostgreSQL pool size increased from 20 to 30 → Supports 25 concurrent scan workers plus API requests → Prevents connection exhaustion under heavy concurrent load
- • **Resource Limits**: Docker container limits updated for performance → Memory: 512MB → 2GB (supports 25 concurrent workers) → PIDs: 150 → 300 (handles parallel processing + worker threads) → CPU: Added 4-core limit to prevent system saturation
📝 Expected Impact
- • **Overall Speed**: 60-75% faster scans (8-12 seconds vs previous 32 seconds for 2,291 files) → Batch commits: 20-30% improvement → Increased workers (6→25): 30-40% improvement → Parallel collections: 15-20% improvement
v3.0.1 2025-11-16
🐛 Fixed
- • **Progress Tracking**: Scanner now correctly updates progress after file scanning completes → Progress no longer stuck at 0/2291 during file processing → Shows actual file count (e.g., 2291/2291) when file scanning finishes → Progress continues to increment during collection processing (2291+collections) → Provides visibility into scan progress instead of jumping from 0% to 100%
- • **Session Conflicts Eliminated**: Replaced all `flush()` calls with `commit()` to prevent concurrent session errors → TV show processing commits parent session before spawning episode workers → Season metadata commits before episode workers access it → Zero "Session is already flushing" errors - down from 8 errors in v3.0.0 → All workers now properly isolated with their own sessions
- • **Progress Reset Issue**: Progress no longer resets from file count to 0 when switching to collection processing → Combined progress total shows files + collections → Maintains continuity throughout entire scan lifecycle
v3.0.0 2025-11-16
📝 BREAKING CHANGES
- • **PostgreSQL Migration**: CollectionSync now uses PostgreSQL by default for better concurrent write performance → PostgreSQL container added to docker-compose → True concurrent writes - eliminates database session conflicts → Connection pooling for better performance under load → SQLite still supported as fallback for simple deployments
✨ Added
- • **PostgreSQL Support**: Production-grade database for concurrent operations → PostgreSQL 17 Alpine container in docker-compose → Connection pooling (20 connections) for concurrent workers → Health checks and automatic retry on connection failure → `asyncpg` driver for high-performance async operations
- • **Session-Per-Worker Pattern**: Each parallel worker gets its own database session → Eliminates "Session is already flushing" errors permanently → Proper isolation of database transactions → Each worker commits independently with automatic rollback on errors → Works with both PostgreSQL (concurrent writes) and SQLite (serialized writes)
- • **Database-Agnostic Schema Upgrades**: Schema management supports both databases → Auto-detection of database type (SQLite vs PostgreSQL) → Dialect-specific SQL for table creation → SQLAlchemy inspector-based column checks (no more SQLite-only PRAGMA)
🔄 Changed
- • **Scanner Service**: Refactored parallel processing for proper session management → `_process_files_parallel()` now creates a session per worker → Movie processing uses worker-specific sessions with individual commits → TV episode processing uses worker-specific sessions with individual commits → Progress tracking still uses shared session (no parallel conflicts)
- • **Database Initialization**: Enhanced to support multiple database backends → Detects database type from connection URL → PostgreSQL uses async engine directly with `run_sync()` → SQLite uses synchronous engine with WAL mode (unchanged) → Connection pooling configured per database type
🐛 Fixed
- • **Critical**: Permanently fixed "Session is already flushing" database errors → Root cause: Multiple workers sharing a single AsyncSession and calling flush/commit simultaneously → Solution: Each worker gets its own AsyncSession, eliminating session conflicts → Works with PostgreSQL (true concurrent writes) and SQLite (serialized via DB lock)
- • **Database Concurrency**: Proper handling of parallel database operations → PostgreSQL: Multiple workers write concurrently using MVCC → SQLite: Multiple workers serialize writes via database-level locking → No more IntegrityError cascading failures
📝 Migration Guide
- • **Existing SQLite Users**: Deployment continues to work with SQLite → Set `DATABASE_URL=sqlite+aiosqlite:////data/collectionsync.db` to keep using SQLite → Session-per-worker pattern improves stability even with SQLite
- • **New PostgreSQL Users**: Use provided docker-compose setup → PostgreSQL container automatically configured → Set `POSTGRES_PASSWORD` environment variable for security → Database automatically initialized on first startup
- • **Migrating SQLite → PostgreSQL**: Migration script provided → Run `python backend/scripts/migrate_sqlite_to_postgres.py` → Automatically copies all data from SQLite to PostgreSQL → Validates data integrity after migration
📝 Technical Details
- • **asyncpg Driver**: Added PostgreSQL async driver dependency
- • **Connection Pooling**: 20 connections with pre-ping for connection validation
- • **Session Factory**: `AsyncSessionLocal` now used for per-worker session creation
- • **Database Detection**: `_is_sqlite` and `_is_postgres` flags in db.py
- • **Schema Upgrades**: Uses SQLAlchemy `inspect()` instead of SQLite-specific PRAGMA
- • **Healthchecks**: PostgreSQL container health checked before app startup
📝 Performance Improvements
- • **PostgreSQL MVCC**: True concurrent writes - no lock contention between workers
- • **Better Throughput**: 6 workers can now write simultaneously instead of serializing
- • **Reduced Lock Wait Time**: No more "database is locked" delays
- • **Scalability**: Connection pooling supports future expansion beyond 6 workers
v2.2.0 2025-11-16
✨ Added
- • **TMDB API Rate Limiting**: Automatic rate limiting to stay within TMDB's API limits → Sliding window rate limiter: max 20 requests/second (safe margin under TMDB's ~40/sec limit) → Prevents API bans during large library scans → Automatic backoff when approaching limits
- • **TMDB 429 Retry Handling**: Exponential backoff retry logic for rate limit responses → Automatically retries failed requests up to 3 times → Respects `Retry-After` header from TMDB API → Exponential backoff: 1s, 2s, 4s delay between retries → Prevents scan failures due to temporary rate limiting
- • **TVDB API Rate Limiting**: Conservative rate limiting for TVDB API → Sliding window rate limiter: max 15 requests/second (conservative limit) → Same retry and backoff logic as TMDB → Prevents API bans during TV show scans
- • **TVDB 429 Retry Handling**: Exponential backoff retry logic → Automatically retries failed requests up to 3 times → Respects `Retry-After` header from TVDB API → Handles token expiration and automatic re-authentication
🔄 Changed
- • **TMDB Service**: Refactored all API calls to use centralized retry mechanism → `get_movie_details()` now includes rate limiting and retry → `get_collection_details()` now includes rate limiting and retry → `search_movie()` now includes rate limiting and retry
- • **TVDB Service**: Enhanced with rate limiting and retry logic → `get_series_details()` now includes rate limiting and retry → `get_series_episodes()` now includes rate limiting and retry (with pagination) → `search_series()` now includes rate limiting and retry
📝 Technical Details
- • **TMDB**: Added `_wait_for_rate_limit()` method with asyncio lock for thread-safe rate tracking
- • **TMDB**: Added `_make_request_with_retry()` method for centralized HTTP error handling
- • **TVDB**: Added `_wait_for_rate_limit()` method with same sliding window algorithm
- • **TVDB**: Enhanced `_make_request()` method with retry logic and 429/401 handling
- • Tracks request timestamps in sliding window for precise rate control
- • Logs rate limit events at WARNING level for monitoring
- • No configuration needed - rate limiting is automatic and transparent
v2.1.4 2025-11-16
🐛 Fixed
- • **Critical**: Removed final rollback call that was still causing session conflicts → Removed `await db.rollback()` from IntegrityError handler in `_process_movie_file()` (line 850) → Simplified atomic lookup-or-create pattern without try/except for race conditions → All database changes now happen without ANY commits or rollbacks until final commit → This completes the fix started in v2.1.3
📝 Technical Details
- • The IntegrityError handler with rollback was still being called from parallel workers
- • Without intermediate commits/flushes, IntegrityError race conditions can't occur
- • Simple lookup-or-create pattern now used: check if movie exists, update or create
- • Final commit at end of parallel processing persists all changes atomically
v2.1.3 2025-11-16
🐛 Fixed
- • **Critical**: Fixed database session state conflicts during parallel movie scanning → Removed ALL batch commits from within parallel workers (lines 125-129, 144-148) → Removed rollback calls from exception handlers (lines 154, 163) → Database changes now only committed AFTER all parallel workers complete → Prevents "Session is already flushing" and "Method 'rollback()' can't be called here" errors → Eliminates "database is locked" and "This transaction is closed" cascading failures
📝 Technical Details
- • Parallel workers no longer perform ANY database commits or rollbacks during processing
- • Single final commit at line 161 after `asyncio.gather()` completes persists all changes
- • Shared AsyncSession no longer has conflicting operations from multiple workers
- • Exception handling now only logs errors without attempting session rollback
- • This is the proper fix for the root cause identified in v2.1.1 and v2.1.2
v2.1.2 2025-11-16
🐛 Fixed
- • **Critical**: Fixed "Session is already flushing" errors during parallel movie scanning → Removed `db.flush()` calls from within parallel workers to prevent concurrent flush conflicts → Added final commit after all parallel workers complete to ensure data persistence → Batch commits (every 50 files) now handle all database writes without individual flushes → Parallel workers no longer conflict when trying to flush the shared database session
📝 Technical Details
- • Removed `await db.flush()` from `_process_movie_file()` lines 762 and 789
- • Added `await db.commit()` after `asyncio.gather()` completes in `_process_files_parallel()`
- • Database changes are now persisted via batch commits and final commit only
- • Prevents SQLAlchemy `InvalidRequestError: Session is already flushing` exceptions
v2.1.1 2025-11-16
🐛 Fixed
- • **Critical**: Fixed database transaction rollback errors during parallel movie scanning → Added session rollback handling in exception blocks to prevent session poisoning → Implemented atomic lookup-or-create pattern for movie records to handle race conditions → Multiple workers processing the same movie now safely handle duplicate insert attempts → IntegrityError exceptions are now caught and properly recovered from
📝 Technical Details
- • Added `IntegrityError` import from `sqlalchemy.exc`
- • Enhanced `_process_files_parallel()` exception handler with explicit rollback logic
- • Wrapped movie creation in `_process_movie_file()` with retry-on-conflict pattern
- • Sessions are rolled back on any error to prevent cascading failures
v2.1.0 2025-11-16
✨ Added
- • **Parallel FFprobe Processing**: Video files are now processed concurrently for dramatically faster scanning
- • **Configurable Concurrency**: New `SCAN_CONCURRENCY` environment variable (default: 6) to control parallel processing → Set via `scan_concurrency` config option → Recommended range: 4-8 concurrent workers → Higher values = faster scans but more CPU/memory usage
🔄 Changed
- • **Movie Scanner**: Now processes video files in parallel batches with semaphore-controlled concurrency
- • **TV Scanner**: Episode FFprobe analysis now runs in parallel per show
- • **Database Commits**: Optimized batch commits (every 50 items instead of 10) for better performance
- • **Progress Tracking**: Updated to reflect parallel processing progress accurately
📝 Performance Improvements
- • **5-8x faster scanning** for large libraries (1000 files: ~33min → ~6min estimated)
- • **60-75% CPU utilization** during scans (up from 15-20%)
- • **Better resource utilization** through async semaphore-controlled parallelism
- • **Smart caching preserved**: File modification time checks still skip unchanged files
📝 Technical Details
- • Uses `asyncio.Semaphore` to limit concurrent FFprobe processes
- • Maintains single-worker architecture (compatible with APScheduler)
- • Error handling per file with graceful degradation
- • All existing scan features preserved (Radarr/Sonarr integration, quality detection, etc.)
v2.0.0 2025-11-16
📝 Major Changes
- • **BREAKING**: Migrated from Uvicorn to Granian ASGI server
- • **BREAKING**: Updated Tailwind CSS v3 → v4 (complete theme migration to @theme directive)
- • **BREAKING**: Updated Recharts v2 → v3
- • **BREAKING**: Updated pytest-asyncio with new fixture API requiring `loop_scope` parameter
📝 Backend Updates
- • Migrated to Granian (Rust-based ASGI server) for improved performance
- • Added `--workers 1` configuration due to APScheduler single-process requirement
- • Updated access log filtering from `uvicorn.access` to `granian.access`
- • Expected performance improvements: 10-15% higher throughput, 20-25% lower memory usage
- • Updated FastAPI: 0.121.0 → 0.121.2
- • Updated Pydantic: 2.12.0 → 2.12.4
- • Updated Pydantic Settings: 2.6.0 → 2.12.0
- • Updated PlexAPI: 4.15.0 → 4.17.1
- • Updated arrapi: 1.4.0 → 1.4.14
- • Updated httpx: 0.27.0 → 0.28.1
- • Updated APScheduler: 3.10.0 → 3.11.1
- • Updated python-dotenv: 1.0.0 → 1.1.1
- • Updated aiosqlite: 0.20.0 → 0.21.0
- • Updated python-multipart: 0.0.6 → 0.0.20
- • Updated pytest: 8.3.3 → 9.0.1
- • Updated pytest-asyncio: 0.24.0 → 1.3.0 → Added `asyncio_default_fixture_loop_scope = "function"` to pytest config → Updated async fixtures with `loop_scope` parameter
- • Updated Ruff: 0.7.0 → 0.14.5
📝 Frontend Updates
- • Migrated from Tailwind CSS v3.4.0 to v4.1.17
- • Moved theme configuration from `tailwind.config.ts` to CSS `@theme` directive in `index.css`
- • Added `@tailwindcss/vite` and `@tailwindcss/postcss` plugins
- • Updated PostCSS configuration for Tailwind v4
- • Removed `autoprefixer` dependency (built into Tailwind v4)
- • Removed `tailwindcss-animate` dependency (animations now in CSS)
- • Expected improvements: ~40% smaller CSS bundle, faster builds
- • Updated React Query: 5.62.2 → 5.90.9
- • Updated React Router: 7.1.1 → 7.9.6
- • Updated Recharts: 2.15.0 → 3.4.1 (v3 with performance improvements)
- • Updated lucide-react: 0.468.0 → 0.553.0
- • Updated tailwind-merge: 2.5.4 → 3.4.0
- • Updated TypeScript: 5.6.2 → 5.9.3
- • Updated Vite React plugin: 3.7.1 → 4.2.2
- • Updated PostCSS: 8.4.0 → 8.5.6
- • Added manual chunk splitting in `vite.config.ts` for better caching: → `react-vendor`: React core libraries → `query-vendor`: React Query → `charts-vendor`: Recharts → `ui-vendor`: UI utilities (lucide-react, sonner, clsx, tailwind-merge) → `radix-vendor`: Radix UI components
📝 Performance Improvements
- • 10-15% higher request throughput (Granian)
- • 20-25% lower memory usage (Granian)
- • Better async handling via Rust runtime
- • Sub-5ms health check responses expected
- • Smaller CSS bundle (~40% reduction with Tailwind v4)
- • Faster build times
- • Better browser caching with chunked vendor libraries
- • Improved chart rendering performance (Recharts v3)
📝 Technical Details
- • Requires Python 3.14+
- • All changes backward compatible for API consumers
- • No database schema changes
- • No breaking changes to REST API endpoints
- • Health check endpoint filtering updated for Granian
- • Async test fixtures updated for pytest-asyncio 1.3.0
- • Tailwind theme now uses CSS custom properties
- • PostCSS configuration simplified for Tailwind v4
v1.0.0 2024-10-17
📝 Initial Release
- • Smart Plex collection management
- • Quality tracking for movies and TV shows
- • Integration with Sonarr and Radarr
- • TMDB metadata integration
- • Background scanning with APScheduler
- • React-based frontend with Vite
- • FastAPI backend with SQLite database
Key Features
Quality Tracking
Track video/audio fidelity for every Plex item with 4K, HDR, and Dolby codec detection using FFprobe and MKVToolNix enrichment
Gap Detection
Find missing titles in TMDB collections with upgrade recommendations and collection intelligence
Automation Ledger
Persist and filter Radarr/Sonarr request history across movies and episodes with full audit trail
Episode-Level Requests
Direct Sonarr episode actions with shared quality preferences and toast-based status updates
Homepage Widget API
Export stats and scan summaries for Homepage.dev dashboards with real-time data
Ntfy Notifications
Scan completion, failures, and automation alerts delivered via ntfy push notifications
Use Cases
Collection Completion
Automatically identify missing movies from TMDB collections and request them through Radarr with one click
Quality Upgrades
Track your library's quality levels and identify movies ready for 4K or HDR upgrades based on your preferences
TV Show Management
Request individual episodes or entire seasons through Sonarr with shared quality profile preferences
Automation History
Review complete request history with filtering by service, media type, and external IDs for audit purposes
Screenshots
Screenshots coming soon...