Commit Graph

12 Commits

Author SHA1 Message Date
Audiolib
edf057f36e Cleanup UI buttons + smarter series-aware search title
UI: Hide developer tools (Cover-aus-Datei, Tags-lesen,
Connectivity-Check) behind a '+ Tools' toggle. Default view has only
Play, Match, Auto-Match. Tag extraction runs automatically on scan
anyway, so the buttons were noise.

Matcher: Item metadata (series, author from tags or earlier matches)
now flows into the search:
- detect_series() also scans inside title, not only prefix — handles
  garbage chars (◆ U+25C6 etc.) before the series name
- New _strip_series_prefix removes "Die drei ???" from search title so
  APIs see only the episode title ("Die Villa der Toten") which is how
  most databases index these
- _build_search_title also strips non-printable / exotic chars and
  bracketed content anywhere (not just trailing)
- Effective series falls back to item.series when detect_series misses
- Search call now logs which series the search is using

Example: title='◆Die◆ drei ??? Die Villa der Toten (drei Fragezeichen)'
  detected_series='Die drei ???', search_title='Die Villa der Toten'

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 20:31:40 +02:00
Audiolib
0824894a7f Read ID3 tags during scan — fixes 'Folge 114 Die Villa der Toten' problem
Diagnosis from connectivity check: 4/5 APIs reachable (only Google Books
rate-limited). So the network is fine — the search title was the problem.
'Folge 114 Die Villa der Toten' isn't indexed under that name anywhere.
The MP3 itself has the real metadata in ID3 tags (album, artist, year).

Scanner now reads ID3/Vorbis/MP4 tags from the first audio file:
- album → item.title
- albumartist / composer / artist → item.author
- date → publish_year
- organization / publisher → publisher
- language → language
- genre → genres
- artist (heuristic) → series, if it doesn't appear in album title

Parent folder name → series hint (skipped if it's a library root).

Only fills empty fields, never overwrites manually edited or matched data.
Runs on new items AND on re-scan for items without an active match.

Search title normalization improved: 'Folge 123 - X' / 'Band 7: Y' etc.
prefixes and infixes get stripped so APIs see the actual episode title.

New endpoint POST /api/items/{id}/extract-tags + 'Tags lesen' button in
BookDetail — triggers tag extraction on demand for existing items.
Returns before/after diff so user can see what was filled in.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 20:15:44 +02:00
Audiolib
e3e6492b1f Make matching debuggable + fix metadata search blockers
DNB rewrite:
- Multiple query strategies with fallback (title+author+mat=ton →
  title+author → title+mat=ton → title-only → fulltext). Returns on
  first hit. Most German audiobooks aren't tagged mat=ton in DNB,
  which was killing all searches.
- Strip CQL wildcard chars (?, *, <, >, =, /, quotes) from search
  terms. The "???" in "Die drei ???" was breaking the CQL parser.
- Log HTTP status, body snippet on non-200, and numberOfRecords on
  every query so log shows exactly what DNB returned.
- Parse SRU diagnostic elements (DNB error messages buried in XML).
- Convert author/narrator from "Lastname, Firstname" to
  "Firstname Lastname" for consistency with other sources.

Matcher:
- Split series patterns: WITH_EPISODE (need digit) and SERIES_ONLY
  (just the series name). "Die drei ??? und der Fluch des Rubins"
  now properly detects "Die drei ???" as series even without folge#.
- New _build_search_title: removes ??? sequences, trailing parens,
  collapses whitespace, before sending to APIs.
- Manual search also passes through normalization. Logs source +
  hit count per query.

Debug endpoint:
- GET /api/items/match/debug?title=...&author=... returns raw results
  from all 4 sources with status, error messages, and full metadata.
- "Debug" button added in BookDetail — shows what each API actually
  returns inline, so the user can see if it's a search problem,
  parse problem, or threshold problem.
- "Cover aus Datei" button — triggers local cover extraction
  (folder.jpg or embedded artwork) on demand.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:34:49 +02:00
Audiolib
d93f972079 Simplify streaming auth + add local cover extraction
Streaming: Drop token-in-URL auth entirely. Session-ID (UUID, 128-bit
entropy) IS the auth — same approach as Audiobookshelf. Eliminates the
entire class of token-related failures and matches how every other
streaming server handles this. Logs every stream request with Range
header and User-Agent for diagnostics.

Player: Visible error banner in UI when audio fails (with HTML5 media
error code translated to German). Stream URL is shown in the banner so
the user can see exactly what failed.

Scanner: Cover extraction from two new sources (in addition to API
matching):
  1. Folder-level images (cover.jpg, folder.jpg, front.jpg, etc.)
  2. Embedded artwork (ID3 APIC, MP4 covr, FLAC/Vorbis pictures)
Runs on every scan — also fills in covers for items that were already
scanned but never got one from matching.

New endpoint POST /api/items/{id}/extract-cover triggers this manually
for a single item.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:09:22 +02:00
Audiolib
17b77afd45 Rewrite player + fix matching metadata loss
Streaming: Custom range-aware HTTP endpoint. Returns 206 Partial Content
for Range requests (with Content-Range, Content-Length, Accept-Ranges).
This was the root cause of broken seeking — Starlette's default
FileResponse behavior wasn't reliable across all clients. Now seeking
works natively via standard HTML5 audio.

Player: Full rewrite. Cleaner separation between absolute book time and
per-track time. Track switching uses pendingSeek + canplay/loadedmetadata
handlers. Console logs for debugging. Removed crossOrigin to avoid CORS
issues. Removed hls.js entirely.

Matcher: Critical bug fix — get_work_details (OpenLibrary) was returning
a sparse MatchResult that REPLACED the rich search result, losing cover,
author, year. New _enrich_match merges details into best without
overwriting existing values (except description/chapters which are
preferred from details fetch).

Scoring: Lenient min/max-weighted similarity (better for German episodic
titles like "Die drei ??? - Folge 215"). Thresholds lowered:
UNCERTAIN 0.50→0.40, AUTO_ACCEPT 0.75→0.65.

Search: search_for_item now returns ALL fields (narrator, publisher,
series, genres, description, language) so manual apply has full data.

Apply: apply_match now always constructs from body first, then enriches
with details. Previously OL applies would lose cover/author. Added
detailed logging across matcher and apply paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 18:02:13 +02:00
Audiolib
eefdfc9886 Fix HLS playback auth + add DNB matching source
Player: hls.js did not send Authorization header for segment requests,
causing 401 errors on all HLS fetches. Fixed via xhrSetup callback.

DNB: Added Deutsche Nationalbibliothek SRU search (mat=ton filter for
audiobooks, MARC21-XML parsing). Extracts title, author, narrator,
publisher, year, series, genres, ISBN-based cover URL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:37:26 +02:00
Audiolib
3aab0ac9f1 Add logging system: app.log + matching.log with admin viewer
- Backend: RotatingFileHandler for app.log (all) and matching.log (matcher/matching services)
- New GET/DELETE /api/logs/{app|matching} endpoints (admin-only)
- matcher.py: improved cover-download logging (URL, bytes, HTTP errors, missing cover URL)
- Frontend: Logs tab in Admin panel with log switcher, line count selector, color-coded ERROR/WARNING lines, clear button

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 17:27:54 +02:00
Audiolib
3871da4bcc Fix matching: add missing subtitle field, proper error logging, match-all endpoint
- MatchResult was missing subtitle field, causing AttributeError in
  _apply_match that silently killed every background match task
- Wrap _apply_match in try/except with exc_info logging so failures
  are visible in docker compose logs backend
- New POST /api/libraries/:id/match-all endpoint to trigger matching
  for all unlocked items (useful for items scanned before the fix)
- Admin UI: Match button per library next to the Scan button

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 15:15:11 +02:00
Audiolib
6bb07ff873 Case-insensitive login, auto-matching after scan, per-library sources
- auth: username lookup is now case-insensitive via func.lower()
- scanner: trigger match_audiobook for each newly found item after scan
- matcher: read match_sources from library settings; refactored to loop
  over configured sources in priority order instead of hardcoded sequence
- schemas/routers: expose matchSources in LibraryOut API response
- Admin UI: pill-toggle for MusicBrainz/Open Library/Google Books per library

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 15:01:56 +02:00
Audiolib
464b47fb9c Replace passlib with bcrypt directly to fix Python 3.12 compatibility
passlib 1.7.4 runs an internal bcrypt wrap-bug test on startup that
fails with bcrypt 4.x because it uses a >72 byte test password.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 13:49:23 +02:00
Audiolib
52c10a7518 Phase 5-9: Matching-Engine, Podcast-Support, Web-Interface + Player
Backend:
- Matching-Orchestrator mit deutschen Serien-Patterns (drei ???, TKKG, ...)
- Vollständige MusicBrainz-Integration (Tracklist → Kapitel, Cover Art Archive)
- OpenLibrary + Google Books als Fallback-Quellen
- Auto-Accept (≥0.75) vs zu_prüfen (0.5-0.75) vs kein Match
- Manuelles Matching: GET /api/items/:id/match/search, POST apply
- RSS-Feed-Manager: feedparser, iTunes Search, periodisches Update
- APScheduler für Podcast-Feed-Updates (konfigurierbares Intervall)
- Podcast-Router: Feed-URL setzen, Episoden, Feed-Suche
- HLS: FFmpeg läuft als Background-Task, wartet auf ersten Segment
- main.py: APScheduler + neue Router eingebunden

Frontend (React + Vite + Tailwind + HLS.js):
- Login-Seite mit Fehlerbehandlung
- Library-Seite: Grid/Listen-Ansicht, Suche, Tag-Filter, Pagination, Scan
- BookCard: Cover, Fortschrittsbalken, zu_prüfen Badge, Quick-Play
- BookDetail: Metadaten, Matching-Panel, Kapitel-Liste, Lesezeichen
- AudioPlayer: HLS.js, Kapitel-Marker auf Fortschrittsbalken, Speed,
  Sleep-Timer, Lesezeichen, Keyboard-Shortcuts (Space/Arrows)
- MiniPlayer: persistent an Fußzeile, expandierbar
- PodcastDetail: Feed-URL, iTunes-Suche, Episoden-Liste
- Admin-Panel: Benutzer/Bibliotheken/Einstellungen verwalten
- App.tsx: React Router, Auth-Guard, Player-Overlay

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 13:11:04 +02:00
Audiolib
14ffee3051 Initial commit: Phase 1 – Projektstruktur, DB-Schema, Core-API
- FastAPI-Backend mit vollständiger ABS v2.x API-Kompatibilität
- SQLAlchemy-Models: User, Library, LibraryItem, BookFile, Chapter,
  Podcast, PodcastEpisode, MediaProgress, Bookmark, PlaybackSession
- Auth: JWT-Login (/login, /logout, /api/authorize)
- Library + Items Endpoints inkl. camelCase ABS-Response-Format
- HLS-Streaming via FFmpeg (POST /api/items/:id/play, Session-Sync)
- Me/Progress Endpoints + Lesezeichen
- User-Management + Server-Settings (Admin)
- Library-Scanner (MP3/WAV Discovery, Hintergrund-Task)
- File Watcher (watchdog, 30s Debounce)
- Matching-Skelett (MusicBrainz, OpenLibrary, Google Books – Phase 5)
- Docker-Setup: backend (Python 3.12+FFmpeg), frontend (React/Vite),
  nginx Reverse-Proxy auf Port 3000
- setup.sh: Installiert Docker auf Debian/Ubuntu, richtet .env ein

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 11:43:35 +02:00