Make matching debuggable + fix metadata search blockers

DNB rewrite:
- Multiple query strategies with fallback (title+author+mat=ton →
  title+author → title+mat=ton → title-only → fulltext). Returns on
  first hit. Most German audiobooks aren't tagged mat=ton in DNB,
  which was killing all searches.
- Strip CQL wildcard chars (?, *, <, >, =, /, quotes) from search
  terms. The "???" in "Die drei ???" was breaking the CQL parser.
- Log HTTP status, body snippet on non-200, and numberOfRecords on
  every query so log shows exactly what DNB returned.
- Parse SRU diagnostic elements (DNB error messages buried in XML).
- Convert author/narrator from "Lastname, Firstname" to
  "Firstname Lastname" for consistency with other sources.

Matcher:
- Split series patterns: WITH_EPISODE (need digit) and SERIES_ONLY
  (just the series name). "Die drei ??? und der Fluch des Rubins"
  now properly detects "Die drei ???" as series even without folge#.
- New _build_search_title: removes ??? sequences, trailing parens,
  collapses whitespace, before sending to APIs.
- Manual search also passes through normalization. Logs source +
  hit count per query.

Debug endpoint:
- GET /api/items/match/debug?title=...&author=... returns raw results
  from all 4 sources with status, error messages, and full metadata.
- "Debug" button added in BookDetail — shows what each API actually
  returns inline, so the user can see if it's a search problem,
  parse problem, or threshold problem.
- "Cover aus Datei" button — triggers local cover extraction
  (folder.jpg or embedded artwork) on demand.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Audiolib
2026-05-26 18:34:49 +02:00
parent 38f7c9726e
commit e3e6492b1f
5 changed files with 279 additions and 39 deletions

View File

@@ -52,6 +52,63 @@ async def search_match(
return {"results": results}
@router.get("/match/debug")
async def debug_match(
title: str,
author: str | None = None,
current_user: User = Depends(get_current_user),
):
"""Debug-Endpoint: gibt rohe Ergebnisse aller Such-Quellen zurück.
Aufruf direkt aus Browser: /api/items/match/debug?title=Foo&author=Bar
"""
from ..services.matching.musicbrainz import search_musicbrainz
from ..services.matching.open_library import search_open_library
from ..services.matching.google_books import search_google_books
from ..services.matching.dnb import search_dnb
from ..services.matcher import _build_search_title, detect_series
series, episode = detect_series(title)
search_title = _build_search_title(title)
if series and episode:
search_title = f"{series} {episode}"
logger.info(f"DEBUG: title={title!r} → search={search_title!r} series={series!r} episode={episode!r}")
async def _try(name, coro):
try:
r = await coro
return {
"source": name,
"ok": True,
"count": len(r),
"results": [
{
"title": x.title, "author": x.author, "narrator": x.narrator,
"publisher": x.publisher, "year": x.publish_year,
"series": x.series, "series_sequence": x.series_sequence,
"cover_url": x.cover_url, "language": x.language,
"genres": x.genres, "description": (x.description or "")[:200],
"confidence": x.confidence, "source_id": x.source_id,
} for x in r
],
}
except Exception as e:
return {"source": name, "ok": False, "error": f"{type(e).__name__}: {e}"}
results = await asyncio.gather(
_try("musicbrainz", search_musicbrainz(search_title, author)),
_try("open_library", search_open_library(search_title, author)),
_try("google_books", search_google_books(search_title, author)),
_try("dnb", search_dnb(search_title, author)),
)
return {
"input": {"title": title, "author": author},
"normalized": {"search_title": search_title, "series": series, "episode": episode},
"sources": results,
}
@router.post("/{item_id}/match/apply")
async def apply_match(
item_id: str,