Commit Graph

7 Commits

Author SHA1 Message Date
albertfj114
92265cf27f feat: add DB operations and CLI wiring for HK parish import
upsertChurch() handles matched churches (replace schedules atomically
via $transaction, update contact fields if null) and new churches
(create with source='diocese-hk', lat/lng=0 for later geocoding).
main() wires up CLI args, file reading, matching loop, and summary.
Guards main() call with ESM import.meta.url check to prevent execution
on import during tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 16:27:02 -04:00
albertfj114
8075072c24 fix: use true Jaccard similarity in wordOverlap (intersection/union)
Replaces max(|A|,|B|) denominator with |A∪B| = |A|+|B|-intersection,
which is the correct Jaccard formula and avoids inflating similarity
when both name sets have significant unique words.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 16:25:24 -04:00
albertfj114
3ebbc3732f feat: add name normalizer and church matcher for HK import
normalizeName strips noise words (church/parish/chapel/etc), accents,
and punctuation for robust name comparison. findMatch uses word-overlap
Jaccard score (threshold 0.4) with address-prefix fallback for Chinese-
named churches where English name overlap may be low.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 16:23:58 -04:00
albertfj114
eedb442e78 feat: add full entry parser for HK parishes
parseEntry composes extractNames, extractFields, parseScheduleLine,
and parseWeekdayLine into a single ParsedEntry. Routes schedule
lines by section header (Sunday/Anticipated/Weekday) and skips
Special Masses and Eucharist Adoration sections.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 16:18:05 -04:00
albertfj114
38274174a9 feat: add HK parish import parser functions (Tasks 2-6)
Implements splitEntries, extractNames, extractFields, normalizeTime,
parseScheduleLine, and parseWeekdayLine with 26 passing unit tests.
Handles full-width parentheses, language tags, conditional schedule
notes, day ranges, and comma-separated day/time lists.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 16:15:04 -04:00
albertfj114
328d146201 feat: add HK parish parser functions (Tasks 2-6) with tests
Implements entry splitter, name extractor, field extractor, time normalizer,
schedule line parser, and weekday day-prefix parser. All 26 tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 16:06:26 -04:00
albertfj114
9aea12f4b0 feat: add HK parish import script skeleton
- Imports, types, and Prisma client init
- ParsedSchedule and ParsedEntry types for parsing parish data
- ExistingChurch interface for matching
- ImportStats interface for tracking progress

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-03 15:59:51 -04:00