Sign in Sign up
pypi

goldenmatch

Zero-config entity resolution that scales from a CSV to 100M+ rows on a Ray cluster (verified: 100M deduped in 213s, 0.30 GB driver). Fuzzy + exact + probabilistic dedupe, identity graph, PPRL, LLM boost. Python + full TypeScript port; SQL-native in PostgreSQL & DuckDB; MCP/REST servers, dbt + Airflow recipes.

Latest release
1d ago
Releases
57
Known CVEs
0
First release
Mar 19, 2026
License
MIT
View on Pypi
Repository

Source

benseverndev-oss/goldenmatch
Stars
86
Forks
10
Open issues
2
Language
Python
  • data-engineering
  • data-quality
  • deduplication
  • entity-resolution
  • fuzzy-matching
  • llm
  • polars
  • python

Security score

6.5 / 10 OpenSSF
CII-Best-Practices
0
Code-Review
0
Contributors
0
Maintained
0
Token-Permissions
0
Branch-Protection
4

Packages from this repo

Insights

Activity

Total releases
57
Last 12 months
57
Cadence
~daily
Dependencies
51

Releases per month

last 12 months

Release mix

  • major 1
  • minor 35
  • patch 20
57 releases
Dependencies

Depends on

1.30.0
  • pypi aiohttp >=3.14.0
  • pypi alembic >=1.13
  • pypi anthropic >=0.30
  • pypi azure-storage-blob >=12.19
  • pypi boto3 >=1.26
  • pypi databricks-sql-connector >=3.0
  • pypi datafusion <54,>=53
  • pypi diptest >=0.5.2
  • pypi duckdb >=0.9
  • pypi faiss-cpu >=1.7
1–10 of 51

Used by

Nothing tracked depends on this yet.

Releases
Version Released
1.30.0 minor
1.29.0 minor
1.28.0 minor
1.27.0 minor
1.26.0 minor
1.25.0 minor
1.24.0 minor
1.23.0 minor
1.22.0 minor
1.21.0 minor
1.20.0 minor
1.19.0 minor
1.18.1 patch
1.18.0 minor
1.17.1 patch
1.17.0 minor
1.16.0 minor
1.15.0 minor
1.14.0 minor
1.13.0 minor
1–20 of 57