betterhtmlchunking
A Python library for intelligent HTML segmentation and ROI extraction. It builds a DOM tree from raw HTML and extracts content-rich regions for efficient web scraping and analysis.
- Latest release
- Mar 07, 2026
- Releases
- 7
- Known CVEs
- 0
- First release
- Feb 14, 2025
- License
- MIT
Repository
Source
- Stars
- —
- Forks
- —
- Open issues
- —
Security score
No OpenSSF Scorecard available for this repository.
Packages from this repo
No other tracked packages from this repository.
Insights
Activity
- Total releases
- 7
- Last 12 months
- 4
- Cadence
- ~27 days
- Dependencies
- 11
Releases per month
last 12 monthsRelease mix
- patch 6
7
releases
Dependencies
Depends on
0.9.7-
attrs
-
attrs-strict
-
beautifulsoup4
-
ipython >=9.9.0
-
lxml
-
parsel-text
-
prettyprinter >=0.18.0
-
pytest >=9.0.2
-
pytest-cov >=7.0.0
-
treelib
1–10 of 11
Used by
Nothing tracked depends on this yet.
Releases
| Version | Released | |
|---|---|---|
0.9.7
patch
| ||
0.9.6
patch
| ||
0.9.5
patch
| ||
0.9.4
patch
| ||
0.9.3
patch
| ||
0.9.2
patch
| ||
0.9.1
initial
|