Sign in Sign up
cargo

kreuzberg-tesseract

A polyglot document intelligence framework with a Rust core. Extract text, metadata, images, and structured information from PDFs, Office documents, images, and 97+ formats. Available for Rust, Python, Ruby, Java, Go, PHP, Elixir, C#, R, C, TypeScript (Node/Bun/Wasm/Deno)- or use via CLI, REST API, or MCP server.

Latest release
15h ago
Releases
117
Known CVEs
0
First release
Nov 15, 2025
License
MIT
View on Cargo
Repository

Source

kreuzberg-dev/kreuzberg
Stars
8.5k
Forks
497
Open issues
8
Language
Rust
  • text-extraction
  • document-intelligence
  • metadata-extraction
  • pdf-extraction
  • pdfium
  • python
  • rag
  • table-extraction

Security score

No OpenSSF Scorecard available for this repository.

Packages from this repo

Insights

Activity

Total releases
117
Last 12 months
117
Cadence
~daily
Dependencies
6

Releases per month

last 12 months

Release mix

  • minor 9
  • patch 65
  • pre 42
117 releases
Dependencies

Depends on

5.0.0-rc.10

Used by

1
Releases
Version Released
5.0.0-rc.10 pre
5.0.0-rc.8 pre
5.0.0-rc.7 pre
5.0.0-rc.5 pre
5.0.0-rc.4 pre
4.9.9 patch
5.0.0-rc.3 pre
5.0.0-rc.2 pre
5.0.0-rc.1 pre
4.9.8 patch
4.9.7 patch
4.9.6 patch
4.10.0-rc.15 pre
4.10.0-rc.14 pre
4.10.0-rc.12 pre
4.10.0-rc.11 pre
4.10.0-rc.9 pre
4.10.0-rc.8 pre
4.10.0-rc.7 pre
4.10.0-rc.6 pre
1–20 of 117