ProjectDescription
SembleCode search for agents that uses ~98% fewer tokens than grep+read.
Model2VecDistil any sentence transformer into a tiny, fast static embedding model.
SemHashSemantic deduplication and dataset filtering across text, images, and audio.
PotionTiny state-of-the-art static embedding models for English, multilingual, and retrieval tasks.
VicinityFast, lightweight nearest neighbor search with pluggable backends.
PyversityDiversify search and retrieval results to reduce redundancy and improve coverage.
AgentcheckScans your shell and reports what an AI agent could access, by severity.
TokenlearnPre-train static embedding models for distillation pipelines.
Model2Vec-rsA Rust port of Model2Vec.