Skip to content
This repository was archived by the owner on Jan 15, 2026. It is now read-only.

fix: pull onnx models from huggingface instead of GCS (WIP)#36

Draft
nleroy917 wants to merge 7 commits into
Anush008:mainfrom
nleroy917:main
Draft

fix: pull onnx models from huggingface instead of GCS (WIP)#36
nleroy917 wants to merge 7 commits into
Anush008:mainfrom
nleroy917:main

Conversation

@nleroy917

Copy link
Copy Markdown
Contributor

This prioritizes huggingface weights over ones stored in google cloud storage (GCS). The reason is twofold: 1) the python fastembed implementation does this, and 2) we shouldnt point at GCS. When people are using sentence-transformers and fastembed they expect embeddings to be the same... we have zero control over what weights they are putting on all-MiniLM-L6-v2 or bge-small-en-v1.5, and so we should point to them as a source of truth.

This addresses some issues in: #30

However, some of the models this module supports (bge-small-zh, bge-small-en) dont actually have onnx weights on HF so that can be a problem.

@nleroy917 nleroy917 changed the title Pull onnx models from HF fix: pull onnx models from huggingface instead of GCS (WIP) Dec 17, 2025
@Anush008 Anush008 marked this pull request as draft December 17, 2025 12:11
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant