Instructions to use ai-sage/Giga-Embeddings-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ai-sage/Giga-Embeddings-instruct with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("ai-sage/Giga-Embeddings-instruct", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers
How to use ai-sage/Giga-Embeddings-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="ai-sage/Giga-Embeddings-instruct", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ai-sage/Giga-Embeddings-instruct", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
torch.compile падает на Tesla T4
Падает, потому что она SM75, а для компиляции bf16 нужно >=SM80.
Но можно решить проще. В коде модели строго все приводится к bf16.
Строка 991: with torch.autocast('cuda', dtype=torch.bfloat16):
Просьба пофиксить, чтобы все красиво работало на T4.
У меня пока работает некрасиво через быструю затычку)))
from transformers.modeling_outputs import BaseModelOutputWithPast
def forward(self, input_ids: torch.Tensor, attention_mask: torch.Tensor,
return_embeddings: bool = False, **kwargs):
kwargs.pop('token_type_ids', None)
with torch.autocast('cuda', dtype=torch.float16):
outputs = self.model(input_ids=input_ids, attention_mask=attention_mask, **kwargs)
last_hidden = self.latent_attention_model(outputs.last_hidden_state, attention_mask)
if return_embeddings:
return self.mean_pool(last_hidden, attention_mask)
return BaseModelOutputWithPast(last_hidden_state=last_hidden)
model = AutoModel.from_pretrained(MODEL_NAME, torch_dtype=torch.float16, trust_remote_code=True)
model.forward = forward.get(model, model.class)