Improving the inference/classification/prediction speed of this bart-large-mnli model

#15

by abhijit57 - opened Apr 27, 2023

Apr 27, 2023

Hello,

I am working on a text classification research project and I have a dataset of about 500000 rows where each document is of a fairly larger size (70-100 tokens). I tried this model on nvidia v100 32gb GPU for 10 rows and a candidate label size of 804. It took 10 minutes. I cannot reduce the candidate label list size as per the requirements. I also tried codon compiler and numba to improve the inferences speed but not much luck there.

Has anyone have worked on the C++ bart model or have used deepspeed to improve the predictions for this model?
Any leads or help would be greatly appreciated, thank you.

manbeast3b

Nov 26, 2023

+1, did you find a way?

abhijit57

Nov 26, 2023

Use deepspeed

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment