Transcription and translation of videos using fine-tuned XLSR Wav2Vec2 on custom dataset and mBART
Paper • 2403.00212 • Published
How to use Aniket-Tathe-08/XLSR-Wav2Vec2-Finetuned-14min-dataset with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model="Aniket-Tathe-08/XLSR-Wav2Vec2-Finetuned-14min-dataset") # Load model directly
from transformers import AutoProcessor, AutoModelForCTC
processor = AutoProcessor.from_pretrained("Aniket-Tathe-08/XLSR-Wav2Vec2-Finetuned-14min-dataset")
model = AutoModelForCTC.from_pretrained("Aniket-Tathe-08/XLSR-Wav2Vec2-Finetuned-14min-dataset")