Bisher/ASVspoof_2019_LA
Viewer • Updated • 121k • 844 • 2
How to use 0xmola/wavlm-deepfake-audio-forensics with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("audio-classification", model="0xmola/wavlm-deepfake-audio-forensics") # Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("0xmola/wavlm-deepfake-audio-forensics", dtype="auto")Fine-tuned WavLM-base for real-time audio deepfake detection.
This model detects AI-cloned/synthetic voices by analyzing raw audio waveforms through a CNN-Transformer hybrid architecture. It identifies synthetic artifacts that human ears miss: unnatural pitch consistency, GAN-generated frequency smoothness, and missing microtremors.
Based on WavLM Model Ensemble for Audio Deepfake Detection:
from transformers import AutoFeatureExtractor, AutoModelForAudioClassification
import torch, librosa
model_id = "0xmola/wavlm-deepfake-audio-forensics"
extractor = AutoFeatureExtractor.from_pretrained(model_id)
model = AutoModelForAudioClassification.from_pretrained(model_id)
model.eval()
# Load audio
audio, sr = librosa.load("audio.wav", sr=16000)
# Inference
inputs = extractor(audio, sampling_rate=16000, return_tensors="pt", padding=True)
with torch.no_grad():
logits = model(**inputs).logits
probs = torch.softmax(logits, dim=-1)
# Risk score (0-100, higher = more likely fake)
spoof_idx = model.config.label2id["spoof"]
risk_score = int(probs[0, spoof_idx].item() * 100)
print(f"Risk Score: {risk_score}/100")
print("⚠️ HIGH RISK" if risk_score >= 60 else "✅ LOW RISK")
Try the live demo: Audio Forensics Deepfake Detector
Base model
microsoft/wavlm-base