说明文档

用于显性仇恨言论检测的 DeBERTa-v3

这是一个经过微调用于二元仇恨言论分类（HATE 与 NOT_HATE）的 DeBERTa-v3-base 模型。它已被转换为 ONNX 格式以实现高性能推理。

该模型是一个硕士项目的一部分，旨在识别最稳健的仇恨言论分类器。虽然它在定量基准测试中表现出色，但在检测隐性或"暗语"仇恨方面存在显著的、已记录的局限性。

作者： [Taiwo Ogun]

模型性能

该模型在 HateXplain 数据集上进行了微调，并在域外（OOD）数据集（Davidson）上进行了评估，以测试其在真实场景中的泛化能力。它在与其他四个模型（BERT、DistilBERT、RoBERTa、XLM-RoBERTa）的对比中明显胜出。

F1 分数与准确率

模型	F1（域内）	F1（域外）	准确率（域内）	准确率（域外）
DeBERTa-v3（本模型）	83.19%	92.86%	78.78%	87.98%
BERT	83.48%	81.68%	79.70%	73.05%
DistilBERT	83.13%	80.01%	78.68%	71.22%
RoBERTa	82.37%	76.38%	78.78%	66.56%
XLM-RoBERTa	81.58%	66.69%	77.59%	56.36%

如何使用（配合 ONNX Runtime）

该模型采用 ONNX 格式。使用它的最简单方法是配合 Hugging Face 的 optimum 库。

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification, pipeline

# 从 Hub 加载 ONNX 模型和分词器
repo_id = "TaiwoOgun/deberta-v3-hate-speech-onnx"
model = ORTModelForSequenceClassification.from_pretrained(repo_id)
tokenizer = AutoTokenizer.from_pretrained(repo_id)

# 创建流水线
classifier = pipeline(
    "text-classification", 
    model=model, 
    tokenizer=tokenizer
)

# 运行推理
texts = [
    "This is a wonderful, positive statement.",
    "Go back to where you came from."
]

predictions = classifier(texts)
print(predictions)
# [{'label': 'NOT_HATE', 'score': 0.8...}, 
#  {'label': 'NOT_HATE', 'score': 0.8...}]

TaiwoOgun/deberta-v3-hate-speech-onnx

作者 TaiwoOgun

text-classification

↓ 0 ♥ 0

创建时间: 2025-10-30 21:06:56+00:00

更新时间: 2025-11-06 12:18:38+00:00

在 Hugging Face 上查看

文件 (9)

.gitattributes

README.md

added_tokens.json

config.json

model.onnx ONNX

special_tokens_map.json

spm.model

tokenizer.json

tokenizer_config.json