ONNX 模型库
返回模型

说明文档

ZipVoice⚡: 基于流匹配的快速高质量零样本文本转语音</center>

本模型包含两个快速高质量非自回归零样本文本转语音模型的检查点:

  • ZipVoice, 用于单说话人语音生成。详情见论文演示

  • ZipVoice-Dialog, 用于口语对话生成。详情见论文演示

请参阅我们的 Github 仓库 ZipVoice 获取使用模型的说明。

1. 各目录说明

目录 模型类型 训练数据 初始化来源
zipvoice ZipVoice Emilia -
zipvoice_libritts ZipVoice LibriTTS -
zipvoice_distill ZipVoice-Distill Emilia zipvoice/model.pt
zipvoice_distill_libritts ZipVoice-Distill LibriTTS zipvoice_libritts/model.pt
zipvoice_dialog ZipVoice-Dialog OpenDialog + 内部数据集 zipvoice/model.pt
zipvoice_dialog_opendialog ZipVoice-Dialog OpenDialog zipvoice/model.pt
zipvoice_dialog_stereo ZipVoice-Dialog-Stereo 内部数据集 zipvoice_dialog/model.pt

2. 讨论与交流

您可以直接在 Github Issues 上讨论。

您也可以扫描二维码加入我们的微信群或关注我们的微信公众号。

微信群 微信公众号
wechat wechat

3. 引用

@article{zhu2025zipvoice,
      title={ZipVoice: Fast and High-Quality Zero-Shot Text-to-Speech with Flow Matching},
      author={Zhu, Han and Kang, Wei and Yao, Zengwei and Guo, Liyong and Kuang, Fangjun and Li, Zhaoqing and Zhuang, Weiji and Lin, Long and Povey, Daniel},
      journal={arXiv preprint arXiv:2506.13053},
      year={2025}
}

@article{zhu2025zipvoicedialog,
      title={ZipVoice-Dialog: Non-Autoregressive Spoken Dialogue Generation with Flow Matching},
      author={Zhu, Han and Kang, Wei and Guo, Liyong and Yao, Zengwei and Kuang, Fangjun and Zhuang, Weiji and Li, Zhaoqing and Han, Zhifeng and Zhang, Dong and Zhang, Xin and Song, Xingchen and Lin, Long and Povey, Daniel},
      journal={arXiv preprint arXiv:2507.09318},
      year={2025}
}

k2-fsa/ZipVoice

作者 k2-fsa

text-to-speech
↓ 0 ♥ 44

创建时间: 2025-06-15 07:53:40+00:00

更新时间: 2025-08-19 03:06:16+00:00

在 Hugging Face 上查看

文件 (40)

.gitattributes
README.md
zipvoice/fm_decoder.onnx ONNX
zipvoice/fm_decoder_int8.onnx ONNX
zipvoice/model.json
zipvoice/model.pt
zipvoice/model.safetensors
zipvoice/text_encoder.onnx ONNX
zipvoice/text_encoder_int8.onnx ONNX
zipvoice/tokens.txt
zipvoice/zipvoice_base.json
zipvoice_dialog/model.json
zipvoice_dialog/model.pt
zipvoice_dialog/tokens.txt
zipvoice_dialog/zipvoice_base.json
zipvoice_dialog_opendialog/model.json
zipvoice_dialog_opendialog/model.pt
zipvoice_dialog_opendialog/tokens.txt
zipvoice_dialog_opendialog/zipvoice_base.json
zipvoice_dialog_stereo/model.json
zipvoice_dialog_stereo/model.pt
zipvoice_dialog_stereo/tokens.txt
zipvoice_dialog_stereo/zipvoice_base.json
zipvoice_distill/fm_decoder.onnx ONNX
zipvoice_distill/fm_decoder_int8.onnx ONNX
zipvoice_distill/model.json
zipvoice_distill/model.pt
zipvoice_distill/model.safetensors
zipvoice_distill/text_encoder.onnx ONNX
zipvoice_distill/text_encoder_int8.onnx ONNX
zipvoice_distill/tokens.txt
zipvoice_distill/zipvoice_base.json
zipvoice_distill_libritts/model.json
zipvoice_distill_libritts/model.pt
zipvoice_distill_libritts/tokens.txt
zipvoice_distill_libritts/zipvoice_base.json
zipvoice_libritts/model.json
zipvoice_libritts/model.pt
zipvoice_libritts/tokens.txt
zipvoice_libritts/zipvoice_base.json