说明文档

Yi-1.5-6B-Chat DirectML ONNX 模型

本仓库托管了 01-ai/Yi-1.5-6B-Chat 的优化版本，用于加速 DirectML 的 ONNX Runtime 推理。

Windows 上的使用方法 (Intel / AMD / Nvidia / Qualcomm)

conda create -n onnx python=3.10
conda activate onnx
winget install -e --id GitHub.GitLFS
pip install huggingface-hub[cli]
huggingface-cli download EmbeddedLLM/01-ai_Yi-1.5-6B-Chat-onnx --include=onnx/directml/01-ai_Yi-1.5-6B-Chat-int4 --local-dir .\01-ai_Yi-1.5-6B-Chat-int4
pip install numpy==1.26.4
Invoke-WebRequest -Uri \"https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/examples/python/phi3-qa.py\" -OutFile \"phi3-qa.py\"
pip install onnxruntime-directml
pip install --pre onnxruntime-genai-directml
conda install conda-forge::vs2015_runtime
python phi3-qa.py -m .\01-ai_Yi-1.5-6B-Chat-int4

什么是 DirectML

DirectML 是一个高性能、硬件加速的 DirectX 12 机器学习库。DirectML 为各种支持的硬件和驱动程序上的常见机器学习任务提供 GPU 加速，包括来自 AMD、Intel、NVIDIA 和 Qualcomm 等供应商的所有支持 DirectX 12 的 GPU。

EmbeddedLLM/01-ai_Yi-1.5-6B-Chat-onnx

作者 EmbeddedLLM

text-generation

↓ 0 ♥ 0

创建时间: 2024-06-18 15:19:10+00:00

更新时间: 2024-06-20 12:44:43+00:00

在 Hugging Face 上查看

文件 (10)

.gitattributes

README.md

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/config.json

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/genai_config.json

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/model.onnx ONNX

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/model.onnx.data

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/special_tokens_map.json

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/tokenizer.json

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/tokenizer.model

onnx/directml/01-ai_Yi-1.5-6B-Chat-int4/tokenizer_config.json