返回模型

说明文档

这是 SparseML 量化版本的 https://huggingface.co/laion/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K，可直接与 DeepSparse CPU 推理引擎配合使用。它在 ImageNet 上实现了 71.1% 的零样本 top-1 准确率，在 Imagenette 上实现了 95.6% 的零样本 top-1 准确率。作为对比，密集版本（原始模型）在 ImageNet 上实现了 72.8%，在 Imagenette 上实现了 95.7%。

在一台具有 64 核和 VNNI 支持的 Intel avx512 CPU 机器上，与全精度模型相比，该模型在文本输入上实现了 2.35x 的加速，在视觉输入上实现了 2.84x 的加速。在批量大小为 64 时，吞吐量测得图像为 1230 项/秒，文本为 2009 项/秒。

基本用法笔记本： Imagenette 评估笔记本：

团队

该模型和示例管道由 Eugenia Iofinova、Michael Goin、Chris Wendler 和 Dan Alistarh 创建。特别感谢 Abhinav Agarwalla 和 Alexandre Marques 为项目部分内容提供的技术支持。

使用设置

首先，安装带有 CLIP 扩展的 DeepSparse：

pip install deepsparse-nightly[clip]>=1.7.0.20231210

下载一些测试图像：教堂、狗和大象：

wget -O basilica.jpg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/src/deepsparse/yolo/sample_images/basilica.jpg
wget -O buddy.jpeg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/tests/deepsparse/pipelines/sample_images/buddy.jpeg
wget -O thailand.jpg https://raw.githubusercontent.com/neuralmagic/deepsparse/main/src/deepsparse/yolact/sample_images/thailand.jpg

对于此模型，还有第二个输入是 token 的长度，因此在创建文本管道之前运行此输入覆盖代码：

import numpy as np
from deepsparse.clip import CLIPTextPipeline

def custom_process_inputs(self, inputs):
    if not isinstance(inputs.text, list):
        inputs.text = [inputs.text]
    if not isinstance(inputs.text[0], str):
        return inputs.text
    tokens = [np.array(t).astype(np.int32) for t in self.tokenizer(inputs.text)]
    tokens = np.stack(tokens, axis=0)
    tokens_lengths = np.array(tokens.shape[0] * [tokens.shape[1] - 1])
    return [tokens, tokens_lengths]

# 这将为所有 CLIPTextPipeline 类全局覆盖 process_inputs 函数
CLIPTextPipeline.process_inputs = custom_process_inputs

文本嵌入管道

以下是如何创建和使用 DeepSparse 文本嵌入管道的示例。

from deepsparse import Pipeline
from huggingface_hub import snapshot_download

# 从 HF 下载模型
model_folder = snapshot_download(repo_id="neuralmagic/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K-quant-ds")

text_embed_pipeline = Pipeline.create(task="clip_text", model_path=model_folder + "/textual.onnx")

text = ["ice cream", "an elephant", "a dog", "a building", "a church"]

embeddings = text_embed_pipeline(text=text).text_embeddings
for i in range(len(embeddings)):
    print(embeddings[i].shape)
    print(embeddings[i])

图像嵌入管道

以下是如何创建和使用 DeepSparse 图像嵌入管道的示例。

from deepsparse import Pipeline
from huggingface_hub import snapshot_download

# 从 HF 下载模型
model_folder = snapshot_download(repo_id="neuralmagic/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K-quant-ds")

image_embed_pipeline = Pipeline.create(task="clip_visual", model_path=model_folder + "/visual.onnx")

images = ["basilica.jpg", "buddy.jpeg", "thailand.jpg"]

embeddings = image_embed_pipeline(images=images).image_embeddings
for i in range(len(embeddings)):
    print(embeddings[i].shape)
    print(embeddings[i])

零样本图像分类管道

由于 CLIP 同时训练了文本和图像嵌入模型，我们可以为两者生成嵌入并将它们关联起来，而无需重新训练。以下是如何创建和使用 DeepSparse 零样本图像分类管道的示例。

from deepsparse import Pipeline
from deepsparse.clip import (
    CLIPTextInput,
    CLIPVisualInput,
    CLIPZeroShotInput
)
from huggingface_hub import snapshot_download

# 从 HF 下载模型
model_folder = snapshot_download(repo_id="neuralmagic/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K-quant-ds")

possible_classes = ["ice cream", "an elephant", "a dog", "a building", "a church"]
images = ["basilica.jpg", "buddy.jpeg", "thailand.jpg"]

# 将模型加载到 DeepSparse 中
pipeline = Pipeline.create(
    task="clip_zeroshot",
    visual_model_path=model_folder + "/visual.onnx",
    text_model_path=model_folder + "/textual.onnx"
)

# 推理
output = pipeline(
    image=CLIPVisualInput(images=images),
    text=CLIPTextInput(text=possible_classes),
).text_scores

for i in range(len(output)):
    prediction = possible_classes[np.argmax(output[i])]
    print(f"Image {images[i]} is a picture of {prediction}")

"""
Image basilica.jpg is a picture of a church
Image buddy.jpeg is a picture of a dog
Image thailand.jpg is a picture of an elephant
"""

RedHatAI/CLIP-ViT-B-32-256x256-DataComp-s34B-b86K-quant-ds

作者 RedHatAI

zero-shot-classification

↓ 0 ♥ 8

创建时间: 2023-12-19 18:03:47+00:00

更新时间: 2024-01-12 22:47:29+00:00

在 Hugging Face 上查看

文件 (7)

.gitattributes

README.md

open_clip_config.json

special_tokens_map.json

textual.onnx ONNX

tokenizer_config.json

visual.onnx ONNX