说明文档

license: mit language:

en model_name: BioCLIP-2 Quantized model_description: "BioCLIP-2 Quantized is a quantized version of BioCLIP-2, a foundation model for biological organismal images. It is trained on TreeOfLife-200M on the basis of a CLIP model (ViT-14/L) pre-trained on LAION-2B. BioCLIP-2 yields state-of-the-art performance in recognizing various species. More importantly, it demonstrates emergent properties beyond species classification after extensive hierarchical contrastive training." base_model:
imageomics/bioclip-2 tags:
bioclip
bioclip-2
biology
CV
images
imageomics
clip
species-classification
biological visual task
multimodal
animals
species
taxonomy
rare species
endangered species
evolutionary biology
knowledge-guided
zero-shot-image-classification datasets:
imageomics/TreeOfLife-200M
GBIF
bioscan-ml/BIOSCAN-5M
EOL
FathomNet

BioCLIP-2 量化模型卡片

BioCLIP-2 量化版是 BioCLIP-2 的量化版本，BioCLIP-2 是一个用于生物有机体图像的基础模型点击此处了解更多关于该模型的信息。该量化模型旨在减少内存使用并提高推理速度，代价是损失少量精度。

模型描述

BioCLIP-2 量化版是从原始 BioCLIP-2 模型动态量化而来。使用 PyTorch 的 onnx.export 函数将原始模型转换为 ONNX 格式，然后使用 onnxruntime.quantization.quantize_dynamic 函数对 ONNX 模型执行动态量化，weight_type 设置为 QuantType.QInt8。

量化详情

您可以在此处找到量化脚本。

推理

要使用量化模型进行推理，您可以使用 onnxruntime 库。以下是示例代码片段：


import onnxruntime as ort
import torch
import torch.nn.functional as F
import numpy as np
from huggingface_hub import hf_hub_download


# 加载量化模型
ort_session = ort.InferenceSession("path/to/bioclip-2-quantized.onnx", providers=['CPUExecutionProvider'])

# 仅返回一个标签
k = 1

# 预处理图像
img_tensor = preprocess_img(img).unsqueeze(0)
img_np = img_tensor.numpy()

# 运行 ONNX 推理
input_name = session.get_inputs()[0].name
output_name = session.get_outputs()[0].name

img_features_np = session.run([output_name], {input_name: img_np})[0]

# 转换回 torch 以兼容现有代码
img_features = torch.from_numpy(img_features_np)
img_features = F.normalize(img_features, dim=-1)

# 可选：用于开放式分类，您需要文本嵌入：
txt_emb = torch.from_numpy(
    np.load(
        hf_hub_download(
            repo_id="imageomics/TreeOfLife-200M",
            filename="embeddings/txt_emb_species.npy",
            repo_type="dataset",
        )
    )
)
# 或者您可以跳过此步骤，使用自己的文本输入进行零样本分类

# 使用与原始模型相同的文本嵌入和 logit 缩放（主模型的 logit_scale：100.00000762939453）
logits = (model.logit_scale.exp() * img_features @ txt_emb).squeeze()
probs = F.softmax(logits, dim=0)

topk = probs.topk(k)
prediction_dict = {
    format_name(*txt_names[i]): prob
    for i, prob in zip(topk.indices, topk.values)
}

print(prediction_dict)

权衡

该模型在 Nguyen Le Truong Thien 的动物和植物数据集上进行了开放式物种分类测试，选择最高概率类别作为预测类别，并与主 BioCLIP-2 模型进行比较。结果如下：替代文本

致谢

主模型由 Imageomics Institute 团队开发。当前模型仅是主模型的量化版本，旨在减少内存使用、提高推理速度，并使模型更易于访问。

mahan-ym/bioclip-2-quantized

作者 mahan-ym

zero-shot-image-classification

↓ 0 ♥ 2

创建时间: 2025-11-21 09:48:30+00:00

更新时间: 2025-11-21 12:46:02+00:00

在 Hugging Face 上查看

文件 (4)

.gitattributes

README.md

doc/comparison_to_main_model.png

onnx/bioclip2_model_int8.onnx ONNX