ONNX 模型库
返回模型

说明文档


library_name: transformers.js license: apple-amlr pipeline_tag: image-text-to-text tags:

  • fastvlm

用法

Transformers.js

如果你还没有安装,可以通过 NPM 安装 Transformers.js JavaScript 库:

npm i @huggingface/transformers

然后你可以按以下方式为图像生成描述:

import {
  AutoProcessor,
  AutoModelForImageTextToText,
  load_image,
  TextStreamer,
} from "@huggingface/transformers";

// 加载处理器和模型
const model_id = "onnx-community/FastVLM-0.5B-ONNX";
const processor = await AutoProcessor.from_pretrained(model_id);
const model = await AutoModelForImageTextToText.from_pretrained(model_id, {
  dtype: {
    embed_tokens: "fp16",
    vision_encoder: "q4",
    decoder_model_merged: "q4",
  },
});

// 准备提示词
const messages = [
  {
    role: "user",
    content: "<image>详细描述这张图片。",
  },
];
const prompt = processor.apply_chat_template(messages, {
  add_generation_prompt: true,
});

// 准备输入
const url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg";
const image = await load_image(url);
const inputs = await processor(image, prompt, {
  add_special_tokens: false,
});

// 生成输出
const outputs = await model.generate({
  ...inputs,
  max_new_tokens: 512,
  do_sample: false,
  streamer: new TextStreamer(processor.tokenizer, {
    skip_prompt: true,
    skip_special_tokens: false,
    // callback_function: (text) => { /* 对流式输出进行操作 */ },
  }),
});

// 解码输出
const decoded = processor.batch_decode(
  outputs.slice(null, [inputs.input_ids.dims.at(-1), null]),
  { skip_special_tokens: true },
);
console.log(decoded[0]);

<details>

<summary>点击此处查看示例输出</summary>

The image depicts a vibrant and colorful scene featuring a variety of flowers and plants. The main focus is on a striking pink flower with a dark center, which appears to be a type of petunia. The petals are a rich, deep pink, and the flower has a classic, slightly ruffled appearance. The dark center of the flower is a contrasting color, likely a deep purple or black, which adds to the flower's visual appeal.

In the background, there are several other flowers and plants, each with their unique colors and shapes. To the left, there is a red flower with a bright, vivid hue, which stands out against the pink flower. The red flower has a more rounded shape and a lighter center, with petals that are a lighter shade of red compared to the pink flower.

To the right of the pink flower, there is a plant with red flowers, which are smaller and more densely packed. The red flowers are a deep, rich red color, and they have a more compact shape compared to the pink flower.

In the foreground, there is a green plant with a few leaves and a few small flowers. The leaves are a bright green color, and the flowers are a lighter shade of green, with a few petals that are slightly open.

Overall, the image is a beautiful representation of a garden or natural setting, with a variety of flowers and plants that are in full bloom. The colors are vibrant and the composition is well-balanced, with the pink flower in the center drawing the viewer's attention.

</details>

acrkaan/FastVLM-0.5B-ONNX-q4

作者 acrkaan

image-text-to-text transformers.js
↓ 1 ♥ 0

创建时间: 2025-09-01 13:16:37+00:00

更新时间: 2025-09-01 13:31:41+00:00

在 Hugging Face 上查看

文件 (17)

.gitattributes
LICENSE
README.md
added_tokens.json
config.json
generation_config.json
merges.txt
model.json
onnx/decoder_model_merged.onnx ONNX
onnx/embed_tokens.onnx ONNX
onnx/vision_encoder.onnx ONNX
preprocessor_config.json
processor_config.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json
vocab.json