说明文档

Xenova/bge-small-en-v1.5

基于 BAAI/bge-small-en-v1.5 的 ONNX 权重版本，与 Transformers.js 兼容。

使用方法 (Transformers.js)

如果还没有安装，可以通过以下命令从 NPM 安装 Transformers.js JavaScript 库：

npm i @huggingface/transformers

你可以使用该模型计算嵌入向量，如下所示：

import { pipeline } from '@huggingface/transformers';

// Create a feature-extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/bge-small-en-v1.5');

// Compute sentence embeddings
const texts = ['Hello world.', 'Example sentence.'];
const embeddings = await extractor(texts, { pooling: 'mean', normalize: true });
console.log(embeddings);
// Tensor {
//   dims: [ 2, 384 ],
//   type: 'float32',
//   data: Float32Array(768) [ -0.04314826801419258, -0.029488801956176758, ... ],
//   size: 768
// }

console.log(embeddings.tolist()); // Convert embeddings to a JavaScript list
// [
//   [ -0.04314826801419258, -0.029488801956176758, 0.027080481871962547, ... ],
//   [ -0.03605496883392334, 0.01643390767276287, 0.008982205763459206, ... ]
// ]

你也可以将该模型用于检索。例如：

import { pipeline, cos_sim } from '@huggingface/transformers';

// Create a feature-extraction pipeline
const extractor = await pipeline('feature-extraction', 'Xenova/bge-small-en-v1.5');

// List of documents you want to embed
const texts = [
    'Hello world.',
    'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.',
    'I love pandas so much!',
];

// Compute sentence embeddings
const embeddings = await extractor(texts, { pooling: 'mean', normalize: true });

// Prepend recommended query instruction for retrieval.
const query_prefix = 'Represent this sentence for searching relevant passages: '
const query = query_prefix + 'What is a panda?';
const query_embeddings = await extractor(query, { pooling: 'mean', normalize: true });

// Sort by cosine similarity score
const scores = embeddings.tolist().map(
    (embedding, i) => ({
        id: i,
        score: cos_sim(query_embeddings.data, embedding),
        text: texts[i],
    })
).sort((a, b) => b.score - a.score);
console.log(scores);
// [
//   { id: 1, score: 0.7995888037433755, text: 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.' },
//   { id: 2, score: 0.6911046766159414, text: 'I love pandas so much!' },
//   { id: 0, score: 0.39066192695524765, text: 'Hello world.' }
// ]

注意：使用单独的仓库存储 ONNX 权重是一个临时解决方案，旨在让 WebML 获得更多关注。如果你希望让模型能够在 Web 上使用，我们建议使用 🤗 Optimum 转换为 ONNX 格式，并像本仓库这样组织（将 ONNX 权重存放在名为 onnx 的子文件夹中）。

Xenova/bge-small-en-v1.5

作者 Xenova

feature-extraction transformers.js

↓ 147K ♥ 14

创建时间: 2023-09-13 15:48:17+00:00

更新时间: 2025-07-22 16:45:37+00:00

在 Hugging Face 上查看

文件 (16)

.gitattributes

README.md

config.json

onnx/model.onnx ONNX

onnx/model_bnb4.onnx ONNX

onnx/model_fp16.onnx ONNX

onnx/model_int8.onnx ONNX

onnx/model_q4.onnx ONNX

onnx/model_q4f16.onnx ONNX

onnx/model_quantized.onnx ONNX

onnx/model_uint8.onnx ONNX

quantize_config.json

special_tokens_map.json

tokenizer.json

tokenizer_config.json

vocab.txt