ONNX 模型库
返回模型

说明文档

SentenceTransformer 基于 hyrinmansoor/text2frappe-s2-sbert

这是一个基于 hyrinmansoor/text2frappe-s2-sbert 微调的 sentence-transformers 模型。它将句子和段落映射到 384 维的稠密向量空间,可用于语义文本相似度、语义搜索、复述挖掘、文本分类、聚类等任务。

模型详情

模型描述

  • 模型类型: Sentence Transformer
  • 基础模型: hyrinmansoor/text2frappe-s2-sbert <!-- at revision e8a3d9a9dfcf109eaa7315c74feac7b4f705dd36 -->
  • 最大序列长度: 128 个 token
  • 输出维度: 384 维
  • 相似度函数: 余弦相似度 <!-- - 训练数据集: 未知 --> <!-- - 语言: 未知 --> <!-- - 许可证: 未知 -->

模型来源

完整模型架构

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

使用方法

直接使用(Sentence Transformers)

首先安装 Sentence Transformers 库:

pip install -U sentence-transformers

然后你可以加载此模型并运行推理。

from sentence_transformers import SentenceTransformer

# 从 🤗 Hub 下载
model = SentenceTransformer("sentence_transformers_model_id")
# 运行推理
sentences = [
    'Doctype: Supplier\nQuestion: Vendors tally per country?',
    'country: supplier country',
    'default_currency: currency used',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# 获取嵌入的相似度分数
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 1.0000,  0.4299, -0.3074],
#         [ 0.4299,  1.0000,  0.0622],
#         [-0.3074,  0.0622,  1.0000]])

<!--

直接使用(Transformers)

<details><summary>点击查看 Transformers 中的直接用法</summary>

</details> -->

<!--

下游使用(Sentence Transformers)

你可以在自己的数据集上微调此模型。

<details><summary>点击展开</summary>

</details> -->

<!--

超出范围的使用

列出模型可能被预见性滥用的方式,并说明用户不应该用模型做什么。 -->

<!--

偏见、风险和局限性

这个模型存在哪些已知或可预见的问题?你也可以在这里标记已知的失败案例或模型的弱点。 -->

<!--

建议

针对可预见的问题有什么建议?例如,过滤显式内容。 -->

训练详情

训练数据集

未命名数据集

  • 大小:92,692 个训练样本
  • 列:<code>sentence_0</code>、<code>sentence_1</code> 和 <code>sentence_2</code>
  • 基于前 1000 个样本的近似统计信息:
    sentence_0 sentence_1 sentence_2
    类型 字符串 字符串 字符串
    详情 <ul><li>最小值:9 个 token</li><li>平均值:18.14 个 token</li><li>最大值:37 个 token</li></ul> <ul><li>最小值:5 个 token</li><li>平均值:11.06 个 token</li><li>最大值:27 个 token</li></ul> <ul><li>最小值:3 个 token</li><li>平均值:10.69 个 token</li><li>最大值:24 个 token</li></ul>
  • 样本:
    sentence_0 sentence_1 sentence_2
    <code>Doctype: Employee<br>Question: List employees with designation "Senior Manager".</code> <code>designation: Designation of the employee.</code> <code>date_of_joining: Date when the employee joined.</code>
    <code>Doctype: Company<br>Question: Give me the tax ID, company name, and establishment date for every business.</code> <code>company_name: The official name of the company.</code> <code>company_description: Description of the company.</code>
    <code>Doctype: Item<br>Question: Which items have product variants and on what basis?</code> <code>variant_based_on: The basis for item variants.</code> <code>customer_items: Customer-specific item details.</code>
  • 损失函数:带以下参数的 <code>TripletLoss</code>
    {
        \"distance_metric\": \"TripletDistanceMetric.COSINE\",
        \"triplet_margin\": 0.3
    }
    

训练超参数

非默认超参数

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 15
  • multi_dataset_batch_sampler: round_robin

所有超参数

<details><summary>点击展开</summary>

  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin
  • router_mapping: {}
  • learning_rate_mapping: {}

</details>

训练日志

<details><summary>点击展开</summary>

Epoch Step Training Loss
0.0863 500 0.0392
0.1726 1000 0.0294
0.2589 1500 0.0249
0.3452 2000 0.0158
0.4315 2500 0.0124
0.5178 3000 0.0102
0.6041 3500 0.0083
0.6904 4000 0.0064
0.7767 4500 0.0067
0.8630 5000 0.0057
0.9493 5500 0.0058
1.0356 6000 0.0049
1.1219 6500 0.0041
1.2081 7000 0.0036
1.2944 7500 0.0044
1.3807 8000 0.0038
1.4670 8500 0.0032
1.5533 9000 0.0035
1.6396 9500 0.0037
1.7259 10000 0.0034
1.8122 10500 0.003
1.8985 11000 0.0027
1.9848 11500 0.0028
2.0711 12000 0.0023
2.1574 12500 0.0021
2.2437 13000 0.0021
2.3300 13500 0.0021
2.4163 14000 0.0021
2.5026 14500 0.0022
2.5889 15000 0.002
2.6752 15500 0.0021
2.7615 16000 0.002
2.8478 16500 0.0019
2.9341 17000 0.0019
3.0204 17500 0.0016
3.1067 18000 0.0011
3.1930 18500 0.0012
3.2793 19000 0.0016
3.3656 19500 0.0015
3.4518 20000 0.0013
3.5381 20500 0.0013
3.6244 21000 0.0008
3.7107 21500 0.0013
3.7970 22000 0.0012
3.8833 22500 0.0017
3.9696 23000 0.0011
4.0559 23500 0.0006
4.1422 24000 0.0007
4.2285 24500 0.001
4.3148 25000 0.0009
4.4011 25500 0.001
4.4874 26000 0.0006
4.5737 26500 0.0009
4.6600 27000 0.0008
4.7463 27500 0.0008
4.8326 28000 0.001
4.9189 28500 0.0008
5.0052 29000 0.0008
5.0915 29500 0.0007
5.1778 30000 0.0007
5.2641 30500 0.0006
5.3504 31000 0.0005
5.4367 31500 0.0006
5.5230 32000 0.0007
5.6093 32500 0.0006
5.6955 33000 0.0005
5.7818 33500 0.0006
5.8681 34000 0.0007
5.9544 34500 0.0007
6.0407 35000 0.0006
6.1270 35500 0.0004
6.2133 36000 0.0005
6.2996 36500 0.0003
6.3859 37000 0.0004
6.4722 37500 0.0003
6.5585 38000 0.0005
6.6448 38500 0.0005
6.7311 39000 0.0003
6.8174 39500 0.0005
6.9037 40000 0.0004
6.9900 40500 0.0006
7.0763 41000 0.0004
7.1626 41500 0.0003
7.2489 42000 0.0004
7.3352 42500 0.0003
7.4215 43000 0.0005
7.5078 43500 0.0005
7.5941 44000 0.0002
7.6804 44500 0.0002
7.7667 45000 0.0004
7.8530 45500 0.0004
7.9392 46000 0.0003
8.0255 46500 0.0003
8.1118 47000 0.0003
8.1981 47500 0.0003
8.2844 48000 0.0002
8.3707 48500 0.0002
8.4570 49000 0.0004
8.5433 49500 0.0002
8.6296 50000 0.0002
8.7159 50500 0.0002
8.8022 51000 0.0002
8.8885 51500 0.0002
8.9748 52000 0.0002
9.0611 52500 0.0001
9.1474 53000 0.0001
9.2337 53500 0.0002
9.3200 54000 0.0002
9.4063 54500 0.0002
9.4926 55000 0.0001
9.5789 55500 0.0001
9.6652 56000 0.0002
9.7515 56500 0.0001
9.8378 57000 0.0003
9.9241 57500 0.0001
10.0104 58000 0.0001
10.0967 58500 0.0001
10.1829 59000 0.0001
10.2692 59500 0.0001
10.3555 60000 0.0001
10.4418 60500 0.0002
10.5281 61000 0.0001
10.6144 61500 0.0002
10.7007 62000 0.0002
10.7870 62500 0.0002
10.8733 63000 0.0001
10.9596 63500 0.0001
11.0459 64000 0.0002
11.1322 64500 0.0001
11.2185 65000 0.0001
11.3048 65500 0.0001
11.3911 66000 0.0001
11.4774 66500 0.0001
11.5637 67000 0.0001
11.6500 67500 0.0001
11.7363 68000 0.0001
11.8226 68500 0.0
11.9089 69000 0.0001
11.9952 69500 0.0
12.0815 70000 0.0
12.1678 70500 0.0
12.2541 71000 0.0001
12.3404 71500 0.0001
12.4266 72000 0.0001
12.5129 72500 0.0001
12.5992 73000 0.0
12.6855 73500 0.0001
12.7718 74000 0.0001
12.8581 74500 0.0001
12.9444 75000 0.0001
13.0307 75500 0.0
13.1170 76000 0.0001
13.2033 76500 0.0001
13.2896 77000 0.0
13.3759 77500 0.0
13.4622 78000 0.0
13.5485 78500 0.0001
13.6348 79000 0.0001
13.7211 79500 0.0
13.8074 80000 0.0
13.8937 80500 0.0001
13.9800 81000 0.0
14.0663 81500 0.0
14.1526 82000 0.0
14.2389 82500 0.0
14.3252 83000 0.0
14.4115 83500 0.0001
14.4978 84000 0.0
14.5841 84500 0.0
14.6703 85000 0.0001
14.7566 85500 0.0001
14.8429 86000 0.0
14.9292 86500 0.0

</details>

框架版本

  • Python: 3.12.11
  • Sentence Transformers: 5.1.0
  • Transformers: 4.56.0
  • PyTorch: 2.8.0+cu126
  • Accelerate: 1.10.1
  • Datasets: 4.0.0
  • Tokenizers: 0.22.0

引用

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

<!--

术语表

清晰定义术语,以便让各领域的读者都能理解。 -->

<!--

模型卡作者

列出创建模型卡的人员,为模型卡构建中投入的详细工作提供认可和责任归属。 -->

<!--

模型卡联系方式

为想要更新模型卡、提出建议或有疑问的人员提供联系方式。 -->

hyrinmansoor/text2frappe-s2-sbert

作者 hyrinmansoor

sentence-similarity sentence-transformers
↓ 1 ♥ 0

创建时间: 2025-05-31 06:47:56+00:00

更新时间: 2025-09-09 13:50:28+00:00

在 Hugging Face 上查看

文件 (14)

.gitattributes
1_Pooling/config.json
README.md
config.json
config_sentence_transformers.json
eval/triplet_evaluation_results.csv
model.safetensors
modules.json
onnx/model.onnx ONNX
sentence_bert_config.json
special_tokens_map.json
tokenizer.json
tokenizer_config.json
vocab.txt