返回模型
说明文档
任务: question-answering
后端: sagemaker-training
后端参数: {'instance_type': 'ml.m5.2xlarge', 'supported_instructions': 'avx512'}
评估样本数量: 全部数据集
固定参数:
- 数据集: [{'path': 'squad', 'eval_split': 'validation', 'data_keys': {'question': 'question', 'context': 'context'}, 'ref_keys': ['answers'], 'name': None, 'calibration_split': None}]
- 模型名称或路径:
distilbert-base-uncased-distilled-squad - 来自transformers:
True - 量化方法:
dynamic
基准测试参数:
- 框架:
onnxruntime,pytorch - 待量化算子:
['Add', 'MatMul'],['Add'] - 节点排除:
[],['layernorm', 'gelu', 'residual', 'gather', 'softmax'] - 按通道量化:
False,True - 框架参数:
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4},{} - 缩小范围:
True,False - 应用量化:
True,False
评估
非时间指标
| 框架 | 待量化算子 | 节点排除 | 按通道量化 | 框架参数 | 缩小范围 | 应用量化 | 精确匹配 | F1值 | ||
|---|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 76.764 | | | 85.053 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 69.622 | | | 79.914 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.435 | | | 5.887 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 78.165 | | | 85.973 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 76.764 | | | 85.053 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 69.622 | | | 79.914 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.435 | | | 5.887 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 78.165 | | | 85.973 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 78.884 | | | 86.690 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 78.884 | | | 86.690 |
pytorch |
None |
None |
None |
{} |
None |
None |
| | 78.884 | | | 86.690 |
时间指标
时间基准测试每个配置运行15秒。
以下是批大小 = 1,输入长度 = 32 的时间指标。
| 框架 | 待量化算子 | 节点排除 | 按通道量化 | 框架参数 | 缩小范围 | 应用量化 | 平均延迟 (ms) | 吞吐量 (/s) | ||
|---|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 14.26 | | | 70.13 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.08 | | | 99.20 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.60 | | | 94.33 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.88 | | | 91.93 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.84 | | | 92.27 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.34 | | | 96.73 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.41 | | | 96.07 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.96 | | | 91.27 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.69 | | | 93.53 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.43 | | | 69.33 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.52 | | | 68.87 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.35 | | | 69.73 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.50 | | | 69.00 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.20 | | | 70.47 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.24 | | | 70.27 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.58 | | | 68.67 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.73 | | | 67.87 |
pytorch |
None |
None |
None |
{} |
None |
None |
| | 31.49 | | | 31.80 |
以下是批大小 = 1,输入长度 = 64 的时间指标。
| 框架 | 待量化算子 | 节点排除 | 按通道量化 | 框架参数 | 缩小范围 | 应用量化 | 平均延迟 (ms) | 吞吐量 (/s) | ||
|---|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 24.83 | | | 40.33 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 18.49 | | | 54.13 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 18.87 | | | 53.00 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 19.17 | | | 52.20 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 18.92 | | | 52.87 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 19.13 | | | 52.33 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 18.95 | | | 52.80 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 19.08 | | | 52.47 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 19.14 | | | 52.27 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.83 | | | 40.33 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 24.84 | | | 40.27 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.66 | | | 40.60 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 24.76 | | | 40.40 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 25.07 | | | 39.93 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 25.27 | | | 39.60 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.76 | | | 40.40 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 24.70 | | | 40.53 |
pytorch |
None |
None |
None |
{} |
None |
None |
| | 41.26 | | | 24.27 |
以下是批大小 = 1,输入长度 = 128 的时间指标。
| 框架 | 待量化算子 | 节点排除 | 按通道量化 | 框架参数 | 缩小范围 | 应用量化 | 平均延迟 (ms) | 吞吐量 (/s) | ||
|---|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 46.89 | | | 21.33 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 34.84 | | | 28.73 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 35.88 | | | 27.93 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 36.92 | | | 27.13 |
onnxruntime |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 36.25 | | | 27.60 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 36.17 | | | 27.67 |
onnxruntime |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 35.59 | | | 28.13 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 37.36 | | | 26.80 |
onnxruntime |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 35.97 | | | 27.87 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 46.94 | | | 21.33 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 47.19 | | | 21.20 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 47.05 | | | 21.27 |
onnxruntime |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 46.79 | | | 21.40 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 46.87 | | | 21.40 |
onnxruntime |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 47.04 | | | 21.27 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 47.08 | | | 21.27 |
onnxruntime |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 47.05 | | | 21.27 |
pytorch |
None |
None |
None |
{} |
None |
None |
| | 54.61 | | | 18.33 |
fxmarty/20220911-h13m58s53_squad_qa_distilbert_dynamic
作者 fxmarty
question-answering
↓ 0
♥ 0
创建时间: 2022-09-11 22:20:48+00:00
更新时间: 2022-09-11 22:21:43+00:00
在 Hugging Face 上查看文件 (89)
.gitattributes
20220911-h14m23s56_0/model.onnx
ONNX
20220911-h14m23s56_0/ort_config.json
20220911-h14m23s56_0/quantized_model.onnx
ONNX
20220911-h14m23s56_0/results.json
20220911-h14m53s50_1/model.onnx
ONNX
20220911-h14m53s50_1/ort_config.json
20220911-h14m53s50_1/quantized_model.onnx
ONNX
20220911-h14m53s50_1/results.json
20220911-h15m19s07_2/model.onnx
ONNX
20220911-h15m19s07_2/ort_config.json
20220911-h15m19s07_2/quantized_model.onnx
ONNX
20220911-h15m19s07_2/results.json
20220911-h15m49s21_3/model.onnx
ONNX
20220911-h15m49s21_3/ort_config.json
20220911-h15m49s21_3/quantized_model.onnx
ONNX
20220911-h15m49s21_3/results.json
20220911-h16m14s23_4/model.onnx
ONNX
20220911-h16m14s23_4/ort_config.json
20220911-h16m14s23_4/quantized_model.onnx
ONNX
20220911-h16m14s23_4/results.json
20220911-h16m39s55_5/model.onnx
ONNX
20220911-h16m39s55_5/ort_config.json
20220911-h16m39s55_5/quantized_model.onnx
ONNX
20220911-h16m39s55_5/results.json
20220911-h17m10s11_6/model.onnx
ONNX
20220911-h17m10s11_6/ort_config.json
20220911-h17m10s11_6/quantized_model.onnx
ONNX
20220911-h17m10s11_6/results.json
20220911-h17m40s23_7/model.onnx
ONNX
20220911-h17m40s23_7/ort_config.json
20220911-h17m40s23_7/quantized_model.onnx
ONNX
20220911-h17m40s23_7/results.json
20220911-h18m05s42_8/model.onnx
ONNX
20220911-h18m05s42_8/ort_config.json
20220911-h18m05s42_8/quantized_model.onnx
ONNX
20220911-h18m05s42_8/results.json
20220911-h18m35s59_9/model.onnx
ONNX
20220911-h18m35s59_9/ort_config.json
20220911-h18m35s59_9/quantized_model.onnx
ONNX
20220911-h18m35s59_9/results.json
20220911-h19m06s15_10/model.onnx
ONNX
20220911-h19m06s15_10/ort_config.json
20220911-h19m06s15_10/quantized_model.onnx
ONNX
20220911-h19m06s15_10/results.json
20220911-h19m36s15_11/model.onnx
ONNX
20220911-h19m36s15_11/ort_config.json
20220911-h19m36s15_11/quantized_model.onnx
ONNX
20220911-h19m36s15_11/results.json
20220911-h20m06s26_12/model.onnx
ONNX
20220911-h20m06s26_12/ort_config.json
20220911-h20m06s26_12/quantized_model.onnx
ONNX
20220911-h20m06s26_12/results.json
20220911-h20m32s12_13/model.onnx
ONNX
20220911-h20m32s12_13/ort_config.json
20220911-h20m32s12_13/quantized_model.onnx
ONNX
20220911-h20m32s12_13/results.json
20220911-h20m57s50_14/model.onnx
ONNX
20220911-h20m57s50_14/ort_config.json
20220911-h20m57s50_14/quantized_model.onnx
ONNX
20220911-h20m57s50_14/results.json
20220911-h21m23s01_15/model.onnx
ONNX
20220911-h21m23s01_15/ort_config.json
20220911-h21m23s01_15/quantized_model.onnx
ONNX
20220911-h21m23s01_15/results.json
20220911-h21m53s17_16/model.onnx
ONNX
20220911-h21m53s17_16/results.json
20220911-h22m20s47_17/results.json
README.md
runs.json
tensorboard/1662934853.040312/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.1
tensorboard/1662934853.0417204/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.2
tensorboard/1662934853.042942/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.3
tensorboard/1662934853.0442307/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.4
tensorboard/1662934853.0453482/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.5
tensorboard/1662934853.04647/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.6
tensorboard/1662934853.0478818/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.7
tensorboard/1662934853.0491567/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.8
tensorboard/1662934853.0507572/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.9
tensorboard/1662934853.0518975/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.10
tensorboard/1662934853.0530026/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.11
tensorboard/1662934853.0541465/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.12
tensorboard/1662934853.0553083/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.13
tensorboard/1662934853.056435/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.14
tensorboard/1662934853.0575647/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.15
tensorboard/1662934853.058674/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.16
tensorboard/1662934853.0610743/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.17
tensorboard/1662934853.062237/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.18
tensorboard/events.out.tfevents.1662934853.ip-10-0-142-213.ec2.internal.1.0