ONNX 模型库
返回模型

说明文档

任务: token-classification
后端: sagemaker-training
后端参数: {'instance_type': 'ml.m5.2xlarge', 'supported_instructions': 'avx512'}
评估样本数量: 全部数据集

固定参数:

  • 数据集: [{'path': 'conll2003', 'eval_split': 'validation', 'data_keys': {'primary': 'tokens'}, 'ref_keys': ['ner_tags'], 'name': None, 'calibration_split': 'train'}]
  • name_or_path: elastic/distilbert-base-uncased-finetuned-conll03-english
  • from_transformers: True
  • 待量化算子: ['Add', 'MatMul']
  • 校准:
    • 方法: percentile
    • 校准样本数: 128
    • 校准直方图百分位: 99.999

基准测试参数:

  • 框架: onnxruntime, pytorch
  • 量化方法: dynamic, static
  • 节点排除: [], ['layernorm', 'gelu', 'residual', 'gather', 'softmax']
  • 逐通道量化: False, True
  • 框架参数: {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4}, {}
  • reduce_range: True, False
  • 应用量化: True, False

评估

非时间指标

框架 量化方法 节点排除 逐通道量化 框架参数 reduce_range 应用量化 整体精确率 整体召回率 整体F1值 整体准确率
onnxruntime None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 0.936 | 0.944 | 0.940 | 0.988
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.935 | 0.943 | 0.939 | 0.988
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.926 | 0.931 | 0.929 | 0.987
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.000 | 0.000 | 0.000 | 0.833
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.934 | 0.944 | 0.939 | 0.988
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.935 | 0.943 | 0.939 | 0.988
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.926 | 0.931 | 0.929 | 0.987
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.000 | 0.000 | 0.000 | 0.833
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.934 | 0.944 | 0.939 | 0.988
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.913 | 0.792 | 0.848 | 0.969
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.913 | 0.792 | 0.848 | 0.969
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.000 | 0.000 | 0.000 | 0.833
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.896 | 0.783 | 0.836 | 0.968
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.925 | 0.844 | 0.883 | 0.975
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.925 | 0.844 | 0.883 | 0.975
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.045 | 0.004 | 0.008 | 0.825
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.922 | 0.839 | 0.879 | 0.975
pytorch None None None {} None None | 0.936 | 0.944 | 0.940 | 0.988

时间指标

时间基准测试每个配置运行15秒。

以下是批量大小 = 1,输入长度 = 32 的时间指标。

框架 量化方法 节点排除 逐通道量化 框架参数 reduce_range 应用量化 平均延迟 (ms) 吞吐量 (/s)
onnxruntime None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 14.22 | 70.33
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.22 | 97.87
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.16 | 98.47
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.52 | 95.07
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.70 | 93.47
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.22 | 97.87
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.24 | 97.67
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.36 | 96.53
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.50 | 95.27
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.98 | 91.07
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 11.31 | 88.47
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 11.23 | 89.07
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 11.48 | 87.20
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 13.54 | 73.87
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 13.74 | 72.80
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 13.80 | 72.53
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 14.08 | 71.07
pytorch None None None {} None None | 31.23 | 32.07

以下是批量大小 = 1,输入长度 = 64 的时间指标。

框架 量化方法 节点排除 逐通道量化 框架参数 reduce_range 应用量化 平均延迟 (ms) 吞吐量 (/s)
onnxruntime None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 24.52 | 40.80
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 18.47 | 54.20
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 18.53 | 54.00
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 18.85 | 53.07
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 19.14 | 52.27
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 18.50 | 54.07
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 18.50 | 54.07
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 18.69 | 53.53
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 19.46 | 51.40
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 20.42 | 49.00
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 19.91 | 50.27
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 20.20 | 49.53
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 20.74 | 48.27
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 24.91 | 40.20
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 24.35 | 41.13
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 24.99 | 40.07
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 24.95 | 40.13
pytorch None None None {} None None | 41.31 | 24.27

以下是批量大小 = 1,输入长度 = 128 的时间指标。

框架 量化方法 节点排除 逐通道量化 框架参数 reduce_range 应用量化 平均延迟 (ms) 吞吐量 (/s)
onnxruntime None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 46.79 | 21.40
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.84 | 27.93
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 35.07 | 28.53
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.71 | 28.00
onnxruntime dynamic ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 35.91 | 27.87
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.42 | 28.27
onnxruntime dynamic [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 35.22 | 28.40
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.51 | 28.20
onnxruntime dynamic [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 35.90 | 27.87
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 39.88 | 25.13
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 39.27 | 25.47
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 39.37 | 25.40
onnxruntime static ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 39.16 | 25.60
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 44.43 | 22.53
onnxruntime static [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 46.13 | 21.73
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 45.48 | 22.00
onnxruntime static [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 45.82 | 21.87
pytorch None None None {} None None | 53.93 | 18.60

fxmarty/20220911-h13m58s51_conll2003_distilbert_quantization

作者 fxmarty

token-classification
↓ 0 ♥ 0

创建时间: 2022-09-11 15:14:40+00:00

更新时间: 2022-09-11 15:15:44+00:00

在 Hugging Face 上查看

文件 (105)

.gitattributes
20220911-h14m00s39_0/model.onnx ONNX
20220911-h14m00s39_0/ort_config.json
20220911-h14m00s39_0/quantized_model.onnx ONNX
20220911-h14m00s39_0/results.json
20220911-h14m02s15_1/model.onnx ONNX
20220911-h14m02s15_1/ort_config.json
20220911-h14m02s15_1/quantized_model.onnx ONNX
20220911-h14m02s15_1/results.json
20220911-h14m09s38_2/augmented_model.onnx ONNX
20220911-h14m09s38_2/calibration_histograms.npy
20220911-h14m09s38_2/model.onnx ONNX
20220911-h14m09s38_2/ort_config.json
20220911-h14m09s38_2/quantized_model.onnx ONNX
20220911-h14m09s38_2/results.json
20220911-h14m16s58_3/augmented_model.onnx ONNX
20220911-h14m16s58_3/calibration_histograms.npy
20220911-h14m16s58_3/model.onnx ONNX
20220911-h14m16s58_3/ort_config.json
20220911-h14m16s58_3/quantized_model.onnx ONNX
20220911-h14m16s58_3/results.json
20220911-h14m24s25_4/augmented_model.onnx ONNX
20220911-h14m24s25_4/calibration_histograms.npy
20220911-h14m24s25_4/model.onnx ONNX
20220911-h14m24s25_4/ort_config.json
20220911-h14m24s25_4/quantized_model.onnx ONNX
20220911-h14m24s25_4/results.json
20220911-h14m25s53_5/model.onnx ONNX
20220911-h14m25s53_5/ort_config.json
20220911-h14m25s53_5/quantized_model.onnx ONNX
20220911-h14m25s53_5/results.json
20220911-h14m33s27_6/augmented_model.onnx ONNX
20220911-h14m33s27_6/calibration_histograms.npy
20220911-h14m33s27_6/model.onnx ONNX
20220911-h14m33s27_6/ort_config.json
20220911-h14m33s27_6/quantized_model.onnx ONNX
20220911-h14m33s27_6/results.json
20220911-h14m40s51_7/augmented_model.onnx ONNX
20220911-h14m40s51_7/calibration_histograms.npy
20220911-h14m40s51_7/model.onnx ONNX
20220911-h14m40s51_7/ort_config.json
20220911-h14m40s51_7/quantized_model.onnx ONNX
20220911-h14m40s51_7/results.json
20220911-h14m42s27_8/model.onnx ONNX
20220911-h14m42s27_8/ort_config.json
20220911-h14m42s27_8/quantized_model.onnx ONNX
20220911-h14m42s27_8/results.json
20220911-h14m44s03_9/model.onnx ONNX
20220911-h14m44s03_9/ort_config.json
20220911-h14m44s03_9/quantized_model.onnx ONNX
20220911-h14m44s03_9/results.json
20220911-h14m45s31_10/model.onnx ONNX
20220911-h14m45s31_10/ort_config.json
20220911-h14m45s31_10/quantized_model.onnx ONNX
20220911-h14m45s31_10/results.json
20220911-h14m46s59_11/model.onnx ONNX
20220911-h14m46s59_11/ort_config.json
20220911-h14m46s59_11/quantized_model.onnx ONNX
20220911-h14m46s59_11/results.json
20220911-h14m48s28_12/model.onnx ONNX
20220911-h14m48s28_12/ort_config.json
20220911-h14m48s28_12/quantized_model.onnx ONNX
20220911-h14m48s28_12/results.json
20220911-h14m55s55_13/augmented_model.onnx ONNX
20220911-h14m55s55_13/calibration_histograms.npy
20220911-h14m55s55_13/model.onnx ONNX
20220911-h14m55s55_13/ort_config.json
20220911-h14m55s55_13/quantized_model.onnx ONNX
20220911-h14m55s55_13/results.json
20220911-h15m03s26_14/augmented_model.onnx ONNX
20220911-h15m03s26_14/calibration_histograms.npy
20220911-h15m03s26_14/model.onnx ONNX
20220911-h15m03s26_14/ort_config.json
20220911-h15m03s26_14/quantized_model.onnx ONNX
20220911-h15m03s26_14/results.json
20220911-h15m10s51_15/augmented_model.onnx ONNX
20220911-h15m10s51_15/calibration_histograms.npy
20220911-h15m10s51_15/model.onnx ONNX
20220911-h15m10s51_15/ort_config.json
20220911-h15m10s51_15/quantized_model.onnx ONNX
20220911-h15m10s51_15/results.json
20220911-h15m12s25_16/model.onnx ONNX
20220911-h15m12s25_16/results.json
20220911-h15m14s39_17/results.json
README.md
runs.json
tensorboard/1662909285.3827937/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.1
tensorboard/1662909285.3842196/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.2
tensorboard/1662909285.3854108/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.3
tensorboard/1662909285.3865986/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.4
tensorboard/1662909285.3881178/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.5
tensorboard/1662909285.389244/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.6
tensorboard/1662909285.390423/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.7
tensorboard/1662909285.391528/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.8
tensorboard/1662909285.3926113/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.9
tensorboard/1662909285.3937325/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.10
tensorboard/1662909285.3948941/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.11
tensorboard/1662909285.3960202/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.12
tensorboard/1662909285.3971512/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.13
tensorboard/1662909285.398288/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.14
tensorboard/1662909285.399385/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.15
tensorboard/1662909285.402039/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.16
tensorboard/1662909285.4033327/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.17
tensorboard/1662909285.4047217/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.18
tensorboard/events.out.tfevents.1662909285.ip-10-0-209-21.ec2.internal.1.0