ONNX 模型库
返回模型

说明文档

任务: text-classification
后端: sagemaker-training
后端参数: {'instance_type': 'ml.m5.2xlarge', 'supported_instructions': 'avx512'}
评估样本数量: 全部数据集

固定参数:

  • 数据集: [{'path': 'glue', 'eval_split': 'validation', 'data_keys': {'primary': 'sentence'}, 'ref_keys': ['label'], 'name': 'sst2', 'calibration_split': 'train'}]
  • 模型路径: distilbert-base-uncased-finetuned-sst-2-english
  • 来自Transformers: True
  • 校准:
    • 方法: percentile
    • 校准样本数: 128
    • 校准直方图百分位: 99.999

基准测试参数:

  • 框架: onnxruntime, pytorch
  • 量化方法: dynamic, static
  • 待量化算子: ['Add', 'MatMul'], ['Add']
  • 节点排除: [], ['layernorm', 'gelu', 'residual', 'gather', 'softmax']
  • 逐通道量化: False, True
  • 框架参数: {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4}, {}
  • 缩减范围: True, False
  • 应用量化: True, False

评估

非时间指标

framework quantization_approach operators_to_quantize node_exclusion per_channel framework_args reduce_range apply_quantization accuracy
onnxruntime None None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 0.911
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.898
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.893
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.490
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.901
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.898
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.893
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.490
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.901
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.911
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.911
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.911
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.911
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.911
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.911
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.911
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.911
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.899
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.899
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.491
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.908
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.899
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.899
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.499
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.900
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.906
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.906
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.906
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.906
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.901
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.901
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 0.901
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 0.901
pytorch None None None None {} None None | 0.911

时间指标

时间基准测试每个配置运行15秒。

以下是批次大小 = 1,输入长度 = 32 的时间指标。

framework quantization_approach operators_to_quantize node_exclusion per_channel framework_args reduce_range apply_quantization latency_mean (ms) throughput (/s)
onnxruntime None None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 14.50 | 69.00
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.19 | 98.13
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.66 | 93.87
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.45 | 95.67
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.72 | 93.33
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.40 | 96.20
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.16 | 98.40
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 10.40 | 96.20
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 10.86 | 92.07
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 14.43 | 69.33
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 14.68 | 68.13
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 14.40 | 69.47
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 14.79 | 67.60
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 14.80 | 67.60
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 14.13 | 70.80
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 14.54 | 68.80
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 14.60 | 68.53
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 11.23 | 89.13
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 11.18 | 89.47
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 11.39 | 87.87
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 11.31 | 88.47
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 13.73 | 72.87
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 14.42 | 69.40
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 14.09 | 71.00
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 13.78 | 72.60
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 16.11 | 62.13
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 15.97 | 62.67
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 15.82 | 63.27
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 15.94 | 62.73
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 19.03 | 52.60
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 18.99 | 52.67
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 18.93 | 52.87
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 18.65 | 53.67
pytorch None None None None {} None None | 31.28 | 32.00

以下是批次大小 = 1,输入长度 = 64 的时间指标。

framework quantization_approach operators_to_quantize node_exclusion per_channel framework_args reduce_range apply_quantization latency_mean (ms) throughput (/s)
onnxruntime None None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 24.59 | 40.67
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 18.67 | 53.60
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 19.16 | 52.20
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 18.97 | 52.73
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 19.29 | 51.87
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 19.13 | 52.33
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 18.64 | 53.67
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 19.01 | 52.60
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 18.96 | 52.80
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 24.63 | 40.67
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 25.28 | 39.60
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 24.75 | 40.47
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 24.97 | 40.07
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 25.16 | 39.80
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 24.49 | 40.87
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 24.88 | 40.20
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 25.17 | 39.73
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 20.05 | 49.93
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 20.76 | 48.20
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 20.75 | 48.20
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 20.23 | 49.47
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 24.79 | 40.40
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 25.17 | 39.73
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 24.14 | 41.47
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 25.27 | 39.60
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 27.97 | 35.80
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 27.43 | 36.47
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 28.17 | 35.53
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 28.16 | 35.53
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 33.24 | 30.13
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 32.46 | 30.87
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 32.39 | 30.93
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 32.75 | 30.53
pytorch None None None None {} None None | 41.25 | 24.27

以下是批次大小 = 1,输入长度 = 128 的时间指标。

framework quantization_approach operators_to_quantize node_exclusion per_channel framework_args reduce_range apply_quantization latency_mean (ms) throughput (/s)
onnxruntime None None None None {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} None False | 46.51 | 21.53
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.33 | 28.33
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 35.92 | 27.87
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.56 | 28.13
onnxruntime dynamic ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 36.32 | 27.53
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.53 | 28.20
onnxruntime dynamic ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 35.96 | 27.87
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 35.42 | 28.27
onnxruntime dynamic ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 36.06 | 27.80
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 47.40 | 21.13
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 47.14 | 21.27
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 47.46 | 21.13
onnxruntime dynamic ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 47.26 | 21.20
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 47.48 | 21.07
onnxruntime dynamic ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 47.08 | 21.27
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 47.02 | 21.33
onnxruntime dynamic ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 47.05 | 21.27
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 39.63 | 25.27
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 39.52 | 25.33
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 39.78 | 25.20
onnxruntime static ['Add', 'MatMul'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 40.01 | 25.00
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 44.24 | 22.67
onnxruntime static ['Add', 'MatMul'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 44.55 | 22.47
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 45.74 | 21.87
onnxruntime static ['Add', 'MatMul'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 44.12 | 22.67
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 51.41 | 19.47
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 52.52 | 19.07
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 51.25 | 19.53
onnxruntime static ['Add'] ['layernorm', 'gelu', 'residual', 'gather', 'softmax'] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 51.51 | 19.47
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 59.37 | 16.87
onnxruntime static ['Add'] [] False {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 58.28 | 17.20
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} False True | 59.37 | 16.87
onnxruntime static ['Add'] [] True {'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} True True | 58.28 | 17.20
pytorch None None None None {} None None | 53.72 | 18.67

fxmarty/20220911-h13m58s49_sst2_distilbert_quantization

作者 fxmarty

text-classification
↓ 0 ♥ 0

创建时间: 2022-09-11 15:52:09+00:00

更新时间: 2022-09-11 15:55:26+00:00

在 Hugging Face 上查看

文件 (201)

.gitattributes
20220911-h14m00s12_0/model.onnx ONNX
20220911-h14m00s12_0/ort_config.json
20220911-h14m00s12_0/quantized_model.onnx ONNX
20220911-h14m00s12_0/results.json
20220911-h14m05s08_1/augmented_model.onnx ONNX
20220911-h14m05s08_1/calibration_histograms.npy
20220911-h14m05s08_1/model.onnx ONNX
20220911-h14m05s08_1/ort_config.json
20220911-h14m05s08_1/quantized_model.onnx ONNX
20220911-h14m05s08_1/results.json
20220911-h14m06s13_2/model.onnx ONNX
20220911-h14m06s13_2/ort_config.json
20220911-h14m06s13_2/quantized_model.onnx ONNX
20220911-h14m06s13_2/results.json
20220911-h14m13s04_3/augmented_model.onnx ONNX
20220911-h14m13s04_3/calibration_histograms.npy
20220911-h14m13s04_3/model.onnx ONNX
20220911-h14m13s04_3/ort_config.json
20220911-h14m13s04_3/quantized_model.onnx ONNX
20220911-h14m13s04_3/results.json
20220911-h14m17s44_4/augmented_model.onnx ONNX
20220911-h14m17s44_4/calibration_histograms.npy
20220911-h14m17s44_4/model.onnx ONNX
20220911-h14m17s44_4/ort_config.json
20220911-h14m17s44_4/quantized_model.onnx ONNX
20220911-h14m17s44_4/results.json
20220911-h14m24s33_5/augmented_model.onnx ONNX
20220911-h14m24s33_5/calibration_histograms.npy
20220911-h14m24s33_5/model.onnx ONNX
20220911-h14m24s33_5/ort_config.json
20220911-h14m24s33_5/quantized_model.onnx ONNX
20220911-h14m24s33_5/results.json
20220911-h14m25s45_6/model.onnx ONNX
20220911-h14m25s45_6/ort_config.json
20220911-h14m25s45_6/quantized_model.onnx ONNX
20220911-h14m25s45_6/results.json
20220911-h14m32s41_7/augmented_model.onnx ONNX
20220911-h14m32s41_7/calibration_histograms.npy
20220911-h14m32s41_7/model.onnx ONNX
20220911-h14m32s41_7/ort_config.json
20220911-h14m32s41_7/quantized_model.onnx ONNX
20220911-h14m32s41_7/results.json
20220911-h14m37s20_8/augmented_model.onnx ONNX
20220911-h14m37s20_8/calibration_histograms.npy
20220911-h14m37s20_8/model.onnx ONNX
20220911-h14m37s20_8/ort_config.json
20220911-h14m37s20_8/quantized_model.onnx ONNX
20220911-h14m37s20_8/results.json
20220911-h14m38s25_9/model.onnx ONNX
20220911-h14m38s25_9/ort_config.json
20220911-h14m38s25_9/quantized_model.onnx ONNX
20220911-h14m38s25_9/results.json
20220911-h14m45s20_10/augmented_model.onnx ONNX
20220911-h14m45s20_10/calibration_histograms.npy
20220911-h14m45s20_10/model.onnx ONNX
20220911-h14m45s20_10/ort_config.json
20220911-h14m45s20_10/quantized_model.onnx ONNX
20220911-h14m45s20_10/results.json
20220911-h14m46s27_11/model.onnx ONNX
20220911-h14m46s27_11/ort_config.json
20220911-h14m46s27_11/quantized_model.onnx ONNX
20220911-h14m46s27_11/results.json
20220911-h14m53s25_12/augmented_model.onnx ONNX
20220911-h14m53s25_12/calibration_histograms.npy
20220911-h14m53s25_12/model.onnx ONNX
20220911-h14m53s25_12/ort_config.json
20220911-h14m53s25_12/quantized_model.onnx ONNX
20220911-h14m53s25_12/results.json
20220911-h14m54s37_13/model.onnx ONNX
20220911-h14m54s37_13/ort_config.json
20220911-h14m54s37_13/quantized_model.onnx ONNX
20220911-h14m54s37_13/results.json
20220911-h14m55s44_14/model.onnx ONNX
20220911-h14m55s44_14/ort_config.json
20220911-h14m55s44_14/quantized_model.onnx ONNX
20220911-h14m55s44_14/results.json
20220911-h15m02s37_15/augmented_model.onnx ONNX
20220911-h15m02s37_15/calibration_histograms.npy
20220911-h15m02s37_15/model.onnx ONNX
20220911-h15m02s37_15/ort_config.json
20220911-h15m02s37_15/quantized_model.onnx ONNX
20220911-h15m02s37_15/results.json
20220911-h15m03s42_16/model.onnx ONNX
20220911-h15m03s42_16/ort_config.json
20220911-h15m03s42_16/quantized_model.onnx ONNX
20220911-h15m03s42_16/results.json
20220911-h15m08s23_17/augmented_model.onnx ONNX
20220911-h15m08s23_17/calibration_histograms.npy
20220911-h15m08s23_17/model.onnx ONNX
20220911-h15m08s23_17/ort_config.json
20220911-h15m08s23_17/quantized_model.onnx ONNX
20220911-h15m08s23_17/results.json
20220911-h15m13s03_18/augmented_model.onnx ONNX
20220911-h15m13s03_18/calibration_histograms.npy
20220911-h15m13s03_18/model.onnx ONNX
20220911-h15m13s03_18/ort_config.json
20220911-h15m13s03_18/quantized_model.onnx ONNX
20220911-h15m13s03_18/results.json
20220911-h15m17s44_19/augmented_model.onnx ONNX
20220911-h15m17s44_19/calibration_histograms.npy
20220911-h15m17s44_19/model.onnx ONNX
20220911-h15m17s44_19/ort_config.json
20220911-h15m17s44_19/quantized_model.onnx ONNX
20220911-h15m17s44_19/results.json
20220911-h15m22s23_20/augmented_model.onnx ONNX
20220911-h15m22s23_20/calibration_histograms.npy
20220911-h15m22s23_20/model.onnx ONNX
20220911-h15m22s23_20/ort_config.json
20220911-h15m22s23_20/quantized_model.onnx ONNX
20220911-h15m22s23_20/results.json
20220911-h15m23s36_21/model.onnx ONNX
20220911-h15m23s36_21/ort_config.json
20220911-h15m23s36_21/quantized_model.onnx ONNX
20220911-h15m23s36_21/results.json
20220911-h15m24s43_22/model.onnx ONNX
20220911-h15m24s43_22/ort_config.json
20220911-h15m24s43_22/quantized_model.onnx ONNX
20220911-h15m24s43_22/results.json
20220911-h15m25s50_23/model.onnx ONNX
20220911-h15m25s50_23/ort_config.json
20220911-h15m25s50_23/quantized_model.onnx ONNX
20220911-h15m25s50_23/results.json
20220911-h15m26s55_24/model.onnx ONNX
20220911-h15m26s55_24/ort_config.json
20220911-h15m26s55_24/quantized_model.onnx ONNX
20220911-h15m26s55_24/results.json
20220911-h15m28s00_25/model.onnx ONNX
20220911-h15m28s00_25/ort_config.json
20220911-h15m28s00_25/quantized_model.onnx ONNX
20220911-h15m28s00_25/results.json
20220911-h15m29s06_26/model.onnx ONNX
20220911-h15m29s06_26/ort_config.json
20220911-h15m29s06_26/quantized_model.onnx ONNX
20220911-h15m29s06_26/results.json
20220911-h15m36s07_27/augmented_model.onnx ONNX
20220911-h15m36s07_27/calibration_histograms.npy
20220911-h15m36s07_27/model.onnx ONNX
20220911-h15m36s07_27/ort_config.json
20220911-h15m36s07_27/quantized_model.onnx ONNX
20220911-h15m36s07_27/results.json
20220911-h15m37s12_28/model.onnx ONNX
20220911-h15m37s12_28/ort_config.json
20220911-h15m37s12_28/quantized_model.onnx ONNX
20220911-h15m37s12_28/results.json
20220911-h15m41s52_29/augmented_model.onnx ONNX
20220911-h15m41s52_29/calibration_histograms.npy
20220911-h15m41s52_29/model.onnx ONNX
20220911-h15m41s52_29/ort_config.json
20220911-h15m41s52_29/quantized_model.onnx ONNX
20220911-h15m41s52_29/results.json
20220911-h15m48s48_30/augmented_model.onnx ONNX
20220911-h15m48s48_30/calibration_histograms.npy
20220911-h15m48s48_30/model.onnx ONNX
20220911-h15m48s48_30/ort_config.json
20220911-h15m48s48_30/quantized_model.onnx ONNX
20220911-h15m48s48_30/results.json
20220911-h15m49s53_31/model.onnx ONNX
20220911-h15m49s53_31/ort_config.json
20220911-h15m49s53_31/quantized_model.onnx ONNX
20220911-h15m49s53_31/results.json
20220911-h15m50s55_32/model.onnx ONNX
20220911-h15m50s55_32/results.json
20220911-h15m52s08_33/results.json
README.md
runs.json
tensorboard/1662911534.732837/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.1
tensorboard/1662911534.734258/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.2
tensorboard/1662911534.7355487/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.3
tensorboard/1662911534.7367046/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.4
tensorboard/1662911534.7379947/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.5
tensorboard/1662911534.739108/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.6
tensorboard/1662911534.7402008/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.7
tensorboard/1662911534.7415037/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.8
tensorboard/1662911534.743044/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.9
tensorboard/1662911534.7445388/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.10
tensorboard/1662911534.745675/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.11
tensorboard/1662911534.7467856/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.12
tensorboard/1662911534.7478852/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.13
tensorboard/1662911534.749013/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.14
tensorboard/1662911534.7501235/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.15
tensorboard/1662911534.7512152/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.16
tensorboard/1662911534.752321/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.17
tensorboard/1662911534.7534142/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.18
tensorboard/1662911534.754547/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.19
tensorboard/1662911534.7556365/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.20
tensorboard/1662911534.7567508/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.21
tensorboard/1662911534.7581015/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.22
tensorboard/1662911534.7595367/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.23
tensorboard/1662911534.7606676/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.24
tensorboard/1662911534.7617643/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.25
tensorboard/1662911534.7628515/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.26
tensorboard/1662911534.7639267/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.27
tensorboard/1662911534.7650144/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.28
tensorboard/1662911534.7661133/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.29
tensorboard/1662911534.7672038/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.30
tensorboard/1662911534.7682934/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.31
tensorboard/1662911534.7693875/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.32
tensorboard/1662911534.7704914/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.33
tensorboard/1662911534.7715504/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.34
tensorboard/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.0