返回模型
说明文档
任务: text-classification
后端: sagemaker-training
后端参数: {'instance_type': 'ml.m5.2xlarge', 'supported_instructions': 'avx512'}
评估样本数量: 全部数据集
固定参数:
- 数据集: [{'path': 'glue', 'eval_split': 'validation', 'data_keys': {'primary': 'sentence'}, 'ref_keys': ['label'], 'name': 'sst2', 'calibration_split': 'train'}]
- 模型路径:
distilbert-base-uncased-finetuned-sst-2-english - 来自Transformers:
True - 校准:
- 方法:
percentile - 校准样本数:
128 - 校准直方图百分位:
99.999
- 方法:
基准测试参数:
- 框架:
onnxruntime,pytorch - 量化方法:
dynamic,static - 待量化算子:
['Add', 'MatMul'],['Add'] - 节点排除:
[],['layernorm', 'gelu', 'residual', 'gather', 'softmax'] - 逐通道量化:
False,True - 框架参数:
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4},{} - 缩减范围:
True,False - 应用量化:
True,False
评估
非时间指标
| framework | quantization_approach | operators_to_quantize | node_exclusion | per_channel | framework_args | reduce_range | apply_quantization | accuracy | |
|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 0.911 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.898 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.893 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.490 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.901 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.898 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.893 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.490 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.901 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.911 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.911 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.911 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.911 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.911 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.911 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.911 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.911 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.899 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.899 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.491 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.908 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.899 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.899 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.499 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.900 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.906 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.906 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.906 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.906 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.901 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.901 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 0.901 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 0.901 |
pytorch |
None |
None |
None |
None |
{} |
None |
None |
| | 0.911 |
时间指标
时间基准测试每个配置运行15秒。
以下是批次大小 = 1,输入长度 = 32 的时间指标。
| framework | quantization_approach | operators_to_quantize | node_exclusion | per_channel | framework_args | reduce_range | apply_quantization | latency_mean (ms) | throughput (/s) | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 14.50 | | | 69.00 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.19 | | | 98.13 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.66 | | | 93.87 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.45 | | | 95.67 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.72 | | | 93.33 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.40 | | | 96.20 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.16 | | | 98.40 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 10.40 | | | 96.20 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 10.86 | | | 92.07 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.43 | | | 69.33 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.68 | | | 68.13 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.40 | | | 69.47 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.79 | | | 67.60 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.80 | | | 67.60 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.13 | | | 70.80 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.54 | | | 68.80 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.60 | | | 68.53 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 11.23 | | | 89.13 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 11.18 | | | 89.47 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 11.39 | | | 87.87 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 11.31 | | | 88.47 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 13.73 | | | 72.87 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 14.42 | | | 69.40 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 14.09 | | | 71.00 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 13.78 | | | 72.60 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 16.11 | | | 62.13 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 15.97 | | | 62.67 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 15.82 | | | 63.27 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 15.94 | | | 62.73 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 19.03 | | | 52.60 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 18.99 | | | 52.67 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 18.93 | | | 52.87 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 18.65 | | | 53.67 |
pytorch |
None |
None |
None |
None |
{} |
None |
None |
| | 31.28 | | | 32.00 |
以下是批次大小 = 1,输入长度 = 64 的时间指标。
| framework | quantization_approach | operators_to_quantize | node_exclusion | per_channel | framework_args | reduce_range | apply_quantization | latency_mean (ms) | throughput (/s) | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 24.59 | | | 40.67 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 18.67 | | | 53.60 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 19.16 | | | 52.20 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 18.97 | | | 52.73 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 19.29 | | | 51.87 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 19.13 | | | 52.33 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 18.64 | | | 53.67 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 19.01 | | | 52.60 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 18.96 | | | 52.80 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.63 | | | 40.67 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 25.28 | | | 39.60 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.75 | | | 40.47 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 24.97 | | | 40.07 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 25.16 | | | 39.80 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 24.49 | | | 40.87 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.88 | | | 40.20 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 25.17 | | | 39.73 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 20.05 | | | 49.93 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 20.76 | | | 48.20 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 20.75 | | | 48.20 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 20.23 | | | 49.47 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.79 | | | 40.40 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 25.17 | | | 39.73 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 24.14 | | | 41.47 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 25.27 | | | 39.60 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 27.97 | | | 35.80 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 27.43 | | | 36.47 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 28.17 | | | 35.53 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 28.16 | | | 35.53 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 33.24 | | | 30.13 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 32.46 | | | 30.87 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 32.39 | | | 30.93 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 32.75 | | | 30.53 |
pytorch |
None |
None |
None |
None |
{} |
None |
None |
| | 41.25 | | | 24.27 |
以下是批次大小 = 1,输入长度 = 128 的时间指标。
| framework | quantization_approach | operators_to_quantize | node_exclusion | per_channel | framework_args | reduce_range | apply_quantization | latency_mean (ms) | throughput (/s) | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
onnxruntime |
None |
None |
None |
None |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
None |
False |
| | 46.51 | | | 21.53 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 35.33 | | | 28.33 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 35.92 | | | 27.87 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 35.56 | | | 28.13 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 36.32 | | | 27.53 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 35.53 | | | 28.20 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 35.96 | | | 27.87 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 35.42 | | | 28.27 |
onnxruntime |
dynamic |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 36.06 | | | 27.80 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 47.40 | | | 21.13 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 47.14 | | | 21.27 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 47.46 | | | 21.13 |
onnxruntime |
dynamic |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 47.26 | | | 21.20 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 47.48 | | | 21.07 |
onnxruntime |
dynamic |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 47.08 | | | 21.27 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 47.02 | | | 21.33 |
onnxruntime |
dynamic |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 47.05 | | | 21.27 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 39.63 | | | 25.27 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 39.52 | | | 25.33 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 39.78 | | | 25.20 |
onnxruntime |
static |
['Add', 'MatMul'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 40.01 | | | 25.00 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 44.24 | | | 22.67 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 44.55 | | | 22.47 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 45.74 | | | 21.87 |
onnxruntime |
static |
['Add', 'MatMul'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 44.12 | | | 22.67 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 51.41 | | | 19.47 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 52.52 | | | 19.07 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 51.25 | | | 19.53 |
onnxruntime |
static |
['Add'] |
['layernorm', 'gelu', 'residual', 'gather', 'softmax'] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 51.51 | | | 19.47 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 59.37 | | | 16.87 |
onnxruntime |
static |
['Add'] |
[] |
False |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 58.28 | | | 17.20 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
False |
True |
| | 59.37 | | | 16.87 |
onnxruntime |
static |
['Add'] |
[] |
True |
{'opset': 13, 'optimization_level': 1, 'intra_op_num_threads': 4} |
True |
True |
| | 58.28 | | | 17.20 |
pytorch |
None |
None |
None |
None |
{} |
None |
None |
| | 53.72 | | | 18.67 |
fxmarty/20220911-h13m58s49_sst2_distilbert_quantization
作者 fxmarty
text-classification
↓ 0
♥ 0
创建时间: 2022-09-11 15:52:09+00:00
更新时间: 2022-09-11 15:55:26+00:00
在 Hugging Face 上查看文件 (201)
.gitattributes
20220911-h14m00s12_0/model.onnx
ONNX
20220911-h14m00s12_0/ort_config.json
20220911-h14m00s12_0/quantized_model.onnx
ONNX
20220911-h14m00s12_0/results.json
20220911-h14m05s08_1/augmented_model.onnx
ONNX
20220911-h14m05s08_1/calibration_histograms.npy
20220911-h14m05s08_1/model.onnx
ONNX
20220911-h14m05s08_1/ort_config.json
20220911-h14m05s08_1/quantized_model.onnx
ONNX
20220911-h14m05s08_1/results.json
20220911-h14m06s13_2/model.onnx
ONNX
20220911-h14m06s13_2/ort_config.json
20220911-h14m06s13_2/quantized_model.onnx
ONNX
20220911-h14m06s13_2/results.json
20220911-h14m13s04_3/augmented_model.onnx
ONNX
20220911-h14m13s04_3/calibration_histograms.npy
20220911-h14m13s04_3/model.onnx
ONNX
20220911-h14m13s04_3/ort_config.json
20220911-h14m13s04_3/quantized_model.onnx
ONNX
20220911-h14m13s04_3/results.json
20220911-h14m17s44_4/augmented_model.onnx
ONNX
20220911-h14m17s44_4/calibration_histograms.npy
20220911-h14m17s44_4/model.onnx
ONNX
20220911-h14m17s44_4/ort_config.json
20220911-h14m17s44_4/quantized_model.onnx
ONNX
20220911-h14m17s44_4/results.json
20220911-h14m24s33_5/augmented_model.onnx
ONNX
20220911-h14m24s33_5/calibration_histograms.npy
20220911-h14m24s33_5/model.onnx
ONNX
20220911-h14m24s33_5/ort_config.json
20220911-h14m24s33_5/quantized_model.onnx
ONNX
20220911-h14m24s33_5/results.json
20220911-h14m25s45_6/model.onnx
ONNX
20220911-h14m25s45_6/ort_config.json
20220911-h14m25s45_6/quantized_model.onnx
ONNX
20220911-h14m25s45_6/results.json
20220911-h14m32s41_7/augmented_model.onnx
ONNX
20220911-h14m32s41_7/calibration_histograms.npy
20220911-h14m32s41_7/model.onnx
ONNX
20220911-h14m32s41_7/ort_config.json
20220911-h14m32s41_7/quantized_model.onnx
ONNX
20220911-h14m32s41_7/results.json
20220911-h14m37s20_8/augmented_model.onnx
ONNX
20220911-h14m37s20_8/calibration_histograms.npy
20220911-h14m37s20_8/model.onnx
ONNX
20220911-h14m37s20_8/ort_config.json
20220911-h14m37s20_8/quantized_model.onnx
ONNX
20220911-h14m37s20_8/results.json
20220911-h14m38s25_9/model.onnx
ONNX
20220911-h14m38s25_9/ort_config.json
20220911-h14m38s25_9/quantized_model.onnx
ONNX
20220911-h14m38s25_9/results.json
20220911-h14m45s20_10/augmented_model.onnx
ONNX
20220911-h14m45s20_10/calibration_histograms.npy
20220911-h14m45s20_10/model.onnx
ONNX
20220911-h14m45s20_10/ort_config.json
20220911-h14m45s20_10/quantized_model.onnx
ONNX
20220911-h14m45s20_10/results.json
20220911-h14m46s27_11/model.onnx
ONNX
20220911-h14m46s27_11/ort_config.json
20220911-h14m46s27_11/quantized_model.onnx
ONNX
20220911-h14m46s27_11/results.json
20220911-h14m53s25_12/augmented_model.onnx
ONNX
20220911-h14m53s25_12/calibration_histograms.npy
20220911-h14m53s25_12/model.onnx
ONNX
20220911-h14m53s25_12/ort_config.json
20220911-h14m53s25_12/quantized_model.onnx
ONNX
20220911-h14m53s25_12/results.json
20220911-h14m54s37_13/model.onnx
ONNX
20220911-h14m54s37_13/ort_config.json
20220911-h14m54s37_13/quantized_model.onnx
ONNX
20220911-h14m54s37_13/results.json
20220911-h14m55s44_14/model.onnx
ONNX
20220911-h14m55s44_14/ort_config.json
20220911-h14m55s44_14/quantized_model.onnx
ONNX
20220911-h14m55s44_14/results.json
20220911-h15m02s37_15/augmented_model.onnx
ONNX
20220911-h15m02s37_15/calibration_histograms.npy
20220911-h15m02s37_15/model.onnx
ONNX
20220911-h15m02s37_15/ort_config.json
20220911-h15m02s37_15/quantized_model.onnx
ONNX
20220911-h15m02s37_15/results.json
20220911-h15m03s42_16/model.onnx
ONNX
20220911-h15m03s42_16/ort_config.json
20220911-h15m03s42_16/quantized_model.onnx
ONNX
20220911-h15m03s42_16/results.json
20220911-h15m08s23_17/augmented_model.onnx
ONNX
20220911-h15m08s23_17/calibration_histograms.npy
20220911-h15m08s23_17/model.onnx
ONNX
20220911-h15m08s23_17/ort_config.json
20220911-h15m08s23_17/quantized_model.onnx
ONNX
20220911-h15m08s23_17/results.json
20220911-h15m13s03_18/augmented_model.onnx
ONNX
20220911-h15m13s03_18/calibration_histograms.npy
20220911-h15m13s03_18/model.onnx
ONNX
20220911-h15m13s03_18/ort_config.json
20220911-h15m13s03_18/quantized_model.onnx
ONNX
20220911-h15m13s03_18/results.json
20220911-h15m17s44_19/augmented_model.onnx
ONNX
20220911-h15m17s44_19/calibration_histograms.npy
20220911-h15m17s44_19/model.onnx
ONNX
20220911-h15m17s44_19/ort_config.json
20220911-h15m17s44_19/quantized_model.onnx
ONNX
20220911-h15m17s44_19/results.json
20220911-h15m22s23_20/augmented_model.onnx
ONNX
20220911-h15m22s23_20/calibration_histograms.npy
20220911-h15m22s23_20/model.onnx
ONNX
20220911-h15m22s23_20/ort_config.json
20220911-h15m22s23_20/quantized_model.onnx
ONNX
20220911-h15m22s23_20/results.json
20220911-h15m23s36_21/model.onnx
ONNX
20220911-h15m23s36_21/ort_config.json
20220911-h15m23s36_21/quantized_model.onnx
ONNX
20220911-h15m23s36_21/results.json
20220911-h15m24s43_22/model.onnx
ONNX
20220911-h15m24s43_22/ort_config.json
20220911-h15m24s43_22/quantized_model.onnx
ONNX
20220911-h15m24s43_22/results.json
20220911-h15m25s50_23/model.onnx
ONNX
20220911-h15m25s50_23/ort_config.json
20220911-h15m25s50_23/quantized_model.onnx
ONNX
20220911-h15m25s50_23/results.json
20220911-h15m26s55_24/model.onnx
ONNX
20220911-h15m26s55_24/ort_config.json
20220911-h15m26s55_24/quantized_model.onnx
ONNX
20220911-h15m26s55_24/results.json
20220911-h15m28s00_25/model.onnx
ONNX
20220911-h15m28s00_25/ort_config.json
20220911-h15m28s00_25/quantized_model.onnx
ONNX
20220911-h15m28s00_25/results.json
20220911-h15m29s06_26/model.onnx
ONNX
20220911-h15m29s06_26/ort_config.json
20220911-h15m29s06_26/quantized_model.onnx
ONNX
20220911-h15m29s06_26/results.json
20220911-h15m36s07_27/augmented_model.onnx
ONNX
20220911-h15m36s07_27/calibration_histograms.npy
20220911-h15m36s07_27/model.onnx
ONNX
20220911-h15m36s07_27/ort_config.json
20220911-h15m36s07_27/quantized_model.onnx
ONNX
20220911-h15m36s07_27/results.json
20220911-h15m37s12_28/model.onnx
ONNX
20220911-h15m37s12_28/ort_config.json
20220911-h15m37s12_28/quantized_model.onnx
ONNX
20220911-h15m37s12_28/results.json
20220911-h15m41s52_29/augmented_model.onnx
ONNX
20220911-h15m41s52_29/calibration_histograms.npy
20220911-h15m41s52_29/model.onnx
ONNX
20220911-h15m41s52_29/ort_config.json
20220911-h15m41s52_29/quantized_model.onnx
ONNX
20220911-h15m41s52_29/results.json
20220911-h15m48s48_30/augmented_model.onnx
ONNX
20220911-h15m48s48_30/calibration_histograms.npy
20220911-h15m48s48_30/model.onnx
ONNX
20220911-h15m48s48_30/ort_config.json
20220911-h15m48s48_30/quantized_model.onnx
ONNX
20220911-h15m48s48_30/results.json
20220911-h15m49s53_31/model.onnx
ONNX
20220911-h15m49s53_31/ort_config.json
20220911-h15m49s53_31/quantized_model.onnx
ONNX
20220911-h15m49s53_31/results.json
20220911-h15m50s55_32/model.onnx
ONNX
20220911-h15m50s55_32/results.json
20220911-h15m52s08_33/results.json
README.md
runs.json
tensorboard/1662911534.732837/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.1
tensorboard/1662911534.734258/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.2
tensorboard/1662911534.7355487/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.3
tensorboard/1662911534.7367046/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.4
tensorboard/1662911534.7379947/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.5
tensorboard/1662911534.739108/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.6
tensorboard/1662911534.7402008/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.7
tensorboard/1662911534.7415037/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.8
tensorboard/1662911534.743044/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.9
tensorboard/1662911534.7445388/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.10
tensorboard/1662911534.745675/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.11
tensorboard/1662911534.7467856/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.12
tensorboard/1662911534.7478852/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.13
tensorboard/1662911534.749013/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.14
tensorboard/1662911534.7501235/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.15
tensorboard/1662911534.7512152/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.16
tensorboard/1662911534.752321/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.17
tensorboard/1662911534.7534142/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.18
tensorboard/1662911534.754547/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.19
tensorboard/1662911534.7556365/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.20
tensorboard/1662911534.7567508/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.21
tensorboard/1662911534.7581015/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.22
tensorboard/1662911534.7595367/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.23
tensorboard/1662911534.7606676/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.24
tensorboard/1662911534.7617643/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.25
tensorboard/1662911534.7628515/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.26
tensorboard/1662911534.7639267/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.27
tensorboard/1662911534.7650144/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.28
tensorboard/1662911534.7661133/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.29
tensorboard/1662911534.7672038/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.30
tensorboard/1662911534.7682934/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.31
tensorboard/1662911534.7693875/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.32
tensorboard/1662911534.7704914/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.33
tensorboard/1662911534.7715504/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.34
tensorboard/events.out.tfevents.1662911534.ip-10-2-100-218.ec2.internal.1.0