说明文档
license: apple-ascl license_url: ./LICENSE library_name: onnxruntime pipeline_tag: depth-estimation tags:
- onnx
- depth-estimation
- apple
- fp16
- gpu base_model:
- apple/DepthPro
性能优化与轻量级 ONNX 版本的 DepthPro
这个基于 ONNX 的 DepthPro 模型能够以最小的开销生成高质量的深度图。深度值经过编码,使近处点明亮,远处点较暗,输出结果可直接用于立体视觉和视差应用,无需额外的反转或预处理。该模型针对标准硬件上的高效推理进行了优化。
[!TIP] 实际应用展示: Video Stereo Converter 使用此模型将 2D 视频转换为沉浸式 3D 立体内容——内置批量处理、可恢复工作流程和智能磁盘管理功能。
主要特性
- 纯深度 ONNX 导出:在保持完整深度质量的同时,显著减小了模型体积
- 跳过视场校准:输出原始预测深度值,无需后处理步骤,避免了归一化伪影和计算开销
- 视差就绪输出:开箱即用的立体视觉/视差工作流兼容——无需转换
- FP16 权重:针对 DirectML GPU 加速优化,实现更快的推理速度
- 批大小为 1:基准测试显示单图像批处理可提供最佳吞吐量;更大的批次反而更慢
- Opset 21:使用现代 ONNX 算子,支持更广泛的运行时优化
- 激进的图优化:简化的模型图,减少计算量并加快加载速度
- 快速推理:最小的内存占用和快速的深度图生成
技术规格
| 属性 | 值 |
|---|---|
| 输入形状 | (1, 3, 1536, 1536) NCHW |
| 输入数据类型 | float16 |
| 输入范围 | [-1.0, 1.0](归一化 RGB) |
| 输出形状 | (1, 1536, 1536) |
| 输出数据类型 | float16 |
| 输出范围 | 相对深度(数值越大 = 距离越近) |
系统要求
- 显存:~5.2 GB
- ONNX Runtime:1.19.0 或更高版本
- Python:3.8 或更高版本
快速开始
pip install onnxruntime-directml numpy opencv-python
import cv2
import numpy as np
import onnxruntime as ort
# 加载模型
session = ort.InferenceSession('depthpro_1536x1536_bs1_fp16_opset21_optimized.onnx', providers=['DmlExecutionProvider', 'CPUExecutionProvider'])
input_name, output_name = session.get_inputs()[0].name, session.get_outputs()[0].name
# 加载并预处理
img = cv2.cvtColor(cv2.imread('examples/sample1/source.jpg'), cv2.COLOR_BGR2RGB)
img = cv2.resize(img, (1536, 1536))
img = np.transpose(((img.astype(np.float32)/127.5)-1.0).astype(np.float16), (2,0,1))[np.newaxis]
# 推理
depth = session.run([output_name], {input_name: img})[0].squeeze().astype(np.float32)
# 裁剪极值并归一化
depth = np.clip(np.nan_to_num(depth, nan=0.0), -1e3, 1e3)
depth_norm = (depth - depth.min()) / max(depth.max() - depth.min(), 1e-6)
# 保存 8 位 PNG 以获得更小的文件体积
cv2.imwrite('depth_frame_0001.png', (depth_norm * 255).round().astype(np.uint8))
# 保存 16 位 TIFF 以获得更高精度
cv2.imwrite('depth_frame_0001.tif', (depth_norm * 65535).round().astype(np.uint16), [cv2.IMWRITE_TIFF_COMPRESSION, cv2.IMWRITE_TIFF_COMPRESSION_DEFLATE])
print(f'深度图已保存')
基准测试:速度、体积与深度图质量
在 AMD Radeon RX 7900 XTX 上使用 ONNX Runtime v1.23.0 和 DirectML 进行基准测试。
不含后处理的 DepthPro 模型
<table width="100%"> <tr> <th>模型</th> <th width="120">吞吐量</th> <th width="100">模型大小</th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/source.jpg"><img src="examples/sample1/source.jpg" width="128" /></a></th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/source.jpg"><img src="examples/sample2/source.jpg" width="128" /></a></th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/source.jpg"><img src="examples/sample3/source.jpg" width="128" /></a></th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/source.jpg"><img src="examples/sample4/source.jpg" width="128" /></a></th> </tr> <tr> <td><a href="https://huggingface.co/apple/DepthPro-hf">apple/DepthPro-hf</a></td> <td>1.5 张/分钟</td> <td>1.8 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/DepthPro-hf_nopost_plasma.png"><img src="examples/sample1/DepthPro-hf_nopost_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/DepthPro-hf_nopost_plasma.png"><img src="examples/sample2/DepthPro-hf_nopost_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/DepthPro-hf_nopost_plasma.png"><img src="examples/sample3/DepthPro-hf_nopost_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/DepthPro-hf_nopost_plasma.png"><img src="examples/sample4/DepthPro-hf_nopost_plasma.png" width="128" /></a></td> </tr> <tr> <td>Owl3D Precision V2</td> <td>9.6 张/分钟</td> <td>1.2 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/owl3d_plasma.png"><img src="examples/sample1/owl3d_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/owl3d_plasma.png"><img src="examples/sample2/owl3d_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/owl3d_plasma.png"><img src="examples/sample3/owl3d_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/owl3d_plasma.png"><img src="examples/sample4/owl3d_plasma.png" width="128" /></a></td> </tr> <tr> <td><strong>本模型</strong></td> <td><strong>75.7 张/分钟</strong></td> <td>1.2 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png"><img src="examples/sample1/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png"><img src="examples/sample2/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png"><img src="examples/sample3/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png"><img src="examples/sample4/depthpro_1536x1536_bs1_fp16_opset21_optimized_plasma.png" width="128" /></a></td> </tr> </table>
含后处理的 DepthPro 模型
DepthPro 的后处理步骤使用视场信息校准深度值并归一化输出。这可能导致严重的伪影:
- 对比度被压缩:极端的深度值离群点(例如 10,000 米而不是在各种场景中观察到的典型最大值约 130 米)会导致归一化将有用的深度信息压缩到一个狭窄的范围内,将大多数像素映射到极端的近值
- 结果不一致:这些伪影会不可预测地出现,特别是在量化模型中,但在全精度版本中也会出现
以下模型使用后处理,根据场景不同可能会出现这些问题:
<table width="100%"> <tr> <th>模型</th> <th width="120">吞吐量</th> <th width="100">模型大小</th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/source.jpg"><img src="examples/sample1/source.jpg" width="128" /></a></th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/source.jpg"><img src="examples/sample2/source.jpg" width="128" /></a></th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/source.jpg"><img src="examples/sample3/source.jpg" width="128" /></a></th> <th width="128"><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/source.jpg"><img src="examples/sample4/source.jpg" width="128" /></a></th> </tr> <tr> <td><a href="https://huggingface.co/apple/DepthPro-hf">apple/DepthPro-hf</a></td> <td>1.5 张/分钟</td> <td>1.8 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/DepthPro-hf_plasma.png"><img src="examples/sample1/DepthPro-hf_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/DepthPro-hf_plasma.png"><img src="examples/sample2/DepthPro-hf_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/DepthPro-hf_plasma.png"><img src="examples/sample3/DepthPro-hf_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/DepthPro-hf_plasma.png"><img src="examples/sample4/DepthPro-hf_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model_fp16.onnx</a></td> <td>69.4 张/分钟</td> <td>1.8 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_fp16_plasma.png"><img src="examples/sample1/model_fp16_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_fp16_plasma.png"><img src="examples/sample2/model_fp16_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_fp16_plasma.png"><img src="examples/sample3/model_fp16_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_fp16_plasma.png"><img src="examples/sample4/model_fp16_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model_q4f16.onnx</a></td> <td>52.9 张/分钟</td> <td><strong>0.6 GB</strong></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_q4f16_plasma.png"><img src="examples/sample1/model_q4f16_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_q4f16_plasma.png"><img src="examples/sample2/model_q4f16_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_q4f16_plasma.png"><img src="examples/sample3/model_q4f16_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_q4f16_plasma.png"><img src="examples/sample4/model_q4f16_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model.onnx</a></td> <td>44.0 张/分钟</td> <td>3.5 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_plasma.png"><img src="examples/sample1/model_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_plasma.png"><img src="examples/sample2/model_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_plasma.png"><img src="examples/sample3/model_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_plasma.png"><img src="examples/sample4/model_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model_q4.onnx</a></td> <td>33.3 张/分钟</td> <td>0.7 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_q4_plasma.png"><img src="examples/sample1/model_q4_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_q4_plasma.png"><img src="examples/sample2/model_q4_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_q4_plasma.png"><img src="examples/sample3/model_q4_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_q4_plasma.png"><img src="examples/sample4/model_q4_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model_quantized.onnx</a></td> <td>17.3 张/分钟</td> <td>0.9 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_quantized_plasma.png"><img src="examples/sample1/model_quantized_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_quantized_plasma.png"><img src="examples/sample2/model_quantized_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_quantized_plasma.png"><img src="examples/sample3/model_quantized_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_quantized_plasma.png"><img src="examples/sample4/model_quantized_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model_uint8.onnx</a></td> <td>17.3 张/分钟</td> <td>0.9 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_uint8_plasma.png"><img src="examples/sample1/model_uint8_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_uint8_plasma.png"><img src="examples/sample2/model_uint8_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_uint8_plasma.png"><img src="examples/sample3/model_uint8_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_uint8_plasma.png"><img src="examples/sample4/model_uint8_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model_int8.onnx</a></td> <td>15.9 张/分钟</td> <td>0.9 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_int8_plasma.png"><img src="examples/sample1/model_int8_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_int8_plasma.png"><img src="examples/sample2/model_int8_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_int8_plasma.png"><img src="examples/sample3/model_int8_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_int8_plasma.png"><img src="examples/sample4/model_int8_plasma.png" width="128" /></a></td> </tr> <tr> <td><a href="https://huggingface.co/onnx-community/DepthPro-ONNX">DepthPro-ONNX - model_bnb4.onnx</a></td> <td>1.3 张/分钟</td> <td>0.6 GB</td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample1/model_plasma.png"><img src="examples/sample1/model_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample2/model_plasma.png"><img src="examples/sample2/model_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample3/model_plasma.png"><img src="examples/sample3/model_plasma.png" width="128" /></a></td> <td><a href="https://huggingface.co/Jens-Duttke/DepthPro-ONNX-HighPerf/resolve/main/examples/sample4/model_plasma.png"><img src="examples/sample4/model_plasma.png" width="128" /></a></td> </tr> </table>
许可证 / 使用条款
此 ONNX 版本的 DepthPro 采用 Apple Machine Learning Research Model License 许可。
- 使用仅限于非商业科学研究和学术开发。
- 仅在包含此许可证的情况下允许重新分发。
- 不得使用 Apple 的商标、标识或名称来推广衍生模型。
- 不允许商业用途、产品集成或服务部署。
Jens-Duttke/DepthPro-ONNX-HighPerf
作者 Jens-Duttke
创建时间: 2026-01-07 15:26:32+00:00
更新时间: 2026-01-30 06:56:58+00:00
在 Hugging Face 上查看