返回模型

说明文档

Mussel-Gooseneck-RGB: RGB无人机图像多物种分割模型

模型类型： ONNX 语义分割
应用场景： 高分辨率RGB航空影像中的贻贝和鹅颈藤壶检测
输入： 3波段RGB影像（红、绿、蓝）
输出： 多类分割掩码（背景、贻贝、鹅颈藤壶）

模型描述

Mussel-Gooseneck-RGB是一个深度学习语义分割模型，专门用于在RGB无人机图像中检测和区分贻贝与鹅颈藤壶。该模型通过提供物种级别的分割图，支持海洋栖息地评估中的详细潮间带监测和研究。

主要特性：

多物种分类（贻贝 vs. 鹅颈藤壶）
针对标准RGB无人机影像优化
ImageNet预训练归一化统计参数
高效ONNX格式，支持跨平台部署
专为潮间带监测设计

模型详情

版本： 20250725
输入通道： 3（RGB）
输入尺寸： 动态分块（推荐：2048x2048瓦片）
归一化： 标准（ImageNet统计参数）
输出： 多类分割（0：背景，1：贻贝，2：鹅颈藤壶）
格式： ONNX

归一化参数

模型要求使用ImageNet统计参数对输入图像进行归一化：

{
  \"mean\": [0.485, 0.456, 0.406],
  \"std\": [0.229, 0.224, 0.225],
  \"max_pixel_value\": 255.0
}

使用方法

1. 使用 kelp-o-matic CLI（推荐）

命令行使用方式：

# 安装 kelp-o-matic
pip install git+https://github.com/HakaiInstitute/kelp-o-matic@dev

# 列出可用模型
kom list-models

# 对潮间带RGB无人机影像进行分割
kom segment \
    --model mussel-gooseneck-rgb \
    --input /path/to/intertidal_rgb_image.tif \
    --output /path/to/species_segmentation.tif \
    --batch-size 8 \
    --crop-size 2048 \
    --blur-kernel 3 \
    --morph-kernel 3

# 使用特定模型版本
kom segment \
    --model mussel-gooseneck-rgb \
    --version 20250725 \
    --input image.tif \
    --output result.tif

# 用于高分辨率潮间带调查
kom segment \
    --model mussel-gooseneck-rgb \
    --input high_res_intertidal_survey.tif \
    --output result.tif \
    --batch-size 4 \
    --crop-size 1024

2. 使用 kelp-o-matic Python API

使用该模型最简单的方式是通过kelp-o-matic包：

from kelp_o_matic import model_registry

# 加载模型（如需要会自动下载）
model = model_registry[\"mussel-gooseneck-rgb\"]

# 处理大型潮间带调查图像，支持自动分块
model.process(
    input_path=\"path/to/your/intertidal_rgb_image.tif\",
    output_path=\"path/to/output/species_segmentation.tif\",
    batch_size=8,  # RGB图像可使用更高的批大小
    crop_size=2048,
    blur_kernel_size=3,  # 温和的后处理
    morph_kernel_size=3,  # 形态学操作
)

# 如需更多控制，可直接使用predict方法
import rasterio
import numpy as np

with rasterio.open(\"intertidal_image.tif\") as src:
    # 读取一个2048x2048瓦片（3波段：RGB）
    tile = src.read(window=((0, 2048), (0, 2048)))  # 形状：(3, 2048, 2048)
    tile = np.transpose(tile, (1, 2, 0))  # 转换为HWC格式
    
    # 添加批次维度并预测
    batch = np.expand_dims(tile, axis=0)  # 形状：(1, 2048, 2048, 3)
    batch = np.transpose(batch, (0, 3, 1, 2))  # 转换为BCHW格式
    
    # 运行推理（预处理自动处理）
    predictions = model.predict(batch)
    
    # 后处理获取最终分割结果
    segmentation = model.postprocess(predictions)
    # 结果包含类别标签：0=背景，1=贻贝，2=鹅颈藤壶

3. 直接使用ONNX Runtime

import numpy as np
import onnxruntime as ort
from huggingface_hub import hf_hub_download
from PIL import Image

# 下载模型
model_path = hf_hub_download(repo_id=\"HakaiInstitute/mussel-gooseneck-rgb\", filename=\"model.onnx\")

# 加载模型
session = ort.InferenceSession(model_path)

# ImageNet归一化参数
mean = np.array([0.485, 0.456, 0.406])
std = np.array([0.229, 0.224, 0.225])

# 预处理RGB图像
def preprocess(image):
    \"\"\"
    预处理RGB图像用于模型输入
    image: numpy数组，形状为[height, width, 3]，像素值范围0-255
    \"\"\"
    # 归一化到0-1
    image = image.astype(np.float32) / 255.0
    
    # 应用ImageNet归一化
    image = (image - mean) / std
    
    # 重塑为模型输入格式[batch, channels, height, width]
    image = np.transpose(image, (2, 0, 1))  # HWC转CHW
    image = np.expand_dims(image, axis=0)  # 添加批次维度
    
    return image

# 加载并预处理图像
image = np.array(Image.open(\"intertidal_drone_image.jpg\"))
preprocessed = preprocess(image)

# 运行推理
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: preprocessed})

# 后处理获取多类掩码
logits = output[0]
prediction = np.argmax(logits, axis=1).squeeze(0).astype(np.uint8)
# 结果：0=背景，1=贻贝，2=鹅颈藤壶

4. 使用HuggingFace Hub集成

from huggingface_hub import hf_hub_download
import onnxruntime as ort

# 下载并加载模型
model_path = hf_hub_download(
    repo_id=\"HakaiInstitute/mussel-gooseneck-rgb\",
    filename=\"model.onnx\",
    cache_dir=\"./models\"
)

session = ort.InferenceSession(model_path)
# ... 继续进行预处理和推理，如上所述

安装

kelp-o-matic使用方式：

# 通过pip安装
pip install git+https://github.com/HakaiInstitute/kelp-o-matic@dev

直接ONNX使用方式：

pip install onnxruntime huggingface-hub numpy pillow
# 如需GPU支持：
pip install onnxruntime-gpu

输入要求

图像格式： 3波段RGB栅格（JPEG、PNG、GeoTIFF）
波段顺序： 红、绿、蓝
像素值： 标准8位（0-255范围）
空间分辨率： 针对高分辨率无人机影像优化（厘米级）
场景类型： 暴露的贻贝床和藤壶群落的潮间带区域

输出格式

类型： 单波段栅格，包含类别标签
值：
- 0：背景（水、岩石、其他基质）
- 1：贻贝（通常为蓝贻贝，Mytilus spp.）
- 2：鹅颈藤壶（Pollicipes spp.）
格式： 与输入栅格格式和投影一致
空间分辨率： 与输入相同

注意： 模型输出类别概率，但kelp-o-matic会自动应用argmax将其转换为离散类别标签。

性能说明

动态瓦片大小： 支持灵活的瓦片尺寸（推荐：2048x2048或1024x1024）
批大小： 从4开始，根据可用GPU内存调整

大图像处理

对于处理大型地理空间图像，kelp-o-matic包会处理：

自动分块： 将大图像分割成可管理的瓦片
重叠处理： 使用重叠瓦片避免边缘伪影
内存管理： 批量处理瓦片以管理内存使用
地理空间元数据： 保留坐标参考系统和地理变换
后处理： 可选的中值滤波和形态学操作

引用

如果您在研究中使用此模型，请引用：

@software{Denouden_Kelp-O-Matic,
  author = {Denouden, Taylor and Reshitnyk, Luba},
  doi = {10.5281/zenodo.7672166},
  title = {{Kelp-O-Matic}},
  url = {https://github.com/HakaiInstitute/kelp-o-matic}
}

许可证

MIT许可证 - 详情请参阅kelp-o-matic仓库。

联系方式

如有问题或反馈：

在GitHub仓库提交issue
联系：Hakai Institute

HakaiInstitute/mussel-gooseneck-rgb

作者 HakaiInstitute

image-segmentation

↓ 0 ♥ 1

创建时间: 2025-07-28 21:15:49+00:00

更新时间: 2026-03-03 22:38:52+00:00

在 Hugging Face 上查看

文件 (4)

.gitattributes

README.md

model.onnx ONNX

segformer_b3_mussels_goosenecks_best_model_sa2o16ca.ckpt