说明文档

Kelp-RGBI: RGB+NIR 无人机图像海带分割模型

模型类型: ONNX 语义分割
应用场景: 4波段 RGB+NIR 航拍图像中的海带林检测
输入: 4波段图像（红、绿、蓝、近红外）
输出: 二值分割掩码（海带 vs 非海带）

模型描述

Kelp-RGBI 模型是一个专门为检测 4波段 RGB+NIR 无人机图像中的海带林而训练的深度学习语义分割模型。该模型利用额外的近红外波段来提高海带检测的准确性，特别是在具有挑战性的水域条件和 submerged kelp detection（水下海带检测）方面。

关键特性:

针对 multispectral drones（多光谱无人机）的 4波段 RGB+NIR 图像进行了优化
使用 min-max normalization（最小-最大归一化）确保在不同传感器间的稳健性能
高效的 ONNX 格式，支持跨平台部署
通过 NIR 光谱信息增强准确性

模型详情

版本: 20231214
输入通道: 4 (RGB + Near-Infrared)
输入尺寸: 动态分块（推荐: 2048x2048 tiles）
归一化: 最小-最大归一化
输出: 多类别分割（0: 背景, 1: 巨型海带, 2: 牛海带）
格式: ONNX

归一化参数

模型使用逐图像应用的最小-最大归一化：

这意味着每个输入图像使用以下公式归一化到 [0, 1] 范围：(pixel - band_min_value) / (band_max_value - band_min_value)

使用方法

1. 使用 kelp-o-matic CLI（推荐）

命令行用法：

# Install kelp-o-matic
pip install git+https://github.com/HakaiInstitute/kelp-o-matic@dev

# List available models
kom list-models

# Run kelp species segmentation on RGB+NIR drone imagery
kom segment \
    --model kelp-rgbi \
    --input /path/to/rgbi_drone_image.tif \
    --output /path/to/kelp_species_segmentation.tif \
    --batch-size 6 \
    --crop-size 2048 \
    --blur-kernel 5 \
    --morph-kernel 3 \
    -b 1 \  # Specify -b flags to rearrange bands to Red, Green, Blue, NIR order
    -b 2 \
    -b 3 \
    -b 4

# Use specific model version
kom segment \
    --model kelp-rgbi \
    --version 20231214 \
    --input image.tif \
    --output result.tif

# For high-resolution multispectral imagery
kom segment \
    --model kelp-rgbi \
    --input high_res_multispectral.tif \
    --output result.tif \
    --batch-size 4 \
    --crop-size 1024 ]
    -b 3 \  # BGRI -> RGBI
    -b 2 \
    -b 1 \
    -b 4

2. 使用 kelp-o-matic Python API

使用此模型最简单的方式是通过 kelp-o-matic 包：

from kelp_o_matic import model_registry

# Load the model (automatically downloads if needed)
model = model_registry["kelp-rgbi"]

# Process a large multispectral image with automatic tiling
model.process(
    input_path="path/to/your/rgbi_drone_image.tif",
    output_path="path/to/output/kelp_species_segmentation.tif",
    batch_size=6,  # Moderate batch size for 4-band
    crop_size=2048,
    blur_kernel_size=5,  # Post-processing median blur
    morph_kernel_size=3,  # Morphological operations
    band_order=[1, 2, 3, 4],  # Ensure RGBI order
)

# For more control, use the predict method directly
import rasterio
import numpy as np

with rasterio.open("multispectral_image.tif") as src:
    # Read a 2048x2048 tile (4 bands: RGBI)
    tile = src.read(window=((0, 2048), (0, 2048)))  # Shape: (4, 2048, 2048)
    tile = np.transpose(tile, (1, 2, 0))  # Convert to HWC
    
    # Add batch dimension and predict
    batch = np.expand_dims(tile, axis=0)  # Shape: (1, 2048, 2048, 4)
    batch = np.transpose(batch, (0, 3, 1, 2))  # Convert to BCHW
    
    # Run inference (preprocessing handled automatically)
    predictions = model.predict(batch)
    
    # Post-process to get final segmentation
    segmentation = model.postprocess(predictions)
    # Result: 0=background, 1=giant kelp, 2=bull kelp

3. 直接使用 ONNX Runtime

import numpy as np
import onnxruntime as ort
from huggingface_hub import hf_hub_download

# Download the model
model_path = hf_hub_download(repo_id="HakaiInstitute/kelp-rgbi", filename="model.onnx")

# Load the model
session = ort.InferenceSession(model_path)

# Preprocess your 4-band image
def preprocess(image):
    """
    Preprocess 4-band RGBI image for model input
    image: numpy array of shape [height, width, 4] with any pixel value range
    """
    # Normalize to 0-1 first
    image = image.astype(np.float32) / 1.0
    
    # Apply min-max normalization per image
    img_min = image.min()
    img_max = image.max()
    image = (image - img_min) / (img_max - img_min + 1e-8)
    
    # Reshape to model input format [batch, channels, height, width]
    image = np.transpose(image, (2, 0, 1))  # HWC to CHW
    image = np.expand_dims(image, axis=0)  # Add batch dimension
    
    return image

# Run inference
preprocessed = preprocess(your_4band_image)
input_name = session.get_inputs()[0].name
output = session.run(None, {input_name: preprocessed})

# Postprocess to get class predictions
logits = output[0]  # Raw probabilities for each class
prediction = np.argmax(logits, axis=1).squeeze(0).astype(np.uint8)
# Result: 0=background, 1=giant kelp, 2=bull kelp

4. 使用 HuggingFace Hub 集成

from huggingface_hub import hf_hub_download
import onnxruntime as ort

# Download and load model
model_path = hf_hub_download(
    repo_id="HakaiInstitute/kelp-rgbi",
    filename="model.onnx",
    cache_dir="./models"
)

session = ort.InferenceSession(model_path)
# ... continue with preprocessing and inference as above

安装

使用 kelp-o-matic:

# Via pip
pip install git+https://github.com/HakaiInstitute/kelp-o-matic@dev

直接使用 ONNX:

pip install onnxruntime huggingface-hub numpy
# For GPU support:
pip install onnxruntime-gpu

输入要求

图像格式: 4波段栅格（推荐 GeoTIFF）
波段顺序: 红、绿、蓝、近红外
像素值: 任意范围（模型使用最小-最大归一化）
空间分辨率: 针对高分辨率无人机图像优化（厘米级）

输出格式

类型: 带有类别标签的单波段栅格
值:
- 0: 背景（水体、其他特征）
- 1: Macrocystis pyrifera（巨型海带）
- 2: Nereocystis luetkeana（牛海带）
格式: 与输入栅格格式和投影匹配
空间分辨率: 与输入相同

注意: 模型输出类别概率，但 kelp-o-matic 会自动应用 argmax 将其转换为离散类别标签。

性能说明

动态分块尺寸: 支持灵活的分块尺寸（推荐: 2048x2048 或 1024x1024）
批大小: 从 4 开始，根据可用 GPU 内存调整

大图像处理

对于处理大型地理空间图像，kelp-o-matic 包会处理：

自动分块: 将大图像分割成可管理的块
重叠处理: 使用重叠块避免边缘伪影
内存管理: 批量处理块以管理内存使用
地理空间元数据: 保留坐标参考系统和地理变换
后处理: 可选的中值滤波和形态学操作

引用

如果您在研究中使用此模型，请引用：

@software{Denouden_Kelp-O-Matic,
  author = {Denouden, Taylor and Reshitnyk, Luba},
  doi = {10.5281/zenodo.7672166},
  title = {{Kelp-O-Matic}},
  url = {https://github.com/HakaiInstitute/kelp-o-matic}
}

许可证

MIT License - 详情请参阅 kelp-o-matic repository。

联系方式

如有问题或议题：

在 GitHub repository 上提交 issue
联系: Hakai Institute

HakaiInstitute/kelp-rgbi

作者 HakaiInstitute

image-segmentation

↓ 0 ♥ 0

创建时间: 2025-07-28 21:13:35+00:00

更新时间: 2025-09-03 20:06:34+00:00

在 Hugging Face 上查看

文件 (3)

.gitattributes

README.md

model.onnx ONNX