返回模型

说明文档

.title { font-size: 2.5em; text-align: center; color: #333; font-family: 'Helvetica Neue', sans-serif; text-transform: uppercase; letter-spacing: 0.1em; padding: 0.5em 0; background: transparent; }

.title span { background: -webkit-linear-gradient(45deg, #7ed56f, #28b485); -webkit-background-clip: text; -webkit-text-fill-color: transparent; }

.custom-table { table-layout: fixed; width: 100%; border-collapse: collapse; margin-top: 2em; }

.custom-table td { width: 50%; vertical-align: top; padding: 10px; box-shadow: 0px 0px 0px 0px rgba(0, 0, 0, 0.15); }

.custom-image-container { position: relative; width: 100%; margin-bottom: 0em; overflow: hidden; border-radius: 10px; transition: transform .7s; /* Smooth transition for the container */ }

.custom-image-container:hover { transform: scale(1.05); /* Scale the container on hover */ }

.custom-image { width: 100%; height: auto; object-fit: cover; border-radius: 10px; transition: transform .7s; margin-bottom: 0em; }

.nsfw-filter { filter: blur(8px); /* Apply a blur effect / transition: filter 0.3s ease; / Smooth transition for the blur effect */ }

.custom-image-container:hover .nsfw-filter { filter: none; /* Remove the blur effect on hover */ }

.overlay { position: absolute; bottom: 0; left: 0; right: 0; color: white; width: 100%; height: 40%; display: flex; flex-direction: column; justify-content: center; align-items: center; font-size: 1vw; font-style: bold; text-align: center; opacity: 0; /* Keep the text fully opaque */ background: linear-gradient(0deg, rgba(0, 0, 0, 0.8) 60%, rgba(0, 0, 0, 0) 100%); transition: opacity .5s; } .custom-image-container:hover .overlay { opacity: 1; } .overlay-text { background: linear-gradient(45deg, #7ed56f, #28b485); -webkit-background-clip: text; color: transparent; text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.7);

.overlay-subtext { font-size: 0.75em; margin-top: 0.5em; font-style: italic; }

.overlay, .overlay-subtext { text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5); }

</style>

<h1 class="title"> <span>Animagine XL 3.1</span> </h1> <h1 class="title"> <span>ONNX 版本</span> </h1> <table class="custom-table"> <tr> <td> <div class="custom-image-container"> <img class="custom-image" src="https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/yq_5AWegnLsGyCYyqJ-1G.png" alt="sample1"> </div> <div class="custom-image-container"> <img class="custom-image" src="https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/sp6w1elvXVTbckkU74v3o.png" alt="sample4"> </div> </td> <td> <div class="custom-image-container"> <img class="custom-image" src="https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/OYBuX1XzffN7Pxi4c75JV.png" alt="sample2"> </div> <div class="custom-image-container"> <img class="custom-image" src="https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/ytT3Oaf-atbqrnPIqz_dq.png" alt="sample3"> </td> <td> <div class="custom-image-container"> <img class="custom-image" src="https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/0oRq204okFxRGECmrIK6d.png" alt="sample1"> </div> <div class="custom-image-container"> <img class="custom-image" src="https://cdn-uploads.huggingface.co/production/uploads/6365c8dbf31ef76df4042821/DW51m0HlDuAlXwu8H8bIS.png" alt="sample4"> </div> </td> </tr> </table>

Animagine XL 3.1 是 Animagine XL V3 系列的更新版本，对前一个版本 Animagine XL 3.0 进行了增强。这个开源的动漫主题文本到图像模型已经过改进，可以生成更高质量的动漫风格图像。它包含了来自知名动漫系列的更广泛的角色，优化的数据集，以及用于更好图像创作的新美学标签。Animagine XL 3.1 基于 Stable Diffusion XL 构建，旨在通过生成准确且详细的动漫角色表现，成为动漫粉丝、艺术家和内容创作者的宝贵资源。

cagliostrolab/animagine-xl-3.1 和这个仓库有什么区别？ 这个仓库包含该模型的 ONNX 检查点版本。

模型详情

开发者：Cagliostro Research Lab
合作方：SeaArt.ai
模型类型：基于扩散的文本到图像生成模型
模型描述：Animagine XL 3.1 从文本提示生成高质量的动漫图像。它具有增强的手部解剖结构、改进的概念理解和高级提示解释能力。
许可证：Fair AI Public License 1.0-SD
微调自：Animagine XL 3.0

Jupyter Notebooks

注意：Colab 和 Sagemaker Studio Lab 没有足够的 VRAM 或 RAM 来运行推理。

在 Kaggle 中打开演示：

在 Google Colab 中打开演示：

🧨 Diffusers 安装

CPU 推理

首先安装所需的库：

pip install iffusers \"optimum[onnxruntime]\" --upgrade

然后使用以下示例代码运行图像生成：

from optimum.onnxruntime import ORTStableDiffusionXLPipeline

base = \"ecyht2/animagine-xl-3.1-onnx\"
pipe = ORTStableDiffusionXLPipeline.from_pretrained(base)
pipe.to(\"cpu\")

prompt = \"1girl, souryuu asuka langley, neon genesis evangelion, solo, upper body, v, smile, looking at viewer, outdoors, night\"
negative_prompt = \"nsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]\"

image = pipe(
    prompt, 
    negative_prompt=negative_prompt,
    width=832,
    height=1216, 
    guidance_scale=7,
    num_inference_steps=28
).images[0]

image.save(\"./output/asuka_test.png\")

GPU 推理

首先安装所需的库：

pip install iffusers \"optimum[onnxruntime-gpu]\" --upgrade

然后使用以下示例代码运行图像生成：

from optimum.onnxruntime import ORTStableDiffusionXLPipeline

base = \"ecyht2/animagine-xl-3.1-onnx\"
pipe = ORTStableDiffusionXLPipeline.from_pretrained(base)
pipe.to(\"cuda\")

prompt = \"1girl, souryuu asuka langley, neon genesis evangelion, solo, upper body, v, smile, looking at viewer, outdoors, night\"
negative_prompt = \"nsfw, lowres, (bad), text, error, fewer, extra, missing, worst quality, jpeg artifacts, low quality, watermark, unfinished, displeasing, oldest, early, chromatic aberration, signature, extra digits, artistic error, username, scan, [abstract]\"

image = pipe(
    prompt, 
    negative_prompt=negative_prompt,
    width=832,
    height=1216, 
    guidance_scale=7,
    num_inference_steps=28
).images[0]

image.save(\"./output/asuka_test.png\")

使用指南

标签顺序

为了获得最佳效果，建议遵循结构化的提示模板，因为我们是这样训练模型的：

1girl/1boy, 角色名称, 来自哪个系列, 其他任何顺序的内容。

特殊标签

Animagine XL 3.1 使用特殊标签来引导结果朝向质量、评级、创建日期和美学方向。虽然模型可以在没有这些标签的情况下生成图像，但使用它们可以帮助获得更好的结果。

质量修饰符

质量标签现在同时考虑分数和帖子评级，以确保平衡的质量分布。我们完善了标签以提高清晰度，例如将"high quality"改为"great quality"。

质量修饰符	分数标准
`masterpiece`	> 95%
`best quality`	> 85% & ≤ 95%
`great quality`	> 75% & ≤ 85%
`good quality`	> 50% & ≤ 75%
`normal quality`	> 25% & ≤ 50%
`low quality`	> 10% & ≤ 25%
`worst quality`	≤ 10%

评级修饰符

我们还简化了评级标签以提高简洁性和清晰度，旨在建立可应用于不同模型的全局规则。例如，标签"rating: general"现在简化为"general"，"rating: sensitive"已简化为"sensitive"。

评级修饰符	评级标准
`safe`	一般
`sensitive`	敏感
`nsfw`	存疑
`explicit, nsfw`	明显

年份修饰符

我们还重新定义了年份范围，以更准确地引导结果朝向特定的现代或复古动漫艺术风格。此更新简化了范围，专注于与当前和过去时代的相关性。

年份标签	年份范围
`newest`	2021 至 2024
`recent`	2018 至 2020
`mid`	2015 至 2017
`early`	2011 至 2014
`oldest`	2005 至 2010

美学标签

我们通过美学标签增强了标签系统，以根据视觉吸引力细化内容分类。这些标签源自专门 ViT（Vision Transformer）图像分类模型的评估，专门针对动漫数据进行训练。为此，我们使用了模型 shadowlilac/aesthetic-shadow-v2，该模型在内容进行训练之前评估内容的美学价值。这确保每条内容不仅相关和准确，而且在视觉上也具有吸引力。

美学标签	分数范围
`very aesthetic`	> 0.71
`aesthetic`	> 0.45 & < 0.71
`displeasing`	> 0.27 & < 0.45
`very displeasing`	≤ 0.27

尺寸	宽高比
`1024 x 1024`	1:1 正方形
`1152 x 896`	9:7
`896 x 1152`	7:9
`1216 x 832`	19:13
`832 x 1216`	13:19
`1344 x 768`	7:4 横向
`768 x 1344`	4:7 纵向
`1536 x 640`	12:5 横向
`640 x 1536`	5:12 纵向

训练和超参数

Animagine XL 3.1 在 2 块 A100 80GB GPU 上训练了约 15 天，总计超过 350 GPU 小时。训练过程包括三个阶段：

预训练：使用了 87 万张有序和标记图像的丰富数据集，以增加 Animagine XL 3.0 的模型知识。
微调 - 第一阶段：使用标记和策划的美学数据集来修复预训练后损坏的 U-Net。
微调 - 第二阶段：使用标记和策划的美学数据集来优化模型的艺术风格，并改进手部和解剖结构渲染。

超参数

阶段	轮次	UNet 学习率	训练文本编码器	批量大小	噪声偏移	优化器	学习率调度器	梯度累积步数	GPU
预训练	10	1e-5	True	16	N/A	AdamW	Cosine Annealing Warm Restart	3	2
微调第一阶段	10	2e-6	False	48	0.0357	Adafactor	Constant with Warmup	1	1
微调第二阶段	15	1e-6	False	48	0.0357	Adafactor	Constant with Warmup	1	1

模型比较（仅预训练）

训练配置

配置项	Animagine XL 3.0	Animagine XL 3.1
GPU	2 x A100 80G	2 x A100 80G
数据集	1,271,990	873,504
随机分隔符	True	True
轮次数	10	10
学习率	7.5e-6	1e-5
文本编码器学习率	3.75e-6	1e-5
有效批量大小	48 x 1 x 2	16 x 3 x 2
优化器	Adafactor	AdamW
优化器参数	Scale Parameter: False, Relative Step: False, Warmup Init: False	Weight Decay: 0.1, Betas: (0.9, 0.99)
学习率调度器	Constant with Warmup	Cosine Annealing Warm Restart
学习率调度器参数	Warmup Steps: 100	Num Cycles: 10, Min LR: 1e-6, LR Decay: 0.9, First Cycle Steps: 9,099

源代码和训练配置可在此处获取：https://github.com/cagliostrolab/sd-scripts/tree/main/notebook

致谢

Animagine XL 3.1 的开发和发布离不开以下个人和组织的宝贵贡献和支持：

SeaArt.ai：我们的合作伙伴和赞助商。
Shadow Lilac：提供美学分类模型 aesthetic-shadow-v2。
Derrian Distro：提供自定义学习率调度器，改编自 LoRA Easy Training Scripts。
Kohya SS：提供全面的训练脚本。
Cagliostrolab 合作者：致力于模型训练、项目管理和数据整理。
早期测试者：提供宝贵的反馈和质量保证工作。
NovelAI：其创新的美学标签方法为我们的实现提供了灵感。
KBlueLeaf：在平衡质量标签分布和基于 Hakubooru Metainfo 管理标签方面提供灵感。

感谢大家的支持和专业知识，推动了动漫风格图像生成的边界。

合作者

局限性

虽然 Animagine XL 3.1 代表了动漫风格图像生成的重大进步，但重要的是要承认其局限性：

专注于动漫：此模型专门设计用于生成动漫风格图像，不适用于创建写实照片。
提示复杂性：此模型可能不适合期望从简短或简单提示获得高质量结果的用户。训练重点在于概念理解而非美学精细化，这可能需要更详细和具体的提示才能达到期望的输出。
提示格式：Animagine XL 3.1 针对 Danbooru 风格标签而非自然语言提示进行了优化。为获得最佳结果，鼓励用户使用适当的标签和语法格式化提示。
解剖和手部渲染：尽管在解剖和手部渲染方面进行了改进，但模型仍可能在这些领域产生次优结果。
数据集大小：用于训练 Animagine XL 3.1 的数据集包含约 87 万张图像。与上一次迭代的数据集（120 万张）结合时，总训练数据约为 210 万张图像。虽然数量可观，但对于"终极"动漫模型而言，此数据集规模仍可能被认为范围有限。
NSFW 内容：Animagine XL 3.1 已设计为生成更平衡的 NSFW 内容。但需要注意的是，即使未明确提示，模型仍可能产生 NSFW 结果。

通过承认这些局限性，我们旨在为 Animagine XL 3.1 的用户提供透明度并设定合理期望。尽管存在这些限制，我们相信该模型代表了动漫风格图像生成的重要一步，并为艺术家、设计师和爱好者提供了强大的工具。

许可证

基于 Animagine XL 3.0，Animagine XL 3.1 属于 Fair AI Public License 1.0-SD 许可证，该许可证与 Stable Diffusion 模型的许可证兼容。要点：

**修改共享：**如果您修改 Animagine XL 3.1，您必须共享您的更改和原始许可证。
**源代码可访问性：**如果您修改的版本可通过网络访问，请提供一种方式（如下载链接）让他人获取源代码。这也适用于衍生模型。
**分发条款：**任何分发必须在本许可证或具有类似规则的另一许可证下进行。
**合规：**不合规必须在 30 天内修复以避免许可证终止，强调透明度和遵守开源价值观。

选择此许可证旨在保持 Animagine XL 3.1 开放和可修改，与开源社区精神保持一致。它保护贡献者和用户，鼓励协作、道德的开源社区。这确保模型不仅从社区投入中受益，而且尊重开源开发自由。

Cagliostro Lab Discord 服务器

Cagliostro Lab 服务器终于向公众开放 https://discord.gg/cqh9tZgbGc

欢迎加入我们的 Discord 服务器

ecyht2/animagine-xl-3.1-onnx

作者 ecyht2

text-to-image

↓ 0 ♥ 2

创建时间: 2024-04-12 04:04:05+00:00

更新时间: 2024-04-13 14:38:10+00:00

在 Hugging Face 上查看

文件 (24)

.gitattributes

README.md

model_index.json

scheduler/scheduler_config.json

text_encoder/config.json

text_encoder/model.onnx ONNX

text_encoder_2/config.json

text_encoder_2/model.onnx ONNX

text_encoder_2/model.onnx_data

tokenizer/merges.txt

tokenizer/special_tokens_map.json

tokenizer/tokenizer_config.json

tokenizer/vocab.json

tokenizer_2/merges.txt

tokenizer_2/special_tokens_map.json

tokenizer_2/tokenizer_config.json

tokenizer_2/vocab.json

unet/config.json

unet/model.onnx ONNX

unet/model.onnx_data

vae_decoder/config.json

vae_decoder/model.onnx ONNX

vae_encoder/config.json

vae_encoder/model.onnx ONNX