注意

您正在阅读 MMEditing 0.x。 MMEditing 0.x 会在 2022 年末开始逐步停止维护，建议您及时升级到 MMEditing 1.0 版本，享受由 OpenMMLab 2.0 带来的更多新特性和更佳的性能表现。阅读 MMEditing 1.0 的发版日志、代码和文档以了解更多。

超分辨率数据集¶

建议将数据集的根目录链接到 $MMEDITING/data 下，如果您的文件目录结构不一致，那么可能需要在配置文件中修改对应的文件路径。

MMEditing 支持下列超分辨率数据集：

图像超分辨率
- DIV2K [ Homepage ]
视频超分辨率
- REDS [ Homepage ]
- Vimeo90K [ Homepage ]

准备 DF2K_OST 数据集¶

@inproceedings{wang2021real,
  title={Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data},
  author={Wang, Xintao and Xie, Liangbin and Dong, Chao and Shan, Ying},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
  pages={1905--1914},
  year={2021}
}

DIV2K 数据集可以在这里下载 (我们只使用训练集)。
Flickr2K 数据集可以在这里下载 (我们只使用训练集)。
OST 数据集可以在这里下载 (我们只使用训练集 OutdoorSceneTrain_v2 )。

请先将所有图片放入 GT 文件夹（命名不需要按顺序）：

mmediting
├── mmedit
├── tools
├── configs
├── data
│   ├── df2k_ost
│   │   ├── GT
│   │   │   ├── 0001.png
│   │   │   ├── 0002.png
│   │   │   ├── ...
...

裁剪子图像¶

为了更快的 IO，我们建议将图像裁剪为子图像。我们提供了这样一个脚本：

python tools/data/super-resolution/df2k_ost/preprocess_df2k_ost_dataset.py --data-root ./data/df2k_ost

生成的数据存放在 df2k_ost 下，数据结构如下，其中 _sub 表示子图像。

mmediting
├── mmedit
├── tools
├── configs
├── data
│   ├── df2k_ost
│   │   ├── GT
│   │   ├── GT_sub
...

Prepare LMDB dataset for DF2K_OST¶

如果你想使用 LMDB 数据集来获得更快的 IO 速度，你可以通过以下方式制作 LMDB 文件：

python tools/data/super-resolution/df2k_ost/preprocess_df2k_ost_dataset.py --data-root ./data/df2k_ost --make-lmdb

准备 DIV2K 数据集¶

@InProceedings{Agustsson_2017_CVPR_Workshops,
    author = {Agustsson, Eirikur and Timofte, Radu},
    title = {NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study},
    booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
    month = {July},
    year = {2017}
}

训练集: DIV2K dataset.
验证集: Set5 and Set14.

mmediting
├── mmedit
├── tools
├── configs
├── data
│   ├── DIV2K
│   │   ├── DIV2K_train_HR
│   │   ├── DIV2K_train_LR_bicubic
│   │   │   ├── X2
│   │   │   ├── X3
│   │   │   ├── X4
│   │   ├── DIV2K_valid_HR
│   │   ├── DIV2K_valid_LR_bicubic
│   │   │   ├── X2
│   │   │   ├── X3
│   │   │   ├── X4
│   ├── val_set5
│   │   ├── Set5_bicLRx2
│   │   ├── Set5_bicLRx3
│   │   ├── Set5_bicLRx4
│   ├── val_set14
│   │   ├── Set14_bicLRx2
│   │   ├── Set14_bicLRx3
│   │   ├── Set14_bicLRx4

裁剪子图¶

为了加快 IO，建议将 DIV2K 中的图片裁剪为一系列子图，为此，我们提供了一个脚本：

python tools/data/super-resolution/div2k/preprocess_div2k_dataset.py --data-root ./data/DIV2K

生成的数据保存在 DIV2K 目录下，其文件结构如下所示，其中 _sub 表示子图:

mmediting
├── mmedit
├── tools
├── configs
├── data
│   ├── DIV2K
│   │   ├── DIV2K_train_HR
│   │   ├── DIV2K_train_HR_sub
│   │   ├── DIV2K_train_LR_bicubic
│   │   │   ├── X2
│   │   │   ├── X3
│   │   │   ├── X4
│   │   │   ├── X2_sub
│   │   │   ├── X3_sub
│   │   │   ├── X4_sub
│   │   ├── DIV2K_valid_HR
│   │   ├── ...
...

准备标注列表文件¶

如果您想使用标注模式来处理数据集，需要先准备一个 txt 格式的标注文件。

标注文件中的每一行包含了图片名以及图片尺寸（这些通常是 ground-truth 图片），这两个字段用空格间隔开。

标注文件示例:

0001_s001.png (480,480,3)
0001_s002.png (480,480,3)

准备 LMDB 格式的 DIV2K 数据集¶

如果您想使用 LMDB 以获得更快的 IO 速度，可以通过以下脚本来构建 LMDB 文件

python tools/data/super-resolution/div2k/preprocess_div2k_dataset.py --data-root ./data/DIV2K --make-lmdb

准备 REDS 数据集¶

@InProceedings{Nah_2019_CVPR_Workshops_REDS,
  author = {Nah, Seungjun and Baik, Sungyong and Hong, Seokil and Moon, Gyeongsik and Son, Sanghyun and Timofte, Radu and Lee, Kyoung Mu},
  title = {NTIRE 2019 Challenge on Video Deblurring and Super-Resolution: Dataset and Study},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops},
  month = {June},
  year = {2019}
}

训练集: REDS 数据集.
验证集: REDS 数据集和 Vid4.

请注意，我们合并了 REDS 的训练集和验证集，以便在 REDS4 划分（在 EDVR 中会使用到）和官方验证集划分之间切换。

原始验证集的名称被修改了（clip 000 到 029），以避免与训练集发生冲突（总共 240 个 clip）。具体而言，验证集中的 clips 被改名为 240、241、… 269。

可通过运行以下命令来准备 REDS 数据集:

python tools/data/super-resolution/reds/preprocess_reds_dataset.py ./data/REDS

mmediting
├── mmedit
├── tools
├── configs
├── data
│   ├── REDS
│   │   ├── train_sharp
│   │   │   ├── 000
│   │   │   ├── 001
│   │   │   ├── ...
│   │   ├── train_sharp_bicubic
│   │   │   ├── 000
│   │   │   ├── 001
│   │   │   ├── ...
│   ├── REDS4
│   │   ├── GT
│   │   ├── sharp_bicubic

准备 LMDB 格式的 REDS 数据集¶

如果您想使用 LMDB 以获得更快的 IO 速度，可以通过以下脚本来构建 LMDB 文件：

python tools/data/super-resolution/reds/preprocess_reds_dataset.py --root-path ./data/REDS --make-lmdb

裁剪为子图¶

MMEditing 支持将 REDS 图像裁剪为子图像以加快 IO。我们提供了这样一个脚本：

python tools/data/super-resolution/reds/crop_sub_images.py --data-root ./data/REDS  -scales 4

生成的数据存储在 REDS 下，数据结构如下，其中_sub表示子图像。

mmediting
├── mmedit
├── tools
├── configs
├── data
│   ├── REDS
│   │   ├── train_sharp
│   │   │   ├── 000
│   │   │   ├── 001
│   │   │   ├── ...
│   │   ├── train_sharp_sub
│   │   │   ├── 000_s001
│   │   │   ├── 000_s002
│   │   │   ├── ...
│   │   │   ├── 001_s001
│   │   │   ├── ...
│   │   ├── train_sharp_bicubic
│   │   │   ├── X4
│   │   │   │   ├── 000
│   │   │   │   ├── 001
│   │   │   │   ├── ...
│   │   │   ├── X4_sub
│   │   │   ├── 000_s001
│   │   │   ├── 000_s002
│   │   │   ├── ...
│   │   │   ├── 001_s001
│   │   │   ├── ...

请注意，默认情况下，preprocess_reds_dataset.py 不会为裁剪后的数据集制作 lmdb 和注释文件。您可能需要为此类操作稍微修改脚本。

准备 Vid4 数据集¶

@article{xue2019video,
  title={On Bayesian adaptive video super resolution},
  author={Liu, Ce and Sun, Deqing},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  volume={36},
  number={2},
  pages={346--360},
  year={2013},
  publisher={IEEE}
}

可以从此处下载 Vid4 数据集，其中包含了由两种下采样方法得到的图片：

BIx4 包含了由双线性插值下采样得到的图片
BDx4 包含了由 σ=1.6 的高斯核模糊，然后每4个像素进行一次采样得到的图片

准备 Vimeo90K 数据集¶

@article{xue2019video,
  title={Video Enhancement with Task-Oriented Flow},
  author={Xue, Tianfan and Chen, Baian and Wu, Jiajun and Wei, Donglai and Freeman, William T},
  journal={International Journal of Computer Vision (IJCV)},
  volume={127},
  number={8},
  pages={1106--1125},
  year={2019},
  publisher={Springer}
}

训练集和测试集可以从此处下载。

Vimeo90K 数据集包含了如下所示的 clip/sequence/img 目录结构：

├── GT/LQ
│   ├── 00001
│   │   ├── 0001
│   │   │   ├── im1.png
│   │   │   ├── im2.png
│   │   │   ├── ...
│   │   ├── 0002
│   │   ├── 0003
│   │   ├── ...
│   ├── 00002
│   ├── ...

准备 Vimeo90K 数据集的标注文件¶

为了准备好训练所需的标注文件，请先从 Vimeo90K 数据集官网下载训练路径列表，随后执行如下命令：

python tools/data/super-resolution/vimeo90k/preprocess_vimeo90k_dataset.py ./data/Vimeo90K/official_train_list.txt

测试集的标注文件可通过类似方式生成.

准备 LMDB 格式的 Vimeo90K 数据集¶

如果您想使用 LMDB 以获得更快的 IO 速度，可以通过以下脚本来构建 LMDB 文件

python tools/data/super-resolution/vimeo90k/preprocess_vimeo90k_dataset.py ./data/Vimeo90K/official_train_list.txt --gt-path ./data/Vimeo90K/GT --lq-path ./data/Vimeo90K/LQ  --make-lmdb