mmedit.models.editors.sagan.sagan_generator¶
Module Contents¶
Classes¶
Generator for SNGAN / Proj-GAN. The implementation refers to |
- class mmedit.models.editors.sagan.sagan_generator.SNGANGenerator(output_scale, num_classes=0, base_channels=64, out_channels=3, input_scale=4, noise_size=128, attention_cfg=dict(type='SelfAttentionBlock'), attention_after_nth_block=0, channels_cfg=None, blocks_cfg=dict(type='SNGANGenResBlock'), act_cfg=dict(type='ReLU'), use_cbn=True, auto_sync_bn=True, with_spectral_norm=False, with_embedding_spectral_norm=None, sn_style='torch', norm_eps=0.0001, sn_eps=1e-12, init_cfg=dict(type='BigGAN'), pretrained=None, rgb_to_bgr=False)[源代码]¶
Bases:
torch.nn.ModuleGenerator for SNGAN / Proj-GAN. The implementation refers to https://github.com/pfnet-research/sngan_projection/tree/master/gen_models
In our implementation, we have two notable design. Namely,
channels_cfgandblocks_cfg.channels_cfg: In default config of SNGAN / Proj-GAN, the number ofResBlocks and the channels of those blocks are corresponding to the resolution of the output image. Therefore, we allow user to define
channels_cfgto try their own models. We also provide a default config to allow users to build the model only from the output resolution.block_cfg: In reference code, the generator consists of a group ofResBlock. However, in our implementation, to make this model more generalize, we support defining
blocks_cfgby users and loading the blocks by calling the build_module method.
- 参数
output_scale (int) – Output scale for the generated image.
num_classes (int, optional) – The number classes you would like to generate. This arguments would influence the structure of the intermedia blocks and label sampling operation in
forward(e.g. If num_classes=0, ConditionalNormalization layers would degrade to unconditional ones.). This arguments would be passed to intermedia blocks by overwrite their config. Defaults to 0.base_channels (int, optional) – The basic channel number of the generator. The other layers contains channels based on this number. Default to 64.
out_channels (int, optional) – Channels of the output images. Default to 3.
input_scale (int, optional) – Input scale for the features. Defaults to 4.
noise_size (int, optional) – Size of the input noise vector. Default to 128.
attention_cfg (dict, optional) – Config for the self-attention block. Default to
dict(type='SelfAttentionBlock').attention_after_nth_block (int | list[int], optional) – Self attention block would be added after which ConvBlock. If
intis passed, only one attention block would be added. Iflistis passed, self-attention blocks would be added after multiple ConvBlocks. To be noted that if the input is smaller than1, self-attention corresponding to this index would be ignored. Default to 0.channels_cfg (list | dict[list], optional) – Config for input channels of the intermedia blocks. If list is passed, each element of the list means the input channels of current block is how many times compared to the
base_channels. For blocki, the input and output channels should bechannels_cfg[i]andchannels_cfg[i+1]If dict is provided, the key of the dict should be the output scale and corresponding value should be a list to define channels. Default: Please refer to_defualt_channels_cfg.blocks_cfg (dict, optional) – Config for the intermedia blocks. Defaults to
dict(type='SNGANGenResBlock')act_cfg (dict, optional) – Activation config for the final output layer. Defaults to
dict(type='ReLU').use_cbn (bool, optional) – Whether use conditional normalization. This argument would pass to norm layers. Defaults to True.
auto_sync_bn (bool, optional) – Whether convert Batch Norm to Synchronized ones when Distributed training is on. Defaults to True.
with_spectral_norm (bool, optional) – Whether use spectral norm for conv blocks or not. Default to False.
with_embedding_spectral_norm (bool, optional) – Whether use spectral norm for embedding layers in normalization blocks or not. If not specified (set as
None),with_embedding_spectral_normwould be set as the same value aswith_spectral_norm. Defaults to None.sn_style (str, optional) – The style of spectral normalization. If set to ajbrock, implementation by ajbrock(https://github.com/ajbrock/BigGAN-PyTorch/blob/master/layers.py) will be adopted. If set to torch, implementation by PyTorch will be adopted. Defaults to torch.
norm_eps (float, optional) – eps for Normalization layers (both conditional and non-conditional ones). Default to 1e-4.
sn_eps (float, optional) – eps for spectral normalization operation. Defaults to 1e-12.
init_cfg (string, optional) – Config for weight initialization. Defaults to
dict(type='BigGAN').pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.
rgb_to_bgr (bool, optional) – Whether to reformat the output channels with order bgr. We provide several pre-trained BigGAN weights whose output channels order is rgb. You can set this argument to True to use the weights.
- forward(noise, num_batches=0, label=None, return_noise=False)[源代码]¶
Forward function.
- 参数
noise (torch.Tensor | callable | None) – You can directly give a batch of noise through a
torch.Tensoror offer a callable function to sample a batch of noise data. Otherwise, theNoneindicates to use the default noise sampler.num_batches (int, optional) – The number of batch size. Defaults to 0.
label (torch.Tensor | callable | None) – You can directly give a batch of label through a
torch.Tensoror offer a callable function to sample a batch of label data. Otherwise, theNoneindicates to use the default label sampler.return_noise (bool, optional) – If True,
noise_batchwill be returned in a dict withfake_img. Defaults to False.
- 返回
- If not
return_noise, only the output image will be returned. Otherwise, a dict contains
fake_image,noise_batchandlabel_batchwould be returned.
- If not
- 返回类型
torch.Tensor | dict
- init_weights(pretrained=None, strict=True)[源代码]¶
Init weights for SNGAN-Proj and SAGAN. If
pretrained=None, weight initialization would follow theINIT_TYPEininit_cfg=dict(type=INIT_TYPE).For SNGAN-Proj, (
INIT_TYPE.upper() in ['SNGAN', 'SNGAN-PROJ', 'GAN-PROJ']), we follow the initialization method in the official Chainer’s implementation (https://github.com/pfnet-research/sngan_projection).For SAGAN (
INIT_TYPE.upper() == 'SAGAN'), we follow the initialization method in official tensorflow’s implementation (https://github.com/brain-research/self-attention-gan).Besides the reimplementation of the official code’s initialization, we provide BigGAN’s and Pytorch-StudioGAN’s style initialization (
INIT_TYPE.upper() == BIGGANandINIT_TYPE.upper() == STUDIO). Please refer to https://github.com/ajbrock/BigGAN-PyTorch and https://github.com/POSTECH-CVLab/PyTorch-StudioGAN.- 参数
pretrained (str | dict, optional) – Path for the pretrained model or dict containing information for pretained models whose necessary key is ‘ckpt_path’. Besides, you can also provide ‘prefix’ to load the generator part from the whole state dict. Defaults to None.