16、卷积残差模块算子融合
R-Drop: Regularized Dropout for Neural NetworksDropout is a powerful and widely used technique to regularize the training of deep neural networks. Dropout在训练和推理时存在不一致的问题(集成学习)R 对每个子模型的分布做一个KL散度import numpy as np
def train_r_drop(ratio, x, w1, b1, w2, b2):
# 输入复制一份
x = torch.cat([x, x], dim=0)
layer1 = np.maximum(0, np.dot(w1, x) + b1)
mask1 = np.random.binomial(1, 1...