1.数据处理
整理好数据集按照要求:
先运行
import os import tarfile TRAIN_SRC_DIR = '/root/autodl-pub/ImageNet/ILSVRC2012/ILSVRC2012_img_train.tar' TRAIN_DEST_DIR = '/root/autodl-tmp/imagenet/train' VAL_SRC_DIR = '/root/autodl-pub/ImageNet/ILSVRC2012/ILSVRC2012_img_val.tar' VAL_DEST_DIR = '/root/autodl-tmp/imagenet/val' def extract_train(): with open(TRAIN_SRC_DIR, 'rb') as f: tar = tarfile.open(fileobj=f, mode='r:') for i, item in enumerate(tar): cls_name = item.name.strip(".tar") a = tar.extractfile(item) b = tarfile.open(fileobj=a, mode="r:") e_path = "{}/{}/".format(TRAIN_DEST_DIR, cls_name) if not os.path.isdir(e_path): os.makedirs(e_path) print("#", i, "extract train dateset to >>>", e_path) names = b.getnames() for name in names: b.extract(name, e_path) def extract_val(): with open(VAL_SRC_DIR, 'rb') as f: tar = tarfile.open(fileobj=f, mode='r:') if not os.path.isdir(VAL_DEST_DIR): os.makedirs(VAL_DEST_DIR) print("extract val dateset to >>>", VAL_DEST_DIR) names = tar.getnames() for name in names: tar.extract(name, VAL_DEST_DIR) if __name__ == '__main__': extract_train() extract_val()
再运行
import os import tarfile # TRAIN_SRC_DIR = '/root/autodl-pub/ImageNet/ILSVRC2012/ILSVRC2012_img_train.tar' # TRAIN_DEST_DIR = '/root/autodl-tmp/imagenet/train' VAL_SRC_DIR = '/root/autodl-pub/ImageNet/ILSVRC2012/ILSVRC2012_img_test.tar' VAL_DEST_DIR = '/root/autodl-tmp/imagenet/test' # def extract_train(): # with open(TRAIN_SRC_DIR, 'rb') as f: # tar = tarfile.open(fileobj=f, mode='r:') # for i, item in enumerate(tar): # cls_name = item.name.strip(".tar") # a = tar.extractfile(item) # b = tarfile.open(fileobj=a, mode="r:") # e_path = "{}/{}/".format(TRAIN_DEST_DIR, cls_name) # if not os.path.isdir(e_path): # os.makedirs(e_path) # print("#", i, "extract train dateset to >>>", e_path) # names = b.getnames() # for name in names: # b.extract(name, e_path) def extract_val(): with open(VAL_SRC_DIR, 'rb') as f: tar = tarfile.open(fileobj=f, mode='r:') if not os.path.isdir(VAL_DEST_DIR): os.makedirs(VAL_DEST_DIR) print("extract val dateset to >>>", VAL_DEST_DIR) names = tar.getnames() for name in names: tar.extract(name, VAL_DEST_DIR) if __name__ == '__main__': # extract_train() extract_val()
处理好数据 差一个 label 文件
我帮你处理好了 处理过程就不说了比较繁琐
2.使用这个 生成 extra 文件夹用于训练
from dinov2.data.datasets import ImageNet for split in ImageNet.Split: dataset = ImageNet(split=split, root="/root/autodl-tmp/imagenet", extra="/root/autodl-tmp/extra") dataset.dump_extra()
过程中会报错 label
在报错位置
class_id, class_name = row
修改为
class_id, class_name,*_ = row
3.OK 环境已经配好
如果需要重新配
输入
conda env create -f conda.yaml
conda activate dinov2
即可
运行过程中会报字符串错误 将报错位置为止修改为:
def remove_suffix(s, suffix):
if s.endswith(suffix):
return s[:-len(suffix)]
return s
args.arch = remove_suffix(args.arch, "_memeff")
4.运行
github 给出的运行代码是在集群运行我们没法用
下面是单卡运行
配置我写好了:
首先
cd /root/dinov2
python setup.py install (已经做过了 不用重复做)
然后 cd 到/root/dinov2/dinov2/train
source activte base
python main.py
直接跑起来了就
5.配置
vitl16_short.yaml 修改为
train: dataset_path: ImageNet:split=TRAIN:root=/root/autodl-tmp/imagenet:extra=/root/autodl-tmp/extra batch_size_per_gpu: 8 student: block_chunks: 1
train.py 修改为
parser.add_argument("--config-file", default="/root/dinov2/dinov2/configs/train/vitl16_short.yaml", metavar="FILE", help="path to config file")` parser.add_argument( "--output-dir", "--output_dir", default="~/output", type=str, help="Output directory to save logs and checkpoints", )
欢迎来到这里!
我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。
注册 关于