目前研究
Ego4d与EgoExo4d数据集准备
Ego4d与EgoExo4d的数据集是申请制,申请得到的license有access id与access key。这里根据官网信息进行下载
pip install awscli
aws configure
egoexo -o /data0/hh_data/egoexo4d/ --parts downscaled_takes/448
训练一个小型多模态模型(pretrain + sft)
info
下载qwen2.5-0.5b和siglip2 qwen2.5-0.5b:https://hf-mirror.com/Qwen/Qwen2.5-0.5B-Instruct siglip2:https://hf-mirror.com/google/siglip2-base-patch16-224
下载数据集 1、预训练数据: 图片数据: https://hf-mirror.com/datasets/liuhaotian/LLaVA-CC3M-Pretrain-595K 中文文本数据: https://hf-mirror.com/datasets/LinkSoul/Chinese-LLaVA-Vision-Instructions
2、SFT数据: 图片数据: https://hf-mirror.com/datasets/jingyaogong/minimind-v_dataset 中文文本数据: https://hf-mirror.com/datasets/LinkSoul/Chinese-LLaVA-Vision-Instructions
- 视觉编码器:Siglip2
- 文本LLM:qwen2.5-0.5b
流程:
- 使用LLaVA-CC3M-Pretrain-595K图片集与中文翻译的文本数据集chat-translated.json进行预训练。chat-translated.json例子如下:
[{
"id": "GCC_train_002582585",
"image": "GCC_train_002582585.jpg",
"conversations": [
{
"from": "human",
"value": "提供给定图像的简要描述。\n<image>"
},
{
"from": "gpt",
"value": "橄榄油是自由使用的健康成分。"
}
]
},
{
"id": "GCC_train_002429825",
"image": "GCC_train_002429825.jpg",
"conversations": [
{
"from": "human",
"value": "<image>\n写一篇简短但有益的图片摘要."
},
{
"from": "gpt",
"value": "随机旋转的薄金符号的3D向量豪华字母。"
}
]
},
{
"id": "GCC_train_000190697",
"image": "GCC_train_000190697.jpg",
"conversations": [
{
"from": "human",
"value": "分享提供图像的简要解释。\n<image>"
},
{
"from": "gpt",
"value": "气候变化——岩石的物理和化学分解。"
}
]
}] - 111
[
{
"id": "000000033471",
"image": "000000033471.jpg",
"conversations": [
{
"from": "human",
"value": "<image>\n图中巴士的颜色是 什么?"
},
{
"from": "gpt",
"value": "图中的公共汽车是白色和红色。"
},
{
"from": "human",
"value": "公共汽车后部可看到什么特征?"
},
{
"from": "gpt",
"value": "公共汽车的后部设有广告。"
},
{
"from": "human",
"value": "公共汽车在街上行走 吗?还是停靠在一边 呢?"
},
{
"from": "gpt",
"value": "公共汽车正在沿着拥挤着人和其他车辆的街道行驶。"
}
]
},
]
element 0 of tensors does not require grad and does not have a grad_fn
参考: 网址
add model.enable_input_require_grads() before get_peft_model or
if peft_config is not None:
model.enable_input_require_grads()
model = get_peft_model(model, peft_config)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
参考: 网址
错误复现:导入deepspeed后执行:
import deepspeed
deepspeed.ops.op_builder.CPUAdamBuilder().load()
该报错较为复杂,经过查阅和实践后,认为和以下四方面的因素有关:
- torch和deepspeed的版本冲突(参见:相关博客1);
- deepspeed版本问题,通过DS_BUILD_CPU_ADAM=1 pip install deepspeed重新安装后解决(参见:相关博客2);
- 环境变量没有定义,将cuda, library的环境变量导入(参见:相关链接3);
export CUDA_HOME="/usr/local/cuda-12.5"
export LIBRARY_PATH="/usr/local/cuda-12.5/lib64:$LIBRARY_PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-12.5/lib64:$LD_LIBRARY_PATH" - gcc版本太低,需将conda环境中的gcc版本进行升级:在很多新的Linux设备上,gcc初始化版本为4.8.5,torch则要求最小为4.9,因此需要对gcc进行升级;(在conda环境中可单独升级gcc版本,而不影响系统环境,适用于不具备管理员sudo权限的设备)
AttributeError: 'Linear' object has no attribute 'ds_grads_remaining'
(vtg-r1) hh@amax:~/workspace/vtg-r1$ uv pip list | grep deepspeed
Using Python 3.11.13 environment at: /home/hh/.conda/envs/vtg-r1
deepspeed 0.18.0
deepspeed版本问题:
降到0.16.4
RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn
info
model.enable_input_require_grads() # 加上这个
# ...
model = get_peft_model()