Python相关
python
python注册器
利用python注册器将每个实体类保存在全局CONFIG中
GLOBAL_CONFIG = defaultdict(dict)
def register(dct :Any=GLOBAL_CONFIG, name=None, force=False):
"""
dct:
if dct is Dict, register foo into dct as key-value pair
if dct is Clas, register as modules attibute
force
whether force register.
"""
def decorator(foo):
register_name = foo.__name__ if name is None else name
if not force:
if inspect.isclass(dct):
assert not hasattr(dct, foo.__name__), \
f'module {dct.__name__} has {foo.__name__}'
else:
assert foo.__name__ not in dct, \
f'{foo.__name__} has been already registered'
if inspect.isfunction(foo):
@functools.wraps(foo)
def wrap_func(*args, **kwargs):
return foo(*args, **kwargs)
if isinstance(dct, dict):
dct[foo.__name__] = wrap_func
elif inspect.isclass(dct):
setattr(dct, foo.__name__, wrap_func)
else:
raise AttributeError('')
return wrap_func
elif inspect.isclass(foo):
dct[register_name] = extract_schema(foo)
else:
raise ValueError(f'Do not support {type(foo)} register')
return foo
return decorator
conda
# 导入导出
conda env export > py36.yaml
conda env create -f py36.yaml
conda create --name d2l python=3.9 -y
# 创建环境
conda create --name xxx python=3.9 -y
# 删除环境
conda remove -n env_name --all -y
# 一些用到的命令
conda config --add channels https://pypi.tuna.tsinghua.edu.cn/simple
conda config --show channels
conda config --remove-key channels
conda list
# 该源有效
# https://mirrors.tuna.tsinghua.edu.cn/help/pypi/
conda pip install xxx -i https://pypi.tuna.tsinghua.edu.cn/simple
# 在d2i环境中
jupyter notebook
pandas
dataframe过滤
比如过滤掉 lat 列中值大于 90 或小于 -90 的行:
import pandas as pd
# 假设你已经有一个 DataFrame 叫做 df
# df = pd.DataFrame({'lat': [45, 91, -91, 30, -85]})
# 过滤掉 lat 列中值大于 90 或小于 -90 的行
df_filtered = df[(df['lat'] <= 90) & (df['lat'] >= -90)]
pytorch
Deformable DETR: no suitable conversion function from "const at::DeprecatedTypeProperties" to "c10::ScalarType" exists
安装Deformable DETR的MultiScaleDeformableAttention报错,环境:
- python 3.11
- torch 2.6.0+cu118
- gcc version 9.4.0
for (int n = 0; n < batch/im2col_step_; ++n)
{
auto grad_output_g = grad_output_n.select(0, n);
// value.type()替换为scalar_type()
AT_DISPATCH_FLOATING_TYPES(value.scalar_type(), "ms_deform_attn_backward_cuda", ([&] {
ms_deformable_col2im_cuda(at::cuda::getCurrentCUDAStream(),
grad_output_g.data<scalar_t>(),
value.data<scalar_t>() + n * im2col_step_ * per_value_size,
spatial_shapes.data<int64_t>(),
level_start_index.data<int64_t>(),
sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size,
attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size,
batch_n, spatial_size, num_heads, channels, num_levels, num_query, num_point,
grad_value.data<scalar_t>() + n * im2col_step_ * per_value_size,
grad_sampling_loc.data<scalar_t>() + n * im2col_step_ * per_sample_loc_size,
grad_attn_weight.data<scalar_t>() + n * im2col_step_ * per_attn_weight_size);
}));
}
pytorch 分布式利用vscode调试 torchrun
{
"version": "0.2.0",
"configurations": [
{
"name": "train.py",
"type": "debugpy",
"request": "launch",
"program": "/data2/hh/anaconda3/envs/owod/lib/python3.12/site-packages/torch/distributed/run.py", // 对应python版本的文件(pip show pytorch查看)
"console": "integratedTerminal",
"justMyCode": true,
"args": [
"--master_port", "9909", // 分布式参数
"--nproc_per_node", "3",
"./rtdetrv2_pytorch/tools/train.py", // 程序文件
"-c", "rtdetrv2_pytorch/configs/rtdetrv2/rtdetrv2_r18vd_120e_sar.yml",
"--use-amp",
"--seed", "0"
],
"env": {
"TORCH_DISTRIBUTED_DEBUG": "DETAIL",
"CUDA_VISIBLE_DEVICES": "0,1,2",
}
}
]
}
安装detectron2
conda install conda-forge::detectron2
pip install protobuf==3.20.*
pip install tensorboard
pip install black==21.4b2
The detected CUDA version (11.8) mismatches the version that was used to compile PyTorch (12.1)
You’ve installed PyTorch /w CUDA 12.1, but your system has CUDA 11.8 installed. I’m presuming you’re building PyTorch C++ code. essentially it’s trying to use NVCC 11.8 but build against PyTorch which was compiled with 12.1.
Either upgrade your system to CUDA 12.1 (check $CUDA_HOME if it’s set), change the CUDA_HOME, LD_LIBRARY_PATH and PATH environment variables to include where PyTorch has pulled the binaries for CUDA 12.1, or downgrade your PyTorch to 11.8.
Or you can build PyTorch from source with your existing CUDA 11.8 installation, but this is more work than just pulling PyTorch/11.8 pre-built binaries from their website…
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
Expected floating point type for target with class probabilities, got Long
torch.nn.CrossEntropyLoss(pred, label) 内部自带softmax,因此第一个参数传入的是概率,此外第二个参数传入的类型一定是整形(torch.Long),否则会报错(nll_loss_forward_reduce_cuda_kernel_2d_index not implemented for Double)
Expected all tensors to be the same device
把模型与数据移动到同一个device上,再使用DataParallel,他会自动分发与reduce,切记forward时不可以创建新的tensor再move
计算loss时不可以使用argmax之类的操作
pytorch的loss的output传入的应该是logits而不是类别,针对BCLoss,output应该与target的shape相同,target针对每个样本应该是一个one-hot编码的格式,下面是错误代码:
bag_loss = criterion(bag_prediction.argmax(dim=-1).float(), bag_label)
loss.backward()
nll_loss_forward_reduce_cuda_kernel_2d_index" not implemented for 'Float'
for i, (x, y) in pbar:
x = x.to(device[0])
x = bacth_STFT(x, N_FFT, HOP_LENGTH, WINDOM_LENGTH, torch.hamming_window(WINDOM_LENGTH).to(device[0]), verbose=True)
y = y.type(torch.LongTensor).to(device[0]) # 这里转换为LongTensor
logits = model(x)
loss_1 = criterion(logits, y)
...
Pytorch报错:RuntimeError: self must be a matrix
torch.mm()是两个矩阵相乘,即两个二维的张量相乘,维度超过或小于二维,则会报错。
RuntimeError: zero-dimensional tensor (at position 0) cannot be concatenated
报错代码:
obj_list = [tensor(0., device='cuda:0', grad_fn=<SubBackward0>)]
obj_tensor = torch.cat(obj_list, dim=0)
原因:这个错误是因为你尝试对一个零维(标量)张量进行 torch.cat 操作,而 torch.cat 需要至少一维的张量才能进行拼接。,obj_list 包含的是一个标量张量(零维张量):
解决方法: