martinxm / gap Goto Github PK
View Code? Open in Web Editor NEWofficial implementation for Language Supervised Training for Skeleton-based Action Recognition
License: Apache License 2.0
official implementation for Language Supervised Training for Skeleton-based Action Recognition
License: Apache License 2.0
Thanks for the great work!
When I trained the bone modality following the Readme, I found the modality still is joint.
In the config, the "bone" is set to False.
And the running command CUDA_VISIBLE_DEVICES=0,1 python main_multipart_ntu.py --config config/nturgbd120-cross-subject/lst_bone.yaml --model model.ctrgcn.Model_lst_4part_bone --work-dir work_dir/ntu120/csub/lst_bone --device 0 1
does not change this setting.
Hello, when i tried to use CTR-GCN to train NW-UCLA, i got 95.5 accurancy , 1% lower than the result CTR-GCN given.
Did you meet the same problem in your experiments.
Hi, thanks for your wonderful work but I have some questions about the ablation study. I tried to run the training process with the same settings described in the 'Implementation Details' to see the influence of partition strategies, but I got a different result. So I doubt about the settings of my model and I hope you could help me correct it. Thank you!
I use the model Model_lst_4part
in the training process, with a modified TCN_GCN_unit
to apply single-scale temporal convolution. The code are described below.
class TCN_GCN_unit(nn.Module):
def __init__(self, in_channels, out_channels, A, stride=1, residual=True, adaptive=True, kernel_size=5, dilations=[1,2]):
super(TCN_GCN_unit, self).__init__()
self.gcn1 = unit_gcn(in_channels, out_channels, A, adaptive=adaptive)
# self.tcn1 = MultiScale_TemporalConv(out_channels, out_channels, kernel_size=kernel_size, stride=stride, dilations=dilations,
# residual=False)
self.tcn1 = unit_tcn(out_channels, out_channels, kernel_size=kernel_size, stride=stride)
self.relu = nn.ReLU(inplace=True)
if not residual:
self.residual = lambda x: 0
elif (in_channels == out_channels) and (stride == 1):
self.residual = lambda x: x
else:
self.residual = unit_tcn(in_channels, out_channels, kernel_size=1, stride=stride)
When using 4 part partition strategy, I just use the code in the main_multipart_ntu.py without any modification.
When using global partition strategy, I also apply the model above but only use return value self.fc(x)
and feature_dict
in training process. I add num_text_aug = 1
in the forward process of every epoch to make sure only the encoded global information is permitted to compute the contrastive loss.
for batch_idx, (data, label, index) in enumerate(process):
self.global_step += 1
with torch.no_grad():
data = data.float().cuda(self.output_device)
timer['dataloader'] += self.split_time()
self.optimizer.zero_grad()
# forward
with torch.cuda.amp.autocast():
output, feature_dict, logit_scale, part_feature_list = self.model(data)
# here is what I added
num_text_aug = 1
label_g = gen_label(label)
label = label.long().cuda(self.output_device)
loss_te_list = []
for ind in range(num_text_aug):
if ind > 0:
text_id = np.ones(len(label), dtype=np.int8) * ind
texts = torch.stack([text_dict[j][i, :] for i, j in zip(label, text_id)])
texts = texts.cuda(self.output_device)
else:
texts = list()
for i in range(len(label)):
text_len = len(text_list[label[i]])
text_id = np.random.randint(text_len, size=1)
text_item = text_list[label[i]][text_id.item()]
texts.append(text_item)
texts = torch.cat(texts).cuda(self.output_device)
I trained both model for 110 epochs. The best accuracy of global and 4 parts strategy are 0.9045 and 0.9027 separately. They are not around 85% as the paper says. And I know it seems ridiculous that global strategy even outperforms 4 parts strategy in my experiment. So I wonder maybe there are some mistakes in my implementation. I hope you can help me. Looking forward to your reply and thank you very much!
Thanks for the great work!
I'm working on a project that could greatly benefit from the use of your pre-trained weights as described in your paper. Would it be possible for me to access or obtain the pre-trained weights to use in my research? I would greatly appreciate your assistance.
Thank you for your time and consideration.
Sincerely.
I want to switch the backbone to ST-GCN, but it seems that this approach is not effective. I suspect it might be because I haven't trained the CLIP. Can you tell me how CLIP is trained?
I have run your code according your settings, but I have got the accuracy for NW-UCLA 96.12%.
Hello, I got some errors when I tried to run the project, ‘gen_label’ and ‘create_logits’ seem to be undefined functions
Hi, thanks for sharing your work and I am interested in trying to another application. I am aware that the dataset is based on 3D co-ordinates but I have a 2D pose dataset. Will it be possible to incorporate that data to train this model using transfer learning? If yes, how can i do so?Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.