Giter Club home page Giter Club logo

bottlenecktransformers's Issues

Numbers of heads in MHSA?

It seems that MHSA only has one head in the released code. But in the paper, 4 heads are used in MHSA. Is it a simplification for CIFAR dataset?

A bug shows when the batch_size sets 1

When I set batch_size 1, a bug shows as "ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512, 1, 1])". I wonder how to solve this problem.

unable to get the cifar10 accuracy

Hi, I just change your code with ResNet50(num_classes=10, resolution=(224, 224)), which end with a lower accuaracy of 90.15%. do you have other changes to get the 9511%?

one problem?

when i run main function where is model.py i get an error,for example
"
energy = content_content + content_position
RuntimeError: The size of tensor a (196) must match the size of tensor b (256) at non-singleton dimension 1
"

About MHSA PROBLEMS.

Hello.
In your paper, you said that you use multi head self attention, and this head num is four.
But in this code, I only see 1 head in mhsa module.
And another question is why use this add to get postion encoding.
Looking forward for your reply!
Thank you!

cifar效果达不到啊,直接运行main

Current Learning Rate: [0.030934962553363768]
[Epoch: 251], Loss: 0.085, Acc: 97.030, Correct 12544.0 / Total 12928.0
[Epoch: 251], Loss: 0.084, Acc: 97.104, Correct 24983.0 / Total 25728.0
[Epoch: 251], Loss: 0.081, Acc: 97.194, Correct 37447.0 / Total 38528.0
[Epoch: 251], Acc: 87.820
Current Learning Rate: [0.030032595786498105]
[Epoch: 252], Loss: 0.075, Acc: 97.401, Correct 12592.0 / Total 12928.0
[Epoch: 252], Loss: 0.077, Acc: 97.365, Correct 25050.0 / Total 25728.0
[Epoch: 252], Loss: 0.077, Acc: 97.366, Correct 37513.0 / Total 38528.0
[Epoch: 252], Acc: 86.980
Current Learning Rate: [0.029137946110005482]
[Epoch: 253], Loss: 0.064, Acc: 97.857, Correct 12651.0 / Total 12928.0
[Epoch: 253], Loss: 0.066, Acc: 97.777, Correct 25156.0 / Total 25728.0
[Epoch: 253], Loss: 0.070, Acc: 97.610, Correct 37607.0 / Total 38528.0
[Epoch: 253], Acc: 87.350
Current Learning Rate: [0.02825135842836657]
[Epoch: 254], Loss: 0.073, Acc: 97.563, Correct 12613.0 / Total 12928.0
[Epoch: 254], Loss: 0.074, Acc: 97.477, Correct 25079.0 / Total 25728.0
[Epoch: 254], Loss: 0.070, Acc: 97.560, Correct 37588.0 / Total 38528.0
[Epoch: 254], Acc: 86.640
Current Learning Rate: [0.02737317453800964]
[Epoch: 255], Loss: 0.063, Acc: 97.826, Correct 12647.0 / Total 12928.0
[Epoch: 255], Loss: 0.061, Acc: 97.889, Correct 25185.0 / Total 25728.0
[Epoch: 255], Loss: 0.065, Acc: 97.742, Correct 37658.0 / Total 38528.0
[Epoch: 255], Acc: 87.650
Current Learning Rate: [0.026503732995541415]
[Epoch: 256], Loss: 0.064, Acc: 97.803, Correct 12644.0 / Total 12928.0
[Epoch: 256], Loss: 0.064, Acc: 97.831, Correct 25170.0 / Total 25728.0
[Epoch: 256], Loss: 0.063, Acc: 97.861, Correct 37704.0 / Total 38528.0
[Epoch: 256], Acc: 87.970
Current Learning Rate: [0.025643368987227095]
[Epoch: 257], Loss: 0.066, Acc: 97.788, Correct 12642.0 / Total 12928.0
[Epoch: 257], Loss: 0.066, Acc: 97.804, Correct 25163.0 / Total 25728.0
[Epoch: 257], Loss: 0.065, Acc: 97.835, Correct 37694.0 / Total 38528.0
[Epoch: 257], Acc: 87.380
Current Learning Rate: [0.02479241419976968]
[Epoch: 258], Loss: 0.055, Acc: 98.198, Correct 12695.0 / Total 12928.0
[Epoch: 258], Loss: 0.053, Acc: 98.204, Correct 25266.0 / Total 25728.0
[Epoch: 258], Loss: 0.055, Acc: 98.183, Correct 37828.0 / Total 38528.0
[Epoch: 258], Acc: 87.930
Current Learning Rate: [0.023951196692438358]
[Epoch: 259], Loss: 0.049, Acc: 98.213, Correct 12697.0 / Total 12928.0
[Epoch: 259], Loss: 0.051, Acc: 98.231, Correct 25273.0 / Total 25728.0
[Epoch: 259], Loss: 0.054, Acc: 98.108, Correct 37799.0 / Total 38528.0
[Epoch: 259], Acc: 87.880
Current Learning Rate: [0.023120040770595558]
[Epoch: 260], Loss: 0.046, Acc: 98.430, Correct 12725.0 / Total 12928.0
[Epoch: 260], Loss: 0.050, Acc: 98.340, Correct 25301.0 / Total 25728.0
[Epoch: 260], Loss: 0.051, Acc: 98.271, Correct 37862.0 / Total 38528.0
[Epoch: 260], Acc: 87.700
Current Learning Rate: [0.022299266860670866]
[Epoch: 261], Loss: 0.052, Acc: 98.229, Correct 12699.0 / Total 12928.0
[Epoch: 261], Loss: 0.051, Acc: 98.278, Correct 25285.0 / Total 25728.0
[Epoch: 261], Loss: 0.050, Acc: 98.279, Correct 37865.0 / Total 38528.0
[Epoch: 261], Acc: 88.190
Current Learning Rate: [0.021489191386630774]
[Epoch: 262], Loss: 0.046, Acc: 98.391, Correct 12720.0 / Total 12928.0
[Epoch: 262], Loss: 0.046, Acc: 98.395, Correct 25315.0 / Total 25728.0
[Epoch: 262], Loss: 0.047, Acc: 98.378, Correct 37903.0 / Total 38528.0
[Epoch: 262], Acc: 87.440
Current Learning Rate: [0.020690126647990973]
[Epoch: 263], Loss: 0.041, Acc: 98.577, Correct 12744.0 / Total 12928.0
[Epoch: 263], Loss: 0.043, Acc: 98.496, Correct 25341.0 / Total 25728.0
[Epoch: 263], Loss: 0.045, Acc: 98.435, Correct 37925.0 / Total 38528.0
[Epoch: 263], Acc: 87.740
Current Learning Rate: [0.019902380699419107]
[Epoch: 264], Loss: 0.041, Acc: 98.700, Correct 12760.0 / Total 12928.0
[Epoch: 264], Loss: 0.040, Acc: 98.706, Correct 25395.0 / Total 25728.0
[Epoch: 264], Loss: 0.040, Acc: 98.679, Correct 38019.0 / Total 38528.0
[Epoch: 264], Acc: 87.720
Current Learning Rate: [0.019126257231973805]
[Epoch: 265], Loss: 0.037, Acc: 98.824, Correct 12776.0 / Total 12928.0
[Epoch: 265], Loss: 0.038, Acc: 98.764, Correct 25410.0 / Total 25728.0
[Epoch: 265], Loss: 0.039, Acc: 98.713, Correct 38032.0 / Total 38528.0
[Epoch: 265], Acc: 88.020
Current Learning Rate: [0.018362055456025896]
[Epoch: 266], Loss: 0.034, Acc: 98.971, Correct 12795.0 / Total 12928.0
[Epoch: 266], Loss: 0.034, Acc: 98.865, Correct 25436.0 / Total 25728.0
[Epoch: 266], Loss: 0.038, Acc: 98.731, Correct 38039.0 / Total 38528.0
[Epoch: 266], Acc: 88.280
Current Learning Rate: [0.01761006998590733]
[Epoch: 267], Loss: 0.028, Acc: 99.033, Correct 12803.0 / Total 12928.0
[Epoch: 267], Loss: 0.030, Acc: 98.978, Correct 25465.0 / Total 25728.0
[Epoch: 267], Loss: 0.031, Acc: 98.936, Correct 38118.0 / Total 38528.0
[Epoch: 267], Acc: 88.010
Current Learning Rate: [0.016870590726331475]
[Epoch: 268], Loss: 0.030, Acc: 99.033, Correct 12803.0 / Total 12928.0
[Epoch: 268], Loss: 0.031, Acc: 98.989, Correct 25468.0 / Total 25728.0
[Epoch: 268], Loss: 0.032, Acc: 98.928, Correct 38115.0 / Total 38528.0
[Epoch: 268], Acc: 88.690
Current Learning Rate: [0.016143902760629568]
[Epoch: 269], Loss: 0.028, Acc: 99.087, Correct 12810.0 / Total 12928.0
[Epoch: 269], Loss: 0.027, Acc: 99.122, Correct 25502.0 / Total 25728.0
[Epoch: 269], Loss: 0.028, Acc: 99.110, Correct 38185.0 / Total 38528.0
[Epoch: 269], Acc: 88.200
Current Learning Rate: [0.015430286240845494]
[Epoch: 270], Loss: 0.028, Acc: 99.010, Correct 12800.0 / Total 12928.0
[Epoch: 270], Loss: 0.025, Acc: 99.090, Correct 25494.0 / Total 25728.0
[Epoch: 270], Loss: 0.026, Acc: 99.079, Correct 38173.0 / Total 38528.0
[Epoch: 270], Acc: 88.660
Current Learning Rate: [0.014730016279731955]
[Epoch: 271], Loss: 0.028, Acc: 99.103, Correct 12812.0 / Total 12928.0
[Epoch: 271], Loss: 0.025, Acc: 99.172, Correct 25515.0 / Total 25728.0
[Epoch: 271], Loss: 0.026, Acc: 99.164, Correct 38206.0 / Total 38528.0
[Epoch: 271], Acc: 88.780
Best Model Saving...
Current Learning Rate: [0.014043362844689204]
[Epoch: 272], Loss: 0.023, Acc: 99.273, Correct 12834.0 / Total 12928.0
[Epoch: 272], Loss: 0.026, Acc: 99.122, Correct 25502.0 / Total 25728.0
[Epoch: 272], Loss: 0.025, Acc: 99.167, Correct 38207.0 / Total 38528.0
[Epoch: 272], Acc: 88.590
Current Learning Rate: [0.0133705906536875]
[Epoch: 273], Loss: 0.024, Acc: 99.188, Correct 12823.0 / Total 12928.0
[Epoch: 273], Loss: 0.020, Acc: 99.386, Correct 25570.0 / Total 25728.0
[Epoch: 273], Loss: 0.019, Acc: 99.403, Correct 38298.0 / Total 38528.0
[Epoch: 273], Acc: 88.260
Current Learning Rate: [0.0127119590732133]
[Epoch: 274], Loss: 0.023, Acc: 99.157, Correct 12819.0 / Total 12928.0
[Epoch: 274], Loss: 0.021, Acc: 99.265, Correct 25539.0 / Total 25728.0
[Epoch: 274], Loss: 0.020, Acc: 99.299, Correct 38258.0 / Total 38528.0
[Epoch: 274], Acc: 88.300
Current Learning Rate: [0.012067722018278455]
[Epoch: 275], Loss: 0.018, Acc: 99.373, Correct 12847.0 / Total 12928.0
[Epoch: 275], Loss: 0.018, Acc: 99.378, Correct 25568.0 / Total 25728.0
[Epoch: 275], Loss: 0.018, Acc: 99.377, Correct 38288.0 / Total 38528.0
[Epoch: 275], Acc: 88.700
Current Learning Rate: [0.011438127854531303]
[Epoch: 276], Loss: 0.015, Acc: 99.590, Correct 12875.0 / Total 12928.0
[Epoch: 276], Loss: 0.015, Acc: 99.569, Correct 25617.0 / Total 25728.0
[Epoch: 276], Loss: 0.015, Acc: 99.564, Correct 38360.0 / Total 38528.0
[Epoch: 276], Acc: 88.500
Current Learning Rate: [0.010823419302506784]
[Epoch: 277], Loss: 0.017, Acc: 99.404, Correct 12851.0 / Total 12928.0
[Epoch: 277], Loss: 0.016, Acc: 99.456, Correct 25588.0 / Total 25728.0
[Epoch: 277], Loss: 0.016, Acc: 99.452, Correct 38317.0 / Total 38528.0
[Epoch: 277], Acc: 88.690
Current Learning Rate: [0.010223833344053286]
[Epoch: 278], Loss: 0.014, Acc: 99.520, Correct 12866.0 / Total 12928.0
[Epoch: 278], Loss: 0.015, Acc: 99.518, Correct 25604.0 / Total 25728.0
[Epoch: 278], Loss: 0.014, Acc: 99.525, Correct 38345.0 / Total 38528.0
[Epoch: 278], Acc: 89.290
Best Model Saving...
Current Learning Rate: [0.00963960113097138]
[Epoch: 279], Loss: 0.012, Acc: 99.629, Correct 12880.0 / Total 12928.0
[Epoch: 279], Loss: 0.013, Acc: 99.600, Correct 25625.0 / Total 25728.0
[Epoch: 279], Loss: 0.013, Acc: 99.624, Correct 38383.0 / Total 38528.0
[Epoch: 279], Acc: 89.070
Current Learning Rate: [0.009070947895900596]
[Epoch: 280], Loss: 0.011, Acc: 99.675, Correct 12886.0 / Total 12928.0
[Epoch: 280], Loss: 0.011, Acc: 99.666, Correct 25642.0 / Total 25728.0
[Epoch: 280], Loss: 0.011, Acc: 99.678, Correct 38404.0 / Total 38528.0
[Epoch: 280], Acc: 89.250
Current Learning Rate: [0.008518092865487875]
[Epoch: 281], Loss: 0.011, Acc: 99.667, Correct 12885.0 / Total 12928.0
[Epoch: 281], Loss: 0.011, Acc: 99.708, Correct 25653.0 / Total 25728.0
[Epoch: 281], Loss: 0.011, Acc: 99.655, Correct 38395.0 / Total 38528.0
[Epoch: 281], Acc: 89.110
Current Learning Rate: [0.007981249175871482]
[Epoch: 282], Loss: 0.011, Acc: 99.652, Correct 12883.0 / Total 12928.0
[Epoch: 282], Loss: 0.011, Acc: 99.670, Correct 25643.0 / Total 25728.0
[Epoch: 282], Loss: 0.010, Acc: 99.689, Correct 38408.0 / Total 38528.0
[Epoch: 282], Acc: 89.260
Current Learning Rate: [0.007460623790513096]
[Epoch: 283], Loss: 0.008, Acc: 99.737, Correct 12894.0 / Total 12928.0
[Epoch: 283], Loss: 0.009, Acc: 99.740, Correct 25661.0 / Total 25728.0
[Epoch: 283], Loss: 0.009, Acc: 99.725, Correct 38422.0 / Total 38528.0
[Epoch: 283], Acc: 89.420
Best Model Saving...
Current Learning Rate: [0.006956417420409298]
[Epoch: 284], Loss: 0.010, Acc: 99.683, Correct 12887.0 / Total 12928.0
[Epoch: 284], Loss: 0.010, Acc: 99.697, Correct 25650.0 / Total 25728.0
[Epoch: 284], Loss: 0.010, Acc: 99.689, Correct 38408.0 / Total 38528.0
[Epoch: 284], Acc: 89.370
Current Learning Rate: [0.0064688244467137924]
[Epoch: 285], Loss: 0.006, Acc: 99.838, Correct 12907.0 / Total 12928.0
[Epoch: 285], Loss: 0.007, Acc: 99.802, Correct 25677.0 / Total 25728.0
[Epoch: 285], Loss: 0.007, Acc: 99.785, Correct 38445.0 / Total 38528.0
[Epoch: 285], Acc: 89.180
Current Learning Rate: [0.005998032845799671]
[Epoch: 286], Loss: 0.006, Acc: 99.845, Correct 12908.0 / Total 12928.0
[Epoch: 286], Loss: 0.006, Acc: 99.817, Correct 25681.0 / Total 25728.0
[Epoch: 286], Loss: 0.006, Acc: 99.795, Correct 38449.0 / Total 38528.0
[Epoch: 286], Acc: 89.140
Current Learning Rate: [0.0055442241167910295]
[Epoch: 287], Loss: 0.008, Acc: 99.737, Correct 12894.0 / Total 12928.0
[Epoch: 287], Loss: 0.007, Acc: 99.759, Correct 25666.0 / Total 25728.0
[Epoch: 287], Loss: 0.007, Acc: 99.779, Correct 38443.0 / Total 38528.0
[Epoch: 287], Acc: 89.310
Current Learning Rate: [0.005107573211591536]
[Epoch: 288], Loss: 0.007, Acc: 99.729, Correct 12893.0 / Total 12928.0
[Epoch: 288], Loss: 0.007, Acc: 99.759, Correct 25666.0 / Total 25728.0
[Epoch: 288], Loss: 0.007, Acc: 99.756, Correct 38434.0 / Total 38528.0
[Epoch: 288], Acc: 89.330
Current Learning Rate: [0.004688248467437186]
[Epoch: 289], Loss: 0.007, Acc: 99.783, Correct 12900.0 / Total 12928.0
[Epoch: 289], Loss: 0.007, Acc: 99.817, Correct 25681.0 / Total 25728.0
[Epoch: 289], Loss: 0.006, Acc: 99.834, Correct 38464.0 / Total 38528.0
[Epoch: 289], Acc: 89.370
Current Learning Rate: [0.004286411541999064]
[Epoch: 290], Loss: 0.006, Acc: 99.830, Correct 12906.0 / Total 12928.0
[Epoch: 290], Loss: 0.005, Acc: 99.841, Correct 25687.0 / Total 25728.0
[Epoch: 290], Loss: 0.005, Acc: 99.839, Correct 38466.0 / Total 38528.0
[Epoch: 290], Acc: 89.200
Current Learning Rate: [0.0039022173510612273]
[Epoch: 291], Loss: 0.006, Acc: 99.845, Correct 12908.0 / Total 12928.0
[Epoch: 291], Loss: 0.006, Acc: 99.829, Correct 25684.0 / Total 25728.0
[Epoch: 291], Loss: 0.005, Acc: 99.834, Correct 38464.0 / Total 38528.0
[Epoch: 291], Acc: 89.350
Current Learning Rate: [0.003535814008797773]
[Epoch: 292], Loss: 0.004, Acc: 99.876, Correct 12912.0 / Total 12928.0
[Epoch: 292], Loss: 0.005, Acc: 99.837, Correct 25686.0 / Total 25728.0
[Epoch: 292], Loss: 0.005, Acc: 99.826, Correct 38461.0 / Total 38528.0
[Epoch: 292], Acc: 89.340
Current Learning Rate: [0.003187342770671916]
[Epoch: 293], Loss: 0.004, Acc: 99.930, Correct 12919.0 / Total 12928.0
[Epoch: 293], Loss: 0.004, Acc: 99.880, Correct 25697.0 / Total 25728.0
[Epoch: 293], Loss: 0.004, Acc: 99.875, Correct 38480.0 / Total 38528.0
[Epoch: 293], Acc: 89.560
Best Model Saving...
Current Learning Rate: [0.002856937978979447]
[Epoch: 294], Loss: 0.004, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 294], Loss: 0.004, Acc: 99.883, Correct 25698.0 / Total 25728.0
[Epoch: 294], Loss: 0.004, Acc: 99.873, Correct 38479.0 / Total 38528.0
[Epoch: 294], Acc: 89.490
Current Learning Rate: [0.002544727011057081]
[Epoch: 295], Loss: 0.004, Acc: 99.892, Correct 12914.0 / Total 12928.0
[Epoch: 295], Loss: 0.004, Acc: 99.883, Correct 25698.0 / Total 25728.0
[Epoch: 295], Loss: 0.004, Acc: 99.881, Correct 38482.0 / Total 38528.0
[Epoch: 295], Acc: 89.650
Best Model Saving...
Current Learning Rate: [0.002250830230176169]
[Epoch: 296], Loss: 0.004, Acc: 99.853, Correct 12909.0 / Total 12928.0
[Epoch: 296], Loss: 0.004, Acc: 99.887, Correct 25699.0 / Total 25728.0
[Epoch: 296], Loss: 0.004, Acc: 99.881, Correct 38482.0 / Total 38528.0
[Epoch: 296], Acc: 89.450
Current Learning Rate: [0.001975360939140324]
[Epoch: 297], Loss: 0.003, Acc: 99.899, Correct 12915.0 / Total 12928.0
[Epoch: 297], Loss: 0.003, Acc: 99.903, Correct 25703.0 / Total 25728.0
[Epoch: 297], Loss: 0.003, Acc: 99.901, Correct 38490.0 / Total 38528.0
[Epoch: 297], Acc: 89.440
Current Learning Rate: [0.0017184253366050195]
[Epoch: 298], Loss: 0.004, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 298], Loss: 0.004, Acc: 99.868, Correct 25694.0 / Total 25728.0
[Epoch: 298], Loss: 0.004, Acc: 99.868, Correct 38477.0 / Total 38528.0
[Epoch: 298], Acc: 89.690
Best Model Saving...
Current Learning Rate: [0.001480122476136056]
[Epoch: 299], Loss: 0.003, Acc: 99.915, Correct 12917.0 / Total 12928.0
[Epoch: 299], Loss: 0.003, Acc: 99.911, Correct 25705.0 / Total 25728.0
[Epoch: 299], Loss: 0.004, Acc: 99.899, Correct 38489.0 / Total 38528.0
[Epoch: 299], Acc: 89.680
Current Learning Rate: [0.0012605442280224245]
[Epoch: 300], Loss: 0.003, Acc: 99.930, Correct 12919.0 / Total 12928.0
[Epoch: 300], Loss: 0.003, Acc: 99.934, Correct 25711.0 / Total 25728.0
[Epoch: 300], Loss: 0.003, Acc: 99.922, Correct 38498.0 / Total 38528.0
[Epoch: 300], Acc: 89.660
Current Learning Rate: [0.00105977524385864]
[Epoch: 301], Loss: 0.002, Acc: 99.954, Correct 12922.0 / Total 12928.0
[Epoch: 301], Loss: 0.002, Acc: 99.934, Correct 25711.0 / Total 25728.0
[Epoch: 301], Loss: 0.003, Acc: 99.907, Correct 38492.0 / Total 38528.0
[Epoch: 301], Acc: 89.690
Current Learning Rate: [0.0008778929239099148]
[Epoch: 302], Loss: 0.003, Acc: 99.907, Correct 12916.0 / Total 12928.0
[Epoch: 302], Loss: 0.003, Acc: 99.918, Correct 25707.0 / Total 25728.0
[Epoch: 302], Loss: 0.003, Acc: 99.920, Correct 38497.0 / Total 38528.0
[Epoch: 302], Acc: 89.660
Current Learning Rate: [0.000714967387272874]
[Epoch: 303], Loss: 0.003, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 303], Loss: 0.003, Acc: 99.899, Correct 25702.0 / Total 25728.0
[Epoch: 303], Loss: 0.003, Acc: 99.904, Correct 38491.0 / Total 38528.0
[Epoch: 303], Acc: 89.600
Current Learning Rate: [0.0005710614448433164]
[Epoch: 304], Loss: 0.004, Acc: 99.899, Correct 12915.0 / Total 12928.0
[Epoch: 304], Loss: 0.003, Acc: 99.911, Correct 25705.0 / Total 25728.0
[Epoch: 304], Loss: 0.003, Acc: 99.914, Correct 38495.0 / Total 38528.0
[Epoch: 304], Acc: 89.660
Current Learning Rate: [0.0004462305751014317]
[Epoch: 305], Loss: 0.003, Acc: 99.892, Correct 12914.0 / Total 12928.0
[Epoch: 305], Loss: 0.003, Acc: 99.914, Correct 25706.0 / Total 25728.0
[Epoch: 305], Loss: 0.003, Acc: 99.927, Correct 38500.0 / Total 38528.0
[Epoch: 305], Acc: 89.630
Current Learning Rate: [0.00034052290272376895]
[Epoch: 306], Loss: 0.002, Acc: 99.954, Correct 12922.0 / Total 12928.0
[Epoch: 306], Loss: 0.002, Acc: 99.934, Correct 25711.0 / Total 25728.0
[Epoch: 306], Loss: 0.003, Acc: 99.912, Correct 38494.0 / Total 38528.0
[Epoch: 306], Acc: 89.620
Current Learning Rate: [0.0002539791800302582]
[Epoch: 307], Loss: 0.002, Acc: 99.946, Correct 12921.0 / Total 12928.0
[Epoch: 307], Loss: 0.002, Acc: 99.949, Correct 25715.0 / Total 25728.0
[Epoch: 307], Loss: 0.002, Acc: 99.943, Correct 38506.0 / Total 38528.0
[Epoch: 307], Acc: 89.630
Current Learning Rate: [0.00018663277127344463]
[Epoch: 308], Loss: 0.003, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 308], Loss: 0.003, Acc: 99.883, Correct 25698.0 / Total 25728.0
[Epoch: 308], Loss: 0.003, Acc: 99.891, Correct 38486.0 / Total 38528.0
[Epoch: 308], Acc: 89.550
Current Learning Rate: [0.0001385096397758911]
[Epoch: 309], Loss: 0.003, Acc: 99.938, Correct 12920.0 / Total 12928.0
[Epoch: 309], Loss: 0.003, Acc: 99.930, Correct 25710.0 / Total 25728.0
[Epoch: 309], Loss: 0.003, Acc: 99.920, Correct 38497.0 / Total 38528.0
[Epoch: 309], Acc: 89.700
Best Model Saving...
Current Learning Rate: [0.00010962833792086233]
[Epoch: 310], Loss: 0.002, Acc: 99.961, Correct 12923.0 / Total 12928.0
[Epoch: 310], Loss: 0.002, Acc: 99.953, Correct 25716.0 / Total 25728.0
[Epoch: 310], Loss: 0.003, Acc: 99.945, Correct 38507.0 / Total 38528.0
[Epoch: 310], Acc: 89.670
Current Learning Rate: [0.1]

torch.matmul

out = torch.matmul(v, attention.permute(0, 1, 3, 2))
out = torch.matmul(v, attention) is ok?

error

An error occurs when I load the untrained model:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

error abou model.py_forward

File "/home/sc/carl_test/BottleneckTransformers-main/model.py", line 43, in forward
energy = content_content + content_position
RuntimeError: The size of tensor a (4) must match the size of tensor b (196) at non-singleton dimension 1

Hi ! when run the model.py i meet this error

Nan or Inf found in input tensor

I trained for about 10k steps, and the values of v and attention matrix became Nan, and the loss also became Nan. I would like to ask how to solve this problem? Thank you!

An error will occur if use the Cifa10

Change Code In main.py line 76
model = ResNet50()
INTO
model = ResNet50(resolution=(32, 32))
can solve that problem.
Because the
self.rel_h = nn.Parameter(torch.randn([1, heads, n_dims // heads, 1, height]), requires_grad=True)
self.rel_w = nn.Parameter(torch.randn([1, heads, n_dims // heads, width, 1]), requires_grad=True)
in model.py line 25, 26
width and height number is calculated by resolution
init width and height = 14 was calculated in resolution=(224, 224)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.