Giter Club home page Giter Club logo

bottlenecktransformers's People

Contributors

leaderj1001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

bottlenecktransformers's Issues

An error will occur if use the Cifa10

Change Code In main.py line 76
model = ResNet50()
INTO
model = ResNet50(resolution=(32, 32))
can solve that problem.
Because the
self.rel_h = nn.Parameter(torch.randn([1, heads, n_dims // heads, 1, height]), requires_grad=True)
self.rel_w = nn.Parameter(torch.randn([1, heads, n_dims // heads, width, 1]), requires_grad=True)
in model.py line 25, 26
width and height number is calculated by resolution
init width and height = 14 was calculated in resolution=(224, 224)

About MHSA PROBLEMS.

Hello.
In your paper, you said that you use multi head self attention, and this head num is four.
But in this code, I only see 1 head in mhsa module.
And another question is why use this add to get postion encoding.
Looking forward for your reply!
Thank you!

A bug shows when the batch_size sets 1

When I set batch_size 1, a bug shows as "ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 512, 1, 1])". I wonder how to solve this problem.

cifar效果达不到啊,直接运行main

Current Learning Rate: [0.030934962553363768]
[Epoch: 251], Loss: 0.085, Acc: 97.030, Correct 12544.0 / Total 12928.0
[Epoch: 251], Loss: 0.084, Acc: 97.104, Correct 24983.0 / Total 25728.0
[Epoch: 251], Loss: 0.081, Acc: 97.194, Correct 37447.0 / Total 38528.0
[Epoch: 251], Acc: 87.820
Current Learning Rate: [0.030032595786498105]
[Epoch: 252], Loss: 0.075, Acc: 97.401, Correct 12592.0 / Total 12928.0
[Epoch: 252], Loss: 0.077, Acc: 97.365, Correct 25050.0 / Total 25728.0
[Epoch: 252], Loss: 0.077, Acc: 97.366, Correct 37513.0 / Total 38528.0
[Epoch: 252], Acc: 86.980
Current Learning Rate: [0.029137946110005482]
[Epoch: 253], Loss: 0.064, Acc: 97.857, Correct 12651.0 / Total 12928.0
[Epoch: 253], Loss: 0.066, Acc: 97.777, Correct 25156.0 / Total 25728.0
[Epoch: 253], Loss: 0.070, Acc: 97.610, Correct 37607.0 / Total 38528.0
[Epoch: 253], Acc: 87.350
Current Learning Rate: [0.02825135842836657]
[Epoch: 254], Loss: 0.073, Acc: 97.563, Correct 12613.0 / Total 12928.0
[Epoch: 254], Loss: 0.074, Acc: 97.477, Correct 25079.0 / Total 25728.0
[Epoch: 254], Loss: 0.070, Acc: 97.560, Correct 37588.0 / Total 38528.0
[Epoch: 254], Acc: 86.640
Current Learning Rate: [0.02737317453800964]
[Epoch: 255], Loss: 0.063, Acc: 97.826, Correct 12647.0 / Total 12928.0
[Epoch: 255], Loss: 0.061, Acc: 97.889, Correct 25185.0 / Total 25728.0
[Epoch: 255], Loss: 0.065, Acc: 97.742, Correct 37658.0 / Total 38528.0
[Epoch: 255], Acc: 87.650
Current Learning Rate: [0.026503732995541415]
[Epoch: 256], Loss: 0.064, Acc: 97.803, Correct 12644.0 / Total 12928.0
[Epoch: 256], Loss: 0.064, Acc: 97.831, Correct 25170.0 / Total 25728.0
[Epoch: 256], Loss: 0.063, Acc: 97.861, Correct 37704.0 / Total 38528.0
[Epoch: 256], Acc: 87.970
Current Learning Rate: [0.025643368987227095]
[Epoch: 257], Loss: 0.066, Acc: 97.788, Correct 12642.0 / Total 12928.0
[Epoch: 257], Loss: 0.066, Acc: 97.804, Correct 25163.0 / Total 25728.0
[Epoch: 257], Loss: 0.065, Acc: 97.835, Correct 37694.0 / Total 38528.0
[Epoch: 257], Acc: 87.380
Current Learning Rate: [0.02479241419976968]
[Epoch: 258], Loss: 0.055, Acc: 98.198, Correct 12695.0 / Total 12928.0
[Epoch: 258], Loss: 0.053, Acc: 98.204, Correct 25266.0 / Total 25728.0
[Epoch: 258], Loss: 0.055, Acc: 98.183, Correct 37828.0 / Total 38528.0
[Epoch: 258], Acc: 87.930
Current Learning Rate: [0.023951196692438358]
[Epoch: 259], Loss: 0.049, Acc: 98.213, Correct 12697.0 / Total 12928.0
[Epoch: 259], Loss: 0.051, Acc: 98.231, Correct 25273.0 / Total 25728.0
[Epoch: 259], Loss: 0.054, Acc: 98.108, Correct 37799.0 / Total 38528.0
[Epoch: 259], Acc: 87.880
Current Learning Rate: [0.023120040770595558]
[Epoch: 260], Loss: 0.046, Acc: 98.430, Correct 12725.0 / Total 12928.0
[Epoch: 260], Loss: 0.050, Acc: 98.340, Correct 25301.0 / Total 25728.0
[Epoch: 260], Loss: 0.051, Acc: 98.271, Correct 37862.0 / Total 38528.0
[Epoch: 260], Acc: 87.700
Current Learning Rate: [0.022299266860670866]
[Epoch: 261], Loss: 0.052, Acc: 98.229, Correct 12699.0 / Total 12928.0
[Epoch: 261], Loss: 0.051, Acc: 98.278, Correct 25285.0 / Total 25728.0
[Epoch: 261], Loss: 0.050, Acc: 98.279, Correct 37865.0 / Total 38528.0
[Epoch: 261], Acc: 88.190
Current Learning Rate: [0.021489191386630774]
[Epoch: 262], Loss: 0.046, Acc: 98.391, Correct 12720.0 / Total 12928.0
[Epoch: 262], Loss: 0.046, Acc: 98.395, Correct 25315.0 / Total 25728.0
[Epoch: 262], Loss: 0.047, Acc: 98.378, Correct 37903.0 / Total 38528.0
[Epoch: 262], Acc: 87.440
Current Learning Rate: [0.020690126647990973]
[Epoch: 263], Loss: 0.041, Acc: 98.577, Correct 12744.0 / Total 12928.0
[Epoch: 263], Loss: 0.043, Acc: 98.496, Correct 25341.0 / Total 25728.0
[Epoch: 263], Loss: 0.045, Acc: 98.435, Correct 37925.0 / Total 38528.0
[Epoch: 263], Acc: 87.740
Current Learning Rate: [0.019902380699419107]
[Epoch: 264], Loss: 0.041, Acc: 98.700, Correct 12760.0 / Total 12928.0
[Epoch: 264], Loss: 0.040, Acc: 98.706, Correct 25395.0 / Total 25728.0
[Epoch: 264], Loss: 0.040, Acc: 98.679, Correct 38019.0 / Total 38528.0
[Epoch: 264], Acc: 87.720
Current Learning Rate: [0.019126257231973805]
[Epoch: 265], Loss: 0.037, Acc: 98.824, Correct 12776.0 / Total 12928.0
[Epoch: 265], Loss: 0.038, Acc: 98.764, Correct 25410.0 / Total 25728.0
[Epoch: 265], Loss: 0.039, Acc: 98.713, Correct 38032.0 / Total 38528.0
[Epoch: 265], Acc: 88.020
Current Learning Rate: [0.018362055456025896]
[Epoch: 266], Loss: 0.034, Acc: 98.971, Correct 12795.0 / Total 12928.0
[Epoch: 266], Loss: 0.034, Acc: 98.865, Correct 25436.0 / Total 25728.0
[Epoch: 266], Loss: 0.038, Acc: 98.731, Correct 38039.0 / Total 38528.0
[Epoch: 266], Acc: 88.280
Current Learning Rate: [0.01761006998590733]
[Epoch: 267], Loss: 0.028, Acc: 99.033, Correct 12803.0 / Total 12928.0
[Epoch: 267], Loss: 0.030, Acc: 98.978, Correct 25465.0 / Total 25728.0
[Epoch: 267], Loss: 0.031, Acc: 98.936, Correct 38118.0 / Total 38528.0
[Epoch: 267], Acc: 88.010
Current Learning Rate: [0.016870590726331475]
[Epoch: 268], Loss: 0.030, Acc: 99.033, Correct 12803.0 / Total 12928.0
[Epoch: 268], Loss: 0.031, Acc: 98.989, Correct 25468.0 / Total 25728.0
[Epoch: 268], Loss: 0.032, Acc: 98.928, Correct 38115.0 / Total 38528.0
[Epoch: 268], Acc: 88.690
Current Learning Rate: [0.016143902760629568]
[Epoch: 269], Loss: 0.028, Acc: 99.087, Correct 12810.0 / Total 12928.0
[Epoch: 269], Loss: 0.027, Acc: 99.122, Correct 25502.0 / Total 25728.0
[Epoch: 269], Loss: 0.028, Acc: 99.110, Correct 38185.0 / Total 38528.0
[Epoch: 269], Acc: 88.200
Current Learning Rate: [0.015430286240845494]
[Epoch: 270], Loss: 0.028, Acc: 99.010, Correct 12800.0 / Total 12928.0
[Epoch: 270], Loss: 0.025, Acc: 99.090, Correct 25494.0 / Total 25728.0
[Epoch: 270], Loss: 0.026, Acc: 99.079, Correct 38173.0 / Total 38528.0
[Epoch: 270], Acc: 88.660
Current Learning Rate: [0.014730016279731955]
[Epoch: 271], Loss: 0.028, Acc: 99.103, Correct 12812.0 / Total 12928.0
[Epoch: 271], Loss: 0.025, Acc: 99.172, Correct 25515.0 / Total 25728.0
[Epoch: 271], Loss: 0.026, Acc: 99.164, Correct 38206.0 / Total 38528.0
[Epoch: 271], Acc: 88.780
Best Model Saving...
Current Learning Rate: [0.014043362844689204]
[Epoch: 272], Loss: 0.023, Acc: 99.273, Correct 12834.0 / Total 12928.0
[Epoch: 272], Loss: 0.026, Acc: 99.122, Correct 25502.0 / Total 25728.0
[Epoch: 272], Loss: 0.025, Acc: 99.167, Correct 38207.0 / Total 38528.0
[Epoch: 272], Acc: 88.590
Current Learning Rate: [0.0133705906536875]
[Epoch: 273], Loss: 0.024, Acc: 99.188, Correct 12823.0 / Total 12928.0
[Epoch: 273], Loss: 0.020, Acc: 99.386, Correct 25570.0 / Total 25728.0
[Epoch: 273], Loss: 0.019, Acc: 99.403, Correct 38298.0 / Total 38528.0
[Epoch: 273], Acc: 88.260
Current Learning Rate: [0.0127119590732133]
[Epoch: 274], Loss: 0.023, Acc: 99.157, Correct 12819.0 / Total 12928.0
[Epoch: 274], Loss: 0.021, Acc: 99.265, Correct 25539.0 / Total 25728.0
[Epoch: 274], Loss: 0.020, Acc: 99.299, Correct 38258.0 / Total 38528.0
[Epoch: 274], Acc: 88.300
Current Learning Rate: [0.012067722018278455]
[Epoch: 275], Loss: 0.018, Acc: 99.373, Correct 12847.0 / Total 12928.0
[Epoch: 275], Loss: 0.018, Acc: 99.378, Correct 25568.0 / Total 25728.0
[Epoch: 275], Loss: 0.018, Acc: 99.377, Correct 38288.0 / Total 38528.0
[Epoch: 275], Acc: 88.700
Current Learning Rate: [0.011438127854531303]
[Epoch: 276], Loss: 0.015, Acc: 99.590, Correct 12875.0 / Total 12928.0
[Epoch: 276], Loss: 0.015, Acc: 99.569, Correct 25617.0 / Total 25728.0
[Epoch: 276], Loss: 0.015, Acc: 99.564, Correct 38360.0 / Total 38528.0
[Epoch: 276], Acc: 88.500
Current Learning Rate: [0.010823419302506784]
[Epoch: 277], Loss: 0.017, Acc: 99.404, Correct 12851.0 / Total 12928.0
[Epoch: 277], Loss: 0.016, Acc: 99.456, Correct 25588.0 / Total 25728.0
[Epoch: 277], Loss: 0.016, Acc: 99.452, Correct 38317.0 / Total 38528.0
[Epoch: 277], Acc: 88.690
Current Learning Rate: [0.010223833344053286]
[Epoch: 278], Loss: 0.014, Acc: 99.520, Correct 12866.0 / Total 12928.0
[Epoch: 278], Loss: 0.015, Acc: 99.518, Correct 25604.0 / Total 25728.0
[Epoch: 278], Loss: 0.014, Acc: 99.525, Correct 38345.0 / Total 38528.0
[Epoch: 278], Acc: 89.290
Best Model Saving...
Current Learning Rate: [0.00963960113097138]
[Epoch: 279], Loss: 0.012, Acc: 99.629, Correct 12880.0 / Total 12928.0
[Epoch: 279], Loss: 0.013, Acc: 99.600, Correct 25625.0 / Total 25728.0
[Epoch: 279], Loss: 0.013, Acc: 99.624, Correct 38383.0 / Total 38528.0
[Epoch: 279], Acc: 89.070
Current Learning Rate: [0.009070947895900596]
[Epoch: 280], Loss: 0.011, Acc: 99.675, Correct 12886.0 / Total 12928.0
[Epoch: 280], Loss: 0.011, Acc: 99.666, Correct 25642.0 / Total 25728.0
[Epoch: 280], Loss: 0.011, Acc: 99.678, Correct 38404.0 / Total 38528.0
[Epoch: 280], Acc: 89.250
Current Learning Rate: [0.008518092865487875]
[Epoch: 281], Loss: 0.011, Acc: 99.667, Correct 12885.0 / Total 12928.0
[Epoch: 281], Loss: 0.011, Acc: 99.708, Correct 25653.0 / Total 25728.0
[Epoch: 281], Loss: 0.011, Acc: 99.655, Correct 38395.0 / Total 38528.0
[Epoch: 281], Acc: 89.110
Current Learning Rate: [0.007981249175871482]
[Epoch: 282], Loss: 0.011, Acc: 99.652, Correct 12883.0 / Total 12928.0
[Epoch: 282], Loss: 0.011, Acc: 99.670, Correct 25643.0 / Total 25728.0
[Epoch: 282], Loss: 0.010, Acc: 99.689, Correct 38408.0 / Total 38528.0
[Epoch: 282], Acc: 89.260
Current Learning Rate: [0.007460623790513096]
[Epoch: 283], Loss: 0.008, Acc: 99.737, Correct 12894.0 / Total 12928.0
[Epoch: 283], Loss: 0.009, Acc: 99.740, Correct 25661.0 / Total 25728.0
[Epoch: 283], Loss: 0.009, Acc: 99.725, Correct 38422.0 / Total 38528.0
[Epoch: 283], Acc: 89.420
Best Model Saving...
Current Learning Rate: [0.006956417420409298]
[Epoch: 284], Loss: 0.010, Acc: 99.683, Correct 12887.0 / Total 12928.0
[Epoch: 284], Loss: 0.010, Acc: 99.697, Correct 25650.0 / Total 25728.0
[Epoch: 284], Loss: 0.010, Acc: 99.689, Correct 38408.0 / Total 38528.0
[Epoch: 284], Acc: 89.370
Current Learning Rate: [0.0064688244467137924]
[Epoch: 285], Loss: 0.006, Acc: 99.838, Correct 12907.0 / Total 12928.0
[Epoch: 285], Loss: 0.007, Acc: 99.802, Correct 25677.0 / Total 25728.0
[Epoch: 285], Loss: 0.007, Acc: 99.785, Correct 38445.0 / Total 38528.0
[Epoch: 285], Acc: 89.180
Current Learning Rate: [0.005998032845799671]
[Epoch: 286], Loss: 0.006, Acc: 99.845, Correct 12908.0 / Total 12928.0
[Epoch: 286], Loss: 0.006, Acc: 99.817, Correct 25681.0 / Total 25728.0
[Epoch: 286], Loss: 0.006, Acc: 99.795, Correct 38449.0 / Total 38528.0
[Epoch: 286], Acc: 89.140
Current Learning Rate: [0.0055442241167910295]
[Epoch: 287], Loss: 0.008, Acc: 99.737, Correct 12894.0 / Total 12928.0
[Epoch: 287], Loss: 0.007, Acc: 99.759, Correct 25666.0 / Total 25728.0
[Epoch: 287], Loss: 0.007, Acc: 99.779, Correct 38443.0 / Total 38528.0
[Epoch: 287], Acc: 89.310
Current Learning Rate: [0.005107573211591536]
[Epoch: 288], Loss: 0.007, Acc: 99.729, Correct 12893.0 / Total 12928.0
[Epoch: 288], Loss: 0.007, Acc: 99.759, Correct 25666.0 / Total 25728.0
[Epoch: 288], Loss: 0.007, Acc: 99.756, Correct 38434.0 / Total 38528.0
[Epoch: 288], Acc: 89.330
Current Learning Rate: [0.004688248467437186]
[Epoch: 289], Loss: 0.007, Acc: 99.783, Correct 12900.0 / Total 12928.0
[Epoch: 289], Loss: 0.007, Acc: 99.817, Correct 25681.0 / Total 25728.0
[Epoch: 289], Loss: 0.006, Acc: 99.834, Correct 38464.0 / Total 38528.0
[Epoch: 289], Acc: 89.370
Current Learning Rate: [0.004286411541999064]
[Epoch: 290], Loss: 0.006, Acc: 99.830, Correct 12906.0 / Total 12928.0
[Epoch: 290], Loss: 0.005, Acc: 99.841, Correct 25687.0 / Total 25728.0
[Epoch: 290], Loss: 0.005, Acc: 99.839, Correct 38466.0 / Total 38528.0
[Epoch: 290], Acc: 89.200
Current Learning Rate: [0.0039022173510612273]
[Epoch: 291], Loss: 0.006, Acc: 99.845, Correct 12908.0 / Total 12928.0
[Epoch: 291], Loss: 0.006, Acc: 99.829, Correct 25684.0 / Total 25728.0
[Epoch: 291], Loss: 0.005, Acc: 99.834, Correct 38464.0 / Total 38528.0
[Epoch: 291], Acc: 89.350
Current Learning Rate: [0.003535814008797773]
[Epoch: 292], Loss: 0.004, Acc: 99.876, Correct 12912.0 / Total 12928.0
[Epoch: 292], Loss: 0.005, Acc: 99.837, Correct 25686.0 / Total 25728.0
[Epoch: 292], Loss: 0.005, Acc: 99.826, Correct 38461.0 / Total 38528.0
[Epoch: 292], Acc: 89.340
Current Learning Rate: [0.003187342770671916]
[Epoch: 293], Loss: 0.004, Acc: 99.930, Correct 12919.0 / Total 12928.0
[Epoch: 293], Loss: 0.004, Acc: 99.880, Correct 25697.0 / Total 25728.0
[Epoch: 293], Loss: 0.004, Acc: 99.875, Correct 38480.0 / Total 38528.0
[Epoch: 293], Acc: 89.560
Best Model Saving...
Current Learning Rate: [0.002856937978979447]
[Epoch: 294], Loss: 0.004, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 294], Loss: 0.004, Acc: 99.883, Correct 25698.0 / Total 25728.0
[Epoch: 294], Loss: 0.004, Acc: 99.873, Correct 38479.0 / Total 38528.0
[Epoch: 294], Acc: 89.490
Current Learning Rate: [0.002544727011057081]
[Epoch: 295], Loss: 0.004, Acc: 99.892, Correct 12914.0 / Total 12928.0
[Epoch: 295], Loss: 0.004, Acc: 99.883, Correct 25698.0 / Total 25728.0
[Epoch: 295], Loss: 0.004, Acc: 99.881, Correct 38482.0 / Total 38528.0
[Epoch: 295], Acc: 89.650
Best Model Saving...
Current Learning Rate: [0.002250830230176169]
[Epoch: 296], Loss: 0.004, Acc: 99.853, Correct 12909.0 / Total 12928.0
[Epoch: 296], Loss: 0.004, Acc: 99.887, Correct 25699.0 / Total 25728.0
[Epoch: 296], Loss: 0.004, Acc: 99.881, Correct 38482.0 / Total 38528.0
[Epoch: 296], Acc: 89.450
Current Learning Rate: [0.001975360939140324]
[Epoch: 297], Loss: 0.003, Acc: 99.899, Correct 12915.0 / Total 12928.0
[Epoch: 297], Loss: 0.003, Acc: 99.903, Correct 25703.0 / Total 25728.0
[Epoch: 297], Loss: 0.003, Acc: 99.901, Correct 38490.0 / Total 38528.0
[Epoch: 297], Acc: 89.440
Current Learning Rate: [0.0017184253366050195]
[Epoch: 298], Loss: 0.004, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 298], Loss: 0.004, Acc: 99.868, Correct 25694.0 / Total 25728.0
[Epoch: 298], Loss: 0.004, Acc: 99.868, Correct 38477.0 / Total 38528.0
[Epoch: 298], Acc: 89.690
Best Model Saving...
Current Learning Rate: [0.001480122476136056]
[Epoch: 299], Loss: 0.003, Acc: 99.915, Correct 12917.0 / Total 12928.0
[Epoch: 299], Loss: 0.003, Acc: 99.911, Correct 25705.0 / Total 25728.0
[Epoch: 299], Loss: 0.004, Acc: 99.899, Correct 38489.0 / Total 38528.0
[Epoch: 299], Acc: 89.680
Current Learning Rate: [0.0012605442280224245]
[Epoch: 300], Loss: 0.003, Acc: 99.930, Correct 12919.0 / Total 12928.0
[Epoch: 300], Loss: 0.003, Acc: 99.934, Correct 25711.0 / Total 25728.0
[Epoch: 300], Loss: 0.003, Acc: 99.922, Correct 38498.0 / Total 38528.0
[Epoch: 300], Acc: 89.660
Current Learning Rate: [0.00105977524385864]
[Epoch: 301], Loss: 0.002, Acc: 99.954, Correct 12922.0 / Total 12928.0
[Epoch: 301], Loss: 0.002, Acc: 99.934, Correct 25711.0 / Total 25728.0
[Epoch: 301], Loss: 0.003, Acc: 99.907, Correct 38492.0 / Total 38528.0
[Epoch: 301], Acc: 89.690
Current Learning Rate: [0.0008778929239099148]
[Epoch: 302], Loss: 0.003, Acc: 99.907, Correct 12916.0 / Total 12928.0
[Epoch: 302], Loss: 0.003, Acc: 99.918, Correct 25707.0 / Total 25728.0
[Epoch: 302], Loss: 0.003, Acc: 99.920, Correct 38497.0 / Total 38528.0
[Epoch: 302], Acc: 89.660
Current Learning Rate: [0.000714967387272874]
[Epoch: 303], Loss: 0.003, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 303], Loss: 0.003, Acc: 99.899, Correct 25702.0 / Total 25728.0
[Epoch: 303], Loss: 0.003, Acc: 99.904, Correct 38491.0 / Total 38528.0
[Epoch: 303], Acc: 89.600
Current Learning Rate: [0.0005710614448433164]
[Epoch: 304], Loss: 0.004, Acc: 99.899, Correct 12915.0 / Total 12928.0
[Epoch: 304], Loss: 0.003, Acc: 99.911, Correct 25705.0 / Total 25728.0
[Epoch: 304], Loss: 0.003, Acc: 99.914, Correct 38495.0 / Total 38528.0
[Epoch: 304], Acc: 89.660
Current Learning Rate: [0.0004462305751014317]
[Epoch: 305], Loss: 0.003, Acc: 99.892, Correct 12914.0 / Total 12928.0
[Epoch: 305], Loss: 0.003, Acc: 99.914, Correct 25706.0 / Total 25728.0
[Epoch: 305], Loss: 0.003, Acc: 99.927, Correct 38500.0 / Total 38528.0
[Epoch: 305], Acc: 89.630
Current Learning Rate: [0.00034052290272376895]
[Epoch: 306], Loss: 0.002, Acc: 99.954, Correct 12922.0 / Total 12928.0
[Epoch: 306], Loss: 0.002, Acc: 99.934, Correct 25711.0 / Total 25728.0
[Epoch: 306], Loss: 0.003, Acc: 99.912, Correct 38494.0 / Total 38528.0
[Epoch: 306], Acc: 89.620
Current Learning Rate: [0.0002539791800302582]
[Epoch: 307], Loss: 0.002, Acc: 99.946, Correct 12921.0 / Total 12928.0
[Epoch: 307], Loss: 0.002, Acc: 99.949, Correct 25715.0 / Total 25728.0
[Epoch: 307], Loss: 0.002, Acc: 99.943, Correct 38506.0 / Total 38528.0
[Epoch: 307], Acc: 89.630
Current Learning Rate: [0.00018663277127344463]
[Epoch: 308], Loss: 0.003, Acc: 99.884, Correct 12913.0 / Total 12928.0
[Epoch: 308], Loss: 0.003, Acc: 99.883, Correct 25698.0 / Total 25728.0
[Epoch: 308], Loss: 0.003, Acc: 99.891, Correct 38486.0 / Total 38528.0
[Epoch: 308], Acc: 89.550
Current Learning Rate: [0.0001385096397758911]
[Epoch: 309], Loss: 0.003, Acc: 99.938, Correct 12920.0 / Total 12928.0
[Epoch: 309], Loss: 0.003, Acc: 99.930, Correct 25710.0 / Total 25728.0
[Epoch: 309], Loss: 0.003, Acc: 99.920, Correct 38497.0 / Total 38528.0
[Epoch: 309], Acc: 89.700
Best Model Saving...
Current Learning Rate: [0.00010962833792086233]
[Epoch: 310], Loss: 0.002, Acc: 99.961, Correct 12923.0 / Total 12928.0
[Epoch: 310], Loss: 0.002, Acc: 99.953, Correct 25716.0 / Total 25728.0
[Epoch: 310], Loss: 0.003, Acc: 99.945, Correct 38507.0 / Total 38528.0
[Epoch: 310], Acc: 89.670
Current Learning Rate: [0.1]

Nan or Inf found in input tensor

I trained for about 10k steps, and the values of v and attention matrix became Nan, and the loss also became Nan. I would like to ask how to solve this problem? Thank you!

error

An error occurs when I load the untrained model:RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!

unable to get the cifar10 accuracy

Hi, I just change your code with ResNet50(num_classes=10, resolution=(224, 224)), which end with a lower accuaracy of 90.15%. do you have other changes to get the 9511%?

torch.matmul

out = torch.matmul(v, attention.permute(0, 1, 3, 2))
out = torch.matmul(v, attention) is ok?

Numbers of heads in MHSA?

It seems that MHSA only has one head in the released code. But in the paper, 4 heads are used in MHSA. Is it a simplification for CIFAR dataset?

error abou model.py_forward

File "/home/sc/carl_test/BottleneckTransformers-main/model.py", line 43, in forward
energy = content_content + content_position
RuntimeError: The size of tensor a (4) must match the size of tensor b (196) at non-singleton dimension 1

Hi ! when run the model.py i meet this error

one problem?

when i run main function where is model.py i get an error,for example
"
energy = content_content + content_position
RuntimeError: The size of tensor a (196) must match the size of tensor b (256) at non-singleton dimension 1
"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.