Giter Club home page Giter Club logo

Comments (8)

aribornstein avatar aribornstein commented on July 19, 2024

@ananyahjha93

from lightning-flash.

akihironitta avatar akihironitta commented on July 19, 2024

Maybe the same as Lightning-Universe/lightning-bolts#436?

from lightning-flash.

Zumbalamambo avatar Zumbalamambo commented on July 19, 2024

Still the same problem even after I have set the max_epochs :(

from lightning-flash.

ananyahjha93 avatar ananyahjha93 commented on July 19, 2024

@Zumbalamambo are you using the simclr/swav script from bolts? If yes, can you post the num samples in your dataset, your batch size, accelerator count and then max epochs?

from lightning-flash.

pengbohua avatar pengbohua commented on July 19, 2024

I met the same issue when I tried to reproduce SwAV pretraining on CIFAR10.

# data
batch_size = 2048
dm = CIFAR10DataModule(data_dir='./data/', batch_size=batch_size, normalize=True)
# loaders are contained in the DataModule which are self consistent

parser = argparse.ArgumentParser('SwAV CIFAR-10')
parser = SwAV.add_model_specific_args(parser)

args = parser.parse_args('')

# model
args.gpus = 1
args.arch = 'resnet18'
args.hidden_mlp = 1024
args.max_epochs = 100
args.dataset = dm
args.batch_size = batch_size
args.size_crops = [32, 16]
args.maxpool1 = False
args.nmb_crops = [2, 1]
args.gaussian_blur = False
args.num_samples = dm.num_samples
dm.train_transforms = SwAVTrainDataTransform(
    size_crops=args.size_crops,
    nmb_crops=args.nmb_crops,
    gaussian_blur=args.gaussian_blur
)

dm.val_transforms = SwAVEvalDataTransform(
    size_crops=args.size_crops,
    nmb_crops=args.nmb_crops,
    gaussian_blur=args.gaussian_blur
)
dm.test_transforms = SwAVEvalDataTransform(
    size_crops=args.size_crops,
    nmb_crops=args.nmb_crops,
    gaussian_blur=args.gaussian_blur
)
print('hypers', args)

#logger 
from pytorch_lightning.loggers import TensorBoardLogger, CSVLogger

csv_logger = CSVLogger("/content/drive/MyDrive/contrastive_learning/Swav/logs", name="SwAV-CIFAR10")
model = SwAV(
**args.__dict__
)


# fit
trainer = pl.Trainer(max_epochs=args.max_epochs, gpus=1, precision=16, logger=csv_logger, callbacks=[EarlyStopping(monitor='val_loss')])
trainer.fit(model, datamodule=dm)


#error message
usr/local/lib/python3.7/dist-packages/pytorch_lightning/trainer/training_loop.py in optimizer_step(self, optimizer, opt_idx, batch_idx, train_step_and_backward_closure)
    431             on_tpu=self.trainer._device_type == DeviceType.TPU and _TPU_AVAILABLE,
    432             using_native_amp=using_native_amp,
--> 433             using_lbfgs=is_lbfgs,
    434         )
    435 

/usr/local/lib/python3.7/dist-packages/pl_bolts/models/self_supervised/swav/swav_module.py in optimizer_step(self, epoch, batch_idx, optimizer, optimizer_idx, optimizer_closure, on_tpu, using_native_amp, using_lbfgs)
    329         # adjust LR of optim contained within LARSWrapper
    330         for param_group in optimizer.param_groups:
--> 331             param_group["lr"] = self.lr_schedule[self.trainer.global_step]
    332 
    333         # from lightning

IndexError: index 1900 is out of bounds for axis 0 with size 1900

from lightning-flash.

edgarriba avatar edgarriba commented on July 19, 2024

@Zumbalamambo are you still having those issues ?
BTW, what version of flash do you use ?

from lightning-flash.

tarunn2799 avatar tarunn2799 commented on July 19, 2024

@edgarriba I'm facing the same issue, when I'm trying to train a custom dataset. I'm running 4 gpus, and a batch size of 2048.

from lightning-flash.

ananyahjha93 avatar ananyahjha93 commented on July 19, 2024

This has been fixed in bolts master.

from lightning-flash.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.