lromul / argus Goto Github PK
View Code? Open in Web Editor NEWLightweight library for training neural networks in PyTorch
Home Page: https://pytorch-argus.readthedocs.io
License: MIT License
Lightweight library for training neural networks in PyTorch
Home Page: https://pytorch-argus.readthedocs.io
License: MIT License
Can I use custom metrics?
Also if not, do you have any suggestions on metrics for Multiclass Segmentation problems?
I am training a model on 4 different GPUs. During the training, I use the MonitorCheckpoint
callback for saving the model on disk. Due to long training time, my script stops after training. The test is performed by hand, reinitializing a python kernel and loading the trained model.
The loading is performed with model = argus.load_model()
and, by printing the model
variable, I can see the 4 devices I used in training. However, I would like to move the model to a single GPU for the tensor operations I need to perform must be on the same device. Indeed I got:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:2 and cuda:0!
Is there any way to re-assign the cuda device? Using model.set_device(0)
does not work, I can see all the previous GPU running.
Thanks in advance, and congratulations for the package!
When I install argus, there are some module cannot be found. However, these modules can be found after installing pytorch-argus. Are there any difference between argus and pytorch-argus?
Hey, I am currently facing a problem using argus when trying to use multiple model (resnet then pspnet).
It's fine when I'm doing it conventionally but got error TypeError: forward() missing 1 required positional argument
when using argus.
the model that I meant is something like this :
class PSPNet(nn.Module):
def __init__(self,in_ch=1,nb_classes=6,bins=(1, 2, 3, 6)):
super(PSPNet,self).__init__()
fea_dim = 128
self.prev_model= resnet(in_ch,nb_classes) #feature map extract
self.ppm = PPM(fea_dim, int(fea_dim/len(bins)), bins)
self.cls = nn.Sequential(
nn.Conv2d(fea_dim*2, 64, kernel_size=3, padding=1, bias=False),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.Conv2d(64, nb_classes, kernel_size=1)
)
def forward(self,x):
x = self.prev_model(x)
x = self.ppm(x)
x = self.cls(x)
return x
class resnet(nn.Module):
blabla
Any recommendation?
thanks.
Currently the default num_epochs is 1 and there is no support for the infinite num_epochs (using -1 for example).
Line 169 in 37c1cad
Line 227 in 37c1cad
What about this feature for the "I do not know how many epochs I need, I just want to provide the EarlyStopping
callback or stop the training with KeyboardInterrupt
".
Checking that either num_epochs > 0
or EarlyStopping_instance in callbacks
is a valid option.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.