Giter Club home page Giter Club logo

Comments (4)

glenn-jocher avatar glenn-jocher commented on May 18, 2024

Multi-GPU training is not supported yet. See Issue #21.

from yolov3.

longxianlei avatar longxianlei commented on May 18, 2024

Because the box2 is torch.FloatTensor, the anchor_vec is on cpu. while the box1 is on GPU.
so, just use .cuda() to transform the data into torch.cuda.FloatTensor()
` box2 = anchor_vec.cuda().unsqueeze(1)

    inter_area = torch.min(box1, box2).prod(2)`

but, when you fix this, the below will also come out some bug.
` txy[b, a, gj, gi] = gxy - gxy.floor()

    # Width and height
    twh[b, a, gj, gi] = torch.log(gwh/ anchor_vec[a]) `

you need to transform the data type to GPU or Cuda according to the error info.
However, the main reason for multi-GPU training lies in
for i, (imgs, targets, _, _) in enumerate(dataloader):
where the imgs is a tensor, but the targets are lists. When parallel the imgs.to(device). The imgs are divided into batch_size/GPU_nums. But the targets cannot targets.to(device)(since it is a list), and the targets are the same num as the batch_size, cannot distribute into every GPUs.

from yolov3.

longxianlei avatar longxianlei commented on May 18, 2024

if nM > 0: lxy = k * MSELoss(xy[mask], txy[mask]) lwh = k * MSELoss(wh[mask], twh[mask])
the xy, txy, wh, twh is not the same dims as the batch_size.
the xy, wh is batch_size/GPU_nums.
but the txy, twh is the targets_nums( batch_size). There will occur some error.

from yolov3.

glenn-jocher avatar glenn-jocher commented on May 18, 2024

@longxianlei we just PRd our under-development multi_gpu branch into the master branch, so multi-GPU functionality now works. Many of the items you raised above should be resolved. See #135 for more info.

from yolov3.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.