Giter Club home page Giter Club logo

gcc-net's Introduction

Hi there 👋

  • 👩‍🏫 I'm currently an Assistant Professor at Shenzhen University.
  • 🌻 I graduated with a Ph.D. from Peking University in July 2024.
  • 🌱 I’m currently learning object detection.
  • 📫 Feel free to reach me: [email protected]

gcc-net's People

Contributors

ixiaohuihuihui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

xlnn monkeyzhy

gcc-net's Issues

In“mmdet/models/backbones/swin_test.py of 204 line”, Are these few lines of code not used in forward propagation?

In“mmdet/models/backbones/swin_test.py of 204 line”,

        # About 2x faster than original impl
        Wh, Ww = self.window_size
        rel_index_coords = self.double_step_seq(2 * Ww - 1, Wh, 1, Ww)
        rel_position_index = rel_index_coords + rel_index_coords.T
        rel_position_index = rel_position_index.flip(1).contiguous()
        self.register_buffer('relative_position_index', rel_position_index)
        @staticmethod
        def double_step_seq(step1, len1, step2, len2):
             seq1 = torch.arange(0, step1 * len1, step1)
             seq2 = torch.arange(0, step2 * len2, step2)
             return (seq1[:, None] + seq2[None, :]).reshape(1, -1)

Are these few lines of code not used in forward propagation? If useful, what is the purpose of these few lines of code, especially "rel_index_coords=self. double_step_seq (2 * Ww -1, Wh, 1, Ww)", thank you!

I have a problems. As follows. Thank you!

image
作者您好,感谢您出色的工作,打扰您一下,您在论文的结构图中说原图与增强的图均相乘,我找到了代码,好像与结构图稍微有一点不同,代码如下:在attn = (qx @ ky.transpose(-2, -1))中表示在结构图中从左边乘到右边得到注意力分数,好像没有从右边乘到左边得到注意力分数,您知道怎么回事吗?谢谢您!我不知道是怎么回事也可能我理解错了。

        B, N, C = x.shape
        # self.qkv生成qkv output of self.qkv: [5640, 49, 288]
        qkv_x = self.qkv(x).reshape(B, N, 3, self.num_heads,
                                  C // self.num_heads).permute(2, 0, 3, 1, 4)
        qkv_y = self.qkv(y).reshape(B, N, 3, self.num_heads,
                                  C // self.num_heads).permute(2, 0, 3, 1, 4)
        # qkv.shape = [3, 5640, 3, 49, 32]
        # make torchscript happy (cannot use tensor as tuple)
        qx, kx, vx = qkv_x[0], qkv_x[1], qkv_x[2]
        qy, ky, vy = qkv_y[0], qkv_y[1], qkv_y[2]

        qx = qx * self.scale
        # qy = qy * self.scale
        attn = (qx @ ky.transpose(-2, -1))

        # 定义一个相对位置的表:
        relative_position_bias = self.relative_position_bias_table[
            self.relative_position_index.view(-1)].view(
                self.window_size[0] * self.window_size[1],
                self.window_size[0] * self.window_size[1],
                -1)  # Wh*Ww,Wh*Ww,nH
        relative_position_bias = relative_position_bias.permute(
            2, 0, 1).contiguous()  # nH, Wh*Ww, Wh*Ww
        attn = attn + relative_position_bias.unsqueeze(0)

        if mask is not None:
            nW = mask.shape[0]
            attn = attn.view(B // nW, nW, self.num_heads, N,
                             N) + mask.unsqueeze(1).unsqueeze(0)
            attn = attn.view(-1, self.num_heads, N, N)
        attn = self.softmax(attn)

        attn = self.attn_drop(attn)

        x = (attn @ vx).transpose(1, 2).reshape(B, N, C)
        x = self.proj(x)
        x = self.proj_drop(x)

        y = (attn @ vy).transpose(1, 2).reshape(B, N, C)
        y = self.proj(y)
        y = self.proj_drop(y)
        return x, y

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.