Giter Club home page Giter Club logo

cuda-code's People

Contributors

sangyc10 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

cuda-code's Issues

代码是否写错了

3.5lesson/grid2D_block1D.cu文件下第14行中的 int iy = threadIdx.y;
是否应该替换为 int iy = blockIdx.y;

速度

前面课程的博客希望大佬快速整理出来,这个项目很棒。

线程唯一标识id问题

在使用线程id的时候,如果采用核函数注释内的形式进行计算的话,对于block_size为正方形的是正常的,但是非正方形的就有问题。不过采用现在的形式来计算id的话则不会有这样的问题,是否说这个线程id的计算不能根据坐标来计算,而是要根据tid bid的形式来计算得到?

#include <iostream>
#include <opencv2/opencv.hpp>

void Error_check(cudaError_t error, const char* filename, int lineNum){
    if (error!=cudaSuccess){
        printf("cuda error:\r\ncode=%d, name=%s, description=%s\r\nfile=%s, line%d\r\n",
                error, cudaGetErrorName(error), cudaGetErrorString(error), filename, lineNum);
    }
    else{
        printf("no cuda_errors\n");
    }
}

__global__ void bgrtogray(u_char* input, u_char* output, int width, int height){
    // const int x = threadIdx.x + blockIdx.x * blockDim.x;
    // const int y = threadIdx.y + blockIdx.y * blockDim.x;
    // const int idx = x + y * width;

    const int tid = threadIdx.x + threadIdx.y * blockDim.x;
    const int bid = blockIdx.x + blockIdx.y * gridDim.x;
    const int idx = tid + blockDim.x * blockDim.y * bid;
    if (idx < width*height)
    output[idx] = 0.299f * input[3 * idx] + 0.587f * input[3 * idx + 1] + 0.114f * input[3 * idx + 2];
}


int main(){
    cv::Mat image = cv::imread("/home/jia/PycharmProjects/img_process/images/3v3蓝色黄色/train/images/chn0_20230808T100503.mp4_360.jpg");
    if (image.empty()){
        std::cerr << "failed to read image" << std::endl;
        return -1;
    }
    std::cout << image.size() << std::endl;
    int height = image.rows;
    int width = image.cols;
    int channels = image.channels();
    size_t nBytes = height * width * channels * sizeof(u_char);
    u_char* image_cuda;
    u_char* output_cuda;
    Error_check(cudaMalloc((void**)&image_cuda, nBytes), __FILE__, __LINE__);
    Error_check(cudaMalloc((void**)&output_cuda, height * width * sizeof(u_char)), __FILE__, __LINE__);
    cudaMemcpy(image_cuda, image.data, nBytes, cudaMemcpyHostToDevice);
    cudaDeviceSynchronize();
    dim3 block_size(4, 1);
    dim3 grid_size((width + block_size.x -1) / block_size.x, (height + block_size.y - 1)/block_size.y);
    bgrtogray<<<grid_size, block_size>>>(image_cuda, output_cuda, width, height);
    cudaDeviceSynchronize();
    Error_check(cudaGetLastError(), __FILE__, __LINE__);
    cv::Mat output(image.size(), CV_8UC1);
    cudaMemcpy(output.data, output_cuda, height * width * sizeof(u_char), cudaMemcpyDeviceToHost);
    cudaDeviceSynchronize();
    cudaFree(image_cuda);
    cudaFree(output_cuda);
    cv::imshow("out",output);
    cv::waitKey(5000);
    return 0;
}

没有输出结果

2.2lesson中,用nvcc编译成可执行文件后,并没有在终端输出 Hello World from the the GPU

上传后续文件

你好,十分感谢您的课程和资料,我学到了4.4全局内存部分,但是发现没有该部分以及后续课程的PPT文件,请问这部分的文档可以共享出来吗?十分感谢!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.