Giter Club home page Giter Club logo

Comments (39)

qleenju avatar qleenju commented on June 14, 2024

Hello, thanks for your attention!

But I don't quite understand your question. Could you please give a more detailed explanation or an example?

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

I cannot see the error report you attached. But your question is how to assign values to operand_a and operand_b in testbench?

In top module, operand_a is defined as follows: input logic [N-1:0][n_i-1:0] operands_a. Assuming N=4 and n_i=8, you can give the input operand_a in testbench as follows: operand_a = {8'h12, 8'h34, 8'h56, 8'h78} or operand_a = 32'h12345678. The assignment of operand_b is the same.

Hope it helps you.

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

There is an error in the always block you wrote in your testbench. You should assign values to all inputs (including operands_a and operands_b) at the same time.

For example, the testbench can be revised as follows:

`timescale 1ns/1ps
module tbpdpu;
   ...

  // change the `always` block to `initial` block
  initial
  begin
    #10;
    operands_a[0]=8'b00000011;
    operands_a[1]=8'b00011100;
    operands_a[2]=8'b00000000;
    operands_a[3]=8'b00000000;
    operands_b[0]=8'b11111111;
    operands_b[1]=8'b11111111;
    operands_b[2]=8'b11111111;
    operands_b[3]=8'b11111111;
    // `acc` has been initialized to zero in the former `initial` block    

    #10;  // wait for computation
    // Display the result `result_o`

  end

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

In order to verify the functional correctness of the hardware module, the stimulus and gold result are usually generated through the software library, and then the stimulus is sent to the hardware module to compare the consistency of the hardware output with the gold result.

By the way, in module posit_encoder, result_o=input_not_zero ? normal_result : '0 is to handle the special cases where the input is all 0s.

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

Out = Acc + Va · Vb = Acc + a0b0 + a1b1 + a2b2 + a3b3. The inputs are operands_a , operands_b and acc in pdpu_top.sv, while the output is result_o in pdpu_top.sv.

In order to use the modules, you can start by focusing on the top module, i.e., pdpu_top.sv or pipelined version pdpu_top_pipelined.sv.

Also, you can refer to our paper published at ISCAS'23 for better understanding. (https://ieeexplore.ieee.org/document/10182007)

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

You can trying running the simple testbench below, where the inputs are randomly generated.

`timescale 1ns/1ps
module tb_pdpu_top();
    localparam int unsigned N = 4;
    localparam int unsigned n_i = 8;
    localparam int unsigned es_i = 1;
    localparam int unsigned n_o = 16;
    localparam int unsigned es_o = 2;
    localparam int unsigned ALIGN_WIDTH = 14;

    logic [N-1:0][n_i-1:0] operands_a;
    logic [N-1:0][n_i-1:0] operands_b;
    logic [n_o-1:0] acc;
    logic [n_o-1:0] result_o;

    pdpu_top #(
        .N(N),
        .n_i(n_i),
        .es_i(es_i),
        .n_o(n_o),
        .es_o(es_o),
        .ALIGN_WIDTH(ALIGN_WIDTH)
    ) u_pdpu_top(
        .operands_a(operands_a),
        .operands_b(operands_b),
        .acc(acc),
        .result_o(result_o)
    );

    integer i,j;
    
    initial begin
        for(j=0;j<10;j++) begin
            for(i=0;i<N;i++) begin
                operands_a[i] = $random;
                operands_b[i] = $random;       
            end
            acc = $random;
            #10;
        end
    end
endmodule

The waveform is as follows:
waveform

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

Could you please show your complete testbench code in comments? I will try runing it to find the potential problems.

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

Firstly, you should add a $finish or $stop command as follows, otherwise the simulation will continue forever.

initial begin
  #1000 $finish;
end

Secondly, I have tried runing the testbench you attached above, the waveform is as follows:
image
The result_o is not always zero, but a fixed value since the inputs no longer change.

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

The testbench code is revised from your attached code.

module tb_pdpu_top;
  reg [3:0][7:0]operands_a;
  reg [3:0][7:0]operands_b;
  reg [15:0] acc;
  reg [15:0] result_o;
  parameter int unsigned N = 4;                   // dot-product size
  parameter int unsigned n_i = 8;                 // word size
  parameter int unsigned es_i = 2;                // exponent size
  parameter int unsigned n_o = 16;
  parameter int unsigned es_o = 2;
  parameter int unsigned ALIGN_WIDTH = 14;

  pdpu_top
#(.N(N),.n_i(n_i),.es_i(es_i),.n_o(n_o),.es_o(es_o),.ALIGN_WIDTH(ALIGN_WIDTH))
dut(.operands_a(operands_a),.operands_b(operands_b),.acc(acc),.result_o(result_o));

  initial
  begin
    operands_a[0]=8'b00000000;
    operands_a[1]=8'b00000000;
    operands_a[2]=8'b00000000;
    operands_a[3]=8'b00000000;
    operands_b[0]=8'b00000000;
    operands_b[1]=8'b00000000;
    operands_b[2]=8'b00000000;
    operands_b[3]=8'b00000000;
    acc=16'b0000000000000000;
  end

  always
  begin
    #10 operands_a[0]=8'b00000011;
    #10 operands_a[1]=8'b00011100;
    #10 operands_a[2]=8'b00000000;
    #10 operands_a[3]=8'b00000000;
    #10 operands_b[0]=8'b11111111;
    #10 operands_b[1]=8'b11111111;
    #10 operands_b[2]=8'b11111111;
    #10 operands_b[3]=8'b11111111;
    #10 acc=16'b0001011111111001;
    // Display the result
        $display("Result: %h", result_o);
  end

    initial begin
        #1000;
        $finish;
    end

  endmodule

You can run it by Mentor QuestaSIM. The simulation script is as follows:

quit -sim
vlib work
vmap work work

# Compile rtl files
vlog ../sources/pdpu_pkg.sv

vlog ../sources/cf_math_pkg.sv
vlog ../sources/lzc.sv
vlog ../sources/barrel_shifter.sv
vlog ../sources/posit_decoder.sv

vlog ../sources/booth_encoder.sv
vlog ../sources/gen_product.sv
vlog ../sources/gen_prods.sv

vlog ../sources/fulladder.sv
vlog ../sources/compressor_3to2.sv
vlog ../sources/counter_5to3.sv
vlog ../sources/compressor_4to2.sv
vlog ../sources/csa_tree.sv

vlog ../sources/radix4_booth_multiplier.sv

vlog ../sources/comparator.sv
vlog ../sources/comp_tree.sv

vlog ../sources/mantissa_norm.sv

vlog ../sources/posit_encoder.sv

vlog ../sources/pdpu_top.sv

# Compile testbench
vlog ../test/tb_pdpu_top.sv

# Simulation
vsim -novopt work.tb_pdpu_top
# Add wave
add wave /*
# Run
run -all

This is a hardware arithmetic unit, which can be deployed as a computing core in an accelerator or processor. And I don't think that the code can be directly used in cnn model.

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

By the way, the simulation script can be simplified as follows:

quit -sim
vlib work
vmap work work

# Compile rtl files
vlog ../sources/pdpu_pkg.sv
vlog ../sources/cf_math_pkg.sv
vlog ../sources/*.sv

# Compile testbench
vlog ../test/tb_pdpu_top.sv

# Simulation
vsim -novopt work.tb_pdpu_top
# Add wave
add wave /*
# Run
run -all

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

You can have a try. The TCL simulation script is consistent, but as I recall modelsim does not support SystemVerilog very well.

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

Then you can have a try. These questions are beyond the scope of the repo, and you can solve the remaining problems through Google.

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

fantasysee avatar fantasysee commented on June 14, 2024

Please find further details of our papers for leveraging posit number in deep learning.

  1. Lu, Jinming, et al. "Evaluations on deep neural networks training using posit number system." IEEE Transactions on Computers 70.2 (2020): 174-187.
  2. Lu, Jinming, et al. "Training deep neural networks using posit number system." 2019 32nd IEEE International System-on-Chip Conference (SOCC). IEEE, 2019.

Some other references you can find in references of our PDPU paper as well. Hope it helpful for you~

You have mentioned in the paper that it is used for deep learning applications ,In what type of Deep learning Applications it can be implemented,if so will you please send me any reference paper On Tue, Mar 5, 2024 at 9:18 AM Deva tharshini @.> wrote:

Thank you so much On Tue, Mar 5, 2024 at 8:05 AM Qiong Li @.
> wrote: > Then you can have a try. These questions are beyond the scope of the > repo, and you can solve the remaining problems through Google. > > — > Reply to this email directly, view it on GitHub > <#1 (comment)>, or > unsubscribe > https://github.com/notifications/unsubscribe-auth/BEP2CBK42XCO3XRBQVCOCNLYWUVQLAVCNFSM6AAAAABEDXFAOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZXHA2DAMZTGU > . > You are receiving this because you authored the thread.Message ID: > @.***> >

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

Dear Mam/Sir In this pdpu can we use dadda multiplier instead of booth multiplier is this possible

On Tue, Mar 5, 2024 at 8:18 PM Chao Fang @.> wrote: Closed #1 <#1> as completed. — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEP2CBJCLSCFUHONSTLFD2DYWXLKHAVCNFSM6AAAAABEDXFAOOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSGAYTKMJRGM2TKOA . You are receiving this because you authored the thread.Message ID: @.>

Year, you can choose to use dadda multiplier to replace booth multiplier in PDPU (I have not implemented it).

from pdpu.

tharshinikumar avatar tharshinikumar commented on June 14, 2024

from pdpu.

qleenju avatar qleenju commented on June 14, 2024

can you provide me the example for solving the equation out=acc+a0.b0+..... becoz I m not getting the accurate value by giving inputs and check with the equation. For example i gave acc=2 a0=1 a1=1 a2=1 a3=1 ,bo=2,b1=2,b2=2,b3=2 (in decimal form) actual o/p should be 10 (in decimal form) but im getting 12 as output (in decimal form) and also for another example acc=10 a0=3 a1=4 a2=5 a3=6 ,bo=1,b1=2,b2=3,b3=4(in decimal form) actual o/p should be 60(in decimal form) but im getting 98 as output (in decimal form) and i tried this for various hex units but i didnt get the correct output can you please tell me what is the reason

On Wed, Mar 13, 2024 at 8:20 PM Qiong Li @.> wrote: Dear Mam/Sir In this pdpu can we use dadda multiplier instead of booth multiplier is this possible … <#m_3764658703652916934_> On Tue, Mar 5, 2024 at 8:18 PM Chao Fang @.> wrote: Closed #1 <#1> <#1 <#1>> as completed. — Reply to this email directly, view it on GitHub <#1 (comment) <#1 (comment)>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEP2CBJCLSCFUHONSTLFD2DYWXLKHAVCNFSM6AAAAABEDXFAOOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSGAYTKMJRGM2TKOA https://github.com/notifications/unsubscribe-auth/BEP2CBJCLSCFUHONSTLFD2DYWXLKHAVCNFSM6AAAAABEDXFAOOVHI2DSMVQWIX3LMV45UABCJFZXG5LFIV3GK3TUJZXXI2LGNFRWC5DJN5XDWMJSGAYTKMJRGM2TKOA . You are receiving this because you authored the thread.Message ID: @.> Year, you can choose to use dadda multiplier to replace booth multiplier in PDPU (I have not implemented it). — Reply to this email directly, view it on GitHub <#1 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/BEP2CBI52OMYXUTVM6MU6EDYYBRUFAVCNFSM6AAAAABEDXFAOOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJUGU3TQMRWGY . You are receiving this because you authored the thread.Message ID: @.**>

Because PDPU is implemented based on Posit format, rather than traditional IEEE-754 floating-point format or integer format.

Regarding the Posit format, you can refer to the following article:
[1] Gustafson J L, Yonemoto I T. Beating floating point at its own game: Posit arithmetic[J]. Supercomputing frontiers and innovations, 2017, 4(2): 71-86.

from pdpu.

Related Issues (1)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.