The ready valid controllers I use have combinational paths through them. The decision

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Support for Testing Combinational Ready Valid Controllers about chiseltest HOT 8 CLOSED

ucb-bar commented on July 25, 2024

Support for Testing Combinational Ready Valid Controllers

from chiseltest.

Comments (8)

chick commented on July 25, 2024

@ducky64 I'm looking over this but I think this issue is going to need your attention

from chiseltest.

ducky64 commented on July 25, 2024

Yeah, I know. It's on my to-do list. As are a bunch of other things.

from chiseltest.

stevenmburns commented on July 25, 2024

Here is another way to do the same thing (using Logical threads.) It is not as beautiful but does currently work in testers2.

package examples.testers2

import org.scalatest._

import chisel3.tester._

import chisel3._
import chisel3.util._

import scala.util.Random


class Join extends Module {
  val io = IO( new Bundle {
    val inp0 = Flipped( DecoupledIO( UInt(16.W)))
    val inp1 = Flipped( DecoupledIO( UInt(16.W)))
    val out = DecoupledIO( UInt(16.W))
  })

  io.inp0.nodeq
  io.inp1.nodeq
  io.out.noenq

  when ( io.inp0.valid && io.inp1.valid && io.out.ready) {
    io.out.bits := io.inp0.bits + io.inp1.bits
    io.inp0.ready := true.B
    io.inp1.ready := true.B
    io.out.valid := true.B
  }

}

class Tie extends Module {
  val io = IO( new Bundle {
    val inp = Flipped( DecoupledIO( UInt(16.W)))
    val out = DecoupledIO( UInt(16.W))
  })
 
  io.inp.nodeq
  io.out.noenq

  when ( io.inp.valid && io.out.ready) {
    io.out.bits := io.inp.bits
    io.inp.ready := true.B
    io.out.valid := true.B
  }

}

import chisel3.internal.firrtl.{LitArg, ULit, SLit}

object LogicalThreadTestAdapters {
  class RandomFlip( seed : Int, val num : Int, val den : Int) {
    val rnd = new Random( seed)  

    def nextBoolean() : Boolean = {
      val r = rnd.nextInt(den) < num
      println( s"Random flip: ${r}")
      r
    }
  }

  val rflip = new RandomFlip( 47, 12, 16)

  abstract class BaseAdapter {
    def doPeek : Unit
    def doPoke : Unit
    def isDone : Boolean
  }

  class Manager( clk: Clock, timeout: Int) {
    var nsteps = 0
    def step : Unit = {
      if ( timeout > 0 && nsteps >= timeout) throw new Exception( s"Exceeded clock tick limit (${timeout}).")
      clk.step(1)
      nsteps += 1
    }

    val adapters = collection.mutable.ArrayBuffer[BaseAdapter]()

    def register( a : BaseAdapter) : Unit = {
      adapters.append( a)
    }

    def checkAll : Boolean = {
      while ( !(adapters.foldLeft(true){ case (x,y) => x && y.isDone})) {
         adapters.foreach( _.doPoke)
         adapters.foreach( _.doPeek)
         step
      }
      true
    }
  }


  class ReadyValidSource[T <: Data](x: ReadyValidIO[T], tossCoin : () => Boolean, var lst : Stream[T]) extends BaseAdapter {

    x.valid.poke(false.B)

    var pending = false

    def doPoke : Unit = {
      pending = pending || !lst.isEmpty && tossCoin()
      if ( pending) {
         x.bits.poke(lst.head)
      }
      x.valid.poke(pending.B)
    }

    def doPeek : Unit = {
      if ( pending && x.ready.peek().litToBoolean) {
         lst = lst.tail                        
         pending = false
      }
    }

    def isDone : Boolean = lst.isEmpty

  }

  class ReadyValidSink[T <: Data](x: ReadyValidIO[T], tossCoin : () => Boolean, var lst : Stream[T]) extends BaseAdapter {

    x.ready.poke(false.B)

    var pending = false

    def doPoke : Unit = {
      pending = pending || !lst.isEmpty && tossCoin()      
      x.ready.poke( pending.B)
    }

    def doPeek : Unit = {
      if ( pending && x.valid.peek().litToBoolean) {
         println( s"${x.bits.peek().litValue} ${lst.head.litValue}")
         x.bits.expect(lst.head)
         lst = lst.tail
         pending = false
      }
    }

    def isDone : Boolean = lst.isEmpty
  }
}


class TieTestLogicalThread extends FlatSpec with ChiselScalatestTester {
  import LogicalThreadTestAdapters._

  val rnd = new Random()

  behavior of "Testers2 with Tie"

  it should "work with a tie" in {
    test( new Tie) { c =>
      val INP = Stream.fill( 100){ BigInt( 16, rnd)}
      val OUT = INP

      val mgr = new Manager( c.clock, 2000)

      mgr.register( new ReadyValidSource( c.io.inp, rflip.nextBoolean, INP map (_.U)))
      mgr.register( new ReadyValidSink( c.io.out, rflip.nextBoolean, OUT map (_.U)))

      assert( mgr.checkAll)
    }
  }
}

class JoinTestLogicalThread extends FlatSpec with ChiselScalatestTester {
  import LogicalThreadTestAdapters._

  val rnd = new Random()

  behavior of "Testers2 with Join"

  it should "work with a join" in {
    test( new Join) { c =>
      val INP0 = Stream.fill( 100){ BigInt( 16, rnd)}
      val INP1 = Stream.fill( 100){ BigInt( 16, rnd)}
      val OUT = (INP0 zip INP1) map { case (x,y) => (x+y) & ((1<<16)-1)}

      val mgr = new Manager( c.clock, 2000)

      mgr.register( new ReadyValidSource( c.io.inp0, rflip.nextBoolean, INP0 map (_.U)))
      mgr.register( new ReadyValidSource( c.io.inp1, rflip.nextBoolean, INP1 map (_.U)))
      mgr.register( new ReadyValidSink( c.io.out, rflip.nextBoolean, OUT map (_.U)))

      assert( mgr.checkAll)
    }
  }
}

from chiseltest.

ducky64 commented on July 25, 2024

Yeah, your second approach is kind of what AdvancedTester currently does, but more understandable.

Overall, I think this is a tough problem. Here are my thoughts on possible solutions, though they all have drawbacks. Perhaps as a group we can come up with something more clever...

Phases

Basically the longest standing solution to this problem and kind of a catch-all to thread order issues. Allows user-defined phases, where in each time step, all threads execute with phases as the primary partial order. Pokes / peeks between phases are not checked.

This provides a very coarse-grained solution to imposing a partial thread order, though it's unclear how scalable it is (or needs to be). Primarily, testers2 could define a few stock phases (eg main testdriver and monitor phases come to mind), though how additional user-defined phases would be ordered, particularly with regard to each other (which might not be aware of each other's existence), would be unclear. Potential solution: punt to the top-level test writer, who would specify constraints between phases that are ambiguous. But it's unclear how these solutions could be made re-usable, eg not specified for every test where they are used.

Note that your ReadyValidSink doesn't seem to neatly fit in the monitor phase, since it's also manipulating signals. I know UVM defines a bunch of stock phases (which I'm not familiar with, and arguably contributes to the learning curve).

Additionally, phases cannot go backwards in time, so when switching to an earlier phase, there must be some kind of time step advance (eg clock.step()) before any kind of circuit interaction. This would be checked dynamically, similar to inter-thread interaction checking. Still a gotcha, but at least noisy and safe.

Thread Order Constraints

Kind of a more fine-grained version of phases, this instead allows users to specify ordering constraints between threads. By being very targeted, this avoids the scalability problem of phases, since the test writer isn't forced to lump a bunch of possibly different threads under a only-somewhat-close-sounding phase.

I haven't fully thought this through, so syntax and semantics are currently unclear. One possibility is for a thread-specific option:

val driverThread = fork { ReadyValidSource.enqueue(...) }
val receiverThread = fork.after(driverThread) { ReadyValidSink.dequeue(...) }

This suffers from the same problem as phases, where if you spawn a thread constrained to run before another thread executes, but after that thread has executed, it would incur a silent one-cycle delay. Potential solutions include:

Only allowing threads to be constrained to run after an existing thread. Current thread ordering constraints means that any newly spawned thread should execute after all previous (and hence, reference-able in code) threads have run. It's unclear how much of a limitation not having a before constraint would be.
Allow users to explicitly specify the one-cycle delay where necessary, similar in the phases proposition (where the system will check that the first thing done is a clock.step() call). Kind of annoying, in that it forces the test writer to spawn the thread one clock ahead, since a goal of testers2 is lightweight, simple, cycle-accurate testing.

Note that the current threading rules already impose a total thread ordering, so this constraint might just be a way of indicating what cross-thread accesses are safe (and won't cause an error).

Example-specific issues

One issue with the example is that it doesn't appear either of the above solutions could be moved into infrastructure (because it interacts across a ReadyValid Sink and Source, and on different Bundles altogether) - so you'd have to attach a phase(...) or thread order constraint wrapper to each fork call, which is cumbersome though not onerous.

Alternatively, it might be possible to have a default phase associated with a wire / ReadyValidSink / etc, though I'm not sure about the generalizability of that abstraction.

from chiseltest.

ducky64 commented on July 25, 2024

It turns out I need this now. I've given this a bit more thought, and I think the best solution at a library infrastructure level is regions (borrowing hopefully familiar-to-some Verilog terminology for how events within a time slot are ordered; I previously called them ~~phases~~ but that's apparently confusing). The main change from above would be that region are defined as blocks, instead of being associated with a thread; this allows a thread to have driver and monitor actions.

There would be two regions per timeslot, main testdriver (default region) and monitor, in that order. Cross-thread operations are not checked between regions, since it forms a global synchronization barrier. Combinational logic continues to propagate instantly (observably, at least). There are no restrictions going forward through regions (main testdriver to monitor) - the thread will pause until the region executes, but going backwards requires a clock step (checked at runtime) before any further simulator interaction.

This very very loosely inspired by SystemVerilog regions, though [System]Verilog allows multiple iterations through regions in a timeslot where we don't. But that's probably to support simulating RTL and test stimulus generation in the same framework, whereas testers2 separates them and we don't need to worry about simulating RTL.

In this proposal, the modified DecoupledDriver code would look like:

  def enqueueNow(data: T): Unit = timescope {
    x.bits.poke(data)
    x.valid.poke(true.B)
    region(Monitor) {
      x.ready.expect(true.B)
    }
    getSourceClock.step(1)
  }

  def enqueue(data: T): Unit = timescope {
    x.bits.poke(data)
    x.valid.poke(true.B)
    region(Monitor) {
      while (x.ready.peek().litToBoolean == false) {
        getSourceClock.step(1)
      }
    }
    getSourceClock.step(1)
  }

...

  def expectDequeue(data: T): Unit = timescope {
    x.ready.poke(true.B)
    region(Monitor) {
      waitForValid()
      expectPeek(data)
    }
    getSinkClock.step(1)    
  }

  def expectDequeueNow(data: T): Unit = timescope {
    x.ready.poke(true.B)
    region(Monitor) {
      expectPeek(data)
    }
    getSinkClock.step(1)
  }

One big question (as above) would be style conventions for where regions are needed, and avoiding the need for gotcha "you need to step the clock after you invoked this driver function". Above, functions always expect to be in the main testdriver region, and they return into the same testdriver region.

from chiseltest.

stevenmburns commented on July 25, 2024

This looks good to me. Does it get rid of your issue expressed above: 'it would incur a silent one-cycle delay'?

from chiseltest.

ducky64 commented on July 25, 2024

Kind of: there's still a delay required, but the compromise is that if you don't put a clock step between switching to an earlier region and the next simulator interaction, it will error out.

Without a fundamental reworking of the abstractions towards an event driven approach like Verilog where a line can be run multiple times as sources are updated (which breaks the imperative programming abstraction), I don't think there's a perfect way around the going-backwards-in-time issue.

from chiseltest.

stevenmburns commented on July 25, 2024

I'm not sure I need to go backwards in time. I'm mostly testing streams and want to make sure they produce the correct values in the correct sequence (assuming various stalls on the valid and ready lines.) Let me know when I can try it out.

from chiseltest.

Support for Testing Combinational Ready Valid Controllers about chiseltest HOT 8 CLOSED

Comments (8)

Phases

Thread Order Constraints

Example-specific issues

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent