The NEORV32 RISC-V Processor
- Overview
- CPU Features
- Processor/SoC Features
- Software Framework
- FPGA Implementation Results
- Performance
- Getting Started
๐ - Legal
Overview
The NEORV32 Processor is a customizable microcontroller-like system on chip (SoC) that is based on the RISC-V NEORV32 CPU. The project is intended as auxiliary processor in larger SoC designs or as ready-to-go stand-alone custom / customizable microcontroller.
asciidoc
sources can be found in docs/src_adoc
.
The doxygen-based documentation of the software framework is also available online
at GitHub-pages.
CHANGELOG.md
.
To see the changes between official releases visit the project's release page.
boards
folder provides exemplary EDA setups targeting
various FPGA boards to get you started.
CONTRIBUTE.md
.
Project Key Features
- CPU plus Processor/SoC plus Software Framework
- completely described in behavioral, platform-independent VHDL - no primitives, macros, etc.
- fully synchronous design, no latches, no gated clocks
- be as small as possible (while being as RISC-V-compliant as possible) โ but with a reasonable size-performance trade-off (the processor has to fit in a Lattice iCE40 UltraPlus 5k low-power FPGA running at 22+ MHz)
- from zero to
printf("hello world!");
- completely open source and documented - easy to use even for FPGA/RISC-V starters โ intended to work out of the box
NEORV32 CPU Features
The CPU (top entity: rtl/core/neorv32_cpu.vhd
)
implements the RISC-V 32-bit rv32
ISA with optional extensions. It is compatible to a subset of the
Unprivileged ISA Specification (Version 2.2)
and a subset of the Privileged Architecture Specification (Version 1.12-draft).
The CPU passes the official RISC-V architecture tests
(see riscv-arch-test/README
).
In order to provide a reduced-size setup the NEORV32 CPU implements a two-stages pipeline, where each stage uses a multi-cycle processing scheme. Instruction and data accesses are conducted via independant bus interfaces, that are multiplexed into a single SoC-bus ("modified Harvard architecture"). As a special execution safety feature, all reserved or unimplemented instructions do raise an exception. Furthermore, the CPU was assigned an official RISC-V open-source architecture ID
Currently implemented RISC-V-compatible ISA extensions
A
- atomic memory access instructions (optional)B
- bit manipulation instructions (subset, optional, still experimental)C
- compressed 16-bit instructions (optional)E
- embedded CPU (reduced register file size) (optional)I
- base integer instruction set (always enabled)M
- integer multiplication and division hardware (optional)U
- less-privilegeduser
mode in combintation with the standardmachine
mode (optional)X
- NEORV32-specific extensions (always enabled)Zfinx
- IEEE-754 single-precision floating-point extensions (optional)Zicsr
- control and status register access instructions (+ exception/irq system) (optional)Zifencei
- instruction stream synchronization (optional)PMP
- physical memory protection (optional)HPM
- hardware performance monitors (optional)DB
- RISC-V CPU debug mode (optional)
Operation modes / privilege levels
machine
user
(U
extension)debug_mode
(DB extension
)
Interrupts (machine level)
- RISC-V standard interrupts
- timer - via MTIME SoC module or via external signal
- external - via external signal
- software - via external signal
- 16 additional "fast interrupt" requests
NEORV32 Processor Features
The NEORV32 Processor (top entity: rtl/core/neorv32_top.vhd
)
provides a full-featured SoC build around the NEORV32 CPU. It is highly configurable to allow
a flexible customization according to your needs.
Included SoC modules:
- processor-internal data and instruction memories (DMEM / IMEM) & cache (iCACHE)
- bootloader (BOOTLDROM) with UART console and automatic application boot from external SPI flash option
- machine system timer (MTIME), RISC-V-compatible
- watchdog timer (WDT)
- two independent universal asynchronous receivers and transmitters (UART0 and UART1) with optional RTS/CTS hardware flow control
- 8/16/24/32-bit serial peripheral interface controller (SPI) with 8 dedicated chip select lines
- two wire serial interface controller (TWI) supporting clock-stretching, compatible to the IยฒC standard
- general purpose parallel IO port (GPIO), 32xOut & 32xIn with pin-change interrupt
- 32-bit external bus interface, Wishbone b4 compatible
(WISHBONE)
- wrapper for AXI4-Lite Master Interface
- PWM controller with 4 channels and 8-bit duty cycle resolution (PWM)
- ring-oscillator-based true random number generator (TRNG)
- custom functions subsystem (CFS) for tightly-coupled custom co-processor extensions
- numerically-controlled oscillator (NCO) with three independent channels
- smart LED interface (NEOLED) to directly drive WS2812-compatible (NeoPixel(TM)) LEDs
- on-chip debugger (OCD) via JTGA - compatible to the Minimal RISC-V Debug Specification Version 0.13.2 and compatible with the OpenOCD and gdb
- alternative top entities/wrappers available
NEORV32 Software Framework
- core libraries for high-level usage of the provided functions and peripherals
- application compilation based on GNU makefiles
- gcc-based toolchain (pre-compiled toolchains available)
- bootloader with UART interface console
- runtime environment for handling traps
- several example programs to get started including CoreMark, FreeRTOS and Conway's Game of Life
doxygen
-based documentation, available on GitHub pages
FPGA Implementation Results
NEORV32 CPU
Implementation results for exemplary CPU configuration generated for an Intel Cyclone IV EP4CE22F17C6N FPGA on a DE0-nano board using Intel Quartus Prime Lite 20.1 ("balanced implementation"). The timing information is derived from the Timing Analyzer / Slow 1200mV 0C Model. No constraints were used at all.
Results generated for hardware version 1.5.3.2
.
CPU Configuration | LEs | FFs | Memory bits | DSPs (9-bit) | f_max |
---|---|---|---|---|---|
rv32i |
980 | 409 | 1024 | 0 | 123 MHz |
rv32i + Zicsr |
1835 | 856 | 1024 | 0 | 124 MHz |
rv32imac + Zicsr |
2685 | 1156 | 1024 | 0 | 124 MHz |
rv32imac + Zicsr + u + Zifencei |
2715 | 1162 | 1024 | 0 | 122 MHz |
rv32imac + Zicsr + u + Zifencei + Zfinx |
4004 | 1812 | 1024 | 7 | 121 MHz |
Setups with enabled E
(embedded CPU extension) provide the same LUT and FF utilization and identical f_max as the according
I
configuration. However, the size of the register file and thus, the embedded memory utilization, is cut in half.
NEORV32 Processor
boards
folder for exemplary setups targeting various FPGA boards.
Results generated for hardware version 1.4.9.0
.
If not otherwise note, the setups use the default configuration (like no TRNG),
no external memory interface and only internal instruction and data memories
(IMEM uses 16kB and DMEM uses 8kB memory space).
Vendor | FPGA | Board | Toolchain | CPU Configuration | LUT / LE | FF / REG | DSP (9-bit) | Memory Bits | BRAM / EBR | SPRAM | Frequency |
---|---|---|---|---|---|---|---|---|---|---|---|
Intel | Cyclone IV EP4CE22F17C6N |
Terasic DE0-Nano | Quartus Prime Lite 20.1 | rv32imcu_Zicsr_Zifencei |
3813 (17%) | 1904 (8%) | 0 (0%) | 231424 (38%) | - | - | 119 MHz |
Lattice | iCE40 UltraPlus iCE40UP5K-SG48I |
boards/UPduino_v3 |
Radiant 2.1 (LSE) | rv32imac_Zicsr |
5123 (97%) | 1972 (37%) | 0 (0%) | - | 12 (40%) | 4 (100%) | c 24 MHz |
Xilinx | Artix-7 XC7A35TICSG324-1L |
Arty A7-35T | Vivado 2019.2 | rv32imcu_Zicsr_Zifencei + PMP |
2465 (12%) | 1912 (5%) | 0 (0%) | - | 8 (16%) | - | c 100 MHz |
Performance
The NEORV32 CPU is based on a two-stages pipelined architecutre. Each stage uses a multi-cycle processing scheme.
Hence, each instruction requires several clock cycles to execute (2 cycles for ALU operations, and up to 40 cycles for divisions).
By default the CPU-internal shifter as well as the multiplier and divider of the M
extension use a bit-serial approach
and require several cycles for completion. The average CPI (cycles per instruction) depends on the instruction mix of a
specific applications and also on the available CPU extensions.
The following table shows the performance results(relative CoreMark score and average cycles per instruction) for successfully running 2000 iterations of the CoreMark CPU benchmark, which reflects a pretty good "real-life" work load. The source files are available in sw/example/coremark.
**CoreMark Setup**
Hardware: 32kB IMEM, 8kB DMEM, no caches, 100MHz clock
CoreMark: 2000 iterations, MEM_METHOD is MEM_STACK
Compiler: RISCV32-GCC 10.1.0 (rv32i toolchain)
Compiler flags: default, see makefile
Optimization: -O3
Peripherals: UART for printing the results
Results generated for hardware version 1.4.9.8
.
CPU (including Zicsr extension) |
Executable Size | CoreMark Score | CoreMarks/MHz | Total Clock Cycles | Executed Instructions | Average CPI |
---|---|---|---|---|---|---|
rv32i |
28 756 bytes | 36.36 | 0.3636 | 5595750503 | 1466028607 | 3.82 |
rv32imc |
22 008 bytes | 68.97 | 0.6897 | 2981786734 | 611814918 | 4.87 |
rv32imc + FAST_MUL_EN + FAST_SHIFT_EN |
22 008 bytes | 90.91 | 0.9091 | 2265135174 | 611814948 | 3.70 |
FAST_MUL_EN
configuration uses DSPs for the multiplier of the M
extension
(enabled via the FAST_MUL_EN
generic). The FAST_SHIFT_EN
configuration uses a barrel shifter for
CPU shift operations (enabled via the FAST_SHIFT_EN
generic).
Getting Started
This overview provides some quick links to the most important sections of the
๐ Hardware Overview
-
NEORV32 Processor - the SoC
- Top Entity - Signals - how to connect to the processor
- Top Entity - Generics - configuration options
- Address Space - memory space and memory-mapped IO
- SoC Modules - available IO/peripheral modules and memories
- On-Chip Debugger - online debugging of the processor via JTAG
-
NEORV32 CPU - the RISC-V core
- RISC-V compatibility - what is compatible to the specs. and what is not
- ISA and Extensions - available RISC-V ISA extensions
- CSRs - control and status registers
- Traps - interrupts and exceptions
๐พ Software Overview
- Core Libraries - high-level functions for accessing the processor's peripherals
- Software Framework Documentation -
doxygen
-based documentation
- Software Framework Documentation -
- Application Makefiles - turning your application into an executable
- Bootloader - the build-in NEORV32 bootloader
๐ User Guides (see full overview)
- Toolchain Setup - install and setup RISC-V gcc
- General Hardware Setup - setup a new NEORV32 EDA project
- General Software Setup - configure the software framework
- Application Compilation - compile an application using
make
- Upload via Bootloader - upload and execute executables
- Debugging via the On-Chip Debugger - step through code online and in-system
Acknowledgements
A big shoutout to all contributors, who helped improving this project!
RISC-V - Instruction Sets Want To Be Free!
Continous integration provided by GitHub Actions and powered by GHDL.
This project is not affiliated with or endorsed by the Open Source Initiative (https://www.oshwa.org / https://opensource.org).
Legal
This project is released under the BSD 3-Clause license. No copyright infringement intended. For more information see the online documentation - "Proprietary and Legal Notice". Other implied or used projects might have different licensing - see their documentation to get more information.
Limitation of Liability for External Links
Our website contains links to the websites of third parties ("external links"). As the content of these websites is not under our control, we cannot assume any liability for such external content. In all cases, the provider of information of the linked websites is liable for the content and accuracy of the information provided. At the point in time when the links were placed, no infringements of the law were recognisable to us. As soon as an infringement of the law becomes known to us, we will immediately remove the link in question.
Citing
If you are using the NEORV32 or parts of the project in some kind of publication, please cite it as follows:
S. Nolting, "The NEORV32 RISC-V Processor", github.com/stnolting/neorv32
Made with