A FIFO (first in first out) buffer allows for temporary storage for transmission of data between subsystems. It is a very common construct used in digital systems, but why do we need them?
As an example, let’s say that we have a keyboard subsystem that, when a key is pressed, sends corresponding ASCII data to a UART subsystem, which then serially passes the data on to a receiving PC. Assuming that data is directly relayed between the keyboard and UART subsystems, there is risk of data corruption if another key is pressed before the last key’s ASCII data has been processed by the UART subsystem. This condition is known as data overrun, and can be avoided by inserting a data buffer such as a FIFO between the keyboard and UART subsystems. If the baud rate for a UART is high enough, then the likelihood of data overrun occurring is limited by how fast a user can mash the keyboard’s keys (which for the previous implementation was unlikely). Regardless, it is a good idea to provide data buffers between subsystems that are not instantaneous in their processing of sent and received data.
In this post I will briefly detail how to implement a FIFO buffer in Verilog HDL using the Block RAM on a Xilinx Spartan 3 FPGA. The implementation will allow us to specify the number of words (pieces of data) in the FIFO as well as word width (number of bits per word). We will be able to read from and write to the FIFO, assuming it isn’t in an empty or full state, which we will keep track of and signal out from the FIFO to make the utilizing subsystem aware.
First we will need to consider how to utilize the block RAM of our FPGA. Block RAM are dedicated memory blocks within an FPGA, separate from regular logic cells. For the Spartan 3E FPGAs they contain 18K bits per block RAM, and can be organized into various different data widths (i.e. 256 x 64). As an example, the Basys 2 FPGA board uses an XC3D100E Spartan 3E FPGA which has 4 block RAMs for a total of 72K bits.
The benefit to using block RAM instead of the distributed RAM of the logic cells (16 x 1 LUT per cell) is that we can easily implement intermediate sized memory storage without sacrificing the logic cells that are used to build our digital designs.
For the FIFO we will want to implement a dual port block RAM, dual port meaning that there will be separate ports for reading and writing to the RAM. We will clearly communicate the type of block RAM we want to Xilinx ISE to infer from our HDL by following the proper template given on page 145 of the Xilinx XST User Guide .
module sync_dual_port_ram #( parameter ADDRESS_WIDTH = 12, // number of words in ram DATA_WIDTH = 8 // number of bits in word ) // IO ports ( input wire clk, // clk for synchronous read/write input wire write_en, // signal to enable synchronous write input wire [ADDRESS_WIDTH-1:0] read_address, write_address, // inputs for dual port addresses input wire [DATA_WIDTH-1:0] write_data_in, // input for data to write to ram output wire [DATA_WIDTH-1:0] read_data_out, write_data_out // outputs for dual data ports ); // internal signal declarations reg [DATA_WIDTH-1:0] ram [2**ADDRESS_WIDTH-1:0]; // ADDRESS_WIDTH x DATA_WIDTH RAM declaration reg [ADDRESS_WIDTH-1:0] read_address_reg, write_address_reg; // dual port address declarations // synchronous write and address update always @(posedge clk) begin if (write_en) // if write enabled ram[write_address] <= write_data_in; // write data to ram and write_address read_address_reg <= read_address; // store read_address to reg write_address_reg <= write_address; // store write_address to reg end // assignments for two data out ports assign read_data_out = ram[read_address_reg]; assign write_data_out = ram[write_address_reg]; endmodule
We begin the module by including parameters for the number of words and word width so they can be set by the circuit utilizing this module, in this instance the FIFO. Following the template we include a clk input, write enable input to signal when to write to the RAM, read/write addresses, data in, and two ports for data out. Inside the module we instantiate ram using the parameters DATA_WIDTH and ADDRESS_WIDTH. This dual port block ram will have synchronous read/write, which is inferred in the always clock. The input data is written to ram on a positive clock edge. The internal registers that store the input read/write addresses are synchronously updated as well, which causes the two data outputs for the read/write addresses to be updated synchronously with the clock.
Next we will implement the FIFO, using the block RAM module to instantiate the buffer’s data storage. We will base our FIFO on a circular-queue design, which is also known as a ring buffer.
The FIFO buffer module will have a signal for write, which when asserted will write the input data to the buffer at the write_address, and then increment write_address by 1 index. When the read input is asserted, the read_address will be incremented by 1 and the data in the buffer at the previous read_address will be left behind. We will need to keep track of when the buffer is empty or full, to know when we can appropriately read or write from it. We will do this by creating two sets of pointer registers, one to keep track of the current read/write addresses and another to keep track of the following read/write addresses (the incremented values). By incrementing the read/write addresses, a ring buffer design is created, as each address pointer can transition all the way to the max address index and back to 0 again.
Above are examples of writing to and reading from a ring buffer with 8 addresses. The buffer is initialized with read/write addresses being 0. When we write ‘O’ to the buffer we place ‘O’ at index 0 and increment the write_address to 1. After writing 3 more inputs to the buffer, the write address is then at index 3.
At this point, the data that is read from the buffer is at read_address which is pointing to index 0 in the buffer. When we read once we increment the read_address to 1, leaving behind the data at index 0 to be overwritten later. Now the data read from the buffer is at index 1. We can read again, and move the read_address to index 2.
Now when we read again, the read_address will catch up with the write_address at index 3, which means that the buffer will be in an “empty” state where we should not allow the read operation to occur, as it will increment read_address to an index that hasn’t been written to yet.
To reiterate, when we are to read from the FIFO and the read address following a potential read operation is the same as the current write address, then the read address pointer has caught up with the write address pointer, signifying that the buffer will now be empty. Likewise, when we are to write to the FIFO and the write address following a write operation is the same as the current read address, then the write address pointer has caught up with the read address pointer, signifying that the buffer will now be full.
// FIFO buffer implemented with synchronous dual-port block ram module fifo #( parameter ADDRESS_WIDTH = 12, // number of words in ram DATA_WIDTH = 8 // number of bits in word ) // IO ports ( input wire clk, reset, input wire read, write, input wire [DATA_WIDTH-1:0] write_data, output wire empty, full, output wire [DATA_WIDTH-1:0] read_data ); // internal signal declarations reg [ADDRESS_WIDTH-1:0] write_address_reg, write_address_next, write_address_after; reg [ADDRESS_WIDTH-1:0] read_address_reg, read_address_next, read_address_after; reg full_reg, empty_reg, full_next, empty_next; wire write_en; // write enable is asserted when write input is asserted and FIFO isn't full assign write_en = write & ~full_reg; // instantiate synchronous block ram sync_dual_port_ram #(.ADDRESS_WIDTH(ADDRESS_WIDTH), .DATA_WIDTH(DATA_WIDTH)) ram (.clk(clk), .write_en(write_en), .write_address(write_address_reg), .read_address(read_address_reg), .write_data_in(write_data), .write_data_out(), .read_data_out(read_data)); // register for address pointers, full/empty status always @(posedge clk, posedge reset) if (reset) begin write_address_reg <= 0; read_address_reg <= 0; full_reg <= 1'b0; empty_reg <= 1'b1; end else begin write_address_reg <= write_address_next; read_address_reg <= read_address_next; full_reg <= full_next; empty_reg <= empty_next; end // next-state logic for address index values after read/write operations always @* begin write_address_after = write_address_reg + 1; read_address_after = read_address_reg + 1; end // next-state logic for address pointers always @* begin // defaults write_address_next = write_address_reg; read_address_next = read_address_reg; full_next = full_reg; empty_next = empty_reg; // if read input asserted and FIFO isn't empty if(read && ~empty_reg && ~write) begin read_address_next = read_address_after; // read address moves forward full_next = 1'b0; // FIFO isn't full if a read occured if (read_address_after == write_address_reg) // if read address caught up with write address, empty_next = 1'b1; // FIFO is empty end // if write input asserted and FIFO isn't full else if(write && ~full_reg && ~read) begin write_address_next = write_address_after; // write address moves forward empty_next = 1'b0; // FIFO isn't empty if write occured if (write_address_after == read_address_reg) // if write address caught up with read address full_next = 1'b1; // FIFO is full end // if write and read are asserted else if(write && read) begin write_address_next = write_address_after; // write address moves forward read_address_next = read_address_after; // read address moves forward end end // assign full/empty status to output ports assign full = full_reg; assign empty = empty_reg; endmodule
In the HDL implementation, write_en (write enable) signal is routed to the block RAM’s write_en input port, and is asserted when the FIF0 write input is asserted and the FIFO isn’t currently full.
The next state logic for the address pointer registers is handled in an always block with a priority routing network, for a read, write, or read/write operation. Note that when a read operation occurs, we check if read_address_after (next index after read operation) is equal to the current write_address, and if so set empty_reg to 1, which will prevent a subsequent read operation until a write operation occurs. Likewise, when a write operation occurs, we check if write_address_after (next index after write operation) is equal to the current read_address, and if so we set full_reg to 1, which will prevent a subsequent write operation to occur until a read operation occurs.
The FIFO buffer module also has size parameters that can set the size of the block RAM that is instantiated, and thus sets the size of the FIFO buffer. This will allow it to serve as a ready made tool in the future when data buffers between subsystems are needed (I’m looking at you VGA monitor).
The graphic is confusing at the beginning, the “out” is depicted as being the first out instead of “first”
LikeLike
Thank you very much! very helpful
One question though: In the comment next to ADDRESS_WIDTH, you wrote that there would be 12 words in the BRAM. Is that a mistake? Because it seems according to the code that there would be 2^12=4096 words in the BRAM
LikeLike
Yes, I’m not sure I understand this part either. Why is it 2^n?
LikeLike
How about testbench? Thank you!
LikeLike
TEST BENCH???
LikeLike