FIFO Buffer Using Block RAM on a Xilinx Spartan 3 FPGA

A FIFO (first in first out) buffer allows for temporary storage for transmission of data between subsystems. It is a very common construct used in digital systems, but why do we need them?

As an example, let’s say that we have a keyboard subsystem that, when a key is pressed, sends corresponding ASCII data to a UART subsystem, which then serially passes the data on to a receiving PC. Assuming that data is directly relayed between the keyboard and UART subsystems, there is risk of data corruption if another key is pressed before the last  key’s ASCII data has been processed by the UART subsystem. This condition is known as data overrun, and can be avoided by inserting a data buffer such as a FIFO between the keyboard and UART subsystems. If the baud rate for a UART is high enough, then the likelihood of data overrun occurring is limited by how fast a user can mash the keyboard’s keys (which for the previous implementation was unlikely). Regardless, it is a good idea to provide data buffers between subsystems that are not instantaneous in their processing of sent and received data.

In this post I will briefly detail how to implement a FIFO buffer in Verilog HDL using the Block RAM on a Xilinx Spartan 3 FPGA. The implementation will allow us to specify the number of words (pieces of data) in the FIFO as well as word width (number of bits per word). We will be able to read from and write to the FIFO, assuming it isn’t in an empty or full state, which we will keep track of and signal out from the FIFO to make the utilizing subsystem aware.

First we will need to consider how to utilize the block RAM of our FPGA. Block RAM are dedicated memory blocks within an FPGA, separate from regular logic cells. For the Spartan 3E FPGAs they contain 18K bits per block RAM, and can be organized into various different data widths (i.e. 256 x 64). As an example, the Basys 2 FPGA board uses an XC3D100E Spartan 3E FPGA which has 4 block RAMs for a total of 72K bits.

The benefit to using block RAM instead of the distributed RAM of the logic cells (16 x 1 LUT per cell) is that we can easily implement intermediate sized memory storage without sacrificing the logic cells that are used to build our digital designs.

For the FIFO we will want to implement a dual port block RAM, dual port meaning that there will be separate ports for reading and writing to the RAM. We will clearly communicate the type of block RAM we want to Xilinx ISE to infer from our HDL by following the proper template given on page 145 of the Xilinx XST User Guide .

module sync_dual_port_ram
	
	
	#( parameter ADDRESS_WIDTH = 12, // number of words in ram
                     DATA_WIDTH    =  8  // number of bits in word
	 )
	
	// IO ports
	(
		input wire clk,                                             // clk for synchronous read/write 
		input wire write_en,                                        // signal to enable synchronous write
		input wire [ADDRESS_WIDTH-1:0] read_address, write_address, // inputs for dual port addresses
		input wire [DATA_WIDTH-1:0] write_data_in,                  // input for data to write to ram
		output wire [DATA_WIDTH-1:0] read_data_out, write_data_out  // outputs for dual data ports
	);
	
	// internal signal declarations
	reg [DATA_WIDTH-1:0] ram [2**ADDRESS_WIDTH-1:0];             // ADDRESS_WIDTH x DATA_WIDTH RAM declaration
	reg [ADDRESS_WIDTH-1:0] read_address_reg, write_address_reg; // dual port address declarations
	
	// synchronous write and address update
	always @(posedge clk)
		begin
		if (write_en)  							 // if write enabled
		   ram[write_address] <= write_data_in; // write data to ram and write_address 
		    
                read_address_reg  <= read_address;      // store read_address to reg
	        write_address_reg <= write_address;     // store write_address to reg
	        end
	
	// assignments for two data out ports
	assign read_data_out  = ram[read_address_reg];
	assign write_data_out = ram[write_address_reg];
endmodule

 

We begin the module by including parameters for the number of words and word width so they can be set by the circuit utilizing this module, in this instance the FIFO. Following the template we include a clk input, write enable input to signal when to write to the RAM, read/write addresses, data in, and two ports for data out. Inside the module we instantiate ram using the parameters DATA_WIDTH and ADDRESS_WIDTH. This dual port block ram will have synchronous read/write, which is inferred in the always clock. The input data is written to ram on a positive clock edge. The internal registers that store the input read/write addresses are synchronously updated as well, which causes the two data outputs for the read/write addresses to be updated synchronously with the clock.

Next we will implement the FIFO, using the block RAM module to instantiate the buffer’s data storage. We will base our FIFO on a circular-queue design, which is also known as a ring buffer.

The FIFO buffer module will have a signal for write, which when asserted will write the input data to the buffer at the write_address, and then increment write_address by 1 index. When the read input is asserted, the read_address will be incremented by 1 and the data in the buffer at the previous read_address will be left behind. We will need to keep track of when the buffer is empty or full, to know when we can appropriately read or write from it. We will do this by creating two sets of pointer registers, one to keep track of the current read/write addresses and another to keep track of the following read/write addresses (the incremented values). By incrementing the read/write addresses, a ring buffer design is created, as each address pointer can transition all the way to the max address index and back to 0 again.

Above are examples of writing to and reading from a ring buffer with 8 addresses. The buffer is initialized with read/write addresses being 0. When we write ‘O’ to the buffer we place ‘O’ at index 0 and increment the write_address to 1. After writing 3 more inputs to the buffer, the write address is then at index 3.

At this point, the data that is read from the buffer is at read_address which is pointing to index 0 in the buffer. When we read once we increment the read_address to 1, leaving behind the data at index 0 to be overwritten later. Now the data read from the buffer is at index 1. We can read again, and move the read_address to index 2.

Now when we read again, the read_address will catch up with the write_address at index 3, which means that the buffer will be in an “empty” state where we should not allow the read operation to occur, as it will increment read_address to an index that hasn’t been written to yet.

To reiterate, when we are to read from the FIFO and the read address following a potential read operation is the same as the current write address, then the read address pointer has caught up with the write address pointer, signifying that the buffer will now be empty. Likewise, when we are to write to the FIFO and the write address following a write operation is the same as the current read address, then the write address pointer has caught up with the read address pointer, signifying that the buffer will now be full.

// FIFO buffer implemented with synchronous dual-port block ram
module fifo

	#( parameter ADDRESS_WIDTH = 12, // number of words in ram
                     DATA_WIDTH    =  8  // number of bits in word
	 )
	 
	// IO ports
	(
		input wire clk, reset,
		input wire read, write,
		input wire [DATA_WIDTH-1:0] write_data,
		output wire empty, full,
		output wire [DATA_WIDTH-1:0] read_data
	);
	
	// internal signal declarations
	reg [ADDRESS_WIDTH-1:0] write_address_reg, write_address_next, write_address_after;
	reg [ADDRESS_WIDTH-1:0] read_address_reg, read_address_next, read_address_after;
	reg full_reg, empty_reg, full_next, empty_next;
	wire write_en;
	
	// write enable is asserted when write input is asserted and FIFO isn't full
	assign write_en = write & ~full_reg;
	
	// instantiate synchronous block ram
	sync_dual_port_ram #(.ADDRESS_WIDTH(ADDRESS_WIDTH), .DATA_WIDTH(DATA_WIDTH)) ram
                       (.clk(clk), .write_en(write_en), .write_address(write_address_reg),
			.read_address(read_address_reg), .write_data_in(write_data),
			.write_data_out(), .read_data_out(read_data));
	
	// register for address pointers, full/empty status
	always @(posedge clk, posedge reset)
		if (reset)
		    begin
                    write_address_reg <= 0;
                    read_address_reg  <= 0;
                    full_reg          <= 1'b0;
                    empty_reg         <= 1'b1;
		    end
		else
		    begin
                    write_address_reg <= write_address_next;
                    read_address_reg  <= read_address_next;
                    full_reg          <= full_next;
                    empty_reg         <= empty_next;
		    end
			
	// next-state logic for address index values after read/write operations
	always @*
		begin
		write_address_after = write_address_reg + 1;
		read_address_after  = read_address_reg + 1;
		end
		
	// next-state logic for address pointers
	always @*
		begin
		// defaults
		write_address_next = write_address_reg;
		read_address_next  = read_address_reg;
		full_next          = full_reg;
		empty_next         = empty_reg;
		
		// if read input asserted and FIFO isn't empty
		if(read && ~empty_reg && ~write)
			begin
			read_address_next = read_address_after;       // read address moves forward
			full_next = 1'b0;                             // FIFO isn't full if a read occured
			
			if (read_address_after == write_address_reg)  // if read address caught up with write address,
				empty_next = 1'b1;                        // FIFO is empty
			end
		
		// if write input asserted and FIFO isn't full
		else if(write && ~full_reg && ~read)
			begin
			write_address_next = write_address_after;     // write address moves forward
			empty_next = 1'b0;                            // FIFO isn't empty if write occured
			
			if (write_address_after == read_address_reg)    // if write address caught up with read address
				full_next = 1'b1;                         // FIFO is full
                        end
		
		// if write and read are asserted
                else if(write && read)
			begin
			write_address_next = write_address_after;     // write address moves forward
			read_address_next  = read_address_after;      // read address moves forward
                        end
		end 
	
   // assign full/empty status to output ports
   assign full  = full_reg;
   assign empty = empty_reg;
endmodule

 

In the HDL implementation, write_en (write enable) signal is routed to the block RAM’s write_en input port, and is asserted when the FIF0 write input is asserted and the FIFO isn’t currently full.

The next state logic for the address pointer registers is handled in an always block with a priority routing network, for a read, write, or read/write operation. Note that when a read operation occurs, we check if read_address_after (next index after read operation) is equal to the current write_address, and if so set empty_reg to 1, which will prevent a subsequent read operation until a write operation occurs. Likewise, when a write operation occurs, we check if write_address_after (next index after write operation) is equal to the current read_address, and if so we set full_reg to 1, which will prevent a subsequent write operation to occur until a read operation occurs.

The FIFO buffer module also has size parameters that can set the size of the block RAM that is instantiated, and thus sets the size of the FIFO buffer. This will allow it to serve as a ready made tool in the future when data buffers between subsystems are needed (I’m looking at you VGA monitor).

5 thoughts on “FIFO Buffer Using Block RAM on a Xilinx Spartan 3 FPGA

  1. The graphic is confusing at the beginning, the “out” is depicted as being the first out instead of “first”

    Like

  2. Thank you very much! very helpful
    One question though: In the comment next to ADDRESS_WIDTH, you wrote that there would be 12 words in the BRAM. Is that a mistake? Because it seems according to the code that there would be 2^12=4096 words in the BRAM

    Like

Leave a comment