Table of Contents

Memory tutorial for USB-FPGA-Modules 1.11

This tutorial explains how the memory controller IP Core is created on USB-FPGA-Modules 1.11.

Creating the IP Core

This section describes how the IP Core is created in an ISE project. The MIG version used for the screen shots below was MIG 3.5 (of ISE version 12.2). The settings for other versions should be very similar.

  1. In the menu: ProjectNew Source
  2. Choose IP Core Generator and enter a name for the new Core
  3. Select Memories & Storage ElementsMemory Interface GeneratorsMIG and click on NEXTFINISH
  4. Verify the Settings in the first dialog box (USB-FPGA Module 1.11a: xc6slx9-ftg256, speed grade -2; 1.11b: xc6slx16-ftg256; speed grade -2, 1.11c: xc6slx25-ftg256, speed grade -2 or -3 ):
    USB-FPGA Module 1.11, MIG screen 1
  5. The settings on the next screen can be ignored:
    USB-FPGA Module 1.11, MIG screen 2
  6. Choose DDR SDRAM for bank 3:
    USB-FPGA Module 1.11, MIG screen 3
  7. Select memory part MT46V32M16XX-5B-IT and make sure that the clock period is 5000 ps:
    USB-FPGA Module 1.11, MIG screen 4
  8. Output Drive Strength should be normal:
    USB-FPGA Module 1.11, MIG screen 5
  9. The recommended address mapping scheme is Row-Bank-Column. The memory port setting depends from the application. This are the values for the memtest example:
    USB-FPGA Module 1.11, MIG screen 6
  10. The arbitration settings depend on the application. Usually the Round Robin algorithm is a good choice:
    USB-FPGA Module 1.11, MIG screen 7
  11. Chose SSTL Class II output signal standard, uncalibrated 50 Ohm termination and single-ended system clock:
    USB-FPGA Module 1.11, MIG screen 8
  12. Click NEXT on the following dialog boxes and Generate on the last screen.

Modify your .ucf file

Insert the following code snippet into your ucf file

# 48 MHz EZ-USB clock 
NET "FXCLK" TNM_NET = "FXCLK";
TIMESPEC "TS_FXCLK" = PERIOD "FXCLK" 20.833333 ns HIGH 50 %;
NET "FXCLK"  LOC = "K14" | IOSTANDARD = LVCMOS33 ;

############################################################################
## Memory Controller 3                               
## Memory Device: DDR_SDRAM->MT46V32M16XX-5B-IT 
## Frequency: 200 MHz
## Time Period: 5000 ps
## Supported Part Numbers: MT46V32M16BN-5B-IT
############################################################################

############################################################################
## I/O TERMINATION                                                          
############################################################################
NET "mcb3_dram_dq[*]"                                 IN_TERM = UNTUNED_SPLIT_50;
NET "mcb3_dram_dqs"                                   IN_TERM = UNTUNED_SPLIT_50;
NET "mcb3_dram_udqs"                                  IN_TERM = UNTUNED_SPLIT_50;

NET  "mcb3_dram_a[*]"                                 OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_ba[*]"                                OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_ck"                                   OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_ck_n"                                 OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_cke"                                  OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_ras_n"                                OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_cas_n"                                OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_we_n"                                 OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_dm"                                   OUT_TERM = UNTUNED_50; 
NET  "mcb3_dram_udm"                                  OUT_TERM = UNTUNED_50; 

############################################################################
# I/O STANDARDS 
############################################################################
NET  "mcb3_dram_dq[*]"                               IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_dqs"                                 IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_udqs"                                IOSTANDARD = SSTL2_II;
NET  "mcb3_rzq"                                      IOSTANDARD = SSTL2_II;
NET  "mcb3_zio"                                      IOSTANDARD = SSTL2_II;

NET  "mcb3_dram_a[*]"                                IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_ba[*]"                               IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_ck"                                  IOSTANDARD = DIFF_SSTL2_II;
NET  "mcb3_dram_ck_n"                                IOSTANDARD = DIFF_SSTL2_II;
NET  "mcb3_dram_cke"                                 IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_ras_n"                               IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_cas_n"                               IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_we_n"                                IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_dm"                                  IOSTANDARD = SSTL2_II;
NET  "mcb3_dram_udm"                                 IOSTANDARD = SSTL2_II;


############################################################################
# MCB 3
# Pin Location Constraints for Clock, Masks, Address, and Controls
############################################################################

NET  "mcb3_dram_dq[4]"                           LOC = "F2" ;
NET  "mcb3_dram_dq[5]"                           LOC = "F1" ;
NET  "mcb3_dram_dq[6]"                           LOC = "G3" ;
NET  "mcb3_dram_dq[7]"                           LOC = "G1" ;
NET  "mcb3_dram_dq[2]"                           LOC = "J3" ;
NET  "mcb3_dram_dq[3]"                           LOC = "J1" ;
NET  "mcb3_dram_dq[0]"                           LOC = "K2" ;
NET  "mcb3_dram_dq[1]"                           LOC = "K1" ;

NET  "mcb3_dram_dq[8]"                           LOC = "L3" ;
NET  "mcb3_dram_dq[9]"                           LOC = "L1" ;
NET  "mcb3_dram_dq[10]"                          LOC = "M2" ;
NET  "mcb3_dram_dq[11]"                          LOC = "M1" ;
NET  "mcb3_dram_dq[12]"                          LOC = "P2" ;
NET  "mcb3_dram_dq[13]"                          LOC = "P1" ;
NET  "mcb3_dram_dq[14]"                          LOC = "R2" ;
NET  "mcb3_dram_dq[15]"                          LOC = "R1" ;

NET  "mcb3_dram_dqs"                             LOC = "H2" ;
NET  "mcb3_dram_udqs"                            LOC = "N3" ;

NET  "mcb3_dram_ba[0]"                           LOC = "C3" ;
NET  "mcb3_dram_ba[1]"                           LOC = "C2" ;

NET  "mcb3_dram_a[0]"                            LOC = "K5" ;
NET  "mcb3_dram_a[1]"                            LOC = "K6" ;
NET  "mcb3_dram_a[2]"                            LOC = "D1" ;
NET  "mcb3_dram_a[3]"                            LOC = "L4" ;
NET  "mcb3_dram_a[4]"                            LOC = "G5" ;
NET  "mcb3_dram_a[5]"                            LOC = "H4" ;
NET  "mcb3_dram_a[6]"                            LOC = "H3" ;
NET  "mcb3_dram_a[7]"                            LOC = "D3" ;
NET  "mcb3_dram_a[8]"                            LOC = "B2" ;
NET  "mcb3_dram_a[9]"                            LOC = "A2" ;
NET  "mcb3_dram_a[10]"                           LOC = "G6" ;
NET  "mcb3_dram_a[11]"                           LOC = "E3" ;
NET  "mcb3_dram_a[12]"                           LOC = "F3" ;

NET  "mcb3_dram_dm"                              LOC = "J4" ;
NET  "mcb3_dram_udm"                             LOC = "K3" ;

NET  "mcb3_dram_ras_n"                           LOC = "J6" ;
NET  "mcb3_dram_cas_n"                           LOC = "H5" ;
NET  "mcb3_dram_we_n"                            LOC = "C1" ;

NET  "mcb3_dram_ck"                              LOC = "E2" ;
NET  "mcb3_dram_ck_n"                            LOC = "E1" ;
NET  "mcb3_dram_cke"                             LOC = "F4" ;

# NC pins 
NET  "mcb3_rzq"                                  LOC = "M4" ;
NET  "mcb3_zio"                                  LOC = "M5" ;

Insert the Core into your VHDL code

Please use memtest.vhd from the memory test example as reference.

  1. Add the following inputs/outputs to the entity declaration:
    -- EZ-USB clock
    FXCLK           : in std_logic;
    RESET_IN        : in std_logic;        -- reset input pin
    -- DDR-SDRAM
    mcb3_dram_dq    : inout std_logic_vector(15 downto 0);
    mcb3_rzq        : inout std_logic;
    mcb3_zio        : inout std_logic;
    mcb3_dram_udqs  : inout std_logic;
    mcb3_dram_dqs   : inout std_logic;
    mcb3_dram_a     : out std_logic_vector(12 downto 0);
    mcb3_dram_ba    : out std_logic_vector(1 downto 0);
    mcb3_dram_cke   : out std_logic;
    mcb3_dram_ras_n : out std_logic;
    mcb3_dram_cas_n : out std_logic;
    mcb3_dram_we_n  : out std_logic;
    mcb3_dram_dm    : out std_logic;
    mcb3_dram_udm   : out std_logic;
    mcb3_dram_ck    : out std_logic;
    mcb3_dram_ck_n  : out std_logic
  2. Add the component declaration from ipcore_dir/<ipcore name>.vho to the architecture header.
  3. Define the following signals in the architecture header:
    signal CLK : std_logic;         -- 50 MHz system clock (for example)
    signal RESET0 : std_logic;	-- released after dcm0 is ready
    signal RESET : std_logic;	-- released after MCB is ready
    signal MEM_CLK : std_logic;     -- memory clock
    signal C3_CALIB_DONE : std_logic; 
    signal C3_RST0 : std_logic;
  4. Insert the instantiation template from ipcore_dir/<ipcore name>.vho into the architecture body and connect the following component ports:
    mcb3_dram_dq    =>  mcb3_dram_dq,  
    mcb3_dram_a     =>  mcb3_dram_a,  
    mcb3_dram_ba    =>  mcb3_dram_ba,
    mcb3_dram_ras_n =>  mcb3_dram_ras_n,                        
    mcb3_dram_cas_n =>  mcb3_dram_cas_n,                        
    mcb3_dram_we_n  =>  mcb3_dram_we_n,                          
    mcb3_dram_cke   =>  mcb3_dram_cke,                          
    mcb3_dram_ck    =>  mcb3_dram_ck,                          
    mcb3_dram_ck_n  =>  mcb3_dram_ck_n,       
    mcb3_dram_dqs   =>  mcb3_dram_dqs,                          
    mcb3_dram_udqs  =>  mcb3_dram_udqs,    -- for X16 parts           
    mcb3_dram_udm   =>  mcb3_dram_udm,     -- for X16 parts
    mcb3_dram_dm    =>  mcb3_dram_dm,
    mcb3_rzq        =>  mcb3_rzq,
     
    c3_sys_clk      =>  MEM_CLK,
    c3_sys_rst_n    =>  RESET0,
     
    c3_clk0	        =>  open,
    c3_rst0		=>  C3_RST0,
    c3_calib_done   =>  C3_CALIB_DONE,
  5. Add the following statement to the architecture body (right after the instantiations):
    RESET <= RESET0 or (not C3_CALIB_DONE) or C3_RST0;

The signals CLK, RESET0 and MEM_CLK are defined in the next section.

Setup the clock resource

On ZTEX USB-FPGA Modules 1.11 the memory clock is generated from the 48 MHz output clock of the EZ-USB. Unfortunately the memory interface generator (MIG) expects that the memory clock comes from an external pin (at least up to version 3.5). The common case that the clock is generated from another clock is ignored by the MIG.

In order to support internal generated clocks a few lines of the memory controller block (MCB) interface generated by the MIG have to be modified.

There are two possibilities to generate the memory clock:

  1. The memory clock is generated by an additional DCM instance.
  2. The PLL instance used to generate the internal clocks required by the MCB is modified and the 48 MHz clock is directly used as input clock. This requires no additional clock resources.

Both solutions described below defined a system clock CLK.

Memory clock generation using a DCM

This approach is used in the memtest examples.

  1. Insert the following statements before the entity declaration:
    Library UNISIM;
    use UNISIM.vcomponents.all;
  2. Define the following signals in the architecture body:
    signal DCM0_LOCKED : std_logic;
    signal DCM0_CLK_STATUS :  std_logic_vector(2 downto 1);
  3. Insert the DCM instantiation into the architecture body:
       inst_dcm0 : DCM_CLKGEN
       generic map (
          CLKFXDV_DIVIDE  => 4,        -- modify if other CLK than 50 MHz is desired
          CLKFX_DIVIDE    => 6,
          CLKFX_MULTIPLY  => 25,
          CLKFX_MD_MAX    => 0.0,
          CLKIN_PERIOD    => 20.833333,
          SPREAD_SPECTRUM => "NONE",
          STARTUP_WAIT    => FALSE 
       )
       port map (
          CLKFX     => MEM_CLK,        -- 200 MHz = 48 MHz / CLKFX_DIVIDE * CLKFX_MULTIPLY
          CLKFX180  => open,  		
          CLKFXDV   => CLK,     	   -- can be used as system clock, 50 MHz = MEM_CLK / CLKFXDV_DIVIDE
          LOCKED    => DCM0_LOCKED,
          PROGDONE  => open,
          STATUS    => DCM0_CLK_STATUS, 
          CLKIN     => FXCLK,
          FREEZEDCM => '0',
          PROGCLK   => '0',
          PROGDATA  => '0',
          PROGEN    => '0',
          RST	=> RESET_IN
       );
  4. Add the following statement to the architecture body (right after the instantiations):
    RESET0 <= RESET_IN or (not DCM0_LOCKED) or DCM0_CLK_STATUS(2);
  5. Apply the changes / patch listed below to to ipcore_dir/<ipcore name>/user_design/rtl/memc3_infrastructure.vhd, i.e. remove all global input buffer (IBUFG) stuff and replace CLKIN1 ⇒ sys_clk_ibufg by CLKIN1 ⇒ sys_clk This allows to connect an internally generated clock to the input of the PLL instance used for the generation of MCB clocks. The .diff file can be found in the memtest example as ipcore_dir/mem0/user_design/rtl/memc3_infrastructure.vhd.diff.
    --- memc3_infrastructure.orig.vhd	2010-08-20 11:42:53.000000000 +0200
    +++ memc3_infrastructure.vhd	2010-08-20 11:48:07.000000000 +0200
    @@ -122,7 +122,6 @@
       signal   mcb_drp_clk_bufg_in : std_logic;
       signal   clkfbout_clkfbin    : std_logic;
       signal   rst_tmp             : std_logic;
    -  signal   sys_clk_ibufg       : std_logic;
       signal   sys_rst             : std_logic;
       signal   rst0_sync_r         : std_logic_vector(RST_SYNC_NUM-1 downto 0);
       signal   powerup_pll_locked  : std_logic;
    @@ -135,7 +134,6 @@
       attribute KEEP : string; 
       attribute max_fanout of rst0_sync_r : signal is "10";
       attribute syn_maxfan of rst0_sync_r : signal is 10;
    -  attribute KEEP of sys_clk_ibufg     : signal is "TRUE";
     
     begin 
     
    @@ -144,33 +142,6 @@
       pll_lock <= bufpll_mcb_locked;
       mcb_drp_clk <= mcb_drp_clk_sig;
     
    -  diff_input_clk : if(C_INPUT_CLK_TYPE = "DIFFERENTIAL") generate   
    -      --***********************************************************************
    -      -- Differential input clock input buffers
    -      --***********************************************************************
    -      u_ibufg_sys_clk : IBUFGDS
    -        generic map (
    -          DIFF_TERM => TRUE		    
    -        )
    -        port map (
    -          I  => sys_clk_p,
    -          IB => sys_clk_n,
    -          O  => sys_clk_ibufg
    -          );
    -  end generate;   
    -  
    -  
    -  se_input_clk : if(C_INPUT_CLK_TYPE = "SINGLE_ENDED") generate   
    -      --***********************************************************************
    -      -- SINGLE_ENDED input clock input buffers
    -      --***********************************************************************
    -      u_ibufg_sys_clk : IBUFG
    -        port map (
    -          I  => sys_clk,
    -          O  => sys_clk_ibufg
    -          );
    -  end generate;   
    -
       --***************************************************************************
       -- Global clock generation and distribution
       --***************************************************************************
    @@ -209,7 +180,7 @@
               (
                CLKFBIN          => clkfbout_clkfbin,
                CLKINSEL         => '1',
    -           CLKIN1           => sys_clk_ibufg,
    +           CLKIN1           => sys_clk,
                CLKIN2           => '0',
                DADDR            => (others => '0'),
                DCLK             => '0',

Memory clock generation by modification of the MCB PLL

It is not possible to generate a 200 MHz clock from 48 MHz within the PLL constraints, i.e. only non-standard clocks can be generated with this method.

  1. Add the following interconnection to the memory core instantiation:
    c3_clk0 => CLK,
  2. Add the following statement to the architecture body (right after the instantiations):
    RESET0 <= RESET_IN;
    MEM_CLK <= FXCLK;
  3. Modify ipcore_dir/<ipcore name>/user_design/rtl/memc3_infrastructure.vhd
    1. Around line 114: Set the value for CLK_PERIOD_NS to
      constant CLK_PERIOD_NS  : real := 20.8333333;
    2. Around line 184: Modify the following clock dividers / multipliers of the u_pll_adv instantiation as desired (examples can be found below):
      CLKOUT0_DIVIDE     => C_CLKOUT0_DIVIDE,    -- memory clock * 2
      CLKOUT1_DIVIDE     => C_CLKOUT1_DIVIDE,    -- memory clock * 2
      CLKOUT2_DIVIDE     => C_CLKOUT2_DIVIDE,    -- memory clock / 4
      CLKOUT3_DIVIDE     => C_CLKOUT3_DIVIDE,    -- memory clock /2
      ...         
      DIVCLK_DIVIDE      => C_DIVCLK_DIVIDE,     -- valid values for 48 MHz input clock: 1 and 2
      CLKFBOUT_MULT      => C_CLKFBOUT_MULT,

Example 1: 198 MHz memory clock

CLKOUT0_DIVIDE     => 2,    -- 396 MHz, memory clock * 2
CLKOUT1_DIVIDE     => 2,    -- 396 MHz, memory clock * 2
CLKOUT2_DIVIDE     => 16,   -- 49.5 MHz, memory clock / 4
CLKOUT3_DIVIDE     => 8,    -- 99 MHz, memory clock / 2
...         
DIVCLK_DIVIDE      => 2,    -- valid values for 48 MHz input clock: 1 and 2
CLKFBOUT_MULT      => 33,    

Example 2: 200 MHz memory clock (PLL constraints violated)

This example violates the PLL constraints (minimum Phase-Frequency detector frequency: 19 MHz), but usually works:

CLKOUT0_DIVIDE     => 1,    -- 400 MHz, memory clock * 2
CLKOUT1_DIVIDE     => 1,    -- 400 MHz, memory clock * 2
CLKOUT2_DIVIDE     => 8,    -- 50 MHz, memory clock / 4
CLKOUT3_DIVIDE     => 4,    -- 100 MHz,memory clock / 2
...         
DIVCLK_DIVIDE      => 3,    -- invalid: 48 MHz / 3 < 19MHz
CLKFBOUT_MULT      => 25,    

Example 3: 132 MHz memory clock (low power setup)

The memory and the parallel (on-chip) input termination of the FPGA consumes about 1.8W at maximum frequency. By reducing the memory frequency to 132 MHz and by disabling the parallel termination the power consumption can be drastically reduced. (The memory bandwidth of this setting is 528 MByte/s.)

For this setting the MT46V32M16XX-6 should be chosen instead of the MT46V32M16XX-5B-IT and the clock period should be set to 7500 ps (step 7 in the section “Creating the IP Core”).

The parallel termination can be disabled by commenting out the following lines in the ucf file:

# NET "mcb3_dram_dq[*]"                                 IN_TERM = UNTUNED_SPLIT_50;
# NET "mcb3_dram_dqs"                                   IN_TERM = UNTUNED_SPLIT_50;
# NET "mcb3_dram_udqs"                                  IN_TERM = UNTUNED_SPLIT_50;

The settings for ipcore_dir/<ipcore name>/user_design/rtl/memc3_infrastructure.vhd are

CLKOUT0_DIVIDE     => 2,    -- 264 MHz, memory clock * 2
CLKOUT1_DIVIDE     => 2,    -- 264 MHz, memory clock * 2
CLKOUT2_DIVIDE     => 16,   -- 33 MHz, memory clock / 4
CLKOUT3_DIVIDE     => 8,    -- 66 Mhz, memory clock / 2
...         
DIVCLK_DIVIDE      => 1,    -- valid values for 48 MHz input clock: 1 and 2
CLKFBOUT_MULT      => 11,