blob: d25c8b439e5e71708a2312b83eb0b138980b0f83 [file] [log] [blame]
.. _arch_model_timing_tutorial:
Primitive Block Timing Modeling Tutorial
----------------------------------------
To accurately model an FPGA, the architect needs to specify the timing characteristics of the FPGA's primitives blocks.
This involves two key steps:
#. Specifying the logical timing characteristics of a primitive including:
* whether primitive pins are sequential or combinational, and
* what the timing dependencies are between the pins.
#. Specifying the physical delay values
These two steps separate the logical timing characteristics of a primitive, from the physically dependant delays.
This enables a single logical netlist primitive type (e.g. Flip-Flop) to be mapped into different physical locations with different timing characteristics.
The :ref:`FPGA architecture description <fpga_architecture_description>` describes the logical timing characteristics in the :ref:`models section <arch_models>`, while the physical timing information is specified on ``pb_types`` within :ref:`complex block <arch_complex_blocks>`.
The following sections illustrate some common block timing modeling approaches.
Combinational block
~~~~~~~~~~~~~~~~~~~
A typical combinational block is a full adder,
.. figure:: fa.*
:width: 50%
Full Adder
where ``a``, ``b`` and ``cin`` are combinational inputs, and ``sum`` and ``cout`` are combinational outputs.
We can model these timing dependencies on the model with the ``combinational_sink_ports``, which specifies the output ports which are dependant on an input port:
.. code-block:: xml
<model name="adder">
<input_ports>
<port name="a" combinational_sink_ports="sum cout"/>
<port name="b" combinational_sink_ports="sum cout"/>
<port name="cin" combinational_sink_ports="sum cout"/>
</input_ports>
<output_ports>
<port name="sum"/>
<port name="cout"/>
</output_ports>
</model>
The physical timing delays are specified on any ``pb_type`` instances of the adder model.
For example:
.. code-block:: xml
<pb_type name="adder" blif_model=".subckt adder" num_pb="1">
<input name="a" num_pins="1"/>
<input name="b" num_pins="1"/>
<input name="cin" num_pins="1"/>
<output name="cout" num_pins="1"/>
<output name="sum" num_pins="1"/>
<delay_constant max="300e-12" in_port="adder.a" out_port="adder.sum"/>
<delay_constant max="300e-12" in_port="adder.b" out_port="adder.sum"/>
<delay_constant max="300e-12" in_port="adder.cin" out_port="adder.sum"/>
<delay_constant max="300e-12" in_port="adder.a" out_port="adder.cout"/>
<delay_constant max="300e-12" in_port="adder.b" out_port="adder.cout"/>
<delay_constant max="10e-12" in_port="adder.cin" out_port="adder.cout"/>
</pb_type>
specifies that all the edges of 300ps delays, except to ``cin`` to ``cout`` edge which has a delay of 10ps.
.. _dff_timing_modeling:
Sequential block (no internal paths)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A typical sequential block is a D-Flip-Flop (DFF).
DFFs have no internal timing paths between their input and output ports.
.. note:: If you are using BLIF's ``.latch`` directive to represent DFFs there is no need to explicitly provide a ``<model>`` definition, as it is supported by default.
.. figure:: dff.*
:width: 50%
DFF
Sequential model ports are specified by providing the ``clock="<name>"`` attribute, where ``<name>`` is the name of the associated clock ports.
The assoicated clock port must have ``is_clock="1"`` specified to indicate it is a clock.
.. code-block:: xml
<model name="dff">
<input_ports>
<port name="d" clock="clk"/>
<port name="clk" is_clock="1"/>
</input_ports>
<output_ports>
<port name="q" clock="clk"/>
</output_ports>
</model>
The physical timing delays are specified on any ``pb_type`` instances of the model.
In the example below the setup-time of the input is specified as 66ps, while the clock-to-q delay of the output is set to 124ps.
.. code-block:: xml
<pb_type name="ff" blif_model=".subckt dff" num_pb="1">
<input name="D" num_pins="1"/>
<output name="Q" num_pins="1"/>
<clock name="clk" num_pins="1"/>
<T_setup value="66e-12" port="ff.D" clock="clk"/>
<T_clock_to_Q max="124e-12" port="ff.Q" clock="clk"/>
</pb_type>
.. _mixed_sp_ram_timing_modeling:
Mixed Sequential/Combinational Block
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is possible to define a block with some sequential ports and some combinational ports.
In the example below, the ``single_port_ram_mixed`` has sequential input ports: ``we``, ``addr`` and ``data`` (which are controlled by ``clk``).
.. figure:: mixed_sp_ram.*
:width: 75%
Mixed sequential/combinational single port ram
However the output port (``out``) is a combinational output, connected internally to the ``we``, ``addr`` and ``data`` input registers.
.. code-block:: xml
<model name="single_port_ram_mixed">
<input_ports>
<port name="we" clock="clk" combinational_sink_ports="out"/>
<port name="addr" clock="clk" combinational_sink_ports="out"/>
<port name="data" clock="clk" combinational_sink_ports="out"/>
<port name="clk" is_clock="1"/>
</input_ports>
<output_ports>
<port name="out"/>
</output_ports>
</model>
In the ``pb_type`` we define the external setup time of the input registers (50ps) as we did for :ref:`dff_timing_modeling`.
However, we also specify the following additional timing information:
* The internal clock-to-q delay of the input registers (200ps)
* The combinational delay from the input registers to the ``out`` port (800ps)
.. code-block:: xml
<pb_type name="mem_sp" blif_model=".subckt single_port_ram_mixed" num_pb="1">
<input name="addr" num_pins="9"/>
<input name="data" num_pins="64"/>
<input name="we" num_pins="1"/>
<output name="out" num_pins="64"/>
<clock name="clk" num_pins="1"/>
<!-- External input register timing -->
<T_setup value="50e-12" port="mem_sp.addr" clock="clk"/>
<T_setup value="50e-12" port="mem_sp.data" clock="clk"/>
<T_setup value="50e-12" port="mem_sp.we" clock="clk"/>
<!-- Internal input register timing -->
<T_clock_to_Q max="200e-12" port="mem_sp.addr" clock="clk"/>
<T_clock_to_Q max="200e-12" port="mem_sp.data" clock="clk"/>
<T_clock_to_Q max="200e-12" port="mem_sp.we" clock="clk"/>
<!-- Internal combinational delay -->
<delay_constant max="800e-12" in_port="mem_sp.addr" out_port="mem_sp.out"/>
<delay_constant max="800e-12" in_port="mem_sp.data" out_port="mem_sp.out"/>
<delay_constant max="800e-12" in_port="mem_sp.we" out_port="mem_sp.out"/>
</pb_type>
.. _seq_sp_ram_timing_modeling:
Sequential block (with internal paths)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Some primitives represent more complex architecture primitives, which have timing paths contained completely within the block.
The model below specifies a sequential single-port RAM.
The ports ``we``, ``addr``, and ``data`` are sequential inputs, while the port ``out`` is a sequential output.
``clk`` is the common clock.
.. figure:: seq_sp_ram.*
:width: 75%
Sequential single port ram
.. code-block:: xml
<model name="single_port_ram_seq">
<input_ports>
<port name="we" clock="clk" combinational_sink_ports="out"/>
<port name="addr" clock="clk" combinational_sink_ports="out"/>
<port name="data" clock="clk" combinational_sink_ports="out"/>
<port name="clk" is_clock="1"/>
</input_ports>
<output_ports>
<port name="out" clock="clk"/>
</output_ports>
</model>
Similarly to :ref:`mixed_sp_ram_timing_modeling` the ``pb_type`` defines the input register timing:
* external input register setup time (50ps)
* internal input register clock-to-q time (200ps)
Since the output port ``out`` is sequential we also define the:
* internal *output* register setup time (60ps)
* external *output* register clock-to-q time (300ps)
The combinational delay between the input and output registers is set to 740ps.
Note the internal path from the input to output registers can limit the maximum operating frequency.
In this case the internal path delay is 1ns (200ps + 740ps + 60ps) limiting the maximum frequency to 1 GHz.
.. code-block:: xml
<pb_type name="mem_sp" blif_model=".subckt single_port_ram_seq" num_pb="1">
<input name="addr" num_pins="9"/>
<input name="data" num_pins="64"/>
<input name="we" num_pins="1"/>
<output name="out" num_pins="64"/>
<clock name="clk" num_pins="1"/>
<!-- External input register timing -->
<T_setup value="50e-12" port="mem_sp.addr" clock="clk"/>
<T_setup value="50e-12" port="mem_sp.data" clock="clk"/>
<T_setup value="50e-12" port="mem_sp.we" clock="clk"/>
<!-- Internal input register timing -->
<T_clock_to_Q max="200e-12" port="mem_sp.addr" clock="clk"/>
<T_clock_to_Q max="200e-12" port="mem_sp.data" clock="clk"/>
<T_clock_to_Q max="200e-12" port="mem_sp.we" clock="clk"/>
<!-- Internal combinational delay -->
<delay_constant max="740e-12" in_port="mem_sp.addr" out_port="mem_sp.out"/>
<delay_constant max="740e-12" in_port="mem_sp.data" out_port="mem_sp.out"/>
<delay_constant max="740e-12" in_port="mem_sp.we" out_port="mem_sp.out"/>
<!-- Internal output register timing -->
<T_setup value="60e-12" port="mem_sp.out" clock="clk"/>
<!-- External output register timing -->
<T_clock_to_Q max="300e-12" port="mem_sp.out" clock="clk"/>
</pb_type>
.. _seq_sp_ram_comb_inputs_timing_modeling:
Sequential block (with internal paths and combinational input)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A primitive may have a mix of sequential and combinational inputs.
The model below specifies a mostly sequential single-port RAM.
The ports ``addr``, and ``data`` are sequential inputs, while the port ``we`` is a combinational input.
The port ``out`` is a sequential output.
``clk`` is the common clock.
.. figure:: seq_comb_sp_ram.*
:width: 75%
Sequential single port ram with a combinational input
.. code-block:: xml
:emphasize-lines: 3
<model name="single_port_ram_seq_comb">
<input_ports>
<port name="we" combinational_sink_ports="out"/>
<port name="addr" clock="clk" combinational_sink_ports="out"/>
<port name="data" clock="clk" combinational_sink_ports="out"/>
<port name="clk" is_clock="1"/>
</input_ports>
<output_ports>
<port name="out" clock="clk"/>
</output_ports>
</model>
We use register delays similar to :ref:`seq_sp_ram_timing_modeling`.
However we also specify the purely combinational delay between the combinational ``we`` input and sequential output ``out`` (800ps).
Note that the setup time of the output register still effects the ``we`` to ``out`` path for an effective delay of 860ps.
.. code-block:: xml
:emphasize-lines: 17
<pb_type name="mem_sp" blif_model=".subckt single_port_ram_seq_comb" num_pb="1">
<input name="addr" num_pins="9"/>
<input name="data" num_pins="64"/>
<input name="we" num_pins="1"/>
<output name="out" num_pins="64"/>
<clock name="clk" num_pins="1"/>
<!-- External input register timing -->
<T_setup value="50e-12" port="mem_sp.addr" clock="clk"/>
<T_setup value="50e-12" port="mem_sp.data" clock="clk"/>
<!-- Internal input register timing -->
<T_clock_to_Q max="200e-12" port="mem_sp.addr" clock="clk"/>
<T_clock_to_Q max="200e-12" port="mem_sp.data" clock="clk"/>
<!-- External combinational delay -->
<delay_constant max="800e-12" in_port="mem_sp.we" out_port="mem_sp.out"/>
<!-- Internal combinational delay -->
<delay_constant max="740e-12" in_port="mem_sp.addr" out_port="mem_sp.out"/>
<delay_constant max="740e-12" in_port="mem_sp.data" out_port="mem_sp.out"/>
<!-- Internal output register timing -->
<T_setup value="60e-12" port="mem_sp.out" clock="clk"/>
<!-- External output register timing -->
<T_clock_to_Q max="300e-12" port="mem_sp.out" clock="clk"/>
</pb_type>
.. _multiclock_dp_ram_timing_modeling:
Multi-clock Sequential block (with internal paths)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
It is also possible for a sequential primitive to have multiple clocks.
The following model represents a multi-clock simple dual-port sequential RAM with:
* one write port (``addr1`` and ``data1``, ``we1``) controlled by ``clk1``, and
* one read port (``addr2`` and ``data2``) controlled by ``clk2``.
.. figure:: multiclock_dp_ram.*
:width: 75%
Multi-clock sequential simple dual port ram
.. code-block:: xml
<model name="multiclock_dual_port_ram">
<input_ports>
<!-- Write Port -->
<port name="we1" clock="clk1" combinational_sink_ports="data2"/>
<port name="addr1" clock="clk1" combinational_sink_ports="data2"/>
<port name="data1" clock="clk1" combinational_sink_ports="data2"/>
<port name="clk1" is_clock="1"/>
<!-- Read Port -->
<port name="addr2" clock="clk2" combinational_sink_ports="data2"/>
<port name="clk2" is_clock="1"/>
</input_ports>
<output_ports>
<!-- Read Port -->
<port name="data2" clock="clk2" combinational_sink_ports="data2"/>
</output_ports>
</model>
On the ``pb_type`` the input and output register timing is defined similarly to :ref:`seq_sp_ram_timing_modeling`, except multiple clocks are used.
.. code-block:: xml
<pb_type name="mem_dp" blif_model=".subckt multiclock_dual_port_ram" num_pb="1">
<input name="addr1" num_pins="9"/>
<input name="data1" num_pins="64"/>
<input name="we1" num_pins="1"/>
<input name="addr2" num_pins="9"/>
<output name="data2" num_pins="64"/>
<clock name="clk1" num_pins="1"/>
<clock name="clk2" num_pins="1"/>
<!-- External input register timing -->
<T_setup value="50e-12" port="mem_dp.addr1" clock="clk1"/>
<T_setup value="50e-12" port="mem_dp.data1" clock="clk1"/>
<T_setup value="50e-12" port="mem_dp.we1" clock="clk1"/>
<T_setup value="50e-12" port="mem_dp.addr2" clock="clk2"/>
<!-- Internal input register timing -->
<T_clock_to_Q max="200e-12" port="mem_dp.addr1" clock="clk1"/>
<T_clock_to_Q max="200e-12" port="mem_dp.data1" clock="clk1"/>
<T_clock_to_Q max="200e-12" port="mem_dp.we1" clock="clk1"/>
<T_clock_to_Q max="200e-12" port="mem_dp.addr2" clock="clk2"/>
<!-- Internal combinational delay -->
<delay_constant max="740e-12" in_port="mem_dp.addr1" out_port="mem_dp.data2"/>
<delay_constant max="740e-12" in_port="mem_dp.data1" out_port="mem_dp.data2"/>
<delay_constant max="740e-12" in_port="mem_dp.we1" out_port="mem_dp.data2"/>
<delay_constant max="740e-12" in_port="mem_dp.addr2" out_port="mem_dp.data2"/>
<!-- Internal output register timing -->
<T_setup value="60e-12" port="mem_dp.data2" clock="clk2"/>
<!-- External output register timing -->
<T_clock_to_Q max="300e-12" port="mem_dp.data2" clock="clk2"/>
</pb_type>
.. _clock_generator_timing_modeling:
Clock Generators
~~~~~~~~~~~~~~~~
Some blocks (such as PLLs) generate clocks on-chip.
To ensure that these generated clocks are identified as clocks, the associated model output port should be marked with ``is_clock="1"``.
As an example consider the following simple PLL model:
.. code-block:: xml
<model name="simple_pll">
<input_ports>
<port name="in_clock" is_clock="1"/>
</input_ports>
<output_ports>
<port name="out_clock" is_clock="1"/>
</output_ports>
</model>
The port named ``in_clock`` is specified as a clock sink, since it is an input port with ``is_clock="1"`` set.
The port named ``out_clock`` is specified as a clock generator, since it is an *output* port with ``is_clock="1"`` set.
.. _clock_buffers_timing_modeling:
Clock Buffers & Muxes
~~~~~~~~~~~~~~~~~~~~~
Some architectures contain special primitives for buffering or controling clocks.
VTR supports modelling these using the ``is_clock`` attritube on the model to differentiate between 'data' and 'clock' signals, allowing users to control how clocks are traced through these primitives.
When VPR traces through the netlist it will propagate clocks from clock inputs to the downstream combinationally connected pins.
Clock Buffers/Gates
^^^^^^^^^^^^^^^^^^^
Consider the following black-box clock buffer with an enable:
.. code-block:: none
.subckt clkbufce \
in=clk3 \
enable=clk3_enable \
out=clk3_buf
We wish to have VPR understand that the ``in`` port of the ``clkbufce`` connects to the ``out`` port, and that as a result the nets ``clk3`` and ``clk3_buf`` are equivalent.
This is accomplished by tagging the ``in`` port as a clock (``is_clock="1"``), and combinationally connecting it to the ``out`` port (``combinational_sink_ports="out"``):
.. code-block:: xml
<model name="clkbufce">
<input_ports>
<port name="in" combinational_sink_ports="out" is_clock="1"/>
<port name="enable" combinational_sink_ports="out"/>
</input_ports>
<output_ports>
<port name="out"/>
</output_ports>
</model>
With the corresponding pb_type:
.. code-block:: xml
<pb_type name="clkbufce" blif_model="clkbufce" num_pb="1">
<clock name="in" num_pins="1"/>
<input name="enable" num_pins="1"/>
<output name="out" num_pins="1"/>
<delay_constant max="10e-12" in_port="clkbufce.in" out_port="clkbufce.out"/>
<delay_constant max="5e-12" in_port="clkbufce.enable" out_port="clkbufce.out"/>
</pb_type>
Notably, although the ``enable`` port is combinationally connected to the ``out`` port it will not be considered as a potential clock since it is not marked with ``is_clock="1"``.
Clock Muxes
^^^^^^^^^^^
Another common clock control block is a clock mux, which selects from one of several potential clocks.
For instance, consider:
.. code-block:: none
.subckt clkmux \
clk1=clka \
clk2=clkb \
sel=select \
clk_out=clk_downstream
which selects one of two input clocks (``clk1`` and ``clk2``) to be passed through to (``clk_out``), controlled on the value of ``sel``.
This could be modelled as:
.. code-block:: xml
<model name="clkmux">
<input_ports>
<port name="clk1" combinational_sink_ports="clk_out" is_clock="1"/>
<port name="clk2" combinational_sink_ports="clk_out" is_clock="1"/>
<port name="sel" combinational_sink_ports="clk_out"/>
</input_ports>
<output_ports>
<port name="clk_out"/>
</output_ports>
</model>
<pb_type name="clkmux" blif_model="clkmux" num_pb="1">
<clock name="clk1" num_pins="1"/>
<clock name="clk2" num_pins="1"/>
<input name="sel" num_pins="1"/>
<output name="clk_out" num_pins="1"/>
<delay_constant max="10e-12" in_port="clkmux.clk1" out_port="clkmux.clk_out"/>
<delay_constant max="10e-12" in_port="clkmux.clk2" out_port="clkmux.clk_out"/>
<delay_constant max="20e-12" in_port="clkmux.sel" out_port="clkmux.clk_out"/>
</pb_type>
where both input clock ports ``clk1`` and ``clk2`` are tagged with ``is_clock="1"`` and combinationally connected to the ``clk_out`` port.
As a result both nets ``clka`` and ``clkb`` in the netlist would be identified as independent clocks feeding ``clk_downstream``.
.. note::
Clock propagation is driven by netlist connectivity so if one of the input clock ports (e.g. ``clk1``) was disconnected in the netlist no associated clock would be created/considered.
Clock Mux Timing Constraints
""""""""""""""""""""""""""""
For the clock mux example above, if the user specified the following :ref:`SDC timing constraints <sdc_commands>`:
.. code-block:: tcl
create_clock -period 3 clka
create_clock -period 2 clkb
VPR would propagate both ``clka`` and ``clkb`` through the clock mux.
Therefore the logic connected to ``clk_downstream`` would be analyzed for both the ``clka`` and ``clkb`` constraints.
Most likely (unless ``clka`` and ``clkb`` are used elswhere) the user should additionally specify:
.. code-block:: tcl
set_clock_groups -exclusive -group clka -group clkb
Which avoids analyzing paths between the two clocks (i.e. ``clka`` -> ``clkb`` and ``clkb`` -> ``clka``) which are not physically realizable.
The muxing logic means only one clock can drive ``clk_downstream`` at any point in time (i.e. the mux enforces that ``clka`` and ``clkb`` are mutually exclusive).
This is the behaviour of :ref:`VPR's default timing constraints <default_timing_constraints>`.