docs/bitdoc/prjxray/partial-reconfig.rst - symbiflow-arch-defs - Git at Google

 Xilinx 7 Series F4PGA Partial Reconfiguration Flow
 ##################################################

 .. Note::
   F4PGA currently does not support partial bitstream generation.
   This is a goal in the future, but at the moment partial FASM must be concatenated with an overlay to generate a full
   bitstream.

 Background
 ==========

 Partition Regions
 -----------------

 In this documentation the terms partition region and region of interest (ROI) are used interchangeably to refer to some
 smaller portion of a larger FPGA architecture.
 This region may or may not align with frame boundaries, but the most tested use-case is for partition regions that are
 one clock region tall.

 Overlay Architecture
 --------------------

 The overlay architecture is essentially the "inverse" of all the partition regions in a design; it includes everything
 in the full device that is not in a partition region.
 Typically this includes chip IOs and the PS region if the chip has one.

 Synthetic IO Tiles (Synth IOs)
 ------------------------------

 Synthetic IO tiles are "fake" IOs inserted into the partition region architecture so VPR will route top level IOs to a
 specific graph node.
 This method allows partition region architectures to interface with each other and the overlay.

 Vivado Node vs Wire
 -------------------

 A wire is a small electrically connected part of the FPGA contained within a single tile.
 A Vivado node is an electrically connected collection of wires that can span multiple tiles.

 Flow Overview
 =============

 A simplified view of the partition region flow is as follows:

 -  Define each partition region architecture

 -  Define the overlay architecture based on the partition regions chosen

 -  Build each architecture separately

 -  Map a top level verilog file to each architecture

 -  Generate FASM for each partition region and the overlay

 -  Concatenate FASM for each architecture together and generate final bitstream

 Partition Region Example (switch_processing)
 ============================================

 This example contains two partition regions that are each about the size of one clock region.

 The goal of this test is to have two partition regions with identical interfaces so switch "data" can be passed through
 each region before being displayed on LEDs.
 Each partition region can then have an arbitrary module mapped to it that processes the data in some way before the
 output.
 The example modules used currently are an add_1 module, a blink module, and an identity module.

 Define the first partition region:

 :ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/design.json`

 .. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/design.json
   :language: JSON

 Here we see the info section defines the boundaries of the partition region.
 It is important to use the prjxray grid, not the VPR grid or the Vivado grid, to define these boundaries.
 The ports section is then used to define the interface pins for the region.
 A synth IO will be placed to correspond to each of these interface pins.
 Each pin must contain a name, pin name, type, and node name.
 The name and pin name must be unique identifiers.
 The type can be in, out or clk.
 The node is the vivado node that a synth IO should be connected to.

 Optionally, a wire name can be provided to give an exact location for the synth IO.
 If a wire is not provided it will be inferred as the first wire outside of the partition region on the given node.
 Providing an explicit wire name is especially important when using nodes that cross all the way through the partition
 region, such as clock nodes.

 Now the CMake files must be defined properly for the first partition region architecture:

 :ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/CMakeLists.txt`

 .. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/CMakeLists.txt
   :language: cmake

 The important argument here is ``ROI_DIR`` which points to the directory containing the ``design.json`` defined earlier.

 Next, define the second partition region in a similar way as the first:

 :ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/design.json`

 .. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/design.json
   :language: JSON

 :ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/CMakeLists.txt`

 .. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/CMakeLists.txt
   :language: cmake

 The last ``design.json`` that must be defined is for the overlay.
 It is mostly a list of the json for the partition regions contained in the design.
 One important change is the pin names must still be unique across all ports in the overlay.
 Any explicit wires must also be changed to be on the other side of the partition region boundary.

 :ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/design.json`

 .. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/design.json
   :language: JSON

 :ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/CMakeLists.txt`

 .. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/CMakeLists.txt
   :language: cmake

 The important argument here is ``OVERLAY_DIR`` which points to the directory containing the ``design.json`` for this
 overlay.
 Notice this ``CMakeLists.txt`` also contains more tile/pb types because it contains the real IOs.

 Continuing on past ``design.json`` definitions, CMake needs to be informed these new architectures should be built.
 This is done in another ``CMakeLists.txt`` by adding the following:

 :ghsrc:`xilinx/xc7/archs/artix7/devices/CMakeLists.txt`

 .. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/CMakeLists.txt
   :language: cmake

 The last step before switching over to adding a test is adding to ``boards.cmake``:

 :ghsrc:`xilinx/xc7/boards.cmake`

 .. literalinclude:: ../../../xilinx/xc7/boards.cmake
   :language: cmake

 This defines a separate board for each of the partition regions and overlay so they can be mapped to separately.

 Now to define a test.
 This part of the documentation will not go in detail on how to define a new test case in f4pga-arch-defs, but will
 point out items of importance for using the partial reconfiguration flow.

 All of the following snippets are from :ghsrc:`xilinx/xc7/tests/switch_processing/CMakeLists.txt`.

 .. code-block:: cmake

 	add_file_target(FILE switch_processing_add_1.v SCANNER_TYPE verilog)
 	add_fpga_target(
 	  NAME switch_processing_arty_add_1_pr1
 	  BOARD arty-switch-processing-pr1
 	  SOURCES switch_processing_add_1.v
 	  INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr1.pcf
 	  EXPLICIT_ADD_FILE_TARGET
 	  )

 	add_file_target(FILE switch_processing_blink.v SCANNER_TYPE verilog)
 	add_fpga_target(
 	  NAME switch_processing_arty_blink_pr2
 	  BOARD arty-switch-processing-pr2
 	  SOURCES switch_processing_blink.v
 	  INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr2.pcf
 	  EXPLICIT_ADD_FILE_TARGET
 	  )

 	add_file_target(FILE switch_processing_identity.v SCANNER_TYPE verilog)
 	add_fpga_target(
 	  NAME switch_processing_arty_identity_pr1
 	  BOARD arty-switch-processing-pr1
 	  SOURCES switch_processing_identity.v
 	  INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr1.pcf
 	  EXPLICIT_ADD_FILE_TARGET
 	  )

 	add_fpga_target(
 	  NAME switch_processing_arty_identity_pr2
 	  BOARD arty-switch-processing-pr2
 	  SOURCES switch_processing_identity.v
 	  INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr2.pcf
 	  EXPLICIT_ADD_FILE_TARGET
 	  )

 Here the add_1 and blink modules are mapped to pr1 and pr2 respectively.
 The identity function is then also mapped to each partition region.

 .. code-block:: cmake

 	add_file_target(FILE switch_processing_arty_overlay.v SCANNER_TYPE verilog)
 	add_fpga_target(
 	  NAME switch_processing_arty_overlay
 	  BOARD arty-switch-processing-overlay
 	  SOURCES switch_processing_arty_overlay.v
 	  INPUT_IO_FILE ${COMMON}/arty_switch_processing_overlay.pcf
 	  EXPLICIT_ADD_FILE_TARGET
 	  )

 Here the overlay verilog is mapped to the overlay architecture.
 This overlay verilog connects switches to the input of the first partition region, connects the output of the first
 partition region to the input of the second partition region, and then connects the output of the second partition
 region to LEDs.

 .. code-block:: cmake

 	add_bitstream_target(
 	  NAME switch_processing_arty_both_merged
 	  USE_FASM
 	  INCLUDED_TARGETS switch_processing_arty_add_1_pr1 switch_processing_arty_blink_pr2 switch_processing_arty_overlay
 	  )

 	add_bitstream_target(
 	  NAME switch_processing_arty_add_1_merged
 	  USE_FASM
 	  INCLUDED_TARGETS switch_processing_arty_add_1_pr1 switch_processing_arty_identity_pr2 switch_processing_arty_overlay
 	  )

 	add_bitstream_target(
 	  NAME switch_processing_arty_blink_merged
 	  USE_FASM
 	  INCLUDED_TARGETS switch_processing_arty_identity_pr1 switch_processing_arty_blink_pr2 switch_processing_arty_overlay
 	  )

 	add_bitstream_target(
 	  NAME switch_processing_arty_identity_merged
 	  USE_FASM
 	  INCLUDED_TARGETS switch_processing_arty_identity_pr1 switch_processing_arty_identity_pr2 switch_processing_arty_overlay
 	  )

 Lastly, multiple merged bitstream targets are defined.
 These targets will concatenate the FASM generated by each included target and produce the final bitstream.
 By varying which targets are included different functionality is created without having to remap any new regions after
 it has been done once.
 Just concatenate the resulting FASM and get different functionality.

 The last thing to cover related to the F4PGA partial reconfiguration flow is synthetic ibufs and obufs required in
 the overlay verilog:

 :ghsrc:`switch_processing_arty_overlay.v <xilinx/xc7/tests/switch_processing/switch_processing_arty_overlay.v>`

 Currently the ``SYN_IBUF`` and ``SYN_OBUF`` must be explicitly defined for each top level IO that will be constrained to
 a synth IO.
 In the future this should be able to be resolved using a yosys io map pass, but currently if explicit synthetic buffers
 are not defined the top level IOs will be packed into a real IO.
 This will prevent constraining the top level IOs to the intended synthetic IO location.

 The overlay pcf file can then be written to constrain real IOs to chip IOs and synthetic IOs to synthetic IOs.


 Frequently Encountered Errors
 =============================

 SYN-IOPAD unroutable
 --------------------

 * Make sure the chosen node is driven in the correct direction for the I/O type it is being used as.

 * Inputs to a partition region must be driven from outside the partition region and outputs must be driven from inside
   the partition region.
	Xilinx 7 Series F4PGA Partial Reconfiguration Flow
	##################################################

	.. Note::
	F4PGA currently does not support partial bitstream generation.
	This is a goal in the future, but at the moment partial FASM must be concatenated with an overlay to generate a full
	bitstream.

	Background
	==========

	Partition Regions
	-----------------

	In this documentation the terms partition region and region of interest (ROI) are used interchangeably to refer to some
	smaller portion of a larger FPGA architecture.
	This region may or may not align with frame boundaries, but the most tested use-case is for partition regions that are
	one clock region tall.

	Overlay Architecture
	--------------------

	The overlay architecture is essentially the "inverse" of all the partition regions in a design; it includes everything
	in the full device that is not in a partition region.
	Typically this includes chip IOs and the PS region if the chip has one.

	Synthetic IO Tiles (Synth IOs)
	------------------------------

	Synthetic IO tiles are "fake" IOs inserted into the partition region architecture so VPR will route top level IOs to a
	specific graph node.
	This method allows partition region architectures to interface with each other and the overlay.

	Vivado Node vs Wire
	-------------------

	A wire is a small electrically connected part of the FPGA contained within a single tile.
	A Vivado node is an electrically connected collection of wires that can span multiple tiles.

	Flow Overview
	=============

	A simplified view of the partition region flow is as follows:

	- Define each partition region architecture

	- Define the overlay architecture based on the partition regions chosen

	- Build each architecture separately

	- Map a top level verilog file to each architecture

	- Generate FASM for each partition region and the overlay

	- Concatenate FASM for each architecture together and generate final bitstream

	Partition Region Example (switch_processing)
	============================================

	This example contains two partition regions that are each about the size of one clock region.

	The goal of this test is to have two partition regions with identical interfaces so switch "data" can be passed through
	each region before being displayed on LEDs.
	Each partition region can then have an arbitrary module mapped to it that processes the data in some way before the
	output.
	The example modules used currently are an add_1 module, a blink module, and an identity module.

	Define the first partition region:

	:ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/design.json`

	.. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/design.json
	:language: JSON

	Here we see the info section defines the boundaries of the partition region.
	It is important to use the prjxray grid, not the VPR grid or the Vivado grid, to define these boundaries.
	The ports section is then used to define the interface pins for the region.
	A synth IO will be placed to correspond to each of these interface pins.
	Each pin must contain a name, pin name, type, and node name.
	The name and pin name must be unique identifiers.
	The type can be in, out or clk.
	The node is the vivado node that a synth IO should be connected to.

	Optionally, a wire name can be provided to give an exact location for the synth IO.
	If a wire is not provided it will be inferred as the first wire outside of the partition region on the given node.
	Providing an explicit wire name is especially important when using nodes that cross all the way through the partition
	region, such as clock nodes.

	Now the CMake files must be defined properly for the first partition region architecture:

	:ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/CMakeLists.txt`

	.. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr1-roi-virt/CMakeLists.txt
	:language: cmake

	The important argument here is ``ROI_DIR`` which points to the directory containing the ``design.json`` defined earlier.

	Next, define the second partition region in a similar way as the first:

	:ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/design.json`

	.. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/design.json
	:language: JSON

	:ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/CMakeLists.txt`

	.. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-pr2-roi-virt/CMakeLists.txt
	:language: cmake

	The last ``design.json`` that must be defined is for the overlay.
	It is mostly a list of the json for the partition regions contained in the design.
	One important change is the pin names must still be unique across all ports in the overlay.
	Any explicit wires must also be changed to be on the other side of the partition region boundary.

	:ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/design.json`

	.. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/design.json
	:language: JSON

	:ghsrc:`xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/CMakeLists.txt`

	.. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/xc7a50t-arty-switch-processing-overlay-virt/CMakeLists.txt
	:language: cmake

	The important argument here is ``OVERLAY_DIR`` which points to the directory containing the ``design.json`` for this
	overlay.
	Notice this ``CMakeLists.txt`` also contains more tile/pb types because it contains the real IOs.

	Continuing on past ``design.json`` definitions, CMake needs to be informed these new architectures should be built.
	This is done in another ``CMakeLists.txt`` by adding the following:

	:ghsrc:`xilinx/xc7/archs/artix7/devices/CMakeLists.txt`

	.. literalinclude:: ../../../xilinx/xc7/archs/artix7/devices/CMakeLists.txt
	:language: cmake

	The last step before switching over to adding a test is adding to ``boards.cmake``:

	:ghsrc:`xilinx/xc7/boards.cmake`

	.. literalinclude:: ../../../xilinx/xc7/boards.cmake
	:language: cmake

	This defines a separate board for each of the partition regions and overlay so they can be mapped to separately.

	Now to define a test.
	This part of the documentation will not go in detail on how to define a new test case in f4pga-arch-defs, but will
	point out items of importance for using the partial reconfiguration flow.

	All of the following snippets are from :ghsrc:`xilinx/xc7/tests/switch_processing/CMakeLists.txt`.

	.. code-block:: cmake

	add_file_target(FILE switch_processing_add_1.v SCANNER_TYPE verilog)
	add_fpga_target(
	NAME switch_processing_arty_add_1_pr1
	BOARD arty-switch-processing-pr1
	SOURCES switch_processing_add_1.v
	INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr1.pcf
	EXPLICIT_ADD_FILE_TARGET
	)

	add_file_target(FILE switch_processing_blink.v SCANNER_TYPE verilog)
	add_fpga_target(
	NAME switch_processing_arty_blink_pr2
	BOARD arty-switch-processing-pr2
	SOURCES switch_processing_blink.v
	INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr2.pcf
	EXPLICIT_ADD_FILE_TARGET
	)

	add_file_target(FILE switch_processing_identity.v SCANNER_TYPE verilog)
	add_fpga_target(
	NAME switch_processing_arty_identity_pr1
	BOARD arty-switch-processing-pr1
	SOURCES switch_processing_identity.v
	INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr1.pcf
	EXPLICIT_ADD_FILE_TARGET
	)

	add_fpga_target(
	NAME switch_processing_arty_identity_pr2
	BOARD arty-switch-processing-pr2
	SOURCES switch_processing_identity.v
	INPUT_IO_FILE ${COMMON}/arty_switch_processing_pr2.pcf
	EXPLICIT_ADD_FILE_TARGET
	)

	Here the add_1 and blink modules are mapped to pr1 and pr2 respectively.
	The identity function is then also mapped to each partition region.

	.. code-block:: cmake

	add_file_target(FILE switch_processing_arty_overlay.v SCANNER_TYPE verilog)
	add_fpga_target(
	NAME switch_processing_arty_overlay
	BOARD arty-switch-processing-overlay
	SOURCES switch_processing_arty_overlay.v
	INPUT_IO_FILE ${COMMON}/arty_switch_processing_overlay.pcf
	EXPLICIT_ADD_FILE_TARGET
	)

	Here the overlay verilog is mapped to the overlay architecture.
	This overlay verilog connects switches to the input of the first partition region, connects the output of the first
	partition region to the input of the second partition region, and then connects the output of the second partition
	region to LEDs.

	.. code-block:: cmake

	add_bitstream_target(
	NAME switch_processing_arty_both_merged
	USE_FASM
	INCLUDED_TARGETS switch_processing_arty_add_1_pr1 switch_processing_arty_blink_pr2 switch_processing_arty_overlay
	)

	add_bitstream_target(
	NAME switch_processing_arty_add_1_merged
	USE_FASM
	INCLUDED_TARGETS switch_processing_arty_add_1_pr1 switch_processing_arty_identity_pr2 switch_processing_arty_overlay
	)

	add_bitstream_target(
	NAME switch_processing_arty_blink_merged
	USE_FASM
	INCLUDED_TARGETS switch_processing_arty_identity_pr1 switch_processing_arty_blink_pr2 switch_processing_arty_overlay
	)

	add_bitstream_target(
	NAME switch_processing_arty_identity_merged
	USE_FASM
	INCLUDED_TARGETS switch_processing_arty_identity_pr1 switch_processing_arty_identity_pr2 switch_processing_arty_overlay
	)

	Lastly, multiple merged bitstream targets are defined.
	These targets will concatenate the FASM generated by each included target and produce the final bitstream.
	By varying which targets are included different functionality is created without having to remap any new regions after
	it has been done once.
	Just concatenate the resulting FASM and get different functionality.

	The last thing to cover related to the F4PGA partial reconfiguration flow is synthetic ibufs and obufs required in
	the overlay verilog:

	:ghsrc:`switch_processing_arty_overlay.v <xilinx/xc7/tests/switch_processing/switch_processing_arty_overlay.v>`

	Currently the ``SYN_IBUF`` and ``SYN_OBUF`` must be explicitly defined for each top level IO that will be constrained to
	a synth IO.
	In the future this should be able to be resolved using a yosys io map pass, but currently if explicit synthetic buffers
	are not defined the top level IOs will be packed into a real IO.
	This will prevent constraining the top level IOs to the intended synthetic IO location.

	The overlay pcf file can then be written to constrain real IOs to chip IOs and synthetic IOs to synthetic IOs.


	Frequently Encountered Errors
	=============================

	SYN-IOPAD unroutable
	--------------------

	* Make sure the chosen node is driven in the correct direction for the I/O type it is being used as.

	* Inputs to a partition region must be driven from outside the partition region and outputs must be driven from inside
	the partition region.