Initial implemention of genfasm (#510)

genfasm is a tool for generating FASM output based on arch and rrgraph
metadata, for the purposes of bitstream generation.

Signed-off-by: Keith Rothman <537074+litghost@users.noreply.github.com>
diff --git a/CMakeLists.txt b/CMakeLists.txt
index d5c1ee2..0b54a72 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -258,6 +258,7 @@
 add_subdirectory(abc)
 add_subdirectory(ODIN_II)
 add_subdirectory(ace2)
+add_subdirectory(utils)
 if(WITH_BLIFEXPLORER)
     add_subdirectory(blifexplorer)
 endif()
diff --git a/doc/src/index.rst b/doc/src/index.rst
index 907e80f..63a2e08 100644
--- a/doc/src/index.rst
+++ b/doc/src/index.rst
@@ -22,6 +22,7 @@
    odin/index
    abc/index
    tutorials/index
+   utils/index
 
 .. toctree::
    :maxdepth: 2
diff --git a/doc/src/utils/fasm.rst b/doc/src/utils/fasm.rst
new file mode 100644
index 0000000..6b27ef4
--- /dev/null
+++ b/doc/src/utils/fasm.rst
@@ -0,0 +1,263 @@
+FPGA Assembly (FASM) Output Support
+===================================
+
+After VPR has generated a place and routed design, the ``genfasm`` utility can
+emit a FASM_ file to represent design via FASM metadata encoded in the VPR
+architecture definition and routing graph.  The output FASM file can then be
+converted into the target architecture via architecture specific tooling.
+
+.. _FASM: https://github.com/SymbiFlow/fasm
+
+FASM metadata
+-------------
+
+The ``genfasm`` utility uses ``metadata`` blocks (see :ref:`arch_metadata`)
+attached to the architecture definition and routing graph to emit FASM
+features.  By adding FASM specific metadata to both the architecture
+definition and the routing graph, a FASM file that represents the place and
+routed design can be generated.
+
+All metadata tags are ignored when packing, placing and routing.  After VPR has
+been completed placement, ``genfasm`` utility loads the VPR output files
+(.net, .place, .route) and then uses the FASM metadata to emit a FASM file.
+The following metadata "keys" are recognized by ``genfasm``:
+
+ * "fasm_prefix"
+ * "fasm_features"
+ * "fasm_type" and "fasm_lut"
+ * "fasm_mux"
+ * "fasm_params"
+
+Invoking genfasm
+----------------
+
+``genfasm`` expects that place and route on the design is completed (e.g.
+.net, .place, .route files are present), so ensure that routing is complete
+before executing ``genfasm``.  ``genfasm`` should be invoked in the same
+subdirectory as the routing output.  The output FASM file will be written to
+``<blif root>.fasm``.
+
+FASM prefixing
+--------------
+
+FASM feature names has structure through their prefixes.  In general the first
+part of the FASM feature is the location of the feature, such as the name of
+the tile the feature is located in, e.g. INT_L_X5Y6 or CLBLL_L_X10Y12.  The
+next part is typically an identifier within the tile.  For example a CLBLL
+tile has two slices, so the next part of the FASM feature name is the slice
+identifier, e.g. SLICE_X0 or SLICE_X1.
+
+Now consider the CLBLL_L pb_type.  This pb_type is repeated in the grid for
+each tile of that type.  To allow one pb_type definition to be defined, the
+"fasm_prefix" metadata tag is allowed to be attached at the layout level on
+the <single> tag.  This enables the same pb_type to be used for all CLBLL_L
+tiles, and the "fasm_prefix" is prepended to all FASM metadata within that
+pb_type.  For example:
+
+.. code-block:: xml
+
+      <single priority="1" type="BLK_TI-CLBLL_L" x="35" y="51">
+        <metadata>
+          <meta name="fasm_prefix">CLBLL_L_X12Y100</meta>
+        </metadata>
+      </single>
+      <single priority="1" type="BLK_TI-CLBLL_L" x="35" y="50">
+        <metadata>
+          <meta name="fasm_prefix">CLBLL_L_X12Y101</meta>
+        </metadata>
+      </single>
+
+"fasm_prefix" tags can also be used within a pb_type to handle repeated
+features.  For example in the CLB, there are 4 LUTs that can be described by
+a common pb_type, except that the prefix changes for each.  For example,
+consider the FF's within a CLB.  There are 8 FF's that share a common
+structure, except for a prefix change.  "fasm_prefix" can be a space
+separated list to assign prefixes to the index of the pb_type, rather than
+needing to emit N copies of the pb_type with varying prefixes.
+
+.. code-block:: xml
+
+    <pb_type name="BEL_FF-FDSE_or_FDRE" num_pb="8">
+      <input  name="D" num_pins="1"/>
+      <input  name="CE" num_pins="1"/>
+      <clock  name="C" num_pins="1"/>
+      <input  name="SR" num_pins="1"/>
+      <output name="Q" num_pins="1"/>
+      <metadata>
+        <meta name="fasm_prefix">AFF BFF CFF DFF A5FF B5FF C5FF D5FF</meta>
+      </metadata>
+    </pb_type>
+
+Construction of the prefix
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+"fasm_prefix" is accumulated throughout the structure of the architecture
+definition.  Each "fasm_prefix" is joined together with a period ('.'), and
+then a period is added after the prefix before the FASM feature name.
+
+
+Simple FASM feature emissions
+-----------------------------
+
+In cases where a FASM feature needs to be emitted simply via use of a pb_type,
+the "fasm_features" tag can be used.  If the pb_type (or mode) is selected,
+then all "fasm_features" in the metadata will be emitted.  Multiple features
+can be listed, whitespace separated.  Example:
+
+.. code-block:: xml
+
+    <metadata>
+        <meta name="fasm_features">ZRST</meta>
+    </metadata>
+
+The other place that "fasm_features" is used heavily is on <edge> tags in the
+routing graph.  If an edge is used in the final routed design, "genfasm" will
+emit features attached to the edge.  Example:
+
+.. code-block:: xml
+
+    <edge sink_node="431195" src_node="418849" switch_id="0">
+      <metadata>
+        <meta name="fasm_features">HCLK_R_X58Y130.HCLK_LEAF_CLK_B_TOP4.HCLK_CK_BUFHCLK7 HCLK_R_X58Y130.ENABLE_BUFFER.HCLK_CK_BUFHCLK7</meta>
+      </metadata>
+    </edge>
+
+In this example, when the routing graph connects node 418849 to 431195, two
+FASM features will be emitted:
+
+ * ``HCLK_R_X58Y130.HCLK_LEAF_CLK_B_TOP4.HCLK_CK_BUFHCLK7``
+ * ``HCLK_R_X58Y130.ENABLE_BUFFER.HCLK_CK_BUFHCLK7``
+
+Emitting LUTs
+-------------
+
+LUTs are a structure that is explicitly understood by VPR.  In order to emit
+LUTs, two metadata keys must be used, "fasm_type" and "fasm_lut".  "fasm_type"
+must be either "LUT" or "SPLIT_LUT".  The "fasm_type" modifies how the
+"fasm_lut" key is interpreted.  If the pb_type that the metadata is attached
+to has no "num_pb" or "num_pb" equals 1, then "fasm_type" can be "LUT".
+"fasm_lut" is then the feature that represents the LUT table storage features,
+example:
+
+.. code-block:: xml
+
+   <metadata>
+     <meta name="fasm_type">LUT</meta>
+     <meta name="fasm_lut">
+       ALUT.INIT[63:0]
+     </meta>
+   </metadata>
+
+When specifying a FASM features with more than one bit, explicitly specify the
+bit range being set.  This is required because "genfasm" does not have access
+to the actual bit database, and would otherwise not have the width of the
+feature.
+
+When "fasm_type" is "SPLIT_LUT", "fasm_lut" must specify both the feature that
+represents the LUT table storage features and the pb_type path to the LUT
+being specified.  Example:
+
+.. code-block:: xml
+
+   <metadata>
+     <meta name="fasm_type">SPLIT_LUT</meta>
+     <meta name="fasm_lut">
+       ALUT.INIT[31:0] = BEL_LT-A5LUT[0]
+       ALUT.INIT[63:32] = BEL_LT-A5LUT[1]
+     </meta>
+   </metadata>
+
+In this case, the LUT in pb_type BEL_LT-A5LUT[0] will use INIT[31:0], and the
+LUT in pb_type BEL_LT-A5LUT[1] will use INIT[63:32].
+
+Within tile interconnect features
+---------------------------------
+
+When a tile has interconnect feature, e.g. output muxes, the "fasm_mux" tag
+should be attached to the interconnect tag, likely the ``<direct>`` or
+``<mux>`` tags.  From the perspective of genfasm, the ``<direct>`` and
+``<mux>`` tags are equivalent.  The syntax for the "fasm_mux" newline
+separated relationship between mux input wire names and FASM features.
+Example:
+
+.. code-block:: xml
+
+    <mux name="D5FFMUX" input="BLK_IG-COMMON_SLICE.DX BLK_IG-COMMON_SLICE.DO5" output="BLK_BB-SLICE_FF.D5[3]" >
+      <metadata>
+        <meta name="fasm_mux">
+          BLK_IG-COMMON_SLICE.DO5 : D5FFMUX.IN_A
+          BLK_IG-COMMON_SLICE.DX : D5FFMUX.IN_B
+        </meta>
+      </metadata>
+    </mux>
+
+The above mux connects input BLK_IG-COMMON_SLICE.DX or BLK_IG-COMMON_SLICE.DO5
+to BLK_BB-SLICE_FF.D5[3].  When VPR selects BLK_IG-COMMON_SLICE.DO5 for the
+mux, "genfasm" will emit D5FFMUX.IN_A, etc.
+
+There is not a requirement that all inputs result in a feature being set.
+In cases where some mux selections result in no feature being set, use "NULL"
+as the feature name.  Example:
+
+.. code-block:: xml
+
+    <mux name="CARRY_DI3" input="BLK_IG-COMMON_SLICE.DO5 BLK_IG-COMMON_SLICE.DX" output="BEL_BB-CARRY[2].DI" >
+      <metadata>
+        <meta name="fasm_mux">
+          BLK_IG-COMMON_SLICE.DO5 : CARRY4.DCY0
+          BLK_IG-COMMON_SLICE.DX : NULL
+        </meta>
+      </metadata>
+    </mux>
+
+The above examples all used the ``<mux>`` tag.  The "fasm_mux" metadata key
+can also be used with the ``<direct>`` tag in the same way, example:
+
+.. code-block:: xml
+
+    <direct name="WA7"  input="BLK_IG-SLICEM.CX" output="BLK_IG-SLICEM_MODES.WA7">
+      <metadata>
+        <meta name="fasm_mux">
+          BLK_IG-SLICEM.CX = WA7USED
+        </meta>
+      </metadata>
+    </direct>
+
+Passing parameters through to the FASM Output
+---------------------------------------------
+
+In many cases there are parameters that need to be passed directly from the
+input :ref:`vpr_eblif_file` to the FASM file.  These can be passed into a FASM
+feature via the "fasm_params" key.  Note that care must be taken to have the
+"fasm_params" metadata be attached to pb_type that the packer uses, the
+pb_type with the blif_model= ".subckt".
+
+The "fasm_params" value is a newline separated list of FASM features to eblif
+parameters. Example:
+
+.. code-block:: xml
+
+  <metadata>
+    <meta name="fasm_params">
+      INIT[31:0] = INIT_00
+      INIT[63:32] = INIT_01
+    </meta>
+  </metadata>
+
+The FASM feature is on the left hand side of the equals.  When setting a
+parameter with multiple bits, the bit range must be specified.  If the
+parameter is a single bit, the bit range is not required, but can be supplied
+for clarity.  The right hand side is the parameter name from eblif.  If the
+parameter name is not found in the eblif, that FASM feature will not be
+emitted.
+
+No errors or warnings will be generated for unused parameters from eblif or
+unused mappings between eblif parameters and FASM parameters to allow for
+flexibility in the synthesis output.  This does mean it is important to check
+spelling of the metadata, and create tests that the mapping is working as
+expected.
+
+Also note that "genfasm" will not accept "x" (unknown/don't care) or "z"
+(high impedence) values in parameters.  Prior to emitting the eblif for place
+and route, ensure that all parameters that will be mapped to FASM have a
+valid "1" or "0".
diff --git a/doc/src/utils/index.rst b/doc/src/utils/index.rst
new file mode 100644
index 0000000..fd804f8
--- /dev/null
+++ b/doc/src/utils/index.rst
@@ -0,0 +1,9 @@
+.. _utils:
+
+Utilities
+---------
+
+.. toctree::
+   :maxdepth: 2
+
+   fasm
diff --git a/utils/CMakeLists.txt b/utils/CMakeLists.txt
new file mode 100644
index 0000000..fbde764
--- /dev/null
+++ b/utils/CMakeLists.txt
@@ -0,0 +1 @@
+add_subdirectory(fasm)
diff --git a/utils/fasm/CMakeLists.txt b/utils/fasm/CMakeLists.txt
new file mode 100644
index 0000000..1280304
--- /dev/null
+++ b/utils/fasm/CMakeLists.txt
@@ -0,0 +1,51 @@
+cmake_minimum_required(VERSION 2.8.12)
+
+project("genfasm")
+
+#Create library
+add_library(fasm
+  src/fasm.cpp
+  src/fasm.h
+  src/lut.cpp
+  src/lut.h
+  src/parameters.cpp
+  src/parameters.h
+  src/fasm_utils.cpp
+  src/fasm_utils.h
+  )
+target_include_directories(fasm PUBLIC src)
+target_link_libraries(fasm
+  libvpr
+  libvtrutil
+  libarchfpga
+  libsdcparse
+  libblifparse
+  libeasygl
+  libtatum
+  libargparse
+  libpugixml)
+
+add_executable(genfasm src/main.cpp)
+target_link_libraries(genfasm fasm)
+
+#Specify link-time dependancies
+install(TARGETS genfasm DESTINATION bin)
+
+#
+# Unit Tests
+#
+set(TEST_SOURCES
+  test/main.cpp
+  test/test_fasm.cpp
+  test/test_lut.cpp
+  test/test_parameters.cpp
+  test/test_utils.cpp
+  )
+add_executable(test_fasm ${TEST_SOURCES})
+target_link_libraries(test_fasm fasm libcatch)
+
+add_test(
+  NAME test_fasm
+  COMMAND test_fasm --use-colour=yes
+  WORKING_DIRECTORY ${CMAKE_CURRENT_SOURCE_DIR}/test
+  )
diff --git a/utils/fasm/src/fasm.cpp b/utils/fasm/src/fasm.cpp
new file mode 100644
index 0000000..adcaaa5
--- /dev/null
+++ b/utils/fasm/src/fasm.cpp
@@ -0,0 +1,612 @@
+#include "fasm.h"
+
+#include <algorithm>
+#include <fstream>
+#include <iomanip>
+#include <iostream>
+#include <iterator>
+#include <set>
+#include <sstream>
+#include <string>
+#include <unordered_map>
+
+#include "globals.h"
+
+#include "rr_metadata.h"
+
+#include "vtr_assert.h"
+#include "vtr_logic.h"
+#include "vtr_version.h"
+#include "vpr_error.h"
+
+#include "atom_netlist_utils.h"
+#include "netlist_writer.h"
+#include "vpr_utils.h"
+
+#include "fasm_utils.h"
+
+namespace fasm {
+
+FasmWriterVisitor::FasmWriterVisitor(std::ostream& f) : os_(f) {}
+
+void FasmWriterVisitor::visit_top_impl(const char* top_level_name) {
+    (void)top_level_name;
+    auto& device_ctx = g_vpr_ctx.device();
+    pb_graph_pin_lookup_from_index_by_type_.resize(device_ctx.num_block_types);
+    for(int itype = 0; itype < device_ctx.num_block_types; itype++) {
+        pb_graph_pin_lookup_from_index_by_type_.at(itype) = alloc_and_load_pb_graph_pin_lookup_from_index(&device_ctx.block_types[itype]);
+    }
+}
+
+void FasmWriterVisitor::visit_clb_impl(ClusterBlockId blk_id, const t_pb* clb) {
+    auto& place_ctx = g_vpr_ctx.placement();
+    auto& device_ctx = g_vpr_ctx.device();
+
+    current_blk_id_ = blk_id;
+
+    VTR_ASSERT(clb->pb_graph_node != nullptr);
+    VTR_ASSERT(clb->pb_graph_node->pb_type);
+
+    root_clb_ = clb->pb_graph_node;
+
+    int x = place_ctx.block_locs[blk_id].x;
+    int y = place_ctx.block_locs[blk_id].y;
+    int z = place_ctx.block_locs[blk_id].z;
+    auto &grid_loc = device_ctx.grid[x][y];
+    blk_type_ = grid_loc.type;
+
+    current_blk_has_prefix_ = true;
+    std::string grid_prefix;
+    if(grid_loc.meta != nullptr && grid_loc.meta->has("fasm_prefix")) {
+      std::string prefix_unsplit = grid_loc.meta->get("fasm_prefix")->front().as_string();
+      std::vector<std::string> fasm_prefixes = vtr::split(prefix_unsplit, " \t\n");
+      if(fasm_prefixes.size() != static_cast<size_t>(blk_type_->capacity)) {
+        vpr_throw(VPR_ERROR_OTHER,
+                  __FILE__, __LINE__,
+                  "number of fasm_prefix (%s) options (%d) for block (%s) must match capacity(%d)",
+                  prefix_unsplit.c_str(), fasm_prefixes.size(), blk_type_->name, blk_type_->capacity);
+      }
+      grid_prefix = fasm_prefixes[z];
+    } else {
+      current_blk_has_prefix_= false;
+    }
+
+    if(current_blk_has_prefix_) {
+      blk_prefix_ = grid_prefix + ".";
+    }
+}
+
+void FasmWriterVisitor::check_interconnect(const t_pb_routes &pb_routes, int inode) {
+  auto iter = pb_routes.find(inode);
+  if(iter == pb_routes.end() || !iter->second.atom_net_id) {
+    // Net is open.
+    return;
+  }
+
+  /* No previous driver implies that this is either a top-level input pin
+    * or a primitive output pin */
+  int prev_node = iter->second.driver_pb_pin_id;
+  if(prev_node == OPEN) {
+    return;
+  }
+
+  t_pb_graph_pin *prev_pin = pb_graph_pin_lookup_from_index_by_type_.at(blk_type_->index)[prev_node];
+
+  int prev_edge;
+  for(prev_edge = 0; prev_edge < prev_pin->num_output_edges; prev_edge++) {
+      VTR_ASSERT(prev_pin->output_edges[prev_edge]->num_output_pins == 1);
+      if(prev_pin->output_edges[prev_edge]->output_pins[0]->pin_count_in_cluster == inode) {
+          break;
+      }
+  }
+  VTR_ASSERT(prev_edge < prev_pin->num_output_edges);
+
+  auto *interconnect = prev_pin->output_edges[prev_edge]->interconnect;
+  if(interconnect->meta.has("fasm_mux")) {
+    std::string fasm_mux = interconnect->meta.get("fasm_mux")->front().as_string();
+    output_fasm_mux(fasm_mux, interconnect, prev_pin);
+  }
+}
+
+std::string FasmWriterVisitor::build_clb_prefix(const t_pb_graph_node* pb_graph_node) const {
+  std::string clb_prefix = "";
+
+  if(root_clb_ != pb_graph_node && pb_graph_node->parent_pb_graph_node != root_clb_) {
+    VTR_ASSERT(pb_graph_node->parent_pb_graph_node != nullptr);
+    clb_prefix = build_clb_prefix(pb_graph_node->parent_pb_graph_node);
+  }
+
+  const auto *pb_type = pb_graph_node->pb_type;
+  if(!pb_type->meta.has("fasm_prefix")) {
+    return clb_prefix;
+  }
+
+  auto fasm_prefix_unsplit = pb_type->meta.one("fasm_prefix")->as_string();
+  auto fasm_prefix = vtr::split(fasm_prefix_unsplit, " \t\n");
+  VTR_ASSERT(pb_type->num_pb >= 0);
+  if(fasm_prefix.size() != static_cast<size_t>(pb_type->num_pb)) {
+    vpr_throw(VPR_ERROR_OTHER,
+              __FILE__, __LINE__,
+              "number of fasm_prefix (%s) options (%d) for block (%s) must match capacity(%d)",
+              fasm_prefix_unsplit.c_str(), fasm_prefix.size(), pb_type->name, pb_type->num_pb);
+  }
+
+  if(pb_graph_node->placement_index >= pb_type->num_pb) {
+    vpr_throw(VPR_ERROR_OTHER,
+              __FILE__, __LINE__,
+              "pb_graph_node->placement_index = %d >= pb_type->num_pb = %d",
+              fasm_prefix_unsplit.c_str(), pb_type->num_pb);
+  }
+
+  return clb_prefix + fasm_prefix.at(pb_graph_node->placement_index) + ".";
+
+}
+
+static const t_pb_graph_pin* is_node_used(const t_pb_routes &top_pb_route, const t_pb_graph_node* pb_graph_node) {
+    // Is the node used at all?
+    const t_pb_graph_pin* pin = nullptr;
+    for(int port_index = 0; port_index < pb_graph_node->num_output_ports; ++port_index) {
+        for(int pin_index = 0; pin_index < pb_graph_node->num_output_pins[port_index]; ++pin_index) {
+            pin = &pb_graph_node->output_pins[port_index][pin_index];
+            if (top_pb_route.count(pin->pin_count_in_cluster) > 0 && top_pb_route[pin->pin_count_in_cluster].atom_net_id != AtomNetId::INVALID()) {
+                return pin;
+            }
+        }
+    }
+    for(int port_index = 0; port_index < pb_graph_node->num_input_ports; ++port_index) {
+        for(int pin_index = 0; pin_index < pb_graph_node->num_input_pins[port_index]; ++pin_index) {
+            pin = &pb_graph_node->input_pins[port_index][pin_index];
+            if (top_pb_route.count(pin->pin_count_in_cluster) > 0 && top_pb_route[pin->pin_count_in_cluster].atom_net_id != AtomNetId::INVALID()) {
+                return pin;
+            }
+        }
+    }
+    return nullptr;
+}
+
+void FasmWriterVisitor::check_features(const t_metadata_dict *meta) const {
+  if(meta == nullptr) {
+    return;
+  }
+
+  if(!meta->has("fasm_features")) {
+    return;
+  }
+
+  output_fasm_features(meta->one("fasm_features")->as_string());
+}
+
+void FasmWriterVisitor::visit_all_impl(const t_pb_routes &pb_routes, const t_pb* pb) {
+  VTR_ASSERT(pb != nullptr);
+  VTR_ASSERT(pb->pb_graph_node != nullptr);
+
+  const t_pb_graph_node *pb_graph_node = pb->pb_graph_node;
+
+  clb_prefix_ = build_clb_prefix(pb_graph_node);
+
+  t_pb_type *pb_type = pb_graph_node->pb_type;
+  auto *mode = &pb_type->modes[pb->mode];
+
+  check_features(&pb_type->meta);
+  if(mode != nullptr) {
+    check_features(&mode->meta);
+  }
+
+  if(mode != nullptr && std::string(mode->name) == "wire") {
+      auto io_pin = is_node_used(pb_routes, pb_graph_node);
+      if(io_pin != nullptr) {
+        const auto& route = pb_routes.at(io_pin->pin_count_in_cluster);
+        const int num_inputs = *route.pb_graph_pin->parent_node->num_input_pins;
+        const auto *lut_definition = find_lut(route.pb_graph_pin->parent_node);
+        VTR_ASSERT(lut_definition->num_inputs == num_inputs);
+
+        output_fasm_features(lut_definition->CreateWire(route.pb_graph_pin->pin_number));
+      }
+  }
+
+  int port_index = 0;
+  for (int i = 0; i < pb_type->num_ports; i++) {
+    if (pb_type->ports[i].is_clock || pb_type->ports[i].type != IN_PORT) {
+      continue;
+    }
+    for (int j = 0; j < pb_type->ports[i].num_pins; j++) {
+      int inode = pb->pb_graph_node->input_pins[port_index][j].pin_count_in_cluster;
+      check_interconnect(pb_routes, inode);
+    }
+    port_index += 1;
+  }
+  port_index = 0;
+  for (int i = 0; i < pb_type->num_ports; i++) {
+    if (pb_type->ports[i].type != OUT_PORT) {
+      continue;
+    }
+    for (int j = 0; j < pb_type->ports[i].num_pins; j++) {
+      int inode = pb->pb_graph_node->output_pins[port_index][j].pin_count_in_cluster;
+      check_interconnect(pb_routes, inode);
+    }
+    port_index += 1;
+  }
+}
+
+void FasmWriterVisitor::visit_route_through_impl(const t_pb* atom) {
+  check_for_lut(atom);
+  check_for_param(atom);
+}
+
+static AtomNetId _find_atom_input_logical_net(const t_pb* atom, const t_pb_routes &pb_route, int atom_input_idx) {
+    const t_pb_graph_node* pb_node = atom->pb_graph_node;
+    const int cluster_pin_idx = pb_node->input_pins[0][atom_input_idx].pin_count_in_cluster;
+    if(pb_route.count(cluster_pin_idx) > 0) {
+        return pb_route[cluster_pin_idx].atom_net_id;
+    } else {
+        return AtomNetId::INVALID();
+    }
+}
+
+static LogicVec lut_outputs(const t_pb* atom_pb, size_t num_inputs, const t_pb_routes &pb_route) {
+    auto& atom_ctx = g_vpr_ctx.atom();
+    AtomBlockId block_id = atom_ctx.lookup.pb_atom(atom_pb);
+    const auto& truth_table = atom_ctx.nlist.block_truth_table(block_id);
+    auto ports = atom_ctx.nlist.block_input_ports(atom_ctx.lookup.pb_atom(atom_pb));
+
+    const t_pb_graph_node* gnode = atom_pb->pb_graph_node;
+
+    if(ports.size() != 1) {
+      if(ports.size() != 0) {
+        vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__, "LUT port unexpected size is %d", ports.size());
+      }
+
+      Lut lut(num_inputs);
+      if(truth_table.size() == 1) {
+        VTR_ASSERT(truth_table[0].size() == 1);
+        lut.SetConstant(truth_table[0][0]);
+      } else if(truth_table.size() == 0) {
+        lut.SetConstant(vtr::LogicValue::FALSE);
+      } else {
+        vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__, "LUT truth table unexpected size is %d", truth_table.size());
+      }
+
+      return lut.table();
+    }
+
+    VTR_ASSERT(gnode->num_input_ports == 1);
+    //VTR_ASSERT(gnode->num_input_pins[0] >= num_inputs);
+    std::vector<vtr::LogicValue> inputs(num_inputs, vtr::LogicValue::DONT_CARE);
+    std::vector<int> permutation(num_inputs, -1);
+
+    AtomPortId port_id = *ports.begin();
+
+    for(size_t ipin = 0; ipin < num_inputs; ++ipin) {
+        //The net currently connected to input j
+        const AtomNetId impl_input_net_id = _find_atom_input_logical_net(atom_pb, pb_route, ipin);
+
+        //Find the original pin index
+        const t_pb_graph_pin* gpin = &gnode->input_pins[0][ipin];
+        const BitIndex orig_index = atom_pb->atom_pin_bit_index(gpin);
+
+        if(impl_input_net_id) {
+            //If there is a valid net connected in the implementation
+            AtomNetId logical_net_id = atom_ctx.nlist.port_net(port_id, orig_index);
+            VTR_ASSERT(impl_input_net_id == logical_net_id);
+
+            //Mark the permutation.
+            //  The net originally located at orig_index in the atom netlist
+            //  was moved to ipin in the implementation
+            permutation[orig_index] = ipin;
+        }
+    }
+
+    // truth_table a vector of truth table rows.  The last value in each row is
+    // the output value, and the other values are the input values.  Open pins
+    // are don't cares, and should be -1 in the permutation table.
+    Lut lut(num_inputs);
+    for(const auto& row : truth_table) {
+        std::fill(std::begin(inputs), std::end(inputs), vtr::LogicValue::DONT_CARE);
+        for(size_t i = 0; i < row.size() - 1; i++) {
+            int permuted_idx = permutation[i];
+            if(permuted_idx != -1) {
+              inputs[permuted_idx] = row[i];
+            }
+        }
+
+        lut.SetOutput(inputs, row[row.size() - 1]);
+    }
+
+    return lut.table();
+}
+
+static const t_metadata_dict *get_fasm_type(const t_pb_graph_node* pb_graph_node, std::string target_type) {
+  if(pb_graph_node == nullptr) {
+    return nullptr;
+  }
+
+  if(pb_graph_node->pb_type == nullptr) {
+    return nullptr;
+  }
+
+  t_metadata_dict *meta = nullptr;
+  if(pb_graph_node->pb_type->meta.has("fasm_type")) {
+    meta = &pb_graph_node->pb_type->meta;
+  }
+
+  if(pb_graph_node->pb_type->parent_mode != nullptr &&
+     pb_graph_node->pb_type->parent_mode->meta.has("fasm_type")) {
+    meta = &pb_graph_node->pb_type->parent_mode->meta;
+  }
+
+  if(meta != nullptr && meta->one("fasm_type")->as_string() == target_type) {
+    return meta;
+  }
+
+  return nullptr;
+}
+
+const LutOutputDefinition* FasmWriterVisitor::find_lut(const t_pb_graph_node* pb_graph_node) {
+  while(pb_graph_node != nullptr) {
+    VTR_ASSERT(pb_graph_node->pb_type != nullptr);
+
+    auto iter = lut_definitions_.find(pb_graph_node->pb_type);
+    if(iter == lut_definitions_.end()) {
+      const t_metadata_dict *meta = get_fasm_type(pb_graph_node, "LUT");
+      if(meta != nullptr) {
+        VTR_ASSERT(meta->has("fasm_lut"));
+        std::vector<std::pair<std::string, LutOutputDefinition>> luts;
+        luts.push_back(std::make_pair(
+            vtr::string_fmt("%s[0]", pb_graph_node->pb_type->name),
+            LutOutputDefinition(meta->one("fasm_lut")->as_string())));
+
+        auto insert_result = lut_definitions_.insert(
+            std::make_pair(pb_graph_node->pb_type, luts));
+        VTR_ASSERT(insert_result.second);
+        iter = insert_result.first;
+      }
+
+      meta = get_fasm_type(pb_graph_node, "SPLIT_LUT");
+      if(meta != nullptr) {
+        VTR_ASSERT(meta->has("fasm_lut"));
+        std::string fasm_lut = meta->one("fasm_lut")->as_string();
+        auto lut_parts = split_fasm_entry(fasm_lut, "\n", "\t ");
+        if(__builtin_popcount(lut_parts.size()) != 1) {
+          vpr_throw(VPR_ERROR_OTHER,
+                    __FILE__, __LINE__,
+                    "Number of lut splits must be power of two, found %d parts",
+                    lut_parts.size());
+        }
+
+        std::vector<std::pair<std::string, LutOutputDefinition>> luts;
+        luts.reserve(lut_parts.size());
+        for(const auto &part : lut_parts) {
+          auto parts = vtr::split(part, "=");
+          if(parts.size() != 2) {
+            vpr_throw(VPR_ERROR_OTHER,
+                      __FILE__, __LINE__,
+                      "Split lut definition fasm_lut = %s does not parse.",
+                      fasm_lut.c_str());
+          }
+
+          luts.push_back(std::make_pair(
+              parts[1], LutOutputDefinition(parts[0])));
+        }
+
+        auto insert_result = lut_definitions_.insert(
+            std::make_pair(pb_graph_node->pb_type,
+                           luts));
+        VTR_ASSERT(insert_result.second);
+        iter = insert_result.first;
+      }
+    }
+
+    if(iter != lut_definitions_.end()) {
+      auto string_at_node = vtr::string_fmt("%s[%d]", pb_graph_node->pb_type->name, pb_graph_node->placement_index);
+      for(const auto &lut : iter->second) {
+        if(lut.first == string_at_node) {
+          return &lut.second;
+        }
+      }
+    }
+
+    pb_graph_node = pb_graph_node->parent_pb_graph_node;
+  }
+
+  vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__,
+            "Failed to find LUT output definition.");
+  return nullptr;
+}
+
+static const t_pb_routes &find_pb_route(const t_pb* pb) {
+  const t_pb* parent = pb->parent_pb;
+  while(parent != nullptr) {
+    pb = parent;
+    parent = pb->parent_pb;
+  }
+  return pb->pb_route;
+}
+
+void FasmWriterVisitor::check_for_param(const t_pb *atom) {
+    auto& atom_ctx = g_vpr_ctx.atom();
+
+    auto atom_blk_id = atom_ctx.lookup.pb_atom(atom);
+    if (atom_blk_id == AtomBlockId::INVALID()) {
+        return;
+    }
+
+    if(atom->pb_graph_node == nullptr ||
+       atom->pb_graph_node->pb_type == nullptr) {
+        return;
+    }
+
+    const auto *meta = &atom->pb_graph_node->pb_type->meta;
+    if(!meta->has("fasm_params")) {
+        return;
+    }
+
+    auto iter = parameters_.find(atom->pb_graph_node->pb_type);
+
+    if(iter == parameters_.end()) {
+        Parameters params;
+        std::string fasm_params = meta->one("fasm_params")->as_string();
+        for(const auto param : vtr::split(fasm_params, "\n")) {
+          auto param_parts = split_fasm_entry(param, "=", "\t ");
+            if(param_parts.size() == 0) {
+                continue;
+            }
+            VTR_ASSERT(param_parts.size() == 2);
+
+            params.AddParameter(param_parts[1], param_parts[0]);
+        }
+
+        auto ret = parameters_.insert(std::make_pair(
+                    atom->pb_graph_node->pb_type,
+                    params));
+
+        VTR_ASSERT(ret.second);
+        iter = ret.first;
+    }
+
+    auto &params = iter->second;
+
+    for(auto param : atom_ctx.nlist.block_params(atom_blk_id)) {
+        auto feature = params.EmitFasmFeature(param.first, param.second);
+
+        if(feature.size() > 0) {
+            output_fasm_features(feature);
+        }
+    }
+}
+
+void FasmWriterVisitor::check_for_lut(const t_pb* atom) {
+    auto& atom_ctx = g_vpr_ctx.atom();
+
+    auto atom_blk_id = atom_ctx.lookup.pb_atom(atom);
+    if (atom_blk_id == AtomBlockId::INVALID()) {
+        return;
+    }
+
+    const t_model* model = atom_ctx.nlist.block_model(atom_blk_id);
+    if (model->name == std::string(MODEL_NAMES)) {
+      VTR_ASSERT(atom->pb_graph_node != nullptr);
+      const auto *lut_definition = find_lut(atom->pb_graph_node);
+      VTR_ASSERT(lut_definition->num_inputs == *atom->pb_graph_node->num_input_pins);
+
+      const t_pb_routes &pb_route = find_pb_route(atom);
+      LogicVec lut_mask = lut_outputs(atom, lut_definition->num_inputs, pb_route);
+      output_fasm_features(lut_definition->CreateInit(lut_mask));
+    }
+}
+
+void FasmWriterVisitor::visit_atom_impl(const t_pb* atom) {
+  check_for_lut(atom);
+  check_for_param(atom);
+}
+
+void FasmWriterVisitor::walk_routing() {
+    auto& route_ctx = g_vpr_ctx.routing();
+
+    for(const auto &trace : route_ctx.trace) {
+      const t_trace *head = trace.head;
+      while(head != nullptr) {
+        const t_trace *next = head->next;
+
+        if(next != nullptr) {
+          const auto next_inode = next->index;
+          auto *meta = vpr::rr_edge_metadata(head->index, next_inode, head->iswitch, "fasm_features");
+          if(meta != nullptr) {
+            current_blk_has_prefix_ = false;
+            output_fasm_features(meta->as_string());
+          }
+        }
+
+        head = next;
+      }
+    }
+}
+
+void FasmWriterVisitor::finish_impl() {
+    auto& device_ctx = g_vpr_ctx.device();
+    for(int itype = 0; itype < device_ctx.num_block_types; itype++) {
+        free_pb_graph_pin_lookup_from_index (pb_graph_pin_lookup_from_index_by_type_.at(itype));
+    }
+
+    walk_routing();
+}
+
+void FasmWriterVisitor::output_fasm_mux(std::string fasm_mux,
+                                        t_interconnect *interconnect,
+                                        t_pb_graph_pin *mux_input_pin) {
+    auto *pb_name = mux_input_pin->parent_node->pb_type->name;
+    auto pb_index = mux_input_pin->parent_node->placement_index;
+    auto *port_name = mux_input_pin->port->name;
+    auto pin_index = mux_input_pin->pin_number;
+    auto mux_inputs = vtr::split(fasm_mux, "\n");
+    for(const auto &mux_input : mux_inputs) {
+      auto mux_parts = split_fasm_entry(mux_input, "=:", "\t ");
+
+      if(mux_parts.size() == 0) {
+        // Swallow whitespace.
+        continue;
+      }
+
+      if(mux_parts.size() != 2) {
+        vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__,
+            "fasm_mux line %s does not have 2 parts, has %d parts.\n",
+            mux_input.c_str(), mux_parts.size());
+      }
+
+      auto vtr_parts = vtr::split(mux_parts[0], ".");
+      if(vtr_parts.size() != 2) {
+        vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__,
+            "fasm_mux line %s does not have 2 parts, has %d parts.\n",
+            mux_parts[0].c_str(), vtr_parts.size());
+      }
+
+      std::string mux_pb_name;
+      int mux_pb_index;
+      parse_name_with_optional_index(vtr_parts[0], &mux_pb_name, &mux_pb_index);
+      std::string mux_port_name;
+      int mux_pin_index;
+      parse_name_with_optional_index(vtr_parts[1], &mux_port_name, &mux_pin_index);
+
+      bool root_level_connection = interconnect->parent_mode->parent_pb_type ==
+          mux_input_pin->parent_node->pb_type;
+
+      if(root_level_connection) {
+        // This connection is root level.  pb_index selects between
+        // pb_type_prefixes_, not on the mux input.
+        if(mux_pb_name == pb_name && mux_port_name == port_name && mux_pin_index == pin_index) {
+          if(mux_parts[1] != "NULL") {
+            output_fasm_features(mux_parts[1]);
+          }
+          return;
+        }
+      } else if(mux_pb_name == pb_name &&
+                mux_pb_index == pb_index &&
+                mux_port_name == port_name &&
+                mux_pin_index == pin_index) {
+        if(mux_parts[1] != "NULL") {
+          output_fasm_features(mux_parts[1]);
+        }
+        return;
+      }
+    }
+
+    vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__,
+        "fasm_mux %s[%d].%s[%d] found no matches in:\n%s\n",
+        pb_name, pb_index, port_name, pin_index, fasm_mux.c_str());
+}
+
+void FasmWriterVisitor::output_fasm_features(std::string features) const {
+  std::stringstream os(features);
+
+  while(os) {
+    std::string feature;
+    os >> feature;
+    if(os) {
+      if(current_blk_has_prefix_) {
+        os_ << blk_prefix_ << clb_prefix_;
+      }
+      os_ << feature << std::endl;
+    }
+  }
+}
+
+} // namespace fasm
diff --git a/utils/fasm/src/fasm.h b/utils/fasm/src/fasm.h
new file mode 100644
index 0000000..808f3da
--- /dev/null
+++ b/utils/fasm/src/fasm.h
@@ -0,0 +1,93 @@
+// Core of the FPGA assembly (FASM) output code.
+//
+// The netlist walker implements the core walking of the netlist and routing
+// graph with the intention of emitting the complete implemented design to the
+// output FASM file.  Once output, the FASM file is expected to completely
+// describe the features required to implement the design in a hardware
+// bitstream.
+#ifndef FASM_H
+#define FASM_H
+
+#include <iostream>
+#include <list>
+#include <map>
+#include <ostream>
+#include <set>
+#include <sstream>
+#include <string>
+#include <unordered_map>
+#include <vector>
+
+#include "netlist_walker.h"
+#include "netlist_writer.h"
+#include "lut.h"
+#include "parameters.h"
+
+namespace fasm {
+
+// FASM netlist visitor.
+//
+// This netlist visitor emits FASM features to the provided ostream using
+// FASM metadata tags.  See the FASM documentation in docs/utils/fasm.rst
+// for a complete list of tags.
+//
+// High level methodology:
+//  - At each top level pb_type, root tile FASM prefix is checked.
+//  - At each pb_type, the CLB prefix is build by walking from the child to
+//    the root pb_type.
+//  - All LUTs are visited and emitted via the fasm_type/fasm_lut tags.
+//  - All route through wires are emitted via the same mechanism as LUTs.
+//  - All atom's (e.g. leaf pb_types) are visited and have their parameters
+//    emitted via fasm_params tags.
+//  - All pb_type interconnect's are visited and are emitted via fasm_mux tags.
+//  - All pb_type and pb_type mode's are checked for static feature emission
+//    via the fasm_features tag.
+//
+// Once the netlist visitor is done, in finish_impl, the routing graph is
+// walked and static features are emitted via the fasm_features tag.
+//
+// Note that FASM features are incrementally written the ostream as they are
+// seen, so the order of FASM output is arbitrary based on the walk order of
+// the netlist and routing graph.
+class FasmWriterVisitor : public NetlistVisitor {
+
+  public:
+      FasmWriterVisitor(std::ostream& f);
+
+  private:
+      void visit_top_impl(const char* top_level_name) override;
+      void visit_route_through_impl(const t_pb* atom) override;
+      void visit_atom_impl(const t_pb* atom) override;
+      // clb in visit_clb_impl stands for complex logic block.
+      // visit_clb_impl is called on each top-level pb_type used in the design.
+      void visit_clb_impl(ClusterBlockId blk_id, const t_pb* clb) override;
+      void visit_all_impl(const t_pb_routes &top_pb_route, const t_pb* pb) override;
+      void finish_impl() override;
+
+  private:
+      void output_fasm_features(std::string features) const;
+      void check_features(const t_metadata_dict *meta) const;
+      void check_interconnect(const t_pb_routes &pb_route, int inode);
+      void check_for_lut(const t_pb* atom);
+      void output_fasm_mux(std::string fasm_mux, t_interconnect *interconnect, t_pb_graph_pin *mux_input_pin);
+      void walk_routing();
+      std::string build_clb_prefix(const t_pb_graph_node* pb_graph_node) const;
+      const LutOutputDefinition* find_lut(const t_pb_graph_node* pb_graph_node);
+      void check_for_param(const t_pb *atom);
+
+      std::ostream& os_;
+
+      t_pb_graph_node *root_clb_;
+      bool current_blk_has_prefix_;
+      t_type_ptr blk_type_;
+      std::string blk_prefix_;
+      std::string clb_prefix_;
+      ClusterBlockId current_blk_id_;
+      std::vector<t_pb_graph_pin**> pb_graph_pin_lookup_from_index_by_type_;
+      std::map<const t_pb_type*, std::vector<std::pair<std::string, LutOutputDefinition>>> lut_definitions_;
+      std::map<const t_pb_type*, Parameters> parameters_;
+};
+
+} // namespace fasm
+
+#endif  // FASM_H
diff --git a/utils/fasm/src/fasm_utils.cpp b/utils/fasm/src/fasm_utils.cpp
new file mode 100644
index 0000000..fddb62a
--- /dev/null
+++ b/utils/fasm/src/fasm_utils.cpp
@@ -0,0 +1,33 @@
+#include "fasm_utils.h"
+#include "vpr_utils.h"
+
+namespace fasm {
+
+void parse_name_with_optional_index(const std::string in, std::string *name, int *index) {
+  auto in_parts = vtr::split(in, "[]");
+
+  if(in_parts.size() == 1) {
+    *name = in;
+    *index = 0;
+  } else if(in_parts.size() == 2) {
+    *name = in_parts[0];
+    *index = vtr::atoi(in_parts[1]);
+  } else {
+    vpr_throw(VPR_ERROR_OTHER, __FILE__, __LINE__,
+              "Cannot parse %s.", in.c_str());
+  }
+}
+
+std::vector<std::string> split_fasm_entry(std::string entry,
+                                                 std::string delims,
+                                                 std::string ignore) {
+  for (size_t ii=0; ii<entry.length(); ii++) {
+    while (ignore.find(entry[ii]) != std::string::npos) {
+      entry.erase(ii, 1);
+    }
+  }
+
+  return vtr::split(entry, delims);
+}
+
+} // namespace fasm
diff --git a/utils/fasm/src/fasm_utils.h b/utils/fasm/src/fasm_utils.h
new file mode 100644
index 0000000..99f82c7
--- /dev/null
+++ b/utils/fasm/src/fasm_utils.h
@@ -0,0 +1,27 @@
+#ifndef UTILS_FASM_FASM_UTILS_H_
+#define UTILS_FASM_FASM_UTILS_H_
+
+#include <string>
+#include <vector>
+
+namespace fasm {
+
+// Parse a port name that may have an index.
+//
+// in="A" parts to *name="A", *index=0
+// in="A[5]" parts to *name="A", *index=5
+//
+// Throws vpr exception if parsing fails.
+void parse_name_with_optional_index(const std::string in, std::string *name, int *index);
+
+// Split FASM entry into parts.
+//
+// delims - Characters to split on.
+// ignore - Characters to ignore.
+std::vector<std::string> split_fasm_entry(std::string entry,
+                                                 std::string delims,
+                                                 std::string ignore);
+
+} // namespace fasm
+
+#endif /* UTILS_FASM_FASM_UTILS_H_ */
diff --git a/utils/fasm/src/lut.cpp b/utils/fasm/src/lut.cpp
new file mode 100644
index 0000000..5356fdc
--- /dev/null
+++ b/utils/fasm/src/lut.cpp
@@ -0,0 +1,137 @@
+#include "lut.h"
+
+#include <sstream>
+
+#include "vtr_assert.h"
+#include "vpr_error.h"
+#include "vtr_util.h"
+
+namespace fasm {
+
+Lut::Lut(size_t num_inputs) : num_inputs_(num_inputs), table_(1 << num_inputs, vtr::LogicValue::DONT_CARE) {}
+
+void Lut::SetOutput(const std::vector<vtr::LogicValue> &inputs, vtr::LogicValue value) {
+  VTR_ASSERT(inputs.size() == num_inputs_);
+  std::vector<size_t> dont_care_inputs;
+  dont_care_inputs.reserve(num_inputs_);
+
+  for(size_t address = 0; address < table_.size(); ++address) {
+    bool match = true;
+    for(size_t input = 0; input < inputs.size(); ++input) {
+      if(inputs[input] == vtr::LogicValue::TRUE && (address & (1 << input)) == 0) {
+        match = false;
+        break;
+      } else if(inputs[input] == vtr::LogicValue::FALSE && (address & (1 << input)) != 0) {
+        match = false;
+        break;
+      }
+    }
+
+    if(match) {
+      VTR_ASSERT(table_[address] == vtr::LogicValue::DONT_CARE || table_[address] == value);
+      table_[address] = value;
+    }
+  }
+}
+
+void Lut::CreateWire(size_t input_pin) {
+  std::vector<vtr::LogicValue> inputs(num_inputs_, vtr::LogicValue::DONT_CARE);
+  inputs[input_pin] = vtr::LogicValue::FALSE;
+  SetOutput(inputs, vtr::LogicValue::FALSE);
+  inputs[input_pin] = vtr::LogicValue::TRUE;
+  SetOutput(inputs, vtr::LogicValue::TRUE);
+}
+
+void Lut::SetConstant(vtr::LogicValue value) {
+  std::vector<vtr::LogicValue> inputs(num_inputs_, vtr::LogicValue::DONT_CARE);
+  SetOutput(inputs, value);
+}
+
+const LogicVec & Lut::table() {
+  // Make sure the entire table is defined.
+  for(size_t address = 0; address < table_.size(); ++address) {
+    if(table_[address] == vtr::LogicValue::DONT_CARE) {
+      table_[address] = vtr::LogicValue::FALSE;
+    }
+  }
+
+  return table_;
+}
+
+LutOutputDefinition::LutOutputDefinition(std::string definition) {
+  // Parse LUT.INIT[63:0] into
+  // fasm_feature = LUT.INIT
+  // start_bit = 0
+  // end_bit = 63
+  // num_inputs = log2(end_bit-start_bit+1)
+
+  size_t slice_start = definition.find_first_of('[');
+  size_t slice = std::string::npos;
+  size_t slice_end = std::string::npos;
+
+  if(slice_start != std::string::npos) {
+    slice = definition.find_first_of(':', slice_start);
+  }
+  if(slice != std::string::npos) {
+    slice_end = definition.find_first_of(']');
+  }
+
+  if(slice_start == std::string::npos ||
+      slice == std::string::npos ||
+      slice_end == std::string::npos ||
+      slice_start+1 > slice-1 ||
+      slice+1 > slice_end-1) {
+    vpr_throw(
+        VPR_ERROR_OTHER, __FILE__, __LINE__,
+        "Could not parse LUT definition %s",
+        definition.c_str());
+  }
+
+  fasm_feature = definition.substr(0, slice_start);
+  std::string end_bit_str = definition.substr(slice_start+1, (slice-1)-(slice_start+1)+1);
+  std::string start_bit_str = definition.substr(slice+1, (slice_end-1)-(slice+1)+1);
+
+  end_bit = vtr::atoi(end_bit_str);
+  start_bit = vtr::atoi(start_bit_str);
+
+  int width = end_bit - start_bit + 1;
+
+  // If an exact power of two, only 1 bit will be set in width.
+  if(width < 0 || __builtin_popcount(width) != 1) {
+    vpr_throw(
+        VPR_ERROR_OTHER, __FILE__, __LINE__,
+        "Invalid LUT start_bit %d and end_bit %d, not a power of 2 width.",
+        start_bit, end_bit);
+  }
+
+  // For exact power's of 2, ctz (count trailing zeros) is log2(width).
+  num_inputs = __builtin_ctz(width);
+}
+
+std::string LutOutputDefinition::CreateWire(int input) const {
+  Lut lut(num_inputs);
+  lut.CreateWire(input);
+
+  return CreateInit(lut.table());
+}
+
+std::string LutOutputDefinition::CreateConstant(vtr::LogicValue value) const {
+  Lut lut(num_inputs);
+  lut.SetConstant(value);
+  return CreateInit(lut.table());
+}
+
+std::string LutOutputDefinition::CreateInit(const LogicVec & table) const {
+  if(table.size() != (1u << num_inputs)) {
+    vpr_throw(
+        VPR_ERROR_OTHER, __FILE__, __LINE__,
+        "LUT with %d inputs requires a INIT LogicVec of size %d, got %d",
+        num_inputs, (1 << num_inputs), table.size());
+  }
+  std::stringstream ss;
+  ss << fasm_feature << "[" << end_bit << ":" << start_bit << "]=" << table;
+
+  return ss.str();
+}
+
+} // namespace fasm
diff --git a/utils/fasm/src/lut.h b/utils/fasm/src/lut.h
new file mode 100644
index 0000000..798accc
--- /dev/null
+++ b/utils/fasm/src/lut.h
@@ -0,0 +1,65 @@
+#ifndef LUT_H
+#define LUT_H
+
+#include "logic_vec.h"
+
+namespace fasm {
+
+// Utility class to create a LUT initialization.
+class Lut {
+ public:
+  // Initialize an LUT of a given number of inputs.
+  Lut(size_t num_inputs);
+
+  // SetOutput sets the lut to output value when the inputs match.
+  //
+  // By default the output from the LUT is always false.
+  void SetOutput(const std::vector<vtr::LogicValue> &inputs, vtr::LogicValue value);
+
+  // Create a wire from input_pin to LUT output.
+  // Also known as a route through LUT.
+  //
+  // input_pin must be less than num_inputs.
+  void CreateWire(size_t input_pin);
+
+  // Create a LUT with a constant output of value.
+  void SetConstant(vtr::LogicValue value);
+
+  // Return current LUT initialization.
+  const LogicVec & table();
+ private:
+  size_t num_inputs_;
+  LogicVec table_;
+};
+
+// Utility class that creates a FASM feature directive based on the FASM LUT definition.
+struct LutOutputDefinition {
+  // Definition should be of the format <feature>[<end_bit>:<start_bit].
+  LutOutputDefinition(std::string definition);
+
+  // Return a FASM feature directive for a wire from input specified to output.
+  // Also known as a route through LUT.
+  std::string CreateWire(int input) const;
+
+  // Return a FASM feature directive for a constant LUT.
+  std::string CreateConstant(vtr::LogicValue value) const;
+
+  // Return a FASM feature directive for a LUT with the specified LUT initialization.
+  std::string CreateInit(const LogicVec & table) const;
+
+  // Base feature name.
+  std::string fasm_feature;
+
+  // Number of inputs to this LUT.
+  int num_inputs;
+
+  // First bit of the LUT INIT parameter.
+  int start_bit;
+
+  // Last bit of the LUT INIT parameter.
+  int end_bit;
+};
+
+} // namespace fasm
+
+#endif  // LUT_H
diff --git a/utils/fasm/src/main.cpp b/utils/fasm/src/main.cpp
new file mode 100644
index 0000000..dae4edf
--- /dev/null
+++ b/utils/fasm/src/main.cpp
@@ -0,0 +1,116 @@
+// Tool to output FASM from placed and routed design via metadata tagging.
+#include <cstdio>
+#include <cstring>
+#include <ctime>
+#include <fstream>
+using namespace std;
+
+#include "vtr_error.h"
+#include "vtr_memory.h"
+#include "vtr_log.h"
+
+#include "tatum/error.hpp"
+
+#include "vpr_error.h"
+#include "vpr_api.h"
+#include "vpr_signal_handler.h"
+
+#include "globals.h"
+
+#include "net_delay.h"
+#include "RoutingDelayCalculator.h"
+
+#include "fasm.h"
+
+/*
+ * Exit codes to signal success/failure to scripts
+ * calling vpr
+ */
+constexpr int SUCCESS_EXIT_CODE = 0; //Everything OK
+constexpr int ERROR_EXIT_CODE = 1; //Something went wrong internally
+constexpr int UNIMPLEMENTABLE_EXIT_CODE = 2; //Could not implement (e.g. unroutable)
+constexpr int INTERRUPTED_EXIT_CODE = 3; //VPR was interrupted by the user (e.g. SIGINT/ctr-C)
+
+/*
+ * Writes FASM file based on the netlist name by walking the netlist.
+ */
+static bool write_fasm() {
+  auto& atom_ctx = g_vpr_ctx.atom();
+
+  std::string fasm_filename = atom_ctx.nlist.netlist_name() + ".fasm";
+  vtr::printf("Writing Implementation FASM: %s\n", fasm_filename.c_str());
+  std::ofstream fasm_os(fasm_filename);
+  fasm::FasmWriterVisitor visitor(fasm_os);
+  NetlistWalker nl_walker(visitor);
+  nl_walker.walk();
+
+  return true;
+}
+
+/*
+ * Generate FASM utility.
+ *
+ * 1. Loads pack, place and route files
+ * 2. Walks netlist and outputs FASM.
+ * 3. Cleans up and exits.
+ *
+ */
+int main(int argc, const char **argv) {
+    t_options Options = t_options();
+    t_arch Arch = t_arch();
+    t_vpr_setup vpr_setup = t_vpr_setup();
+    clock_t entire_flow_begin, entire_flow_end;
+
+    entire_flow_begin = clock();
+
+    try {
+        vpr_install_signal_handler();
+
+        /* Read options, architecture, and circuit netlist */
+        vpr_init(argc, argv, &Options, &vpr_setup, &Arch);
+
+        vpr_setup.PackerOpts.doPacking    = STAGE_LOAD;
+        vpr_setup.PlacerOpts.doPlacement  = STAGE_LOAD;
+        vpr_setup.RouterOpts.doRouting    = STAGE_LOAD;
+        vpr_setup.AnalysisOpts.doAnalysis = STAGE_SKIP;
+
+        bool flow_succeeded = false;
+        flow_succeeded = vpr_flow(vpr_setup, Arch);
+
+        /* Actually write output FASM file. */
+        flow_succeeded = write_fasm();
+        if (!flow_succeeded) {
+            return UNIMPLEMENTABLE_EXIT_CODE;
+        }
+
+        entire_flow_end = clock();
+
+        vtr::printf_info("The entire flow of VPR took %g seconds.\n",
+                (float) (entire_flow_end - entire_flow_begin) / CLOCKS_PER_SEC);
+
+        /* free data structures */
+        vpr_free_all(Arch, vpr_setup);
+
+    } catch (const tatum::Error& tatum_error) {
+        vtr::printf_error(__FILE__, __LINE__, "STA Engine: %s\n", tatum_error.what());
+
+        return ERROR_EXIT_CODE;
+
+    } catch (const VprError& vpr_error) {
+        vpr_print_error(vpr_error);
+
+        if (vpr_error.type() == VPR_ERROR_INTERRUPTED) {
+            return INTERRUPTED_EXIT_CODE;
+        } else {
+            return ERROR_EXIT_CODE;
+        }
+
+    } catch (const vtr::VtrError& vtr_error) {
+        vtr::printf_error(__FILE__, __LINE__, "%s:%d %s\n", vtr_error.filename_c_str(), vtr_error.line(), vtr_error.what());
+
+        return ERROR_EXIT_CODE;
+    }
+
+    /* Signal success to scripts */
+    return SUCCESS_EXIT_CODE;
+}
diff --git a/utils/fasm/src/parameters.cpp b/utils/fasm/src/parameters.cpp
new file mode 100644
index 0000000..1d81437
--- /dev/null
+++ b/utils/fasm/src/parameters.cpp
@@ -0,0 +1,62 @@
+#include "parameters.h"
+#include "vtr_assert.h"
+
+namespace fasm {
+
+void Parameters::AddParameter(const std::string &eblif_parameter, const std::string &fasm_feature) {
+    auto ret = features_.insert(std::make_pair(eblif_parameter, FeatureParameter()));
+
+    VTR_ASSERT(ret.second);
+
+    ret.first->second.feature = fasm_feature;
+    ret.first->second.width = FeatureWidth(fasm_feature);
+}
+
+std::string Parameters::EmitFasmFeature(const std::string &eblif_parameter, const std::string &value) {
+    auto iter = features_.find(eblif_parameter);
+    if(iter == features_.end()) {
+        return "";
+    }
+
+    // Parameter should have exactly the expected number of bits.
+    if(value.size() != iter->second.width) {
+        vpr_throw(VPR_ERROR_OTHER,
+                __FILE__, __LINE__, "When emitting FASM for parameter %s, expected width of %d got width of %d, value = \"%s\".",
+                eblif_parameter.c_str(), iter->second.width, value.size(), value.c_str());
+    }
+    VTR_ASSERT(value.size() == iter->second.width);
+
+    return vtr::string_fmt("%s=%d'b%s", iter->second.feature.c_str(),
+            iter->second.width, value.c_str());
+}
+
+size_t Parameters::FeatureWidth(const std::string &feature) const {
+    size_t start_of_address = feature.rfind('[');
+    size_t end_of_address = feature.rfind(']');
+
+    if(start_of_address == std::string::npos) {
+        VTR_ASSERT(end_of_address == std::string::npos);
+        return 1;
+    }
+
+    VTR_ASSERT(end_of_address > start_of_address+1);
+
+    auto address = feature.substr(start_of_address+1, end_of_address - start_of_address - 1);
+
+    size_t address_split = address.find(':');
+
+    if(address_split == std::string::npos) {
+        return 1;
+    }
+
+    VTR_ASSERT(address_split > 0);
+    VTR_ASSERT(address_split+1 < address.size());
+
+    int high_slice = vtr::atoi(address.substr(0, address_split));
+    int low_slice = vtr::atoi(address.substr(address_split+1));
+    VTR_ASSERT(high_slice >= low_slice);
+
+    return high_slice - low_slice + 1;
+}
+
+} // namespace fasm
diff --git a/utils/fasm/src/parameters.h b/utils/fasm/src/parameters.h
new file mode 100644
index 0000000..65be708
--- /dev/null
+++ b/utils/fasm/src/parameters.h
@@ -0,0 +1,30 @@
+#ifndef PARAMETERS_H
+#define PARAMETERS_H
+
+#include "netlist_writer.h"
+
+namespace fasm {
+
+// Utility class emit parameters from eblif
+class Parameters {
+ public:
+  // Adds a parameter mapping between an eblif parameter (e.g. .param <param> <value>)
+  // to FASM feature.
+  void AddParameter(const std::string &eblif_parameter, const std::string &fasm_feature);
+
+  // Return a FASM feature directive for the given parameter and value.
+  std::string EmitFasmFeature(const std::string &eblif_parameter, const std::string &value);
+ private:
+  struct FeatureParameter {
+      size_t width;
+      std::string feature;
+  };
+
+  std::unordered_map<std::string, FeatureParameter> features_;
+
+  size_t FeatureWidth(const std::string &feature) const;
+};
+
+} // namespace fasm
+
+#endif  // PARAMETERS_H
diff --git a/utils/fasm/test/main.cpp b/utils/fasm/test/main.cpp
new file mode 100644
index 0000000..0c7c351
--- /dev/null
+++ b/utils/fasm/test/main.cpp
@@ -0,0 +1,2 @@
+#define CATCH_CONFIG_MAIN
+#include "catch.hpp"
diff --git a/utils/fasm/test/test_fasm.cpp b/utils/fasm/test/test_fasm.cpp
new file mode 100644
index 0000000..04b9e74
--- /dev/null
+++ b/utils/fasm/test/test_fasm.cpp
@@ -0,0 +1,145 @@
+#include "catch.hpp"
+
+#include "vpr_api.h"
+#include "vtr_util.h"
+#include "rr_metadata.h"
+#include "fasm.h"
+#include "arch_util.h"
+#include "rr_graph_writer.h"
+#include <sstream>
+
+static constexpr const char kArchFile[] = "test_fasm_arch.xml";
+static constexpr const char kRrGraphFile[] = "test_fasm_rrgraph.xml";
+
+namespace {
+
+using Catch::Matchers::Equals;
+
+TEST_CASE("fasm_integration_test", "[fasm]") {
+    {
+        t_vpr_setup vpr_setup;
+        t_arch arch;
+        t_options options;
+        const char *argv[] = {
+            "test_vpr",
+            kArchFile,
+            "wire.eblif",
+            "--route_chan_width",
+            "100",
+        };
+        vpr_init(sizeof(argv)/sizeof(argv[0]), argv,
+                &options, &vpr_setup, &arch);
+        bool flow_succeeded = vpr_flow(vpr_setup, arch);
+        REQUIRE(flow_succeeded == true);
+
+        auto &device_ctx = g_vpr_ctx.mutable_device();
+        for(int inode = 0; inode < device_ctx.rr_nodes.size(); ++inode) {
+            for(int iedge = 0; iedge < device_ctx.rr_nodes[inode].num_edges(); ++iedge) {
+                auto sink_inode = device_ctx.rr_nodes[inode].edge_sink_node(iedge);
+                auto switch_id = device_ctx.rr_nodes[inode].edge_switch(iedge);
+                vpr::add_rr_edge_metadata(inode, sink_inode, switch_id,
+                        "fasm_features", vtr::string_fmt("%d_%d_%d",
+                            inode, sink_inode, switch_id));
+            }
+        }
+
+        write_rr_graph(kRrGraphFile, vpr_setup.Segments);
+        vpr_free_all(arch, vpr_setup);
+    }
+
+    t_vpr_setup vpr_setup;
+    t_arch arch;
+    t_options options;
+    const char *argv[] = {
+        "test_vpr",
+        kArchFile,
+        "wire.eblif",
+        "--route_chan_width",
+        "100",
+        "--read_rr_graph",
+        kRrGraphFile,
+    };
+
+    vpr_init(sizeof(argv)/sizeof(argv[0]), argv,
+              &options, &vpr_setup, &arch);
+
+    vpr_setup.gen_netlist_as_blif     = false;
+    vpr_setup.PackerOpts.doPacking    = STAGE_LOAD;
+    vpr_setup.PlacerOpts.doPlacement  = STAGE_LOAD;
+    vpr_setup.RouterOpts.doRouting    = STAGE_LOAD;
+    vpr_setup.AnalysisOpts.doAnalysis = STAGE_SKIP;
+
+    bool flow_succeeded = vpr_flow(vpr_setup, arch);
+    REQUIRE(flow_succeeded == true);
+
+    std::stringstream fasm_string;
+    fasm::FasmWriterVisitor visitor(fasm_string);
+    NetlistWalker nl_walker(visitor);
+    nl_walker.walk();
+
+    fasm_string.seekg(0);
+
+    std::set<std::tuple<int, int, short>> routing_edges;
+    bool found_lut5 = false;
+    bool found_lut6 = false;
+    while(fasm_string) {
+        // Should see something like:
+        // CLB.FLE0.N2_LUT5
+        // CLB.FLE8.LUT5_1.LUT[31:0]=32'b00000000000000010000000000000000
+        // CLB.FLE9.OUT_MUX.LUT
+        // CLB.FLE9.LUT6[63:0]=64'b0000000000000000000000000000000100000000000000000000000000000000
+        // 3634_3690_0
+        std::string line;
+        std::getline(fasm_string, line);
+
+        if(line == "") {
+            continue;
+        }
+
+        if(line.find("CLB") != std::string::npos) {
+            auto pos = line.find("LUT[");
+            if(pos != std::string::npos) {
+                CHECK_THAT(line.substr(pos), Equals("LUT[31:0]=32'b00000000000000010000000000000000"));
+                found_lut5 = true;
+            }
+
+            pos = line.find("LUT6[");
+            if(pos != std::string::npos) {
+                CHECK_THAT(line.substr(pos), Equals("LUT6[63:0]=64'b0000000000000000000000000000000100000000000000000000000000000000"));
+                found_lut6 = true;
+            }
+        } else {
+            auto parts = vtr::split(line, "_");
+            REQUIRE(parts.size() == 3);
+            auto src_inode = vtr::atoi(parts[0]);
+            auto sink_inode = vtr::atoi(parts[1]);
+            auto switch_id = vtr::atoi(parts[2]);
+
+            auto ret = routing_edges.insert(std::make_tuple(src_inode, sink_inode, switch_id));
+            CHECK(ret.second == true);
+        }
+    }
+
+    const auto & route_ctx = g_vpr_ctx.routing();
+    for(const auto &trace : route_ctx.trace) {
+        const t_trace *head = trace.head;
+        while(head != nullptr) {
+            const t_trace *next = head->next;
+
+            if(next != nullptr) {
+                const auto next_inode = next->index;
+                auto iter = routing_edges.find(std::make_tuple(head->index, next_inode, head->iswitch));
+                CHECK(iter != routing_edges.end());
+            }
+
+            head = next;
+        }
+    }
+
+    CHECK(found_lut5);
+    CHECK(found_lut6);
+
+    vpr_free_all(arch, vpr_setup);
+}
+
+} // namespace
diff --git a/utils/fasm/test/test_fasm_arch.xml b/utils/fasm/test/test_fasm_arch.xml
new file mode 100644
index 0000000..ce1ba63
--- /dev/null
+++ b/utils/fasm/test/test_fasm_arch.xml
@@ -0,0 +1,274 @@
+<architecture>
+  <models/>
+
+  <layout>
+    <fixed_layout height="10" width="10" name="test" >
+      <perimeter type="io" priority="100">
+        <metadata>
+          <meta name="type">io</meta>
+        </metadata>
+      </perimeter>
+      <corners type="EMPTY" priority="101"/>
+      <fill type="clb" priority="10">
+        <metadata>
+          <meta name="fasm_prefix">CLB</meta>
+        </metadata>
+      </fill>
+      <col type="EMPTY" startx="6" repeatx="8" starty="1" priority="19"/>
+      <col type="EMPTY" startx="2" repeatx="8" starty="1" priority="19"/>
+    </fixed_layout>
+  </layout>
+
+  <device>
+    <sizing R_minW_nmos="8926" R_minW_pmos="16067"/>
+    <area grid_logic_tile_area="53894"/>
+    <chan_width_distr>
+      <x distr="uniform" peak="1.000000"/>
+      <y distr="uniform" peak="1.000000"/>
+    </chan_width_distr>
+    <switch_block type="wilton" fs="3"/>
+    <connection_block input_switch_name="ipin_cblock"/>
+  </device>
+  <switchlist>
+    <switch type="mux" name="0" R="551" Cin=".77e-15" Cout="4e-15" Tdel="58e-12" mux_trans_size="2.630740" buf_size="27.645901"/>
+    <switch type="mux" name="ipin_cblock" R="2231.5" Cout="0." Cin="1.47e-15" Tdel="7.247000e-11" mux_trans_size="1.222260" buf_size="auto"/>
+  </switchlist>
+  <segmentlist>
+    <segment freq="1.000000" length="4" type="unidir" Rmetal="101" Cmetal="22.5e-15">
+      <mux name="0"/>
+      <sb type="pattern">1 1 1 1 1</sb>
+      <cb type="pattern">1 1 1 1</cb>
+    </segment>
+  </segmentlist>
+
+  <complexblocklist>
+    <pb_type name="io" capacity="8">
+      <input name="outpad" num_pins="1"/>
+      <output name="inpad" num_pins="1"/>
+      <clock name="clock" num_pins="1"/>
+
+      <mode name="inpad">
+        <metadata>
+          <meta name="mode">inpad</meta>
+        </metadata>
+        <pb_type name="inpad" blif_model=".input" num_pb="1">
+          <output name="inpad" num_pins="1"/>
+        </pb_type>
+        <interconnect>
+          <direct name="inpad" input="inpad.inpad" output="io.inpad">
+            <delay_constant max="4.243e-11" in_port="inpad.inpad" out_port="io.inpad"/>
+            <metadata>
+              <meta name="interconnect">inpad_iconnect</meta>
+            </metadata>
+          </direct>
+        </interconnect>
+
+      </mode>
+      <mode name="outpad">
+        <pb_type name="outpad" blif_model=".output" num_pb="1">
+          <input name="outpad" num_pins="1"/>
+        </pb_type>
+        <interconnect>
+          <direct name="outpad" input="io.outpad" output="outpad.outpad">
+            <delay_constant max="1.394e-11" in_port="io.outpad" out_port="outpad.outpad"/>
+          </direct>
+        </interconnect>
+      </mode>
+
+      <!-- Every input pin is driven by 15% of the tracks in a channel, every output pin is driven by 10% of the tracks in a channel -->
+      <fc in_type="frac" in_val="0.15" out_type="frac" out_val="0.10"/>
+
+      <!-- IOs go on the periphery of the FPGA, for consistency,
+          make it physically equivalent on all sides so that only one definition of I/Os is needed.
+          If I do not make a physically equivalent definition, then I need to define 4 different I/Os, one for each side of the FPGA
+        -->
+      <pinlocations pattern="custom">
+        <loc side="left">io.outpad io.inpad io.clock</loc>
+        <loc side="top">io.outpad io.inpad io.clock</loc>
+        <loc side="right">io.outpad io.inpad io.clock</loc>
+        <loc side="bottom">io.outpad io.inpad io.clock</loc>
+      </pinlocations>
+
+      <!-- Place I/Os on the sides of the FPGA -->
+      <power method="ignore"/>
+    </pb_type>
+
+    <pb_type name="clb">
+      <input name="I" num_pins="33" equivalent="full"/>
+      <output name="O" num_pins="20" equivalent="none"/>
+      <clock name="clk" num_pins="1"/>
+      <pb_type name="fle" num_pb="10">
+        <input name="in" num_pins="6"/>
+        <output name="out" num_pins="2"/>
+        <clock name="clk" num_pins="1"/>
+
+        <mode name="n2_lut5">
+          <pb_type name="lut5inter" num_pb="1">
+            <input name="in" num_pins="5"/>
+            <output name="out" num_pins="2"/>
+            <clock name="clk" num_pins="1"/>
+            <pb_type name="ble5" num_pb="2">
+              <input name="in" num_pins="5"/>
+              <output name="out" num_pins="1"/>
+              <clock name="clk" num_pins="1"/>
+
+              <pb_type name="lut5" blif_model=".names" num_pb="1" class="lut">
+                <input name="in" num_pins="5" port_class="lut_in"/>
+                <output name="out" num_pins="1" port_class="lut_out"/>
+                <delay_matrix type="max" in_port="lut5.in" out_port="lut5.out">
+                  235e-12
+                  235e-12
+                  235e-12
+                  235e-12
+                  235e-12
+                </delay_matrix>
+                <metadata>
+                  <meta name="fasm_type">LUT</meta>
+                  <meta name="fasm_lut">
+                    LUT[31:0] = LUT
+                  </meta>
+                </metadata>
+              </pb_type>
+
+              <pb_type name="ff" blif_model=".latch" num_pb="1" class="flipflop">
+                <input name="D" num_pins="1" port_class="D"/>
+                <output name="Q" num_pins="1" port_class="Q"/>
+                <clock name="clk" num_pins="1" port_class="clock"/>
+                <T_setup value="66e-12" port="ff.D" clock="clk"/>
+                <T_clock_to_Q max="124e-12" port="ff.Q" clock="clk"/>
+              </pb_type>
+
+              <interconnect>
+                <direct name="direct1" input="ble5.in[4:0]" output="lut5[0:0].in[4:0]"/>
+                <direct name="direct2" input="lut5[0:0].out" output="ff[0:0].D">
+                  <!-- Advanced user option that tells CAD tool to find LUT+FF pairs in netlist -->
+                  <pack_pattern name="ble5" in_port="lut5[0:0].out" out_port="ff[0:0].D"/>
+                </direct>
+                <direct name="direct3" input="ble5.clk" output="ff[0:0].clk"/>
+                <mux name="mux1" input="ff[0:0].Q lut5.out[0:0]" output="ble5.out[0:0]">
+                  <!-- LUT to output is faster than FF to output on a Stratix IV -->
+                  <delay_constant max="25e-12" in_port="lut5.out[0:0]" out_port="ble5.out[0:0]"/>
+                  <delay_constant max="45e-12" in_port="ff[0:0].Q" out_port="ble5.out[0:0]"/>
+                  <meta name="fasm_mux">
+                    ff.Q : OUT_MUX.FFQ
+                    lut5.out : OUT_MUX.LUT
+                  </meta>
+                </mux>
+              </interconnect>
+              <metadata>
+                <meta name="fasm_prefix">
+                  LUT5_0 LUT5_1
+                </meta>
+              </metadata>
+            </pb_type>
+            <interconnect>
+              <direct name="direct1" input="lut5inter.in" output="ble5[0:0].in"/>
+              <direct name="direct2" input="lut5inter.in" output="ble5[1:1].in"/>
+              <direct name="direct3" input="ble5[1:0].out" output="lut5inter.out"/>
+              <complete name="complete1" input="lut5inter.clk" output="ble5[1:0].clk"/>
+            </interconnect>
+          </pb_type>
+          <interconnect>
+            <direct name="direct1" input="fle.in[4:0]" output="lut5inter.in"/>
+            <direct name="direct2" input="lut5inter.out" output="fle.out"/>
+            <direct name="direct3" input="fle.clk" output="lut5inter.clk"/>
+          </interconnect>
+          <metadata>
+            <meta name="fasm_features">
+              N2_LUT5
+            </meta>
+          </metadata>
+        </mode>
+        <mode name="n1_lut6">
+          <pb_type name="ble6" num_pb="1">
+            <input name="in" num_pins="6"/>
+            <output name="out" num_pins="1"/>
+            <clock name="clk" num_pins="1"/>
+
+            <pb_type name="lut6" blif_model=".names" num_pb="1" class="lut">
+              <input name="in" num_pins="6" port_class="lut_in"/>
+              <output name="out" num_pins="1" port_class="lut_out"/>
+              <delay_matrix type="max" in_port="lut6.in" out_port="lut6.out">
+                261e-12
+                261e-12
+                261e-12
+                261e-12
+                261e-12
+                261e-12
+              </delay_matrix>
+              <metadata>
+                <meta name="fasm_type">LUT</meta>
+                <meta name="fasm_lut">
+                  LUT6[63:0] = LUT
+                </meta>
+              </metadata>
+            </pb_type>
+
+            <!-- Define flip-flop -->
+            <pb_type name="ff" blif_model=".latch" num_pb="1" class="flipflop">
+              <input name="D" num_pins="1" port_class="D"/>
+              <output name="Q" num_pins="1" port_class="Q"/>
+              <clock name="clk" num_pins="1" port_class="clock"/>
+              <T_setup value="66e-12" port="ff.D" clock="clk"/>
+              <T_clock_to_Q max="124e-12" port="ff.Q" clock="clk"/>
+            </pb_type>
+
+            <interconnect>
+              <direct name="direct1" input="ble6.in" output="lut6[0:0].in"/>
+              <direct name="direct2" input="lut6.out" output="ff.D">
+                <pack_pattern name="ble6" in_port="lut6.out" out_port="ff.D"/>
+              </direct>
+              <direct name="direct3" input="ble6.clk" output="ff.clk"/>
+              <mux name="mux1" input="ff.Q lut6.out" output="ble6.out">
+                <delay_constant max="25e-12" in_port="lut6.out" out_port="ble6.out"/>
+                <delay_constant max="45e-12" in_port="ff.Q" out_port="ble6.out"/>
+                <metadata>
+                  <meta name="fasm_mux">
+                    ff.Q : OUT_MUX.FFQ
+                    lut6.out : OUT_MUX.LUT
+                  </meta>
+                </metadata>
+              </mux>
+            </interconnect>
+          </pb_type>
+          <interconnect>
+            <direct name="direct1" input="fle.in" output="ble6.in"/>
+            <direct name="direct2" input="ble6.out" output="fle.out[0:0]"/>
+            <direct name="direct3" input="fle.clk" output="ble6.clk"/>
+          </interconnect>
+          <metadata>
+            <meta name="fasm_features">
+              N1_LUT6
+            </meta>
+          </metadata>
+        </mode>
+        <metadata>
+          <meta name="fasm_prefix">
+            FLE0 FLE1 FLE2 FLE3 FLE4 FLE5 FLE6 FLE7 FLE8 FLE9
+          </meta>
+        </metadata>
+      </pb_type>
+      <interconnect>
+        <complete name="crossbar" input="clb.I fle[9:0].out" output="fle[9:0].in">
+          <delay_constant max="95e-12" in_port="clb.I" out_port="fle[9:0].in"/>
+          <delay_constant max="75e-12" in_port="fle[9:0].out" out_port="fle[9:0].in"/>
+        </complete>
+        <complete name="clks" input="clb.clk" output="fle[9:0].clk">
+        </complete>
+
+        <direct name="clbouts1" input="fle[9:0].out[0:0]" output="clb.O[9:0]"/>
+        <direct name="clbouts2" input="fle[9:0].out[1:1]" output="clb.O[19:10]"/>
+      </interconnect>
+
+      <fc in_type="frac" in_val="0.15" out_type="frac" out_val="0.10"/>
+
+      <pinlocations pattern="spread"/>
+    </pb_type>
+  </complexblocklist>
+  <power>
+    <local_interconnect C_wire="2.5e-10"/>
+  </power>
+  <clocks>
+    <clock buffer_size="auto" C_wire="2.5e-10"/>
+  </clocks>
+</architecture>
diff --git a/utils/fasm/test/test_lut.cpp b/utils/fasm/test/test_lut.cpp
new file mode 100644
index 0000000..4c64848
--- /dev/null
+++ b/utils/fasm/test/test_lut.cpp
@@ -0,0 +1,105 @@
+#include "catch.hpp"
+
+#include "lut.h"
+
+namespace {
+
+using Catch::Matchers::Equals;
+
+TEST_CASE("default_lut", "[fasm]") {
+    for(size_t num_inputs = 0; num_inputs < 10; ++num_inputs) {
+        fasm::Lut lut(num_inputs);
+
+        const LogicVec &table = lut.table();
+        CHECK(table.size() == (1 << num_inputs));
+
+        for(const vtr::LogicValue & value : table) {
+            CHECK(value == vtr::LogicValue::FALSE);
+        }
+    }
+}
+
+
+TEST_CASE("const_true", "[fasm]") {
+    for(size_t num_inputs = 0; num_inputs < 10; ++num_inputs) {
+        fasm::Lut lut(num_inputs);
+        lut.SetConstant(vtr::LogicValue::TRUE);
+
+        const LogicVec &table = lut.table();
+        CHECK(table.size() == (1 << num_inputs));
+
+        for(const vtr::LogicValue & value : table) {
+            CHECK(value == vtr::LogicValue::TRUE);
+        }
+    }
+}
+
+TEST_CASE("const_false", "[fasm]") {
+    for(size_t num_inputs = 0; num_inputs < 10; ++num_inputs) {
+        {
+            fasm::Lut lut(num_inputs);
+            lut.SetConstant(vtr::LogicValue::FALSE);
+
+            const LogicVec &table = lut.table();
+            CHECK(table.size() == (1 << num_inputs));
+
+            for(const vtr::LogicValue & value : table) {
+                CHECK(value == vtr::LogicValue::FALSE);
+            }
+        }
+    }
+}
+
+TEST_CASE("wire", "[fasm]") {
+    for(size_t num_inputs = 0; num_inputs < 10; ++num_inputs) {
+        for(size_t input_pin = 0; input_pin < num_inputs; ++input_pin) {
+            fasm::Lut lut(num_inputs);
+            lut.CreateWire(input_pin);
+
+            const LogicVec &table = lut.table();
+            CHECK(table.size() == (1 << num_inputs));
+            for(size_t i = 0; i < table.size(); ++i) {
+                if(((1 << input_pin) & i) != 0) {
+                    CHECK(table[i] == vtr::LogicValue::TRUE);
+                } else {
+                    CHECK(table[i] == vtr::LogicValue::FALSE);
+                }
+            }
+        }
+    }
+}
+
+TEST_CASE("lut_output", "[fasm]") {
+    fasm::LutOutputDefinition lut_def("TEST[31:0]");
+
+    CHECK_THAT(lut_def.fasm_feature, Equals("TEST"));
+    CHECK(lut_def.num_inputs == 5);
+    CHECK(lut_def.start_bit == 0);
+    CHECK(lut_def.end_bit == 31);
+
+    CHECK_THAT(lut_def.CreateConstant(vtr::LogicValue::TRUE), Equals("TEST[31:0]=32'b11111111111111111111111111111111"));
+    CHECK_THAT(lut_def.CreateConstant(vtr::LogicValue::FALSE), Equals("TEST[31:0]=32'b00000000000000000000000000000000"));
+    CHECK_THAT(lut_def.CreateWire(0), Equals("TEST[31:0]=32'b10101010101010101010101010101010"));
+    CHECK_THAT(lut_def.CreateWire(1), Equals("TEST[31:0]=32'b11001100110011001100110011001100"));
+}
+
+TEST_CASE("lut_table_output", "[fasm]") {
+    fasm::LutOutputDefinition lut_def("TEST[7:4]");
+
+    CHECK_THAT(lut_def.fasm_feature, Equals("TEST"));
+    CHECK(lut_def.num_inputs == 2);
+    CHECK(lut_def.start_bit == 4);
+    CHECK(lut_def.end_bit == 7);
+
+    LogicVec vec(4, vtr::LogicValue::FALSE);
+
+    CHECK_THAT(lut_def.CreateInit(vec), Equals("TEST[7:4]=4'b0000"));
+
+    vec[0] = vtr::LogicValue::TRUE;
+    CHECK_THAT(lut_def.CreateInit(vec), Equals("TEST[7:4]=4'b0001"));
+
+    vec[3] = vtr::LogicValue::TRUE;
+    CHECK_THAT(lut_def.CreateInit(vec), Equals("TEST[7:4]=4'b1001"));
+}
+
+} // namespace
diff --git a/utils/fasm/test/test_parameters.cpp b/utils/fasm/test/test_parameters.cpp
new file mode 100644
index 0000000..817f47f
--- /dev/null
+++ b/utils/fasm/test/test_parameters.cpp
@@ -0,0 +1,24 @@
+#include "catch.hpp"
+
+#include "parameters.h"
+
+namespace {
+
+using Catch::Matchers::Equals;
+
+TEST_CASE("parameters", "[fasm]") {
+    fasm::Parameters params;
+
+    params.AddParameter("A", "B");
+    params.AddParameter("INIT_0", "INIT[31:0]");
+    params.AddParameter("INIT_1", "INIT[63:32]");
+
+    // Unmatched features returns "".
+    CHECK_THAT(params.EmitFasmFeature("C", "0"), Equals(""));
+
+    CHECK_THAT(params.EmitFasmFeature("A", "0"), Equals("B=1'b0"));
+    CHECK_THAT(params.EmitFasmFeature("INIT_0", "10100000000000000000000000000001"), Equals("INIT[31:0]=32'b10100000000000000000000000000001"));
+    CHECK_THAT(params.EmitFasmFeature("INIT_1", "00010000000000000000000000001001"), Equals("INIT[63:32]=32'b00010000000000000000000000001001"));
+}
+
+} // namespace
diff --git a/utils/fasm/test/test_utils.cpp b/utils/fasm/test/test_utils.cpp
new file mode 100644
index 0000000..3cf1d89
--- /dev/null
+++ b/utils/fasm/test/test_utils.cpp
@@ -0,0 +1,40 @@
+#include "catch.hpp"
+
+#include "fasm_utils.h"
+
+namespace fasm {
+namespace {
+
+using Catch::Matchers::Equals;
+using Catch::Matchers::Contains;
+
+TEST_CASE("parse_names", "[fasm]") {
+    std::string name;
+    int index;
+
+    parse_name_with_optional_index("A", &name, &index);
+    CHECK_THAT(name, Equals("A"));
+    CHECK(index == 0);
+
+    parse_name_with_optional_index("ABCD[500]", &name, &index);
+    CHECK_THAT(name, Equals("ABCD"));
+    CHECK(index == 500);
+}
+
+TEST_CASE("split_fasm_entry", "[fasm]") {
+    auto parts = split_fasm_entry("A B C", " ", "");
+    REQUIRE(parts.size() == 3);
+    CHECK_THAT(parts[0], Equals("A"));
+    CHECK_THAT(parts[1], Equals("B"));
+    CHECK_THAT(parts[2], Equals("C"));
+
+    parts = split_fasm_entry("A \nB\n\tC ", "\n", "\t ");
+
+    REQUIRE(parts.size() == 3);
+    CHECK_THAT(parts[0], Equals("A"));
+    CHECK_THAT(parts[1], Equals("B"));
+    CHECK_THAT(parts[2], Equals("C"));
+}
+
+} // namespace
+} // namespace fasm
diff --git a/utils/fasm/test/wire.eblif b/utils/fasm/test/wire.eblif
new file mode 100644
index 0000000..1f4c799
--- /dev/null
+++ b/utils/fasm/test/wire.eblif
@@ -0,0 +1,8 @@
+.model top
+.inputs di0 di1 di2 di3 di4 di5
+.outputs do0 do1
+.names di0 di1 di2 di3 di4 di5 do0
+000001 1
+.names di0 di1 di2 di3 di4 do1
+00010 1
+.end
diff --git a/vpr/src/base/logic_vec.cpp b/vpr/src/base/logic_vec.cpp
new file mode 100644
index 0000000..cdf6594
--- /dev/null
+++ b/vpr/src/base/logic_vec.cpp
@@ -0,0 +1,12 @@
+#include "logic_vec.h"
+#include "vtr_assert.h"
+
+//Output operator for vtr::LogicValue
+std::ostream& operator<<(std::ostream& os, vtr::LogicValue val) {
+    if(val == vtr::LogicValue::FALSE) os << "0";
+    else if (val == vtr::LogicValue::TRUE) os << "1";
+    else if (val == vtr::LogicValue::DONT_CARE) os << "-";
+    else if (val == vtr::LogicValue::UNKOWN) os << "x";
+    else VTR_ASSERT(false);
+    return os;
+}
diff --git a/vpr/src/base/logic_vec.h b/vpr/src/base/logic_vec.h
new file mode 100644
index 0000000..5e6741a
--- /dev/null
+++ b/vpr/src/base/logic_vec.h
@@ -0,0 +1,51 @@
+#ifndef LOGIC_VEC_H
+#define LOGIC_VEC_H
+
+#include <vector>
+#include <ostream>
+
+#include "vtr_logic.h"
+
+
+std::ostream& operator<<(std::ostream& os, vtr::LogicValue val);
+
+//A vector-like object containing logic values.
+class LogicVec {
+    public:
+        LogicVec() = default;
+        LogicVec(size_t size_val, //Number of logic values
+                 vtr::LogicValue init_value) //Default value
+            : values_(size_val, init_value)
+            {}
+        LogicVec(std::vector<vtr::LogicValue> values)
+            : values_(values) {}
+
+        //Array indexing operator
+        const vtr::LogicValue& operator[](size_t i) const { return values_[i]; }
+        vtr::LogicValue& operator[](size_t i) { return values_[i]; }
+
+        // Size accessor
+        size_t size() const { return values_.size(); }
+
+        //Output operator which writes the logic vector in verilog format
+        friend std::ostream& operator<<(std::ostream& os, LogicVec logic_vec) {
+            os << logic_vec.values_.size() << "'b";
+            //Print in reverse since th convention is MSB on the left, LSB on the right
+            //but we store things in array order (putting LSB on left, MSB on right)
+            for(auto iter = logic_vec.begin(); iter != logic_vec.end(); iter++) {
+                os << *iter;
+            }
+            return os;
+        }
+
+        //Standard iterators
+        std::vector<vtr::LogicValue>::reverse_iterator begin() { return values_.rbegin(); }
+        std::vector<vtr::LogicValue>::reverse_iterator end() { return values_.rend(); }
+        std::vector<vtr::LogicValue>::const_reverse_iterator begin() const { return values_.crbegin(); }
+        std::vector<vtr::LogicValue>::const_reverse_iterator end() const { return values_.crend(); }
+
+    private:
+        std::vector<vtr::LogicValue> values_; //The logic values
+};
+
+#endif
diff --git a/vpr/src/base/netlist_walker.cpp b/vpr/src/base/netlist_walker.cpp
index a54fde4..1648b2b 100644
--- a/vpr/src/base/netlist_walker.cpp
+++ b/vpr/src/base/netlist_walker.cpp
@@ -10,43 +10,48 @@
     visitor_.visit_top(atom_ctx.nlist.netlist_name().c_str());
 
     for(auto blk_id : cluster_ctx.clb_nlist.blocks()) {
+        const auto *pb = cluster_ctx.clb_nlist.block_pb(blk_id);
+
         //Visit the top-level block
-		visitor_.visit_clb(cluster_ctx.clb_nlist.block_pb(blk_id));
+        visitor_.visit_clb(blk_id, pb);
 
         //Visit all the block's primitives
-        walk_atoms(cluster_ctx.clb_nlist.block_pb(blk_id));
+        walk_blocks(pb->pb_route, pb);
     }
 
     visitor_.finish();
 }
 
-void NetlistWalker::walk_atoms(const t_pb* pb) {
-    //Recursively travers this pb calling visitor_.visit_atom()
-    //on any of its primitive pb's
+void NetlistWalker::walk_blocks(const t_pb_routes &top_pb_route, const t_pb* pb) {
+    // Recursively travers this pb calling visitor_.visit_atom() or
+    // visitor_.visit_open() on any of its primitive pb's.
 
-    if(pb == nullptr || pb->name == nullptr) {
-        //Empty pb
-        return;
-    }
+    VTR_ASSERT(pb != nullptr);
+    VTR_ASSERT(pb->pb_graph_node != nullptr);
 
+    visitor_.visit_all(top_pb_route, pb);
     if(pb->child_pbs == nullptr) {
         //Primitive pb
-        visitor_.visit_atom(pb);
+        if (pb->name != NULL) {
+            visitor_.visit_atom(pb);
+        } else {
+            visitor_.visit_route_through(pb);
+        }
         return;
     }
 
     //Recurse
     const t_pb_type* pb_type = pb->pb_graph_node->pb_type;
     if(pb_type->num_modes > 0) {
-        for(int i = 0; i < pb_type->modes[pb->mode].num_pb_type_children; i++) {
-            for(int j = 0; j < pb_type->modes[pb->mode].pb_type_children[i].num_pb; j++) {
-                walk_atoms(&pb->child_pbs[i][j]);
+        const t_mode *mode = &pb_type->modes[pb->mode];
+        for(int i = 0; i < mode->num_pb_type_children; i++) {
+            for(int j = 0; j < mode->pb_type_children[i].num_pb; j++) {
+                walk_blocks(top_pb_route, &pb->child_pbs[i][j]);
             }
         }
     }
 }
 
-
 void NetlistVisitor::start_impl() {
     //noop
 }
@@ -55,7 +60,11 @@
     //noop
 }
 
-void NetlistVisitor::visit_clb_impl(const t_pb* /*clb*/) {
+void NetlistVisitor::visit_clb_impl(ClusterBlockId /*blk_id*/, const t_pb* /*clb*/) {
+    //noop
+}
+
+void NetlistVisitor::visit_route_through_impl(const t_pb* /*atom*/) {
     //noop
 }
 
@@ -63,6 +72,10 @@
     //noop
 }
 
+void NetlistVisitor::visit_all_impl(const t_pb_routes & /*top_pb_route*/, const t_pb* /* pb */) {
+    //noop
+}
+
 void NetlistVisitor::finish_impl() {
     //noop
 }
diff --git a/vpr/src/base/netlist_walker.h b/vpr/src/base/netlist_walker.h
index 2acd472..d9dac04 100644
--- a/vpr/src/base/netlist_walker.h
+++ b/vpr/src/base/netlist_walker.h
@@ -13,7 +13,7 @@
         void walk();
 
     private:
-        void walk_atoms(const t_pb* pb);
+        void walk_blocks(const t_pb_routes &pb_route, const t_pb *pb);
 
     private:
         NetlistVisitor& visitor_;
@@ -25,16 +25,36 @@
         virtual ~NetlistVisitor() = default;
         void start() { start_impl(); }
         void visit_top(const char* top_level_name) { visit_top_impl(top_level_name); }
-        void visit_clb(const t_pb* clb) { visit_clb_impl(clb); }
+        void visit_clb(ClusterBlockId blk_id, const t_pb* clb) { visit_clb_impl(blk_id, clb); }
+
+        // visit_atom is called on leaf pb nodes that map to a netlist element.
         void visit_atom(const t_pb* atom) { visit_atom_impl(atom); }
+
+        // visit_route_through is called on leaf pb nodes that do not map to a
+        // netlist element.  This is generally used for route-through nodes.
+        void visit_route_through(const t_pb* atom) {
+            visit_route_through_impl(atom);
+        }
+
+        // visit_all is called on all t_pb nodes that are in use for any
+        // reason.
+        //
+        // top_pb_route is the pb_route for the cluster being visited.
+        // pb is the current element in the cluster being visited.
+        void visit_all(const t_pb_routes &top_pb_route, const t_pb* pb) {
+                visit_all_impl(top_pb_route, pb);
+        }
         void finish() { finish_impl(); }
 
     protected:
         //All implementation methods are no-ops in this base class
         virtual void start_impl();
         virtual void visit_top_impl(const char* top_level_name);
-        virtual void visit_clb_impl(const t_pb* clb);
+        virtual void visit_clb_impl(ClusterBlockId blk_id, const t_pb* clb);
         virtual void visit_atom_impl(const t_pb* atom);
+        virtual void visit_route_through_impl(const t_pb* atom);
+        virtual void visit_all_impl(const t_pb_routes &top_pb_route, const t_pb* pb);
+
         virtual void finish_impl();
 };
 #endif
diff --git a/vpr/src/base/netlist_writer.cpp b/vpr/src/base/netlist_writer.cpp
index 109f1b0..7ec05da 100644
--- a/vpr/src/base/netlist_writer.cpp
+++ b/vpr/src/base/netlist_writer.cpp
@@ -26,6 +26,7 @@
 #include "path_delay.h"
 #include "atom_netlist.h"
 #include "atom_netlist_utils.h"
+#include "logic_vec.h"
 
 //Overview
 //========
@@ -91,8 +92,6 @@
 //
 //File local type declarations
 //
-std::ostream& operator<<(std::ostream& os, vtr::LogicValue val);
-
 
 /*enum class PortType {
     IN,
@@ -121,45 +120,6 @@
 //
 //
 
-//A vector-like object containing logic values.
-class LogicVec {
-    public:
-        LogicVec() = default;
-        LogicVec(size_t size_val, //Number of logic values
-                 vtr::LogicValue init_value) //Default value
-            : values_(size_val, init_value)
-            {}
-        LogicVec(std::vector<vtr::LogicValue> values)
-            : values_(values) {}
-
-        //Array indexing operator
-        vtr::LogicValue& operator[](size_t i) { return values_[i]; }
-
-        //Size accessor
-        size_t size() { return values_.size(); }
-
-
-        //Output operator which writes the logic vector in verilog format
-        friend std::ostream& operator<<(std::ostream& os, LogicVec logic_vec) {
-            os << logic_vec.values_.size() << "'b";
-            //Print in reverse since th convention is MSB on the left, LSB on the right
-            //but we store things in array order (putting LSB on left, MSB on right)
-            for(auto iter = logic_vec.begin(); iter != logic_vec.end(); iter++) {
-                os << *iter;
-            }
-            return os;
-        }
-
-        //Standard iterators
-        std::vector<vtr::LogicValue>::reverse_iterator begin() { return values_.rbegin(); }
-        std::vector<vtr::LogicValue>::reverse_iterator end() { return values_.rend(); }
-        std::vector<vtr::LogicValue>::const_reverse_iterator begin() const { return values_.crbegin(); }
-        std::vector<vtr::LogicValue>::const_reverse_iterator end() const { return values_.crend(); }
-
-    private:
-        std::vector<vtr::LogicValue> values_; //The logic values
-};
-
 //A combinational timing arc
 class Arc {
     public:
@@ -1846,7 +1806,6 @@
             }
             return count;
         }
-
         //Returns the logical net ID
         AtomNetId find_atom_input_logical_net(const t_pb* atom, int atom_input_idx) {
             const t_pb_graph_node* pb_node = atom->pb_graph_node;
@@ -1939,17 +1898,6 @@
 // File-scope function implementations
 //
 
-//Output operator for vtr::LogicValue
-std::ostream& operator<<(std::ostream& os, vtr::LogicValue val) {
-    if(val == vtr::LogicValue::FALSE) os << "0";
-    else if (val == vtr::LogicValue::TRUE) os << "1";
-    else if (val == vtr::LogicValue::DONT_CARE) os << "-";
-    else if (val == vtr::LogicValue::UNKOWN) os << "x";
-    else VTR_ASSERT(false);
-    return os;
-}
-
-
 //Returns a blank string for indenting the given depth
 std::string indent(size_t depth) {
     std::string indent_ = "    ";
diff --git a/vpr/src/base/netlist_writer.h b/vpr/src/base/netlist_writer.h
index 37dce0b..141ce6a 100644
--- a/vpr/src/base/netlist_writer.h
+++ b/vpr/src/base/netlist_writer.h
@@ -1,6 +1,11 @@
 #ifndef NETLIST_WRITER_H
 #define NETLIST_WRITER_H
 #include <memory>
+#include <string>
+#include <sstream>
+
+#include "vtr_logic.h"
+
 #include "AnalysisDelayCalculator.h"
 
 //Writes out the post-synthesis implementation netlists in BLIF and Verilog formats,
diff --git a/vpr/src/base/vpr_api.cpp b/vpr/src/base/vpr_api.cpp
index 3ae388d..2deb1af 100644
--- a/vpr/src/base/vpr_api.cpp
+++ b/vpr/src/base/vpr_api.cpp
@@ -983,6 +983,30 @@
 	cluster_ctx.clb_nlist = ClusteredNetlist();
 }
 
+static void free_atoms() {
+    auto& atom_ctx = g_vpr_ctx.mutable_atom();
+    atom_ctx.nlist = AtomNetlist();
+    atom_ctx.lookup = AtomLookup();
+}
+
+static void free_placement() {
+    auto& place_ctx = g_vpr_ctx.mutable_placement();
+    place_ctx.block_locs.clear();
+    place_ctx.grid_blocks.clear();
+}
+
+static void free_routing() {
+    auto& routing_ctx = g_vpr_ctx.mutable_routing();
+    routing_ctx.trace.clear();
+    routing_ctx.trace_nodes.clear();
+    routing_ctx.net_rr_terminals.clear();
+    routing_ctx.rr_blk_source.clear();
+    routing_ctx.rr_blk_source.clear();
+    routing_ctx.rr_node_route_inf.clear();
+    routing_ctx.net_status.clear();
+    routing_ctx.route_bb.clear();
+}
+
 void vpr_free_vpr_data_structures(t_arch& Arch,
         t_vpr_setup& vpr_setup) {
 
@@ -993,6 +1017,9 @@
     free_echo_file_info();
     free_timing_stats();
     free_sdc_related_structs();
+    free_placement();
+    free_routing();
+    free_atoms();
 }
 
 void vpr_free_all(t_arch& Arch,
diff --git a/vpr/src/route/route_common.cpp b/vpr/src/route/route_common.cpp
index c039e0d..12f8f4d 100644
--- a/vpr/src/route/route_common.cpp
+++ b/vpr/src/route/route_common.cpp
@@ -1715,7 +1715,10 @@
 void free_chunk_memory_trace() {
 	if (trace_ch.chunk_ptr_head != nullptr) {
 		free_chunk_memory(&trace_ch);
+		trace_ch.chunk_ptr_head = nullptr;
+		trace_free_head = nullptr;
 	}
+
 }