tree: 5f0b1fac00db66fbb1ae4fc25deac1ea17b65805 [path history] [tgz]
  1. BUILD
  2. concrete_syntax_leaf.cc
  3. concrete_syntax_leaf.h
  4. concrete_syntax_leaf_test.cc
  5. concrete_syntax_tree.cc
  6. concrete_syntax_tree.h
  7. concrete_syntax_tree_test.cc
  8. config_utils.cc
  9. config_utils.h
  10. config_utils_test.cc
  11. constants.h
  12. macro_definition.cc
  13. macro_definition.h
  14. macro_definition_test.cc
  15. parser_verifier.cc
  16. parser_verifier.h
  17. parser_verifier_test.cc
  18. README.md
  19. symbol.cc
  20. symbol.h
  21. syntax_tree_context.h
  22. syntax_tree_context_test.cc
  23. text_structure.cc
  24. text_structure.h
  25. text_structure_test.cc
  26. text_structure_test_utils.cc
  27. text_structure_test_utils.h
  28. token_info.cc
  29. token_info.h
  30. token_info_json.cc
  31. token_info_json.h
  32. token_info_json_test.cc
  33. token_info_test.cc
  34. token_info_test_util.cc
  35. token_info_test_util.h
  36. token_info_test_util_test.cc
  37. token_stream_view.cc
  38. token_stream_view.h
  39. token_stream_view_test.cc
  40. tree_builder_test_util.cc
  41. tree_builder_test_util.h
  42. tree_builder_test_util_test.cc
  43. tree_compare.cc
  44. tree_compare.h
  45. tree_compare_test.cc
  46. tree_context_visitor.cc
  47. tree_context_visitor.h
  48. tree_context_visitor_test.cc
  49. tree_utils.cc
  50. tree_utils.h
  51. tree_utils_test.cc
  52. visitors.h
common/text/README.md

Text Structural Representation Libraries

At the heart of language-tooling libraries and applications lie various structural representations of text, and the functions that operate on them. This directory contains language-agnostic data structures like:

  • Tokens: annotated substrings of a body of text, often what a lexer produces.
    • Token streams: iterable representations of lexer output, including filtered views thereof.
  • Syntax trees: represent how parsers understand and organize code hierarchically.

Key Concepts

absl::string_views do not just represent text, but they represent position within a larger body of text, by virtue of comparing their begin and end bounds. This concept is leveraged heavily to avoid unnecessary string copying. A base string_view that represents a body of text and serve as the basis for interchanging between substring-views and byte-offsets relative to the start of the base.