Skip to content

Commit 69a5f77

Browse files
author
Patrick Thomson
committed
Describe how to use tree-sitter tags as well.
1 parent 1fbace1 commit 69a5f77

File tree

1 file changed

+27
-7
lines changed

1 file changed

+27
-7
lines changed

docs/section-8-code-navigation-systems.md

Lines changed: 27 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,19 @@ permalink: code-navigation-systems
55

66
# Code Navigation Systems
77

8-
Tree-sitter can be used in conjunction with its [tree query language](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries) as a part of code navigation systems. An example of such a system can be seen in the `tree-sitter tag` command, which emits a textual dump of the interesting syntactic nodes in its file argument. A notable application of this is GitHub's support for [search-based code navigation](https://docs.github.com/en/repositories/working-with-files/using-files/navigating-code-on-github#precise-and-search-based-navigation). This document exists to describe how to extend the
8+
Tree-sitter can be used in conjunction with its [tree query language](https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries) as a part of code navigation systems. An example of such a system can be seen in the `tree-sitter tag` command, which emits a textual dump of the interesting syntactic nodes in its file argument. A notable application of this is GitHub's support for [search-based code navigation](https://docs.github.com/en/repositories/working-with-files/using-files/navigating-code-on-github#precise-and-search-based-navigation). This document exists to describe how to integrate with such systems, and how to extend this functionality to any language with a Tree-sitter grammar.
99

1010
## Tagging and captures
1111

1212
*Tagging* is the act of identifying the entities that can be named in a program. We use Tree-sitter queries to find those entities. Having found them, you use a syntax capture to label the entity and its name.
1313

14-
You can use the `tree-sitter tag` command to test out a given set of tags
15-
1614
The essence of a given tag lies in two pieces of data: the _kind_ of entity that is matched (usually a definition or a reference) and the _role_ of that entity, which describes how the entity is used (i.e. whether it's a class definition, function call, variable reference, and so on). Our convention is to use a syntax capture following the `@kind.role` capture name format, and another inner capture, always called `@name`, that pulls out the name of a given identifier.
1715
'
18-
You may optionally include a capture named `@doc `to bind a docstring. For convenience purposes, the tagging system provides two built-in functions, `#select-adjacent` and `#strip` that are convenient for removing comment syntax from a docstring. `#strip` takes a capture as its first argument and a regular expression, expressed as a quoted string. Any text patterns matched by the regular expression will be removed from the text associated with the passed capture. `#select-adjacent`, when passed two capture names, filters the text associated with the first capture so that only text adjacent to the second capture is preserved. This can be useful when writing queries that would otherwise include too much information in matched comments.
16+
You may optionally include a capture named `@doc` to bind a docstring. For convenience purposes, the tagging system provides two built-in functions, `#select-adjacent!` and `#strip!` that are convenient for removing comment syntax from a docstring. `#strip!` takes a capture as its first argument and a regular expression, expressed as a quoted string. Any text patterns matched by the regular expression will be removed from the text associated with the passed capture. `#select-adjacent!`, when passed two capture names, filters the text associated with the first capture so that only text adjacent to the second capture is preserved. This can be useful when writing queries that would otherwise include too much information in matched comments.
1917

2018
## Examples
2119

22-
An [example query](https://github.com/tree-sitter/tree-sitter-python/blob/78c4e9b6b2f08e1be23b541ffced47b15e2972ad/queries/tags.scm#L4-L5) follows, one that recognizes Python function definitions and captures their declared name. The `function_definition` syntax node is defined in the [Python Tree-sitter grammar](https://github.com/tree-sitter/tree-sitter-python/blob/78c4e9b6b2f08e1be23b541ffced47b15e2972ad/grammar.js#L354).
20+
This [query](https://github.com/tree-sitter/tree-sitter-python/blob/78c4e9b6b2f08e1be23b541ffced47b15e2972ad/queries/tags.scm#L4-L5) recognizes Python function definitions and captures their declared name. The `function_definition` syntax node is defined in the [Python Tree-sitter grammar](https://github.com/tree-sitter/tree-sitter-python/blob/78c4e9b6b2f08e1be23b541ffced47b15e2972ad/grammar.js#L354).
2321

2422
``` scheme
2523
(function_definition
@@ -64,7 +62,7 @@ An even more sophisticated query is in the [Ruby Tree-sitter repository](https:/
6462
)
6563
```
6664

67-
The below table describes a standard vocabulary for kinds and roles during the tagging process. User applications may extend (or only recognize a subset of) these capture names, but it is desirable to standardize on the names below when supported by a given system or language. Language communities that write tagging rules using these names can work out-of-the-box with a steadily increasing set of analysis tools.
65+
The below table describes a standard vocabulary for kinds and roles during the tagging process. New applications may extend (or only recognize a subset of) these capture names, but it is desirable to standardize on the names below.
6866

6967
| Category | Tag |
7068
|--------------------------|-----------------------------|
@@ -77,4 +75,26 @@ The below table describes a standard vocabulary for kinds and roles during the t
7775
| Class reference | `@reference.class` |
7876
| Interface implementation | `@reference.implementation` |
7977

80-
By convention, tags for a given language are made available in a `queries/tags.scm `file in that language's repository.
78+
## Command-line invocation
79+
80+
You can use the `tree-sitter tags` command to test out a tags query file. We can run this tool from within the Tree-sitter Ruby repository, over code in a file called `test.rb`
81+
82+
``` ruby
83+
module Foo
84+
class Bar
85+
def baz
86+
end
87+
end
88+
end
89+
```
90+
91+
Invoking `tree-sitter tags test.rb` produces the following console output:
92+
93+
```
94+
test.rb
95+
Foo | module def (0, 7) - (0, 10) `module Foo`
96+
Bar | class def (1, 8) - (1, 11) `class Bar`
97+
baz | method def (2, 8) - (2, 11) `def baz`
98+
```
99+
100+
By convention, tags for a given language are made available in a `queries/tags.scm`file in that language's repository.

0 commit comments

Comments
 (0)