Skip to content

Commit 613382c

Browse files
committed
docs: update badges; fix markdown lint complains
Linter config `.vscode/settings.json`: ```json { "[markdown]": { "files.trimTrailingWhitespace": false, }, "markdownlint.config": { "default": true, // "ul-style": { // "style": "asterisk" // }, "MD001": false, "MD024": false, "MD025": false, "MD033": false, "MD041": false, "MD053": false, }, } ```
1 parent 6c52045 commit 613382c

14 files changed

+121
-95
lines changed

README.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,11 @@
11
# tree-sitter
22

3-
[![CICD](https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml/badge.svg)](https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml)
3+
[![CICD badge]][CICD]
44
[![DOI](https://zenodo.org/badge/14164618.svg)](https://zenodo.org/badge/latestdoi/14164618)
55

6+
[CICD badge]: https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml/badge.svg
7+
[CICD]: https://github.com/tree-sitter/tree-sitter/actions/workflows/CICD.yml
8+
69
Tree-sitter is a parser generator tool and an incremental parsing library. It can build a concrete syntax tree for a source file and efficiently update the syntax tree as the source file is edited. Tree-sitter aims to be:
710

811
- **General** enough to parse any programming language

cli/README.md

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,11 @@
1-
Tree-sitter CLI
2-
===============
1+
# Tree-sitter CLI
32

4-
[![Crates.io](https://img.shields.io/crates/v/tree-sitter-cli.svg)](https://crates.io/crates/tree-sitter-cli)
3+
[![crates.io badge]][crates.io] [![npmjs.com badge]][npmjs.com]
4+
5+
[crates.io]: https://crates.io/crates/tree-sitter-cli
6+
[crates.io badge]: https://img.shields.io/crates/v/tree-sitter-cli.svg?color=%23B48723
7+
[npmjs.com]: https://www.npmjs.org/package/tree-sitter-cli
8+
[npmjs.com badge]: https://img.shields.io/npm/v/tree-sitter-cli.svg?color=%23BF4A4A
59

610
The Tree-sitter CLI allows you to develop, test, and use Tree-sitter grammars from the command line. It works on MacOS, Linux, and Windows.
711

@@ -19,7 +23,7 @@ or with `npm`:
1923
npm install tree-sitter-cli
2024
```
2125

22-
You can also download a pre-built binary for your platform from [the releases page](https://github.com/tree-sitter/tree-sitter/releases/latest).
26+
You can also download a pre-built binary for your platform from [the releases page].
2327

2428
### Dependencies
2529

@@ -30,8 +34,11 @@ The `tree-sitter` binary itself has no dependencies, but specific commands have
3034

3135
### Commands
3236

33-
* `generate` - The `tree-sitter generate` command will generate a Tree-sitter parser based on the grammar in the current working directory. See [the documentation](https://tree-sitter.github.io/tree-sitter/creating-parsers) for more information.
37+
* `generate` - The `tree-sitter generate` command will generate a Tree-sitter parser based on the grammar in the current working directory. See [the documentation] for more information.
3438

35-
* `test` - The `tree-sitter test` command will run the unit tests for the Tree-sitter parser in the current working directory. See [the documentation](https://tree-sitter.github.io/tree-sitter/creating-parsers) for more information.
39+
* `test` - The `tree-sitter test` command will run the unit tests for the Tree-sitter parser in the current working directory. See [the documentation] for more information.
3640

3741
* `parse` - The `tree-sitter parse` command will parse a file (or list of files) using Tree-sitter parsers.
42+
43+
[the documentation]: https://tree-sitter.github.io/tree-sitter/creating-parsers
44+
[the releases page]: https://github.com/tree-sitter/tree-sitter/releases/latest

docs/index.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -160,9 +160,9 @@ By convention, parsers are named with the language last, eg. tree-sitter-ruby.
160160

161161
The design of Tree-sitter was greatly influenced by the following research papers:
162162

163-
- [Practical Algorithms for Incremental Software Development Environments](https://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/CSD-97-946.pdf)
164-
- [Context Aware Scanning for Parsing Extensible Languages](https://www-users.cse.umn.edu/~evw/pubs/vanwyk07gpce/vanwyk07gpce.pdf)
165-
- [Efficient and Flexible Incremental Parsing](https://harmonia.cs.berkeley.edu/papers/twagner-parsing.pdf)
166-
- [Incremental Analysis of Real Programming Languages](https://harmonia.cs.berkeley.edu/papers/twagner-glr.pdf)
167-
- [Error Detection and Recovery in LR Parsers](https://what-when-how.com/compiler-writing/bottom-up-parsing-compiler-writing-part-13)
168-
- [Error Recovery for LR Parsers](https://apps.dtic.mil/sti/pdfs/ADA043470.pdf)
163+
* [Practical Algorithms for Incremental Software Development Environments](https://www2.eecs.berkeley.edu/Pubs/TechRpts/1997/CSD-97-946.pdf)
164+
* [Context Aware Scanning for Parsing Extensible Languages](https://www-users.cse.umn.edu/~evw/pubs/vanwyk07gpce/vanwyk07gpce.pdf)
165+
* [Efficient and Flexible Incremental Parsing](https://harmonia.cs.berkeley.edu/papers/twagner-parsing.pdf)
166+
* [Incremental Analysis of Real Programming Languages](https://harmonia.cs.berkeley.edu/papers/twagner-glr.pdf)
167+
* [Error Detection and Recovery in LR Parsers](https://what-when-how.com/compiler-writing/bottom-up-parsing-compiler-writing-part-13)
168+
* [Error Recovery for LR Parsers](https://apps.dtic.mil/sti/pdfs/ADA043470.pdf)

docs/section-2-using-parsers.md

Lines changed: 36 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -21,21 +21,21 @@ Alternatively, you can incorporate the library in a larger project's build syste
2121

2222
**source file:**
2323

24-
- `tree-sitter/lib/src/lib.c`
24+
* `tree-sitter/lib/src/lib.c`
2525

2626
**include directories:**
2727

28-
- `tree-sitter/lib/src`
29-
- `tree-sitter/lib/include`
28+
* `tree-sitter/lib/src`
29+
* `tree-sitter/lib/include`
3030

3131
### The Basic Objects
3232

3333
There are four main types of objects involved when using Tree-sitter: languages, parsers, syntax trees, and syntax nodes. In C, these are called `TSLanguage`, `TSParser`, `TSTree`, and `TSNode`.
3434

35-
- A `TSLanguage` is an opaque object that defines how to parse a particular programming language. The code for each `TSLanguage` is generated by Tree-sitter. Many languages are already available in separate git repositories within the [Tree-sitter GitHub organization](https://github.com/tree-sitter). See [the next page](./creating-parsers) for how to create new languages.
36-
- A `TSParser` is a stateful object that can be assigned a `TSLanguage` and used to produce a `TSTree` based on some source code.
37-
- A `TSTree` represents the syntax tree of an entire source code file. It contains `TSNode` instances that indicate the structure of the source code. It can also be edited and used to produce a new `TSTree` in the event that the source code changes.
38-
- A `TSNode` represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children.
35+
* A `TSLanguage` is an opaque object that defines how to parse a particular programming language. The code for each `TSLanguage` is generated by Tree-sitter. Many languages are already available in separate git repositories within the [Tree-sitter GitHub organization](https://github.com/tree-sitter). See [the next page](./creating-parsers) for how to create new languages.
36+
* A `TSParser` is a stateful object that can be assigned a `TSLanguage` and used to produce a `TSTree` based on some source code.
37+
* A `TSTree` represents the syntax tree of an entire source code file. It contains `TSNode` instances that indicate the structure of the source code. It can also be edited and used to produce a new `TSTree` in the event that the source code changes.
38+
* A `TSNode` represents a single node in the syntax tree. It tracks its start and end positions in the source code, as well as its relation to other nodes like its parent, siblings and children.
3939

4040
### An Example Program
4141

@@ -442,31 +442,31 @@ Many code analysis tasks involve searching for patterns in syntax trees. Tree-si
442442
443443
A _query_ consists of one or more _patterns_, where each pattern is an [S-expression](https://en.wikipedia.org/wiki/S-expression) that matches a certain set of nodes in a syntax tree. The expression to match a given node consists of a pair of parentheses containing two things: the node's type, and optionally, a series of other S-expressions that match the node's children. For example, this pattern would match any `binary_expression` node whose children are both `number_literal` nodes:
444444
445-
``` scheme
445+
```scheme
446446
(binary_expression (number_literal) (number_literal))
447447
```
448448

449449
Children can also be omitted. For example, this would match any `binary_expression` where at least _one_ of child is a `string_literal` node:
450450

451-
``` scheme
451+
```scheme
452452
(binary_expression (string_literal))
453453
```
454454

455455
#### Fields
456456

457457
In general, it's a good idea to make patterns more specific by specifying [field names](#node-field-names) associated with child nodes. You do this by prefixing a child pattern with a field name followed by a colon. For example, this pattern would match an `assignment_expression` node where the `left` child is a `member_expression` whose `object` is a `call_expression`.
458458

459-
``` scheme
459+
```scheme
460460
(assignment_expression
461461
left: (member_expression
462462
object: (call_expression)))
463463
```
464464

465465
#### Negated Fields
466466

467-
You can also constrain a pattern so that it only matches nodes that *lack* a certain field. To do this, add a field name prefixed by a `!` within the parent pattern. For example, this pattern would match a class declaration with no type parameters:
467+
You can also constrain a pattern so that it only matches nodes that _lack_ a certain field. To do this, add a field name prefixed by a `!` within the parent pattern. For example, this pattern would match a class declaration with no type parameters:
468468

469-
``` scheme
469+
```scheme
470470
(class_declaration
471471
name: (identifier) @class_name
472472
!type_parameters)
@@ -476,7 +476,7 @@ You can also constrain a pattern so that it only matches nodes that *lack* a cer
476476

477477
The parenthesized syntax for writing nodes only applies to [named nodes](#named-vs-anonymous-nodes). To match specific anonymous nodes, you write their name between double quotes. For example, this pattern would match any `binary_expression` where the operator is `!=` and the right side is `null`:
478478

479-
``` scheme
479+
```scheme
480480
(binary_expression
481481
operator: "!="
482482
right: (null))
@@ -488,15 +488,15 @@ When matching patterns, you may want to process specific nodes within the patter
488488

489489
For example, this pattern would match any assignment of a `function` to an `identifier`, and it would associate the name `the-function-name` with the identifier:
490490

491-
``` scheme
491+
```scheme
492492
(assignment_expression
493493
left: (identifier) @the-function-name
494494
right: (function))
495495
```
496496

497497
And this pattern would match all method definitions, associating the name `the-method-name` with the method name, `the-class-name` with the containing class name:
498498

499-
``` scheme
499+
```scheme
500500
(class_declaration
501501
name: (identifier) @the-class-name
502502
body: (class_body
@@ -510,21 +510,21 @@ You can match a repeating sequence of sibling nodes using the postfix `+` and `*
510510

511511
For example, this pattern would match a sequence of one or more comments:
512512

513-
``` scheme
513+
```scheme
514514
(comment)+
515515
```
516516

517517
This pattern would match a class declaration, capturing all of the decorators if any were present:
518518

519-
``` scheme
519+
```scheme
520520
(class_declaration
521521
(decorator)* @the-decorator
522522
name: (identifier) @the-name)
523523
```
524524

525525
You can also mark a node as optional using the `?` operator. For example, this pattern would match all function calls, capturing a string argument if one was present:
526526

527-
``` scheme
527+
```scheme
528528
(call_expression
529529
function: (identifier) @the-function
530530
arguments: (arguments (string)? @the-string-arg))
@@ -534,7 +534,7 @@ You can also mark a node as optional using the `?` operator. For example, this p
534534

535535
You can also use parentheses for grouping a sequence of _sibling_ nodes. For example, this pattern would match a comment followed by a function declaration:
536536

537-
``` scheme
537+
```scheme
538538
(
539539
(comment)
540540
(function_declaration)
@@ -543,7 +543,7 @@ You can also use parentheses for grouping a sequence of _sibling_ nodes. For exa
543543

544544
Any of the quantification operators mentioned above (`+`, `*`, and `?`) can also be applied to groups. For example, this pattern would match a comma-separated series of numbers:
545545

546-
``` scheme
546+
```scheme
547547
(
548548
(number)
549549
("," (number))*
@@ -558,7 +558,7 @@ This is similar to _character classes_ from regular expressions (`[abc]` matches
558558
For example, this pattern would match a call to either a variable or an object property.
559559
In the case of a variable, capture it as `@function`, and in the case of a property, capture it as `@method`:
560560

561-
``` scheme
561+
```scheme
562562
(call_expression
563563
function: [
564564
(identifier) @function
@@ -569,7 +569,7 @@ In the case of a variable, capture it as `@function`, and in the case of a prope
569569

570570
This pattern would match a set of possible keyword tokens, capturing them as `@keyword`:
571571

572-
``` scheme
572+
```scheme
573573
[
574574
"break"
575575
"delete"
@@ -592,7 +592,7 @@ and `_` will match any named or anonymous node.
592592

593593
For example, this pattern would match any node inside a call:
594594

595-
``` scheme
595+
```scheme
596596
(call (_) @call.inner)
597597
```
598598

@@ -602,21 +602,21 @@ The anchor operator, `.`, is used to constrain the ways in which child patterns
602602

603603
When `.` is placed before the _first_ child within a parent pattern, the child will only match when it is the first named node in the parent. For example, the below pattern matches a given `array` node at most once, assigning the `@the-element` capture to the first `identifier` node in the parent `array`:
604604

605-
``` scheme
605+
```scheme
606606
(array . (identifier) @the-element)
607607
```
608608

609609
Without this anchor, the pattern would match once for every identifier in the array, with `@the-element` bound to each matched identifier.
610610

611611
Similarly, an anchor placed after a pattern's _last_ child will cause that child pattern to only match nodes that are the last named child of their parent. The below pattern matches only nodes that are the last named child within a `block`.
612612

613-
``` scheme
613+
```scheme
614614
(block (_) @last-expression .)
615615
```
616616

617617
Finally, an anchor _between_ two child patterns will cause the patterns to only match nodes that are immediate siblings. The pattern below, given a long dotted name like `a.b.c.d`, will only match pairs of consecutive identifiers: `a, b`, `b, c`, and `c, d`.
618618

619-
``` scheme
619+
```scheme
620620
(dotted_name
621621
(identifier) @prev-id
622622
.
@@ -633,7 +633,7 @@ You can also specify arbitrary metadata and conditions associated with a pattern
633633

634634
For example, this pattern would match identifier whose names is written in `SCREAMING_SNAKE_CASE`:
635635

636-
``` scheme
636+
```scheme
637637
(
638638
(identifier) @constant
639639
(#match? @constant "^[A-Z][A-Z_]+")
@@ -642,7 +642,7 @@ For example, this pattern would match identifier whose names is written in `SCRE
642642

643643
And this pattern would match key-value pairs where the `value` is an identifier with the same name as the key:
644644

645-
``` scheme
645+
```scheme
646646
(
647647
(pair
648648
key: (property_identifier) @key-name
@@ -723,8 +723,8 @@ The node types file contains an array of objects, each of which describes a part
723723
724724
Every object in this array has these two entries:
725725
726-
- `"type"` - A string that indicates which grammar rule the node represents. This corresponds to the `ts_node_type` function described [above](#syntax-nodes).
727-
- `"named"` - A boolean that indicates whether this kind of node corresponds to a rule name in the grammar or just a string literal. See [above](#named-vs-anonymous-nodes) for more info.
726+
* `"type"` - A string that indicates which grammar rule the node represents. This corresponds to the `ts_node_type` function described [above](#syntax-nodes).
727+
* `"named"` - A boolean that indicates whether this kind of node corresponds to a rule name in the grammar or just a string literal. See [above](#named-vs-anonymous-nodes) for more info.
728728
729729
Examples:
730730
@@ -745,14 +745,14 @@ Together, these two fields constitute a unique identifier for a node type; no tw
745745

746746
Many syntax nodes can have _children_. The node type object describes the possible children that a node can have using the following entries:
747747

748-
- `"fields"` - An object that describes the possible [fields](#node-field-names) that the node can have. The keys of this object are field names, and the values are _child type_ objects, described below.
749-
- `"children"` - Another _child type_ object that describes all of the node's possible _named_ children _without_ fields.
748+
* `"fields"` - An object that describes the possible [fields](#node-field-names) that the node can have. The keys of this object are field names, and the values are _child type_ objects, described below.
749+
* `"children"` - Another _child type_ object that describes all of the node's possible _named_ children _without_ fields.
750750

751751
A _child type_ object describes a set of child nodes using the following entries:
752752

753-
- `"required"` - A boolean indicating whether there is always _at least one_ node in this set.
754-
- `"multiple"` - A boolean indicating whether there can be _multiple_ nodes in this set.
755-
- `"types"`- An array of objects that represent the possible types of nodes in this set. Each object has two keys: `"type"` and `"named"`, whose meanings are described above.
753+
* `"required"` - A boolean indicating whether there is always _at least one_ node in this set.
754+
* `"multiple"` - A boolean indicating whether there can be _multiple_ nodes in this set.
755+
* `"types"`- An array of objects that represent the possible types of nodes in this set. Each object has two keys: `"type"` and `"named"`, whose meanings are described above.
756756

757757
Example with fields:
758758

@@ -812,7 +812,7 @@ In Tree-sitter grammars, there are usually certain rules that represent abstract
812812

813813
Normally, hidden rules are not mentioned in the node types file, since they don't appear in the syntax tree. But if you add a hidden rule to the grammar's [`supertypes` list](./creating-parsers#the-grammar-dsl), then it _will_ show up in the node types file, with the following special entry:
814814

815-
- `"subtypes"` - An array of objects that specify the _types_ of nodes that this 'supertype' node can wrap.
815+
* `"subtypes"` - An array of objects that specify the _types_ of nodes that this 'supertype' node can wrap.
816816

817817
Example:
818818

0 commit comments

Comments
 (0)