Skip to content

Restrict repository indexing by glob match #7767

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Sep 11, 2019
Merged
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
a85393b
Restrict repository indexing by file extension
Aug 6, 2019
4353c71
Merge master into indexbyfileext
Aug 6, 2019
c48c37a
Use REPO_EXTENSIONS_LIST_INCLUDE instead of REPO_EXTENSIONS_LIST_EXCL…
Aug 6, 2019
c6d5a79
Corrected to pass lint gosimple
Aug 6, 2019
021014a
Merge master into indexbyfileext
Aug 7, 2019
72a650c
Add wildcard support to REPO_INDEXER_EXTENSIONS
Aug 7, 2019
1d7edb4
This reverts commit 72a650c8e42f4abf59d5df7cd5dc27b451494cc6.
Aug 7, 2019
7450aee
Add wildcard support to REPO_INDEXER_EXTENSIONS (no make vendor)
Aug 7, 2019
106faf3
Simplify isIndexable() for better clarity
guillep2k Aug 7, 2019
e48f041
Add gobwas/glob to vendors
guillep2k Aug 7, 2019
bf82bdb
Merge branch master into indexbyfileext
guillep2k Aug 7, 2019
d63a0fb
Merge branch 'master' of github.com:go-gitea/gitea into indexbyfileext
guillep2k Aug 7, 2019
49e260c
Merge master into indexbyfileext and resolve conflicts
guillep2k Aug 28, 2019
7c66a16
Merge branch 'master' into indexbyfileext
guillep2k Aug 28, 2019
f2342d6
Merge branch 'master' into indexbyfileext
guillep2k Aug 28, 2019
55a93a3
manually set appengine new release
guillep2k Aug 28, 2019
89773e9
Merge branch 'master' into indexbyfileext
guillep2k Sep 3, 2019
0eca697
Merge branch 'master' of github.com:go-gitea/gitea into indexbyfileext
guillep2k Sep 4, 2019
6bd0ac8
Implement better REPO_INDEXER_INCLUDE and REPO_INDEXER_EXCLUDE
guillep2k Sep 4, 2019
435a222
Merge branch 'master' of github.com:go-gitea/gitea into indexbyfileext
guillep2k Sep 4, 2019
4de7be9
Add unit and integration tests
guillep2k Sep 5, 2019
b79406a
Merge branch 'indexbyfileext' of github.com:guillep2k/gitea into inde…
guillep2k Sep 5, 2019
9afc70e
Merge branch 'master' of github.com:go-gitea/gitea into indexbyfileext
guillep2k Sep 6, 2019
d92ee99
Update app.ini.sample and reword config-cheat-sheet
guillep2k Sep 6, 2019
59fe641
Add doc page and correct app.ini.sample
guillep2k Sep 6, 2019
a1f675b
Some polish on the doc
guillep2k Sep 6, 2019
480a406
Simplify code as suggested by @lafriks
guillep2k Sep 6, 2019
e0fbcb7
Merge branch 'master' into indexbyfileext
guillep2k Sep 11, 2019
326f770
Merge branch 'master' into indexbyfileext
lafriks Sep 11, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Some polish on the doc
  • Loading branch information
guillep2k committed Sep 6, 2019
commit a1f675b11e7d97326c5824cad895a361c748ef43
19 changes: 10 additions & 9 deletions docs/content/doc/advanced/repo-indexer.en-us.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ menu:

## Setting up the repository indexer

Gitea can search through the files of the repositories by enabling this function in your `app.ini`:
Gitea can search through the files of the repositories by enabling this function in your [`app.ini`](https://docs.gitea.io/en-us/config-cheat-sheet/):

```
[indexer]
Expand All @@ -30,28 +30,29 @@ REPO_INDEXER_INCLUDE =
REPO_INDEXER_EXCLUDE = resources/bin/**
```

Please bear in mind that indexing the contents can consume a lot of system resources, especially when the index is created or globally updated (e.g. after upgrading Gitea).
Please bear in mind that indexing the contents can consume a lot of system resources, especially when the index is created for the first time or globally updated (e.g. after upgrading Gitea).

### Choosing the files for indexing by size

The `MAX_FILE_SIZE` option will make the indexer to skip all files larger than the specified value.
The `MAX_FILE_SIZE` option will make the indexer skip all files larger than the specified value.

### Choosing the files for indexing by path

Gitea applies glob pattern matching from the [`gobwas/glob` library](https://github.com/gobwas/glob) to choose which files will be included in the index.

Limiting the list of files can help preventing the indexes to become polluted with derived or irrelevant files (e.g. lss, sym, map, etc.), so the search results are more relevant. It can also help reduce the index size.
Limiting the list of files prevents the indexes from becoming polluted with derived or irrelevant files (e.g. lss, sym, map, etc.), so the search results are more relevant. It can also help reduce the index size.

`REPO_INDEXER_INCLUDE` (default: empty) is a comma separated list of glob patterns to include in the index. An empty list means "include all files".
`REPO_INDEXER_EXCLUDE` (default: empty) is a comma separated list of glob patterns to exclude from the index. Files that match this list will not be indexed. `REPO_INDEXER_EXCLUDE` takes precedence over `REPO_INDEXER_INCLUDE`.
`REPO_INDEXER_INCLUDE` (default: empty) is a comma separated list of glob patterns to **include** in the index. An empty list means "_include all files_".
`REPO_INDEXER_EXCLUDE` (default: empty) is a comma separated list of glob patterns to **exclude** from the index. Files that match this list will not be indexed. `REPO_INDEXER_EXCLUDE` takes precedence over `REPO_INDEXER_INCLUDE`.

Pattern matching works as follows:

* To match all files with a `.txt` extension no matter what directory, use `**.txt`.
* To match all files with a `.txt` extension at the root level of the repository, use `*.txt`.
* To match all files with a `.txt` extension _only at the root level of the repository_, use `*.txt`.
* To match all files inside `resources/bin` and below, use `resources/bin/**`.
* To match all files immediately inside `resources/bin`, use `resources/bin/*`.
* To match all files _immediately inside_ `resources/bin`, use `resources/bin/*`.
* To match all files named `Makefile`, use `**Makefile`.
* Matching a directory has no effect; the pattern `resources/bin` will not include/exclude files inside that directory; `resources/bin/**` will.
* All files and patterns are normalized to lower case.
* All files and patterns are normalized to lower case, so `**Makefile`, `**makefile` and `**MAKEFILE` are equivalent.