-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
docs(self-hosted): explain self-hosted data flow #13745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
6ff84c9
docs(self-hosted): explain self-hosted data flow
aldy505 9e68b9a
Update data-flow.mdx
aldy505 fb7a95a
Update data-flow.mdx
aldy505 7296229
Update data-flow.mdx
aldy505 7c2578c
Merge remote-tracking branch 'origin/master' into docs/self-hosted/da…
aldy505 31942cc
Grammatical fixes
BYK File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
--- | ||
title: Self-hosted Data Flow | ||
sidebar_title: Data Flow | ||
sidebar_order: 20 | ||
description: Learn about the data flow of self-hosted Sentry | ||
--- | ||
|
||
This diagram shows the data flow of self-hosted Sentry. It is similar with [Application Architecture](/application-architecture/overview/) but we are focusing more on the self-hosted components. | ||
|
||
```mermaid | ||
graph LR | ||
kafka@{ shape: cyl, label: "Kafka\n(eventstream)" } | ||
redis@{ shape: cyl, label: "Redis" } | ||
postgres@{ shape: cyl, label: "Postgres" } | ||
memcached@{ shape: cyl, label: "Memcached" } | ||
clickhouse@{ shape: cyl, label: "Clickhouse" } | ||
smtp@{ shape: win-pane, label: "SMTP Server" } | ||
symbol-server@{ shape: win-pane, label: "Public/Private Symbol Servers" } | ||
internet@{ shape: trap-t, label: "Internet" } | ||
|
||
internet --> nginx | ||
|
||
nginx -- Event submitted by SDKs --> relay | ||
nginx -- Web UI & API --> web | ||
|
||
subgraph querier [Event Querier] | ||
snuba-api --> clickhouse | ||
end | ||
|
||
subgraph processing [Event Processing] | ||
kafka --> snuba-consumer --> clickhouse | ||
snuba-consumer --> kafka | ||
kafka --> snuba-replacer --> clickhouse | ||
kafka --> snuba-subscription-scheduler --> clickhouse | ||
kafka --> snuba-subscription-executor --> clickhouse | ||
redis -- As a celery queue --> sentry-consumer | ||
kafka --> sentry-consumer --> kafka | ||
kafka --> sentry-post-process-forwarder --> kafka | ||
sentry-post-process-forwarder -- Preventing concurrent processing of the same event --> redis | ||
|
||
vroom-blob-storage@{ shape: cyl, label: "Blob Storage\n(default is filesystem)" } | ||
|
||
kafka -- Profiling event processing --> vroom -- Republish to Kafka to be consumed by Snuba --> kafka | ||
vroom --> snuba-api | ||
vroom -- Store profiles data --> vroom-blob-storage | ||
|
||
outgoing-monitors@{ shape: win-pane, label: "Outgoing HTTP Monitors" } | ||
redis -- Fetching uptime configs --> uptime-checker -- Publishing uptime monitoring results --> kafka | ||
uptime-checker --> outgoing-monitors | ||
end | ||
|
||
subgraph ui [Web User Interface] | ||
sentry-blob-storage@{ shape: cyl, label: "Blob Storage\n(default is filesystem)" } | ||
|
||
web --> worker | ||
web --> postgres | ||
web -- Caching layer --> memcached | ||
web -- Queries on event (errors, spans, etc) data (to snuba-api) --> snuba-api | ||
web -- Avatars, attachments, etc --> sentry-blob-storage | ||
worker -- As a celery queue --> redis | ||
worker --> postgres | ||
worker -- Alert & digest emails --> smtp | ||
web -- Sending test emails --> smtp | ||
end | ||
|
||
subgraph ingestion [Event Ingestion] | ||
relay@{ shape: rect, label: 'Relay' } | ||
sentry_ingest_consumer[sentry-ingest-consumers] | ||
|
||
relay -- Process envelope into specific types --> kafka --> sentry_ingest_consumer -- Caching event data (to redis) --> redis | ||
relay -- Register relay instance --> web | ||
relay -- Fetching project configs (to redis) --> redis | ||
sentry_ingest_consumer -- Symbolicate stack traces --> symbolicator --> symbol-server | ||
sentry_ingest_consumer -- Save event payload to Nodestore --> postgres | ||
sentry_ingest_consumer -- Republish to events topic --> kafka | ||
end | ||
``` | ||
|
||
### Event Ingestion Pipeline | ||
|
||
1. Events from the SDK is sent to the `relay` service. | ||
2. Relay parses the incoming envelope, validates whether the DSN and Project ID are valid. It reads project config data from `redis`. | ||
3. Relay builds a new payload to be consumed by Sentry ingest consumers, and sends it to `kafka`. | ||
4. Sentry `ingest-*` consumers ( with `*` [wildcard] being the event type [errors, transaction, profiles, etc]) consumes the event, caches it in `redis` and starts the `preprocess_event` task. | ||
5. The `preprocess_event` task symbolicates stack traces with `symbolicator` service, and processes the event according to its event type. | ||
6. The `preprocess_event` task saves the event payload to nodestore (default nodestore backend is `postgres`). | ||
7. The `preprocess_event` task publishes the event to `kafka` under the `events` topic. | ||
|
||
### Event Processing Pipeline | ||
|
||
1. The `snuba-consumer` service consumes events from `events` topic and processes them. After the events are written to clickhouse, snuba publishes error & transaction events to `post-process-forwarder`. | ||
2. The Sentry `post-process-forwarder` consumer consumes messages and spawns a `post_process_group` task for each processed error & issue occurance. | ||
|
||
### Web User Interface | ||
|
||
1. The `web` service is what you see, it's the Django web UI and API that serves the Sentry's frontend. | ||
2. The `worker` service mainly consumes tasks from `redis` that acts as a celery queue. One notable task is to send emails through the SMTP server. | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to note why we do this? Like where do they go after?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea.