Translog architecture guide Distributed team #126416

kingherc · 2025-04-07T15:37:59Z

Closes ES-7879

elasticsearchmachine · 2025-04-07T15:41:12Z

Pinging @elastic/es-docs (Team:Docs)

elasticsearchmachine · 2025-04-07T15:41:12Z

Pinging @elastic/es-distributed-indexing (Team:Distributed Indexing)

…nslog-guide

JeremyDahlgren

As a newbie consumer this looks great, very helpful, thank you Iraklis.

docs/internal/DistributedArchitectureGuide.md

…nslog-guide

kingherc

Thank you all for the feedback! Feel free to review again.

docs/internal/DistributedArchitectureGuide.md

…nslog-guide

kingherc · 2025-04-21T16:25:31Z

Dear reviewers, handled all feedback, gentle reminder for review!

JeremyDahlgren

LGTM

nicktindall

LGTM

One thing to consider, you can declare the links outside of the text and reference them by name inside the text so that they're not so disruptive when reading the MD in plain-text form.

See example here.

docs/internal/DistributedArchitectureGuide.md

…nslog-guide

kingherc · 2025-04-22T08:13:37Z

Thanks @JeremyDahlgren @nicktindall !

One thing to consider, you can declare the links outside of the text and reference them by name inside the text so that they're not so disruptive when reading the MD in plain-text form.

Unfortunately does not work if I'd like to make the linked text a code style as well (with the backticks). It only works for regular styled text. :(

nicktindall · 2025-04-22T08:27:13Z

Thanks @JeremyDahlgren @nicktindall !

One thing to consider, you can declare the links outside of the text and reference them by name inside the text so that they're not so disruptive when reading the MD in plain-text form.

Unfortunately does not work if I'd like to make the linked text a code style as well (with the backticks). It only works for regular styled text. :(

You might be able to do (if you haven't tried already, I used this elsewhere to have text other than the link ID inline)

[`ClassName`][LinkIDForClassName]

Still a bit clunky but better than the full link inline IMO

docs/internal/DistributedArchitectureGuide.md

…nslog-guide

kingherc

Still a bit clunky but better than the full link inline IMO

Thanks @nicktindall that works! Replaced all links as such.

docs/internal/DistributedArchitectureGuide.md

DaveCTurner

All good, just some suggestions.

docs/internal/DistributedArchitectureGuide.md

DaveCTurner · 2025-04-25T09:56:51Z

docs/internal/DistributedArchitectureGuide.md

+A [`MultiSnapshot`] can be used to iterate operations over multiple [`TranslogSnapshot`]s.
+A [`TranslogWriter`] can be used to write operations to the translog.
+
+#### Real-time GETs from the translog


I know the Recovery section isn't written yet, but may be worth at least linking to it here and saying something about how we replay the operations from the translog during recovery by just reading them sequentially.

Good point, added in the introduction:

, so they can be replayed by just reading them sequentially from the translog during recovery in the event of ephemeral failures such as a crash or power loss.

docs/internal/DistributedArchitectureGuide.md

DaveCTurner · 2025-04-25T09:59:14Z

docs/internal/DistributedArchitectureGuide.md

+Each translog is a sequence of files, each identified by a translog generation ID, each containing a sequence of operations, with the last file open for writes.
+The last file has a part which has been fsync'ed to disk, and a part which has been written but not necessarily fsync'ed yet to disk.
+Each operation is identified by a sequence number (`seqno`), which is monotonically increased by the engine's ingestion functionality.
+A [`Checkpoint`] file is also maintained, that contains, among other information, the current translog generation ID, and its last fsync'ed operation and location, the minimum translog generation ID, and the minimum and maximum sequence number of operations the sequence of translog generations include.


Could you explain here why the separate Checkpoint is necessary?

…nslog-guide

kingherc

Thanks @DaveCTurner ! Feel free to review again.

kingherc · 2025-04-25T12:21:23Z

docs/internal/DistributedArchitectureGuide.md

+Each translog is a sequence of files, each identified by a translog generation ID, each containing a sequence of operations, with the last file open for writes.
+The last file has a part which has been fsync'ed to disk, and a part which has been written but not necessarily fsync'ed yet to disk.
+Each operation is identified by a sequence number (`seqno`), which is monotonically increased by the engine's ingestion functionality.
+A [`Checkpoint`] file is also maintained, that contains, among other information, the current translog generation ID, and its last fsync'ed operation and location, the minimum translog generation ID, and the minimum and maximum sequence number of operations the sequence of translog generations include.


kingherc · 2025-04-25T12:28:32Z

docs/internal/DistributedArchitectureGuide.md

+A [`MultiSnapshot`] can be used to iterate operations over multiple [`TranslogSnapshot`]s.
+A [`TranslogWriter`] can be used to write operations to the translog.
+
+#### Real-time GETs from the translog


Good point, added in the introduction:

, so they can be replayed by just reading them sequentially from the translog during recovery in the event of ephemeral failures such as a crash or power loss.

DaveCTurner · 2025-04-25T12:38:01Z

docs/internal/DistributedArchitectureGuide.md

+The last file has a part which has been fsync'ed to disk, and a part which has been written but not necessarily fsync'ed yet to disk.
+Each operation is identified by a sequence number (`seqno`), which is monotonically increased by the engine's ingestion functionality.
+Typically the entries in the translog are in increasing order of their sequence number, but not necessarily.
+A [`Checkpoint`] file is also maintained, which is written on each fsync operation of the translog, and records important metadata and statistics about the translog, such as the current translog generation ID, its last fsync'ed operation and location, the minimum translog generation ID, and the minimum and maximum sequence number of operations the sequence of translog generations include, all of which are useful to identify the translog operations needed to be replayed upon recovery.


It's more than just "useful", this is essential for correctness. The Checkpoint records the last fsynced location in the translog file, i.e. it is safe to read up to the location in the checkpoint but beyond that point are only dragons.

Reworded, thanks!

DaveCTurner

LGTM (could probably iterate on this forever but this is a great start)

Relates: elastic#126416

Relates: #126416

Translog architecture guide Distributed team

7b47e98

Closes ES-7879

kingherc added >docs General docs changes >non-issue :Distributed Indexing/Engine Anything around managing Lucene and the Translog in an open shard. Team:Distributed Indexing Meta label for Distributed Indexing team v9.1.0 labels Apr 7, 2025

kingherc self-assigned this Apr 7, 2025

kingherc marked this pull request as ready for review April 7, 2025 15:40

elasticsearchmachine added the Team:Docs Meta label for docs team label Apr 7, 2025

kingherc requested review from BrianRothermich, JeremyDahlgren, Tim-Brooks and arteam April 7, 2025 15:41

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

3ef5edc

…nslog-guide

JeremyDahlgren reviewed Apr 8, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

nicktindall reviewed Apr 8, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

nicktindall reviewed Apr 8, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

nicktindall reviewed Apr 9, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

nicktindall reviewed Apr 9, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

kingherc added 4 commits April 14, 2025 09:58

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

ac038f2

…nslog-guide

Added links

37fd83d

PR comments

231e939

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

c8722f7

…nslog-guide

kingherc commented Apr 14, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

kingherc requested review from nicktindall, ywangd and JeremyDahlgren April 14, 2025 17:16

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

7e44680

…nslog-guide

kingherc added 3 commits April 17, 2025 14:08

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

d176ba6

…nslog-guide

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

805a3f6

…nslog-guide

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

6f0c73b

…nslog-guide

JeremyDahlgren approved these changes Apr 21, 2025

View reviewed changes

nicktindall approved these changes Apr 22, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

c04966d

…nslog-guide

Typo

03e6a7e

DaveCTurner reviewed Apr 22, 2025

View reviewed changes

kingherc added 4 commits April 23, 2025 12:12

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

ce767ad

…nslog-guide

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

09ceff8

…nslog-guide

PR comments

9b5994d

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

f4a3aa8

…nslog-guide

kingherc commented Apr 25, 2025

View reviewed changes

docs/internal/DistributedArchitectureGuide.md Show resolved Hide resolved

kingherc requested a review from DaveCTurner April 25, 2025 09:47

DaveCTurner reviewed Apr 25, 2025

View reviewed changes

kingherc added 2 commits April 25, 2025 15:13

Merge remote-tracking branch 'origin/main' into non-issue/ES-7879-tra…

9c81a71

…nslog-guide

PR comments

742d7a7

kingherc commented Apr 25, 2025

View reviewed changes

kingherc requested a review from DaveCTurner April 25, 2025 12:31

DaveCTurner reviewed Apr 25, 2025

View reviewed changes

DaveCTurner approved these changes Apr 25, 2025

View reviewed changes

PR comments

c623c8d

kingherc merged commit fd7d973 into elastic:main Apr 25, 2025
16 checks passed

kingherc deleted the non-issue/ES-7879-translog-guide branch April 25, 2025 13:50

ywangd added a commit to ywangd/elasticsearch that referenced this pull request Apr 28, 2025

Remove translog from javadoc for acquireHistoryRetentionLock

192bc0c

Relates: elastic#126416

ywangd mentioned this pull request Apr 28, 2025

Remove translog from javadoc for acquireHistoryRetentionLock #127454

Merged

ywangd added a commit that referenced this pull request Apr 28, 2025

Remove translog from javadoc for acquireHistoryRetentionLock (#127454)

9106a44

Relates: #126416

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Translog architecture guide Distributed team #126416

Translog architecture guide Distributed team #126416

kingherc commented Apr 7, 2025

elasticsearchmachine commented Apr 7, 2025

elasticsearchmachine commented Apr 7, 2025

JeremyDahlgren left a comment

kingherc left a comment

kingherc commented Apr 21, 2025

JeremyDahlgren left a comment

nicktindall left a comment

kingherc commented Apr 22, 2025

nicktindall commented Apr 22, 2025 •

edited

Loading

kingherc left a comment

DaveCTurner left a comment

DaveCTurner Apr 25, 2025

kingherc Apr 25, 2025

DaveCTurner Apr 25, 2025

kingherc Apr 25, 2025

kingherc left a comment

kingherc Apr 25, 2025

kingherc Apr 25, 2025

DaveCTurner Apr 25, 2025

kingherc Apr 25, 2025

DaveCTurner left a comment

Translog architecture guide Distributed team #126416

Translog architecture guide Distributed team #126416

Conversation

kingherc commented Apr 7, 2025

elasticsearchmachine commented Apr 7, 2025

elasticsearchmachine commented Apr 7, 2025

JeremyDahlgren left a comment

Choose a reason for hiding this comment

kingherc left a comment

Choose a reason for hiding this comment

kingherc commented Apr 21, 2025

JeremyDahlgren left a comment

Choose a reason for hiding this comment

nicktindall left a comment

Choose a reason for hiding this comment

kingherc commented Apr 22, 2025

nicktindall commented Apr 22, 2025 • edited Loading

kingherc left a comment

Choose a reason for hiding this comment

DaveCTurner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kingherc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DaveCTurner left a comment

Choose a reason for hiding this comment

nicktindall commented Apr 22, 2025 •

edited

Loading