Skip to content

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented May 13, 2025

Rationale for this change

This allows users only to fetch the snapshots that are referenced by a tag or a branch.

Are these changes tested?

Are there any user-facing changes?

@Fokko Fokko requested a review from kevinjqliu May 13, 2025 16:28
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines +683 to +687
if mode := self.properties.get(SNAPSHOT_LOADING_MODE):
if mode in {"all", "refs"}:
params["snapshots"] = mode
else:
raise ValueError("Invalid snapshot-loading-mode: {}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def load_table(self, identifier: Union[str, Identifier]) -> Table:
response = self._session.get(self.url(Endpoints.load_table, prefixed=True, **self._split_identifier_for_path(identifier)))
params = {}
if mode := self.properties.get(SNAPSHOT_LOADING_MODE):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to add this as a parameter to load_table that takes preference over session-level properties? I guess it's not part of the base catalog interface but 🤷🏻‍♂️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good seeing you here @corleyma, and I share your concern here. In some situations, we want to override this as well:

  • To enable time-travel.
  • In the case of a write, we want to know about all the snapshots to see if there are any conflicts.

I'm open to adding an option to the base interface, however, it does not make a lot of sense in the case of non-REST catalogs where we have to read the JSON anyway.

Copy link

@corleyma corleyma May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for coming back to this late.

Maybe the base load_table interface should just change to accept kwargs, and we can add here some implementation-specific kwargs for REST catalog?

@Fokko Fokko merged commit b8e5a4b into apache:main May 13, 2025
11 checks passed
@Fokko Fokko deleted the fd-add-loading-mode branch May 13, 2025 21:54
gabeiglio pushed a commit to Netflix/iceberg-python that referenced this pull request Aug 13, 2025
<!--
Thanks for opening a pull request!
-->

<!-- In the case this PR will resolve an issue, please replace
${GITHUB_ISSUE_ID} below with the actual Github issue id. -->
<!-- Closes #${GITHUB_ISSUE_ID} -->

# Rationale for this change

This allows users only to fetch the snapshots that are referenced by a
tag or a branch.

# Are these changes tested?

# Are there any user-facing changes?

<!-- In the case of user-facing changes, please add the changelog label.
-->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants