feat: document client: more filetypes #404

cebtenzzre · 2025-11-11T00:24:07Z

Important

Enhance upload_file() in nomic/client.py to support more file types using filetype, update tests and CI configuration accordingly.

Behavior:
- upload_file() in nomic/client.py now supports more file types (Office documents, images) using _detect_content_type().
- _detect_content_type() added to nomic/client.py for MIME type detection using filetype library.
Testing:
- New test_ci target in makefile excludes tests marked with atlas.
- Added atlas marker to multiple tests in tests/test_atlas_client.py.
CI/CD:
- .circleci/config.yml updated to use make test_ci for unit tests.
Dependencies:
- Added filetype to install_requires in setup.py.

^{This description was created by}^{for 115095a. You can customize this summary. It will automatically update as commits are pushed.}

cebtenzzre · 2025-11-11T00:24:22Z

feat: document client: more filetypes #404 👈 (View in Graphite)
main

This stack of pull requests is managed by Graphite. Learn more about stacking.

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to bd231ae in 1 minute and 55 seconds. Click for details.

Reviewed 113 lines of code in 2 files
Skipped 0 files when reviewing.
Skipped posting 5 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. nomic/client.py:160

Draft comment:
Ensure filetype.guess_mime() handles pathlib.Path; consider using str(path) for compatibility.
Reason this comment was not posted:
Confidence changes required: 66% <= threshold 70% None

2. nomic/client.py:167

Draft comment:
Clarify ZIP archive handling: non-Office ZIP files trigger a ValueError. Update the docstring if this behavior is intentional.
Reason this comment was not posted:
Confidence changes required: 66% <= threshold 70% None

3. nomic/client.py:201

Draft comment:
Rename variable 'pdf_file' to a more generic name (e.g., file_obj) since the file may not be a PDF.
Reason this comment was not posted:
Comment was on unchanged code.

4. setup.py:41

Draft comment:
Good explicit version constraint for 'filetype'. Consider adding a comment explaining the rationale behind the version range.
Reason this comment was not posted:
Confidence changes required: 33% <= threshold 70% None

5. nomic/client.py:145

Draft comment:
Typo: "docling's approach" might be a misspelling. Please confirm if it should be "docile's approach" or another term.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 5% vs. threshold = 70% "Docling" is likely a reference to a specific library or tool used for document processing. A quick search would reveal that "docling" is indeed a real document processing library. The automated tool appears to have incorrectly flagged this as a typo when it's actually a proper noun/technical term. The suggestion to change it to "docile's approach" would be nonsensical. This comment is incorrect and should be deleted. I might be wrong if "docling" is truly a typo and the author meant something else. However, given the technical context and the fact that the code is implementing MIME type detection with specific handling for Office formats, it's highly likely this is a reference to an actual library or methodology. Even if I'm uncertain about whether "docling" is a real library, the suggested correction to "docile's approach" makes no sense in this technical context. The comment is not providing value and is likely incorrect. Additionally, this is just a comment/documentation change, not a code logic issue, and the rules state not to make comments that are obvious or unimportant. This comment should be deleted. It incorrectly flags "docling" as a typo when it appears to be a legitimate technical term or library reference. The suggested correction "docile's approach" would be nonsensical in this context.

Workflow ID: wflow_oBTF6zrEX7DZ2A1k

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed 115095a in 1 minute and 57 seconds. Click for details.

Reviewed 188 lines of code in 4 files
Skipped 0 files when reviewing.
Skipped posting 5 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. .circleci/config.yml:42

Draft comment:
Changed test command from 'make test' to 'make test_ci', which filters out tests marked with 'atlas'. Please confirm that skipping integration tests on CI is intentional and consider documenting this decision.
Reason this comment was not posted:
Comment did not seem useful. Confidence is useful = 30% <= threshold 70% The comment is asking the PR author to confirm their intention regarding skipping integration tests on CI. This violates the rule against asking for confirmation of intention. However, it also suggests documenting the decision, which is a specific and actionable suggestion. The comment is partially useful but mostly violates the rules.

2. makefile:40

Draft comment:
New 'test_ci' target excludes 'atlas' integration tests using '-m "not atlas"'. It would be helpful to add a comment explaining that these tests depend on external services and are excluded from CI by design.
Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 70% This comment is asking the author to add a comment/documentation explaining the purpose of the test exclusion. However, the rules explicitly state "Do NOT comment unless there is clearly a code change required. Do not make purely informative comments." The code itself is fairly self-explanatory - it's a CI test target that excludes atlas tests. The naming convention (test_ci vs test) and the pytest marker usage make the intent reasonably clear. Additionally, looking at the full file context, I can see other CI-related targets like black_ci, isort_ci, pyright_ci which follow a similar naming pattern, suggesting this is a consistent pattern in the codebase. The comment is essentially asking for documentation/explanation rather than pointing out a functional issue. This violates the rule about not making purely informative comments. While the code might benefit from a comment explaining why atlas tests are excluded, the naming and context make it reasonably clear. The comment could be considered helpful for future maintainers, and the rule about "Extremely Complicated Code Needs Comments" might apply if this is confusing. The code is not "extremely complicated" - it's a simple pytest command with a marker exclusion. The naming convention (test_ci) and the marker usage are standard pytest practices. This is more about documentation preference than a necessary code change. The rules prioritize actionable code changes over documentation requests. This comment should be deleted because it's requesting documentation/explanation rather than pointing out a functional issue or necessary code change. The code is self-explanatory enough through naming conventions and standard pytest practices.

3. pyproject.toml:13

Draft comment:
The addition of the 'atlas' marker in pytest config is good. Consider expanding the documentation (e.g., in the developer guide) on when to use the '@pytest.mark.atlas' marker.
Reason this comment was not posted:
Confidence changes required: 20% <= threshold 70% None

4. tests/test_atlas_client.py:30

Draft comment:
Integration test functions now include '@pytest.mark.atlas'. If the majority of tests in this file are integration tests, consider applying a module-level marker (e.g., 'pytestmark = pytest.mark.atlas') to reduce repetition and improve maintainability. Also, ensure consistent naming (e.g., differentiate between integration tests and mocked tests).
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 70% None

5. tests/test_atlas_client.py:49

Draft comment:
Tests using 'pytest.raises(Exception)' might catch too broad an exception. If possible, consider using a more specific exception type to ensure that only the expected error is caught.
Reason this comment was not posted:
Confidence changes required: 70% <= threshold 70% None

Workflow ID: wflow_F90OyLRY96tJKN1z

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

document client: more filetypes

bd231ae

cebtenzzre changed the title ~~document client: more filetypes~~ feat: document client: more filetypes Nov 11, 2025

cebtenzzre marked this pull request as ready for review November 11, 2025 00:31

ellipsis-dev bot reviewed Nov 11, 2025

View reviewed changes

disable atlas tests in ci

115095a

ellipsis-dev bot reviewed Nov 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: document client: more filetypes #404

feat: document client: more filetypes #404

cebtenzzre commented Nov 11, 2025 •

edited by ellipsis-dev bot

Loading

Uh oh!

cebtenzzre commented Nov 11, 2025

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

ellipsis-dev bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: document client: more filetypes #404

Are you sure you want to change the base?

feat: document client: more filetypes #404

Conversation

cebtenzzre commented Nov 11, 2025 • edited by ellipsis-dev bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cebtenzzre commented Nov 11, 2025

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

ellipsis-dev bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cebtenzzre commented Nov 11, 2025 •

edited by ellipsis-dev bot

Loading