-
Notifications
You must be signed in to change notification settings - Fork 204
feat: document client: more filetypes #404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed everything up to bd231ae in 1 minute and 55 seconds. Click for details.
- Reviewed
113lines of code in2files - Skipped
0files when reviewing. - Skipped posting
5draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. nomic/client.py:160
- Draft comment:
Ensure filetype.guess_mime() handles pathlib.Path; consider using str(path) for compatibility. - Reason this comment was not posted:
Confidence changes required:66%<= threshold70%None
2. nomic/client.py:167
- Draft comment:
Clarify ZIP archive handling: non-Office ZIP files trigger a ValueError. Update the docstring if this behavior is intentional. - Reason this comment was not posted:
Confidence changes required:66%<= threshold70%None
3. nomic/client.py:201
- Draft comment:
Rename variable 'pdf_file' to a more generic name (e.g., file_obj) since the file may not be a PDF. - Reason this comment was not posted:
Comment was on unchanged code.
4. setup.py:41
- Draft comment:
Good explicit version constraint for 'filetype'. Consider adding a comment explaining the rationale behind the version range. - Reason this comment was not posted:
Confidence changes required:33%<= threshold70%None
5. nomic/client.py:145
- Draft comment:
Typo: "docling's approach" might be a misspelling. Please confirm if it should be "docile's approach" or another term. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 5% vs. threshold = 70% "Docling" is likely a reference to a specific library or tool used for document processing. A quick search would reveal that "docling" is indeed a real document processing library. The automated tool appears to have incorrectly flagged this as a typo when it's actually a proper noun/technical term. The suggestion to change it to "docile's approach" would be nonsensical. This comment is incorrect and should be deleted. I might be wrong if "docling" is truly a typo and the author meant something else. However, given the technical context and the fact that the code is implementing MIME type detection with specific handling for Office formats, it's highly likely this is a reference to an actual library or methodology. Even if I'm uncertain about whether "docling" is a real library, the suggested correction to "docile's approach" makes no sense in this technical context. The comment is not providing value and is likely incorrect. Additionally, this is just a comment/documentation change, not a code logic issue, and the rules state not to make comments that are obvious or unimportant. This comment should be deleted. It incorrectly flags "docling" as a typo when it appears to be a legitimate technical term or library reference. The suggested correction "docile's approach" would be nonsensical in this context.
Workflow ID: wflow_oBTF6zrEX7DZ2A1k
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed 115095a in 1 minute and 57 seconds. Click for details.
- Reviewed
188lines of code in4files - Skipped
0files when reviewing. - Skipped posting
5draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. .circleci/config.yml:42
- Draft comment:
Changed test command from 'make test' to 'make test_ci', which filters out tests marked with 'atlas'. Please confirm that skipping integration tests on CI is intentional and consider documenting this decision. - Reason this comment was not posted:
Comment did not seem useful. Confidence is useful =30%<= threshold70%The comment is asking the PR author to confirm their intention regarding skipping integration tests on CI. This violates the rule against asking for confirmation of intention. However, it also suggests documenting the decision, which is a specific and actionable suggestion. The comment is partially useful but mostly violates the rules.
2. makefile:40
- Draft comment:
New 'test_ci' target excludes 'atlas' integration tests using '-m "not atlas"'. It would be helpful to add a comment explaining that these tests depend on external services and are excluded from CI by design. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 70% This comment is asking the author to add a comment/documentation explaining the purpose of the test exclusion. However, the rules explicitly state "Do NOT comment unless there is clearly a code change required. Do not make purely informative comments." The code itself is fairly self-explanatory - it's a CI test target that excludes atlas tests. The naming convention (test_ci vs test) and the pytest marker usage make the intent reasonably clear. Additionally, looking at the full file context, I can see other CI-related targets likeblack_ci,isort_ci,pyright_ciwhich follow a similar naming pattern, suggesting this is a consistent pattern in the codebase. The comment is essentially asking for documentation/explanation rather than pointing out a functional issue. This violates the rule about not making purely informative comments. While the code might benefit from a comment explaining why atlas tests are excluded, the naming and context make it reasonably clear. The comment could be considered helpful for future maintainers, and the rule about "Extremely Complicated Code Needs Comments" might apply if this is confusing. The code is not "extremely complicated" - it's a simple pytest command with a marker exclusion. The naming convention (test_ci) and the marker usage are standard pytest practices. This is more about documentation preference than a necessary code change. The rules prioritize actionable code changes over documentation requests. This comment should be deleted because it's requesting documentation/explanation rather than pointing out a functional issue or necessary code change. The code is self-explanatory enough through naming conventions and standard pytest practices.
3. pyproject.toml:13
- Draft comment:
The addition of the 'atlas' marker in pytest config is good. Consider expanding the documentation (e.g., in the developer guide) on when to use the '@pytest.mark.atlas' marker. - Reason this comment was not posted:
Confidence changes required:20%<= threshold70%None
4. tests/test_atlas_client.py:30
- Draft comment:
Integration test functions now include '@pytest.mark.atlas'. If the majority of tests in this file are integration tests, consider applying a module-level marker (e.g., 'pytestmark = pytest.mark.atlas') to reduce repetition and improve maintainability. Also, ensure consistent naming (e.g., differentiate between integration tests and mocked tests). - Reason this comment was not posted:
Confidence changes required:70%<= threshold70%None
5. tests/test_atlas_client.py:49
- Draft comment:
Tests using 'pytest.raises(Exception)' might catch too broad an exception. If possible, consider using a more specific exception type to ensure that only the expected error is caught. - Reason this comment was not posted:
Confidence changes required:70%<= threshold70%None
Workflow ID: wflow_F90OyLRY96tJKN1z
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.

Important
Enhance
upload_file()innomic/client.pyto support more file types usingfiletype, update tests and CI configuration accordingly.upload_file()innomic/client.pynow supports more file types (Office documents, images) using_detect_content_type()._detect_content_type()added tonomic/client.pyfor MIME type detection usingfiletypelibrary.test_citarget inmakefileexcludes tests marked withatlas.atlasmarker to multiple tests intests/test_atlas_client.py..circleci/config.ymlupdated to usemake test_cifor unit tests.filetypetoinstall_requiresinsetup.py.This description was created by
for 115095a. You can customize this summary. It will automatically update as commits are pushed.