Skip to content

[importer] Introduce new Importer module with separate Configs, API Endpoints, and Dependencies #4089

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
May 30, 2025
Merged
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
75defec
[importer] Add new component and API endpoint with new directory stru…
Harshg999 Apr 1, 2025
935ca5d
[importer] Implement file upload API for CSV and Excel formats with v…
Harshg999 Apr 8, 2025
1ce2856
Refactor importer API: remove unused import and delete obsolete templ…
Harshg999 Apr 8, 2025
c20623e
Refactors file format detection and metadata extraction
Harshg999 Apr 25, 2025
db2a4ec
Add file metadata detection and update dependencies
Harshg999 Apr 25, 2025
6ba570f
Refactors file upload API for better separation of concerns
Harshg999 Apr 29, 2025
5552e78
Refactor file metadata detection API and improve efficiency
Harshg999 Apr 29, 2025
b901255
Improves file metadata extraction and error handling
Harshg999 Apr 30, 2025
48f07aa
Improves file type detection with graceful magic lib fallback
Harshg999 Apr 30, 2025
c7a2a11
Adds file preview API for data import functionality
Harshg999 May 5, 2025
4810c36
Add coerce_bool handling for has_header parameter
Harshg999 May 13, 2025
c9a2760
Enhance field separator handling in preview_file API with unicode dec…
Harshg999 May 20, 2025
880fbb5
Refactor file metadata handling by introducing GuessFileMetadataSeria…
Harshg999 May 21, 2025
e1af698
Refactor file preview functionality by introducing PreviewFileSeriali…
Harshg999 May 21, 2025
70e7fd7
Add has_header field to PreviewFileSerializer for explicit header det…
Harshg999 May 21, 2025
307fe3f
Uncomment old code
Harshg999 May 21, 2025
c312738
Change variable name to uploaded_file
Harshg999 May 21, 2025
3732baf
fix docstring
Harshg999 May 21, 2025
4d163db
Remove unnecessary blank line before api_error_handler function
Harshg999 May 21, 2025
06b5d0a
Refactor comments
Harshg999 May 22, 2025
323fd19
Add configurable importer restrictions and settings
Harshg999 May 27, 2025
1d4ff77
Improves Excel sheet name extraction performance
Harshg999 May 27, 2025
6baa522
Add API and logic for file header row detection
Harshg999 May 28, 2025
590f772
Add API for mapping Polars types to SQL types
Harshg999 May 28, 2025
708e75a
Sort req packages
Harshg999 May 28, 2025
5a9832b
Add unit tests
Harshg999 May 29, 2025
e235b65
Refactor import statements for better organization and clarity
Harshg999 May 29, 2025
dba5f8f
Remove redundant validate methods from GuessFileMetadataSerializer an…
Harshg999 May 30, 2025
a023370
Change "data file importer" to just "importer"
Harshg999 May 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Enhance field separator handling in preview_file API with unicode dec…
…oding and error logging
  • Loading branch information
Harshg999 committed May 28, 2025
commit c9a2760d7a063a56cd5b0ece2788041e47505fac
9 changes: 8 additions & 1 deletion desktop/core/src/desktop/lib/importer/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
import re
import csv
import uuid
import codecs
import logging
import tempfile
from functools import wraps
Expand Down Expand Up @@ -262,7 +263,13 @@ def preview_file(request: Request) -> Response:

result = _preview_excel_file(fh, file_type, sheet_name, sql_dialect, has_header)
elif file_type in ['csv', 'tsv', 'delimiter_format']:
field_separator = request.query_params.get('field_separator', ',')
field_sep = request.query_params.get('field_separator', ',')
try:
field_separator = codecs.decode(field_sep, 'unicode_escape')
except Exception as e:
LOG.error(f"Error decoding field_separator: {e}", exc_info=True)
return Response({'error': "Invalid field_separator provided."}, status=status.HTTP_400_BAD_REQUEST)

quote_char = request.query_params.get('quote_char', '"')
record_separator = request.query_params.get('record_separator', '\n')

Expand Down