Python Data Formats Software

View related business solutions

Browse free open source Python Data Formats Software and projects below. Use the toggles on the left to filter open source Python Data Formats Software by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • No-Nonsense Code-to-Cloud Security for Devs | Aikido Icon
    No-Nonsense Code-to-Cloud Security for Devs | Aikido

    Connect your GitHub, GitLab, Bitbucket, or Azure DevOps account to start scanning your repos for free.

    Aikido provides a unified security platform for developers, combining 12 powerful scans like SAST, DAST, and CSPM. AI-driven AutoFix and AutoTriage streamline vulnerability management, while runtime protection blocks attacks.
    Start for Free
  • 1
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 69 This Week
    Last Update:
    See Project
  • 2
    PdfBooklet
    PdfBooklet is a Python Gtk application which allows to make books or booklets from existing pdf files. It can also adjust margins, rotate, scale, merge files or extract pages.
    Leader badge
    Downloads: 398 This Week
    Last Update:
    See Project
  • 3
    lxml

    lxml

    The lxml XML toolkit for Python

    A Python library for efficient XML and HTML processing, known for speed and compatibility. The lxml XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API. The latest release works with all CPython versions from 3.6 to 3.12. See the introduction for more information about the background and goals of the lxml project.
    Downloads: 45 This Week
    Last Update:
    See Project
  • 4
    Asymptote

    Asymptote

    2D & 3D TeX-Aware Vector Graphics Language

    Asymptote is a powerful descriptive vector graphics language for technical drawing, inspired by MetaPost but with an improved C++-like syntax. Asymptote provides for figures the same high-quality typesetting that LaTeX does for scientific text.
    Leader badge
    Downloads: 172 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5
    Grassroots DICOM

    Grassroots DICOM

    Cross-platform DICOM implementation

    Grassroots DiCoM is a C++ library for DICOM medical files. It is accessible from Python, C#, Java and PHP. It supports RAW, JPEG, JPEG 2000, JPEG-LS, RLE and deflated transfer syntax. It comes with a super fast scanner implementation to quickly scan hundreds of DICOM files. It supports SCU network operations (C-ECHO, C-FIND, C-STORE, C-MOVE). PS 3.3 & 3.6 are distributed as XML files. It also provides PS 3.15 certificates and password based mecanism to anonymize and de-identify DICOM datasets.
    Leader badge
    Downloads: 192 This Week
    Last Update:
    See Project
  • 6
    Biosignal Tools
    BioSig is a software library for processing of biomedical signals (EEG, ECG, etc.) with Matlab, Octave, C/C++ and Python. About 50 different data formats are supported.
    Leader badge
    Downloads: 157 This Week
    Last Update:
    See Project
  • 7
    Pix2Text

    Pix2Text

    Open-Source Python3 tool for recognizing layouts, tables, and math

    An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported. Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, and mathematical formulas, and integrate all of these contents into Markdown format. P2T can also convert an entire PDF file (which can contain scanned images or any other format) into Markdown format.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 8
    SQLModel

    SQLModel

    SQL databases in Python, designed for simplicity, compatibility

    SQLModel, SQL databases in Python, designed for simplicity, compatibility, and robustness. SQLModel is a library for interacting with SQL databases from Python code, with Python objects. It is designed to be intuitive, easy to use, highly compatible, and robust. SQLModel is based on Python-type annotations, and powered by Pydantic and SQLAlchemy.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 9
    LaTeX Cookbook

    LaTeX Cookbook

    A comprehensive LaTeX template with examples for theses, books, etc.

    This repo contains a LaTeX document, usable as a cookbook (different "recipes" to achieve various things in LaTeX) as well as a template. The resulting PDF covers LaTeX-specific topics and instructions on compiling the LaTeX source.
    Downloads: 9 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    PDF-Shuffler
    PDF-Shuffler is a small python-gtk application, which helps the user to merge or split pdf documents and rotate, crop and rearrange their pages using an interactive and intuitive graphical interface. It is a frontend for python-pyPdf.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 11
    Great Expectations

    Great Expectations

    Always know what to expect from your data

    Great Expectations helps data teams eliminate pipeline debt, through data testing, documentation, and profiling. Software developers have long known that testing and documentation are essential for managing complex codebases. Great Expectations brings the same confidence, integrity, and acceleration to data science and data engineering teams. Expectations are assertions for data. They are the workhorse abstraction in Great Expectations, covering all kinds of common data issues. Expectations are a great start, but it takes more to get to production-ready data validation. Where are Expectations stored? How do they get updated? How do you securely connect to production data systems? How do you notify team members and triage when data validation fails? Great Expectations supports all of these use cases out of the box. Instead of building these components for yourself over weeks or months, you will be able to add production-ready validation to your pipeline in a day.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12
    TOML

    TOML

    Tom Preston-Werner's obvious, minimal language

    Tom's Obvious, Minimal Language. By Tom Preston-Werner, Pradyun Gedam, et al. TOML aims to be a minimal configuration file format that's easy to read due to obvious semantics. TOML is designed to map unambiguously to a hash table. TOML should be easy to parse into data structures in a wide variety of languages. TOML shares traits with other file formats used for application configuration and data serialization, such as YAML and JSON. TOML and JSON both are simple and use ubiquitous data types, making them easy to code for or parse with machines. TOML and YAML both emphasize human readability features, like comments that make it easier to understand the purpose of a given line. TOML differs in combining these, allowing comments (unlike JSON) but preserving simplicity (unlike YAML). Because TOML is explicitly intended as a configuration file format, parsing it is easy, but it is not intended for serializing arbitrary data structures.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 13
    FreeTAKServer

    FreeTAKServer

    Situational Awareness Server compatible with TAK clients

    FTS is a Python3 implementation of a TAK Server for devices like ATAK, WinTAK, and ITAK, it is cross-platform and runs from a multi-node installation on AWS down to the Android edition. It's free and open source (released under the Eclipse Public License. FTS allows you to connect ATAK clients to share geo-information, to chat with all the connected clients, exchange files and more. It intends to support all the major use cases of the original TAK server.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 14
    JC

    JC

    CLI tool and python library

    CLI tool and python library that converts the output of popular command-line tools and file types to JSON or Dictionaries. This allows piping of output to tools like jq and simplifying automation scripts. jc JSONifies the output of many CLI tools and file types for easier parsing in scripts. This allows further command-line processing of output with tools like jq or jello by piping commands. The JC parsers can also be used as python modules. In this case, the output will be a python dictionary, or a list of dictionaries, instead of JSON. Two representations of the data are available. The default representation uses a strict schema per parser and converts known numbers to int/float JSON values. Certain known values of None are converted to JSON null, known boolean values are converted, and, in some cases, additional semantic context fields are added.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 15
    TikZ

    TikZ

    TikZ figures for concepts in physics/chemistry/ML

    Collection of 111 standalone TikZ figures for illustrating concepts in physics, chemistry, and machine learning. Check out janosh.github.io to search, sort, open in Overleaf, and download figures (PDF/SVG/PNG) from this collection.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    FontForge Windows builds

    FontForge Windows builds

    Unofficial Windows builds of FontForge

    The aim of this project is to compile up-to-date Windows builds of FontForge. For 'stable' builds, see https://github.com/fontforge/fontforge/releases The build system used was based off that offered by Matthew Petroff (http://www.mpetroff.net/software/fontforge-windows/), but has since been practically rewritten. New in 11/07/2020: * Synced with the 20201107 release. New in 06/04/2020: * Updated to latest master, picks up a clipboard copying fix New in 14/03/2020: * Synced with the 20200314 release. New in 01/03/2020: * Updated to latest master, now built with CMake. (prerelease) New in 02/06/2019: * The 32-bit build now uses Python 3 (3.7) instead of Python 2. No further Python 2 builds will be provided. * The GDK3 backend is now used. VcXsrv is no longer bundled. New in 31/07/2017: KNOWN ISSUES: * CTRL-C from console no longer interrupts/stops FontForge
    Leader badge
    Downloads: 43 This Week
    Last Update:
    See Project
  • 17
    jsonschema

    jsonschema

    An implementation of the JSON Schema specification for Python

    jsonschema is an implementation of the JSON Schema specification for Python. Full support for Draft 2020-12, Draft 2019-09, Draft 7, Draft 6, Draft 4 and Draft 3. Lazy validation that can iteratively report all validation errors. Programmatic querying of which properties or items failed validation.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    autopep8

    autopep8

    A tool that automatically formats Python code to conform to the PEP 8

    autopep8 automatically formats Python code to conform to the PEP 8 style guide. It uses the pycodestyle utility to determine what parts of the code need to be formatted. autopep8 is capable of fixing most of the formatting issues that can be reported by pycodestyle. Correct deprecated or non-idiomatic Python code (via lib2to3). Use this for making Python 2.7 code more compatible with Python 3. Put a blank line between a class docstring and its first method declaration. Remove blank lines between a function declaration and its docstring.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    isort

    isort

    A Python utility / library to sort imports

    isort is a Python utility/library to sort imports alphabetically, and automatically separated into sections and by type. It provides a command-line utility, Python library and plugins for various editors to quickly sort all your imports. It requires Python 3.6+ to run but supports formatting Python 2 code too. Several plugins have been written that enable to use isort from within a variety of text-editors. You can find a full list of them on the isort wiki. Additionally, I will enthusiastically accept pull requests that include plugins for other text editors and add documentation for them as I am notified. As of isort 3.1.0 support for balanced multi-line imports has been added. With this enabled isort will dynamically change the import length to the one that produces the most balanced grid, while staying below the maximum import length defined.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    tqdm

    tqdm

    A Fast, Extensible Progress Bar for Python and CLI

    tqdm is a fast, extensible progress bar for Python and CLI that enables you to see the progress of your loops in a clear and smart way. Simply wrap any iterable with tqdm(iterable), and sit back and watch that progress meter go! tqdm can be wrapped around any iterable, or executed as a module with pipes. Just by inserting tqdm (or python -m tqdm) between pipes will pass through all stdin to stdout while printing progress to stderr. tqdm does not require any dependencies, has a very low overhead and uses smart algorithms to predict the remaining time and skip unnecessary iteration displays. It works on just about any platform, console or in a GUI, as well as IPython/Jupyter notebooks.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 21
    Canmatrix

    Canmatrix

    Converting Can (Controller Area Network) Database Formats

    Canmatrix is a Python package to read and write several CAN (Controller Area Network) database formats. Canmatrix implements a "Python Can Matrix Object" which describes the can-communication and the needed objects (Board units, Frames, Signals, Values, ...) Canmatrix also includes two Tools (can convert and can compare) for converting and comparing CAN databases.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    Mashumaro

    Mashumaro

    Fast and well tested serialization library on top of dataclasses

    When using data classes, you often need to dump and load objects based on the schema you have. Mashumaro not only lets you save and load things in different ways, but it also does it super quickly.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    Pandas Profiling

    Pandas Profiling

    Create HTML profiling reports from pandas DataFrame objects

    pandas-profiling generates profile reports from a pandas DataFrame. The pandas df.describe() function is handy yet a little basic for exploratory data analysis. pandas-profiling extends pandas DataFrame with df.profile_report(), which automatically generates a standardized univariate and multivariate report for data understanding. High correlation warnings, based on different correlation metrics (Spearman, Pearson, Kendall, Cramér’s V, Phik). Most common categories (uppercase, lowercase, separator), scripts (Latin, Cyrillic) and blocks (ASCII, Cyrilic). File sizes, creation dates, dimensions, indication of truncated images and existance of EXIF metadata. Mostly global details about the dataset (number of records, number of variables, overall missigness and duplicates, memory footprint). Comprehensive and automatic list of potential data quality issues (high correlation, skewness, uniformity, zeros, missing values, constant values, between others).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    Remarshal

    Remarshal

    Convert between CBOR, JSON, MessagePack, TOML, and YAML

    Convert between CBOR, JSON, MessagePack, TOML, and YAML. When installed, provides the command-line command remarshal as well as the short commands {cbor,json,msgpack,toml,yaml}2{cbor,json,msgpack,toml,yaml}. You can perform format conversion, reformatting, and error detection using these commands. CBOR, MessagePack, and YAML with binary fields cannot be converted to JSON or TOML. Binary fields are converted between CBOR, MessagePack, and YAML.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    TexText

    TexText

    Re-editable LaTeX/ typst graphics for Inkscape

    Re-editable LaTeX and typst graphics for Inkscape. TexText is a Python extension for the vector graphics editor Inkscape providing the possibility to add and re-edit LaTeX and typst generated SVG elements to your drawing.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.