Text Processing Software

Browse free open source Text Processing software and projects for Mac and BSD below. Use the toggles on the left to filter open source Text Processing software by OS, license, language, programming language, and project status.

  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Ada Class Library

    Ada Class Library

    Ada Class Library - an object orientated library for Ada.

    Text search and replace. Scripting (small tool programs). CGI scripts. Execution of external programs (incl. I/O redirection). Garbage Collection. Extendended Booch Components. CD-Recorder
    Leader badge
    Downloads: 479 This Week
    Last Update:
    See Project
  • 2
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 98 This Week
    Last Update:
    See Project
  • 3
    FCKeditor

    FCKeditor

    FCKeditor (retired)

    FCKeditor is the previous version of CKEditor and has been discontinued after version 2. The new CKEditor is redesigned from the ground up, offering more WYSIWYG text editing features, enhanced security and better integration. Don’t force yourself with retro FCKeditor. Switch to the new, cool CKEditor at ckeditor.com
    Downloads: 42 This Week
    Last Update:
    See Project
  • 4
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions of iText build on the success of previous versions and feature an improved document engine, high and low-level programming capabilities, and a more efficient modular structure. iText represents the next level for developers looking to leverage PDF in document workflows. The main project page for iText is now on GitHub, and all the latest releases, code samples, open source add-ons and tools, etc. can be found at https://github.com/itext/.
    Leader badge
    Downloads: 255 This Week
    Last Update:
    See Project
  • Simple, Secure Domain Registration Icon
    Simple, Secure Domain Registration

    Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

    Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
    Sign up for free
  • 5
    PDF Clown

    PDF Clown

    General-Purpose PDF Library for Java and .NET

    PDF Clown is a general-purpose Java and .NET library for manipulating PDF files through multiple abstraction layers, rigorously adhering to PDF 1.7 specification (ISO 32000-1). This project aims to provide a universal access to PDF files (creation, reading, editing, rendering...) through an accurate and elegant object-oriented API. * Features: http://pdfclown.org/overview/features/ * Overview: http://pdfclown.org/overview/architecture/ * Website: http://pdfclown.org/ * Blog: http://www.pdfclown.org/blog/ * Twitter: https://twitter.com/PDFClown
    Downloads: 18 This Week
    Last Update:
    See Project
  • 6
    Early Access iText, a PDF generation library in Java
    Downloads: 22 This Week
    Last Update:
    See Project
  • 7
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 10
    A Perl script that splits a long HTML file into separate inter-linked pages, according to the headings in the original file. Useful for maintaining both a print version and a browsable version of a site.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 11
    JODReports is a solution for generating dynamic documents and reports in Java based on the OpenDocument format (ODF). Templates can be easily composed with a word processor such as OpenOffice.org Writer. Data sources include POJOs and XML.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    RTF to HTML converter for use both with your applications and as a standalone tool. Small and fast. Processes tables better than any other tool I've seen.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    CPLed is an OpenSIPS tool for editing CPL scripts in a friendly and easy graphical way. It can be used as a standalone application or embedded in a web page as applet. It also provide CPL script transport functionalities via SIP and HTTP protocols.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Xelem is a compact Java-library to read and write Excel files of type SpreadsheetML. It can produce sophisticated, intricate and complex spreadsheets from within any Java program. And, since the release of xelem.2.0, it can read xml-spreadsheets.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    A Python-based template and view-controller framework derived from HTML::Mason. Supports the full featureset of Mason, allowing component-based web development with Python-embedded HTML, and includes many new concepts and features not found in Mason.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 16
    This projects aims to create a NFO generator which will be able to create different kinds of nfo files with different artwork for the different needs of its users.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    JLoom is a JSP like template language for text generation - e.g. source code, HTML, XML. JLoom templates are modular encapsulated. Parameters can be any Java type, even Generics or Varargs. There is a plugin for Eclipse and a command line tool.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    (XSLT transformer/editor) A text editor that allows the loading and editing of an XML document and an XSLT document at the same time. It also can apply the XSLT to the XML and display the output for further editing/saving. Plugable XML and XSLT parsers
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    CPIA is a macro-processing engine for XML (and HTML), written in C. The engine can either be used offline as a processor, or inside a web server. Both developers have lost interest. If you are interested in maintaining it, please contact either admin.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Fast C++ template engine (C++ analog of PERL HTML::Template). CTPP completely separates project source and data representation. Home page: http://reki.ru/products/ctpp CTPP2: http://reki.ru/products/ctpp2/ New API and arcitecture.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Chaperon is a LALR(1) parser, which parse structured text documents and generate XML documents as output. It includes a parser generator like yacc and a regex scaner like lex. As input use Chaperon a grammar written in XML.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    I have stopped developing this project, as I found DokuWiki to be exactly what I was looking for when starting to develop Codeslang. Sorry.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Concurrence is a networked file editing program that enables multiple people to modify a document simultaneously. It is written entirely in Python, and uses the wxPython library for the GUI and the Twisted library for networking.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DOMIT! is a Document Object Model (DOM) XML parser for PHP, written purely in PHP. It is mostly compliant with the DOM Level 2 specification.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.