Browse free open source Text Processing software and projects for Windows and Linux below. Use the toggles on the left to filter open source Text Processing software by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Total Network Visibility for Network Engineers and IT Managers Icon
    Total Network Visibility for Network Engineers and IT Managers

    Network monitoring and troubleshooting is hard. TotalView makes it easy.

    This means every device on your network, and every interface on every device is automatically analyzed for performance, errors, QoS, and configuration.
    Learn More
  • 1
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions of iText build on the success of previous versions and feature an improved document engine, high and low-level programming capabilities, and a more efficient modular structure. iText represents the next level for developers looking to leverage PDF in document workflows. The main project page for iText is now on GitHub, and all the latest releases, code samples, open source add-ons and tools, etc. can be found at https://github.com/itext/.
    Leader badge
    Downloads: 271 This Week
    Last Update:
    See Project
  • 2
    Command-line/Ant-task/embeddable text file preprocessor. Macros, flow control, expressions. Recursive directory processing. Extensible in Java to display data from any data sources (as database). Can generate complete homepages (tree of HTML-s, images, etc.)
    Leader badge
    Downloads: 89 This Week
    Last Update:
    See Project
  • 3
    FCKeditor

    FCKeditor

    FCKeditor (retired)

    FCKeditor is the previous version of CKEditor and has been discontinued after version 2. The new CKEditor is redesigned from the ground up, offering more WYSIWYG text editing features, enhanced security and better integration. Don’t force yourself with retro FCKeditor. Switch to the new, cool CKEditor at ckeditor.com
    Downloads: 24 This Week
    Last Update:
    See Project
  • 4
    Ada Class Library

    Ada Class Library

    Ada Class Library - an object orientated library for Ada.

    Text search and replace. Scripting (small tool programs). CGI scripts. Execution of external programs (incl. I/O redirection). Garbage Collection. Extendended Booch Components. CD-Recorder
    Leader badge
    Downloads: 89 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 5
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 6
    PDF Clown

    PDF Clown

    General-Purpose PDF Library for Java and .NET

    PDF Clown is a general-purpose Java and .NET library for manipulating PDF files through multiple abstraction layers, rigorously adhering to PDF 1.7 specification (ISO 32000-1). This project aims to provide a universal access to PDF files (creation, reading, editing, rendering...) through an accurate and elegant object-oriented API. * Features: http://pdfclown.org/overview/features/ * Overview: http://pdfclown.org/overview/architecture/ * Website: http://pdfclown.org/ * Blog: http://www.pdfclown.org/blog/ * Twitter: https://twitter.com/PDFClown
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    cpDetector is a proxy for codepage detection of documents. It delegates to multiple instances that try to detect the codepage by different techinques. A command line executeable is shipped that allows to sort documents by codepage.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 8
    RTF to HTML converter for use both with your applications and as a standalone tool. Small and fast. Processes tables better than any other tool I've seen.
    Leader badge
    Downloads: 10 This Week
    Last Update:
    See Project
  • 9
    Web Book Downloader

    Web Book Downloader

    Download websites as e-book: pdf, txt, epub.

    This application allows user to download chapters from website in 3 ways: - from table of contents; - from range: first chapter address, last chapter address; - by crawling from first chapter to n; In settings you can customize language, input(website encoding) for simplicity output is in the same encoding. If you want your language add new class into strings package, and new fields into Settings class and GUI menu(initialize method).
    Downloads: 5 This Week
    Last Update:
    See Project
  • Dun and Bradstreet Connect simplifies the complex burden of data management Icon
    Dun and Bradstreet Connect simplifies the complex burden of data management

    Our self-service data management platform enables your organization to gain a complete and accurate view of your accounts and contacts.

    The amount, speed, and types of data created in today’s world can be overwhelming. With D&B Connect, you can instantly benchmark, enrich, and monitor your data against the Dun & Bradstreet Data Cloud to help ensure your systems of record have trusted data to fuel growth.
    Learn More
  • 10
    Early Access iText, a PDF generation library in Java
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    IMPORTANT NOTE: This project has moved to Github: https://github.com/pkozelka/libxml2-pas Pascal units accessing the popular XML API from Daniel Veillard ( http://www.xmlsoft.org ). This should be usable at least from Kylix and Delphi, but hopefully also from other Pascal compilers (like freepascal).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Babeldoc is integration tool that can plumb together data flows. It is completely configurable and scriptable. It is heavily XML biased but not exclusively so.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    Piccolo is the fastest SAX parser for Java, supporting SAX1, SAX2, and JAXP (SAX only). Piccolo is different from other parsers in that it was developed using parser generators. It weighs 160K including XML APIs. See http://piccolo.sf.net for more info.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Functional XML parsing framework: SAX/DOM and SXML parsers with support for XML Namespaces and validation. Related to SSAX are SXPath queries and SXML transformations, with applications to XML/HTML authoring and literate Scheme and XML programming.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    PDFBox is a Java PDF Library. This project will allow access to all of the components in a PDF document. More PDF manipulation features will be added as the project matures. This ships with a utility to take a PDF document and output a text file.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    JODReports is a solution for generating dynamic documents and reports in Java based on the OpenDocument format (ODF). Templates can be easily composed with a word processor such as OpenOffice.org Writer. Data sources include POJOs and XML.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    This projects aims to create a NFO generator which will be able to create different kinds of nfo files with different artwork for the different needs of its users.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    A Perl script that splits a long HTML file into separate inter-linked pages, according to the headings in the original file. Useful for maintaining both a print version and a browsable version of a site.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Mamba is a extensible xml templates preprocessor wrote in Python. Using it, you can rapidly develop powerful applications ready to integrate with the internet. It can be used to work as a generic CGI program or for generate content.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    The Doc2Html command line operating program strippes the Word produced html files (by opening the documet, saving as html) leaving pure text + minimum html code. It also has a mode to convert data berween different charsets: DOS, Windows-1250 and ISO-8859
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Xelem is a compact Java-library to read and write Excel files of type SpreadsheetML. It can produce sophisticated, intricate and complex spreadsheets from within any Java program. And, since the release of xelem.2.0, it can read xml-spreadsheets.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    The objective of the OpenBerg Project is to develop Open-Source, Open-Standards-based, Multi-Platform tools for eBook authors, editors and users. We are currently working on OpenBerg Lector, an e-Book reader, and OpenBerg Rector, an e-Book compiler.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    PDF::API2 is 'The Next Generation' of Text::PDF::API, a Perl module-chain that facilitates the creation and modification of PDF files. It features support for the 14 base PDF Core Fonts, TrueType fonts, and Adobe-Type1, with unicode mappings, embedding o
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    PDML is an informal markup language written in PHP that is similar to HTML. It allows for the creation of complex PDF documents and can also be used in conjunction with PHP, to define templates which can generate dynamic PDF documents.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    PWEditor is a professional web development tool, enabling users to efficiently design, develop and maintain websites both online and offline. It includes WYSIWYG html editor, CSS editor, JS editor and text editor. It works with Firefox, Mozilla and IE.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next