Python Linguistics Software

View 2686 business solutions

Browse free open source Python Linguistics Software and projects below. Use the toggles on the left to filter open source Python Linguistics Software by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Get Avast Free Antivirus with 24/7 AI-powered online scam detection Icon
    Get Avast Free Antivirus with 24/7 AI-powered online scam detection

    Get protection for today’s online threats. Free.

    Award-winning antivirus protection, as well as protection against online scams, dangerous Wi-Fi connections, hacked accounts, and ransomware. It includes Avast Assistant, your built-in AI partner, which gives you help with suspicious online messages, offers, and more.
    Free Download
  • 1
    Argos Translate

    Argos Translate

    Open-source offline translation library written in Python

    Argos Translate uses OpenNMT for translations and can be used as either a Python library, command-line, or GUI application. Argos Translate supports installing language model packages which are zip archives with a ".argosmodel" extension containing the data needed for translation. LibreTranslate is an API and web-app built on top of Argos Translate. Argos Translate also manages automatically pivoting through intermediate languages to translate between languages that don't have a direct translation between them installed. For example, if you have a es → en and en → fr translation installed you are able to translate from es → fr as if you had that translation installed. This allows for translating between a wide variety of languages at the cost of some loss of translation quality.
    Downloads: 57 This Week
    Last Update:
    See Project
  • 2
    PDFMathTranslate

    PDFMathTranslate

    PDF scientific paper translation with preserved formats

    PDFMathTranslate is a Python-based tool that uses AI translation to convert academic PDFs into bilingual (e.g. Chinese-English) documents while preserving formatting, including math notation. It supports OCR-enhanced content and offers CLI, GUI, Docker, and Zotero integration under AGPL v3.
    Downloads: 31 This Week
    Last Update:
    See Project
  • 3
    iramuteq
    IRAMUTEQ : Interface de R pour les Analyses Multidimensionnelles de Textes et de Questionnaires. Logiciel de traitement de données pour des corpus texte ou de type individus/caractères. Permet notamment de réaliser des analyses de type "ALCESTE"
    Leader badge
    Downloads: 632 This Week
    Last Update:
    See Project
  • 4
    Mishkal: Arabic Text Vocalization

    Mishkal: Arabic Text Vocalization

    Arabic Text Vocalization system

    Automatic system of vocalization of arabic text.
    Downloads: 63 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 5

    Presage

    the intelligent predictive text entry platform

    Presage (formerly Soothsayer) is an intelligent predictive text entry system. Presage generates predictions by modelling natural language as a combination of redundant information sources. Presage computes probabilities for words which are most likely to be entered next by merging predictions generated by the different predictive algorithms. Presage's modular and extensible architecture allows its language model to be extended and customized to utilize statistical, syntactic, and semantic predictive algorithms. Presage's predictive capabilities are implemented by predictive plugins. Predictive plugins use services provided by the platform to implement multiple prediction techniques.
    Leader badge
    Downloads: 71 This Week
    Last Update:
    See Project
  • 6
    SPPAS

    SPPAS

    SPPAS - the automatic annotation and analyses of speech

    SPPAS is a scientific computer software package written and maintained by Brigitte Bigi of the Laboratoire Parole et Langage, in Aix-en-Provence, France. Available for free, with open source code, there is simply no other package for linguists to simple use in the automatic annotations of speech, the analyses of any kind of annotated data and the conversion of annotated files. SPPAS is able to produce automatically speech annotations from a recorded speech sound and its orthographic transcription. SPPAS is helpful for the analysis of any annotated data: estimate statistical distributions, make requests, manage files, visualize annotations. SPPAS offers a file converter from/to a wide range of formats: xra, TextGrid, eaf, trs... <https://sppas.org>
    Downloads: 37 This Week
    Last Update:
    See Project
  • 7

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are interested in reuse, and we focus on common NLP tasks that are broadly useful for textmining.
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 8
    Arramooz Alwaseet Arabic Dictionary
    Arramooz Alwaseet Open Arabic Dictionary for morphological analyze. To be useful for Arabic language processing. This dictionary is derived from the Ayaspell Arabic spell checker.
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9

    Arabic Corpus

    Text categorization, arabic language processing, language modeling

    The Arabic Corpus {compiled by Dr. Mourad Abbas ( http://sites.google.com/site/mouradabbas9/corpora ) The corpus Khaleej-2004 contains 5690 documents. It is divided to 4 topics (categories). The corpus Watan-2004 contains 20291 documents organized in 6 topics (categories). Researchers who use these two corpora would mention the two main references: (1) For Watan-2004 corpus ---------------------- M. Abbas, K. Smaili, D. Berkani, (2011) Evaluation of Topic Identification Methods on Arabic Corpora,JOURNAL OF DIGITAL INFORMATION MANAGEMENT,vol. 9, N. 5, pp.185-192. 2) For Khaleej-2004 corpus --------------------------------- M. Abbas, K. Smaili (2005) Comparison of Topic Identification Methods for Arabic Language, RANLP05 : Recent Advances in Natural Language Processing ,pp. 14-17, 21-23 september 2005, Borovets, Bulgary. More useful references to check: ------------------------------------------- https://sites.google.com/site/mouradabbas9/corpora
    Downloads: 3 This Week
    Last Update:
    See Project
  • Picsart Enterprise Background Removal API for Stunning eCommerce Visuals Icon
    Picsart Enterprise Background Removal API for Stunning eCommerce Visuals

    Instantly remove the background from your images in just one click.

    With our Remove Background API tool, you can access the transformative capabilities of automation , which will allow you to turn any photo asset into compelling product imagery. With elevated visuals quality on your digital platforms, you can captivate your audience, and therefore achieve higher engagement and sales.
    Learn More
  • 10

    Aelius Brazilian Portuguese POS-Tagger

    Python, NLTK-based package for shallow parsing of Brazilian Portuguese

    Aelius is an ongoing open source project aiming at developing a suite of Python, NLTK-based modules and interfaces to external freely available tools for shallow parsing of Brazilian Portuguese. It also includes language resources such as language models, sample texts, and gold standards. Presently, Aelius already offers facilities for POS-tagging and chunking corpora and outputting annotations in different formats, such as in XML in the TEI P5 encoding scheme.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11

    Automatic Compound Processing (AuCoPro)

    Automatic compound splitting and semantic analysis of compounds

    The central problem to be addressed in this project concerns a multidisciplinary (linguistics and computational linguistics) investigation into sharing of knowledge and resources between closely-related languages, specifically relating to the automatic processing of compounds. Specifically, we will explore the possibility to create new knowledge about closely-related languages, and efficiently develop additional, more advanced resources for (a) compound segmentation; and (b) the semantic analysis of compounds; as such, the project will be divided into two interrelated subprojects, to be executed simultaneously. The focus in this project will be on Afrikaans (with Dutch as the closely-related, well-sourced language), which will lay grounds for future work on other closely-related language pairs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12

    Language Constructor

    Complete tool for constructing/manipulating languages in digital form

    With this tool you can easily design a new language, digitize an existing one or incrementally reconstruct an ancient language. It allows for free experimentation of all aspects of the language, so it does not have to be made consistent on paper first. You can edit script, syntax, grammar, morphology, lexicon and phonology, as well as write documents in the language, as it might be too complex to be handled by current font technology. The information is stored in xml format for easy integration with other software.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Part-of-speech tagging is the task of assigning symbols from a particular set to words in a natural language text. ACOPOST implements and extends well-known machine learning techniques and provides a uniform environment for testing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Alfanous

    Alfanous

    Quran Search Engine

    Alfanous (The Lantern - الفانوس ) is an Arabic search engine API provide the simple and advanced search in the Holy Quran , more features and many interfaces...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    AsiEs stands for Asistente de Escritura (writing assistant). It provides word prediction and autocomplete for fast writing. Thought for people with difficulties writing on keyboard, improves the writing speed preventing the user from pressing at most 50% of keys to write and avoids ortographic errors. Made by Fundación Teletón Uruguay (http://www.teleton.org.uy/home/)
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Color to Word

    Color to Word

    Turn colors into words

    The program will turn a color into a list of 10 words, obtained according to a custom designed algorithm based on letter shape and position in the alphabet. - Click inside the frame on the left to pick a color through the color chooser window - The program will match the color with the colors corresponding to a list of all the English words contained in the file wordcolor.txt - The first 10 matches will appear in the frame on the right - Right-click - Copy to copy the word matches and the RGB values This version comes with a text file (wordcolor.txt) containing all the English words followed by Red, Green, Blue channel values for the corresponding color. The colors were obtained through a modified version of the program "Text to Color" by same author, available for download on GitHub and SourceForge on the profile page of Fonazza-Stent. The next version (coming soon) will include a tool to convert a custom word list into a word+color list named wordcolor.txt
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17

    CompE Toolkit

    Data Type Converter

    CompE Toolkit allows the user to seamlessly convert between binary, decimal, hexadecimal, and 32-bit floating point representation. It uses a simple, user-friendly interface designed for maximum efficiency and minimal clutter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Redundancy due to cut-paste operations in text creates bias in machine learning for NLP. This module takes a directory and produces a subset of the files in that directory (in a list) with an upper bound on similarity between two files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Donatus is an on-going project consisting of Python, NLTK-based tools and grammars for deep parsing and syntactical annotation of Brazilian Portuguese corpora. It includes a user-friendly graphical user interface for building syntactic parsers with the NLTK, providing some additional functionalities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Helsinki Finite-State Technology
    The Helsinki Finite-State Transducer toolkit is intended for processing natural language morphologies. The toolkit is demonstrated by wide-coverage implementations of a number of languages of varying morphological complexity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    An agent-based situated language learning simulation that focuses on lexical learning and grounding, featuring a unigram syntax structure and a CFG-based semantic grammar. Created as a MSc thesis project, using python.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    KAF2Tiger2 is a KAF (KYOTO annotation format) to <tiger2/> (Tiger2 XML) converter.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Kana no quiz is a little educational tool to memorize the transcription and pronunciation of Japanese kana (katakana & hiragana), presented as a quiz. It is written in Python and uses a GTK+ interface for a nice cross-paltform rendering!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Little Cohesion Helper (LCH), alias TraglWeck, semi-automates the annotation of lexical-cohesion in a given text. Input is a raw text file and this software generates a bunch of XML files which can be used with MMAX2.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    MITRE Annotation Toolkit

    A toolkit for managing and manipulating text annotations

    The MITRE Annotation Toolkit (MAT) is a suite of tools which can be used for automated and human tagging of annotations. Annotation is a process, used mostly by researchers in natural language processing, of enhancing documents with information about the various phrase types the documents contain. MAT supports both UI interaction and command-line interaction, and provides various levels of control over the overall annotation process. It can be customized for specific tasks (e.g., named entity identification, de-identification of medical records). The goal of MAT is not to help you configure your training engine (in the default case, the Carafe CRF system) to achieve the best possible performance on your data. MAT is for "everything else": all the tools you end up wishing you had.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.