Showing 68 open source projects for "data analysis"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 1
    Smile

    Smile

    Statistical machine intelligence and learning engine

    Smile is a fast and comprehensive machine learning engine. With advanced data structures and algorithms, Smile delivers the state-of-art performance. Compared to this third-party benchmark, Smile outperforms R, Python, Spark, H2O, xgboost significantly. Smile is a couple of times faster than the closest competitor. The memory usage is also very efficient. If we can train advanced machine learning models on a PC, why buy a cluster? Write applications quickly in Java, Scala, or any JVM...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Discourse Network Analyzer (DNA)

    Discourse Network Analyzer (DNA)

    Discourse Network Analyzer (DNA)

    The Java software Discourse Network Analyzer (DNA) is a qualitative content analysis tool with network export facilities. You import text files and annotate statements that persons or organizations make, and the program will return network matrices of actors connected by shared concepts.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    MOA - Massive Online Analysis

    MOA - Massive Online Analysis

    Big Data Stream Analytics Framework.

    A framework for learning from a continuous supply of examples, a data stream. Includes classification, regression, clustering, outlier detection and recommender systems. Related to the WEKA project, also written in Java, while scaling to adaptive large scale machine learning.
    Leader badge
    Downloads: 48 This Week
    Last Update:
    See Project
  • 4

    LegacyInsight

    Legacy reverse engineering tool

    LegacyInsight is an AI-powered reverse engineering platform that transforms legacy software systems into comprehensible business logic. Using cutting-edge GenAI, it analyzes legacy and extracts core operations, business rules, and data transformations—all translated into natural language. LegacyInsight supports enterprise-grade systems built on Java, COBOL, NET and other legacy stacks, helping organizations reclaim understanding of business-critical code.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    TXM

    TXM

    Unicode XML TEI text analysis platform

    TXM is a free and open-source cross-platform Unicode & XML based text analysis environment and graphical client, supporting Windows, Linux and Mac OS X. It can also be used online as a J2EE standard compliant web portal (GWT based) with access control built in. DOWNLOAD LATEST VERSION OF TXM : http://textometrie.ens-lyon.fr/spip.php?rubrique61&lang=en TXM offers a comprehensive range of analysis tools (concordances, collocate search, frequency lists, etc.) based on the powerfull CQP...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 6
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    DynaQ

    DynaQ

    Innovative text document search. http://dynaq.opendfki.de for details.

    The goal of DynaQ is to develop an inquiry system to explore the personal information space, supporting you with the searching paradigm 'orienteering'. DynaQ is a (desktop)search engine with enhanced functionality for file, email and blog search. Look at our GitLab homepage for sourcecode and documentation: http://dynaq.opendfki.de
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    VIKAMINE is a flexible environment for visual analytics, data mining and business intelligence - implemented in pure Java. It features several powerful visualization and mining methods, and can utilize background knowledge.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    DSTK - Data Science TooKit 3

    DSTK - Data Science TooKit 3

    Data and Text Mining Software for Everyone

    DSTK - Data Science Toolkit 3 is a set of data and text mining softwares, following the CRISP DM model. DSTK offers data understanding using statistical and text analysis, data preparation using normalization and text processing, modeling and evaluation for machine learning and algorithms. It is based on the old version DSTK at https://sourceforge.net/projects/dstk2/ DSTK Engine is like R.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 10
    Wandora
    Wandora is a general purpose information extraction, management, and publishing environment based on Topic Maps and Java. Wandora has several data storage options, rich data extraction, import and export capabilities and embedded server.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit

    DSTK - DataScience ToolKit for All of Us

    DSTK - DataScience ToolKit is an opensource free software for statistical analysis, data visualization, text analysis, and predictive analytics. Newer version and smaller file size can be found at: https://sourceforge.net/projects/dstk3/ It is designed to be straight forward and easy to use, and familar to SPSS user. While JASP offers more statistical features, DSTK tends to be a broad solution workbench, including text analysis and predictive analytics features. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    MYRA

    MYRA

    A collection of ACO algorithms for the data mining classification task

    MYRA is a collection of Ant Colony Optimization (ACO) algorithms for the data mining classification task. It includes popular rule induction and decision tree induction algorithms. The algorithms are ready to be used from the command line or can be easily called from your own Java code. They are build using a modular architecture, so they can be easily extended to incorporate different procedures and/or use different parameter values. This project is now hosted at: https://github.com/febo/myra
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13

    JCLTP

    A Java Class Library for Text Processing

    JCLTP is a class library designed for processing text. JCLTP is free, open source and developed with the Java programming language. JCLTP is distributed under the GNU license. It incorporates several technologies that enable process information while applying AI techniques, in order to build predictive models for text classification. Through a flexible structure of interfaces and classes, the opportunity to extend, adapt and add functionality JCLTP is provided. Thus, analysis of new types...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    GUI Ant-Miner is a tool for extracting classification rules from data. It is an updated version of a data mining algorithm called Ant-Miner (Ant Colony-based Data Miner).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    Twitter Research Data Collector
    It gives facility of collecting tweets through Twitter Streaming API w.r.t different search criteria and to save tweets in CSV and ARFF (WEKA) file formats.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16

    BioC

    We describe a simple XML format to share text documents and annotation

    A minimalist approach to share text documents and data annotations. Allows a large number of different annotations to be represented. Project files contain: - simple code to hold/read/write data and perform sample processing. - BioC-formatted corpora - BioC tools that work with BioC corpora BioC goals - simplicity - interoperability - broad use - reuse There should be little investment required to learn to use a format or a software module to process that format. We are...
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    Open Cezeri Library

    Open Cezeri Library

    Effective Linear Algebra and Computer Vision Library with JAVA

    OCL stands for Open Cezeri Library (yet another linear algebra and matrix library). This library provides rapid coding as matlab ease of use. To learn for library please try to use test examples at OpenCezeriLibrary\test\test. It is originally developed at el-cezeri laboratory of Siirt University, in order to establish generic framework of reusable components and software tools for machine vision, machine learning, AI and robotic applications. Currently, it holds following main concepts 1-...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The Java Data Mining Package (JDMP) is a library that provides methods for analyzing data with the help of machine learning algorithms (e.g. clustering, classification, graphical models, neural networks, Bayesian networks, text processing, optimization).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Text Expander, Inverse summarizer

    Text Expander, Inverse summarizer

    Expand text, inverse summarizer

    IT WILL WORK WITH A JAVA DEVELOPMENT KIT 1.7 ONLY !!! This program is a data-miner and a knowledge-miner. It does exactly the opposite of what the text summarizers do. A text summarizer produces a shortened text given some text as an input. An inverse summarizer takes the shortened input, a similar or a same text and does the process in reverse. This results in an expanded text. It can be used with any text or notes that have the knowledge gaps. It is a great aid to any creative...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Chordalysis

    Log-linear analysis (data modelling) for high-dimensional data

    ===== Project moved to https://github.com/fpetitjean/Chordalysis ===== Log-linear analysis is the statistical method used to capture multi-way relationships between variables. However, due to its exponential nature, previous approaches did not allow scale-up to more than a dozen variables. We present here Chordalysis, a log-linear analysis method for big data. Chordalysis exploits recent discoveries in graph theory by representing complex models as compositions of triangular structures, also known as chordal graphs. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MODLEM

    MODLEM

    rule-based, WEKA compatible, Machine Learning algorithm

    This project is a WEKA (Waikato Environment for Knowledge Analysis) compatible implementation of MODLEM - a Machine Learning algorithm which induces minimum set of rules. These rules can be adopted as a classifier (in terms of ML). It is a sequential covering algorithm, which was invented to cope with numeric data without discretization. Actually the nominal and numeric attributes are treated in the same way: attribute's space is being searched to find the best rule condition during rule induction. ...
    Leader badge
    Downloads: 31 This Week
    Last Update:
    See Project
  • 22
    Flamingo Project

    Flamingo Project

    Workflow Designer, Hive Editor, Pig Editor, File System Browser

    Flamingo is a open-source Big Data Platform that combine a Ajax Rich Web Interface + Workflow Engine + Workflow Designer + MapReduce + Hive Editor + Pig Editor. 1. Easy Tool for big data 2. Use comfortable in Hadoop EcoSystem projects 3. Based GPL V3 License Supporting Pig IDE, Hive IDE, HDFS Browser, Scheduler, Hadoop Job Monitoring, Workflow Engine, Workflow Designer, MapReduce.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Graphical Grammar Studio

    Graphical Grammar Studio

    An user friendly grammar tool for natural language processing tasks

    Full documentation with tutorials is included in the download package. Graphical Grammar Studio is a tool for applying grammars which behave as words acceptors/consumers and annotators. GGS grammars can be used to find and annotate sequences of words which respect certain conditions, in a given input. Its purpose is for creating NLP tools like phrase chunkers, named entity finders, pronoun co-reference solvers etc. A grammar is represented by a state machine which can be visualized, edited...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Unsupervised TXT classifier

    Unsupervised TXT classifier

    Classify any two TXT documents, no training required - JAVA

    ...This extracts a relevant structure for both documents (and thus avoids the over-training) which are then compared using the Vector-Space analysis to give a range of belonging of one document to another (and thus avoids the shortage of information). This method can be used to create the user-defined classes by merging texts of certain categories and then to calculate the relevant distances between the documents, but this is not necessary.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    LExAu: Learning Expectations Autonomously. Library for on-line data driven statistical machine learning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next