Browse free open source Data Analytics tools and projects below. Use the toggles on the left to filter open source Data Analytics tools by OS, license, language, programming language, and project status.

  • Auth for GenAI | Auth0 Icon
    Auth for GenAI | Auth0

    Enable AI agents to securely access tools, workflows, and data with fine-grained control and just a few lines of code.

    Easily implement secure login experiences for AI Agents - from interactive chatbots to background workers with Auth0. Auth for GenAI is now available in Developer Preview
    Try free now
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    Bdash

    Bdash

    Simple SQL Client for lightweight data analysis

    Simple SQL Client for lightweight data analysis. You can share the result with gist. Supports MySQL, PostgreSQL (Amazon Redshift), SQLite3, Google BigQuery, Treasure Data, Amazon Athena. You can download and install from Web Site or Releases.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 2
    dlib

    dlib

    Toolkit for making machine learning and data analysis applications

    Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real world problems. It is used in both industry and academia in a wide range of domains including robotics, embedded devices, mobile phones, and large high performance computing environments. Dlib's open source licensing allows you to use it in any application, free of charge. Good unit test coverage, the ratio of unit test lines of code to library lines of code is about 1 to 4. The library is tested regularly on MS Windows, Linux, and Mac OS X systems. No other packages are required to use the library, only APIs that are provided by an out of the box OS are needed. There is no installation or configure step needed before you can use the library. All operating system specific code is isolated inside the OS abstraction layers which are kept as small as possible.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    PySchool

    PySchool

    Installable / Portable Python Distribution for Everyone.

    PySchool is a free and open-source Python distribution intended primarily for students who learn Python and data analysis, but it can also used by scientists, engineering, and data scientists. It includes more than 150 Python packages (full edition) including numpy, pandas, scipy, sympy, keras, scikit-learn, matplotlib, seaborn, beautifulsoup4...
    Leader badge
    Downloads: 230 This Week
    Last Update:
    See Project
  • 4
    ECharts

    ECharts

    A powerful, interactive charting and visualization library for browser

    ECharts is a free and open source charting and visualization library that gives you an easy way to add interactive, intuitive, custom charts to your commercial products, projects, presentations and more. It offers a rich set of features that includes rendering ability for ten-million-level data, Wechart and Powerpoint support, multi-dimension data analysis, and more. It also has a number of extensions for various applications. ECharts is written in pure JavaScript, and is based on zrender, a new and lightweight canvas library.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • 5
    Apache InLong

    Apache InLong

    Apache InLong - a one-stop integration framework for massive data

    Apache InLong is a one-stop integration framework for massive data that provides automatic, secure and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling and other real-time applications based on streaming data. InLong (应龙) is a divine beast in Chinese mythology who guides the river into the sea, and it is regarded as a metaphor of the InLong system for reporting data streams. InLong was originally built at Tencent, which has served online businesses for more than 8 years, to support massive data (data scale of more than 80 trillion pieces of data per day) reporting services in big data scenarios. The entire platform has integrated 5 modules: Ingestion, Convergence, Caching, Sorting, and Management, so that the business only needs to provide data sources, data service quality, data landing clusters and data landing formats.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    DocWire SDK

    DocWire SDK

    Award-winning modern data processing SDK in C++20

    DocWire SDK, a standout C++20AI driven data processing tool, has received award from SourceForge and strong backing from Microsoft. It handles nearly 100 file types, empowering efficient text extraction, web data extraction, and document analysis. For businesses, the shift to DocWire SDK signifies a leap forward. It promises comprehensive document format support and the ability to extract valuable insights from email boxes, databases, and websites using cutting-edge AI. DocWire SDK aims to expand its capabilities, focusing on versatile data extraction, platform support, and seamless integration with various systems. DocWire SDK is dedicated to streamlining data processing, reducing development time and costs, and harnessing the potential of AI. Its advancements promise a superior experience compared to its predecessor, DocToText.
    Downloads: 57 This Week
    Last Update:
    See Project
  • 7
    Blue Whale Configuration Platform

    Blue Whale Configuration Platform

    Blue Whale smart cloud configuration platform

    Has accumulated experience in supporting hundreds of Tencent businesses, compatible with various complex system architectures, born in operation and maintenance, and proficient in operation and maintenance. From configuration management to job execution, task scheduling and monitoring self-healing, and then through operation and maintenance big data analysis to assist operational decision-making, it covers the full-cycle assurance management of business operations in a comprehensive manner. The open PaaS has a powerful development framework and scheduling engine, as well as a complete operation and maintenance development training system, which helps the rapid transformation and upgrading of operation and maintenance. Through the Blue Whale intelligent cloud system, it can help enterprises quickly realize the automation of basic operation and maintenance services, thereby accelerating the transformation of DevOps, realizing a tool culture, and maximizing operational efficiency.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Kapacitor

    Kapacitor

    Open source framework for processing, monitoring, and alerting

    Open source framework for processing, monitoring, and alerting on time series data. Kapacitor is a real-time data processing engine for monitoring and alerting, specifically designed to work with time-series data from InfluxDB.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    POCO

    POCO

    Cross-platform C++ libraries for building network applications

    The POCO C++ Libraries are powerful cross-platform C++ libraries for building network- and internet-based applications that run on desktop, server, mobile, IoT, and embedded systems. Whether building automation systems, industrial automation, IoT platforms, air traffic management systems, enterprise IT application and infrastructure management, security and network analytics, automotive infotainment and telematics, financial or healthcare, C++ developers have been trusting the POCO C++ Libraries for 15+ years and deployed it in millions of devices. Create software for connected embedded devices running Linux, Windows Embedded or QNX. Create cross-platform backends in C++ for iOS and Android applications and combine it with a native or HTML5-based user interface. Create software for IoT devices that talk to cloud backends over HTTP REST APIs. See macchina.io for an IoT platform built with POCO.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Cloud SQL for MySQL, PostgreSQL, and SQL Server Icon
    Cloud SQL for MySQL, PostgreSQL, and SQL Server

    Focus on your application, and leave the database to us

    Fully managed, cost-effective relational database service for PostgreSQL, MySQL, and SQL Server. Try Enterprise Plus edition for a 99.99% availability SLA and category-leading performance.
    Try it for free
  • 10
    SageMaker Spark Container

    SageMaker Spark Container

    Docker image used to run data processing workloads

    Apache Spark™ is a unified analytics engine for large-scale data processing. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing. The SageMaker Spark Container is a Docker image used to run batch data processing workloads on Amazon SageMaker using the Apache Spark framework. The container images in this repository are used to build the pre-built container images that are used when running Spark jobs on Amazon SageMaker using the SageMaker Python SDK. The pre-built images are available in the Amazon Elastic Container Registry (Amazon ECR), and this repository serves as a reference for those wishing to build their own customized Spark containers for use in Amazon SageMaker.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    PANDA

    PANDA

    A comprehensive and flexible quantification tool for proteomics data

    PANDA is a comprehensive and flexib tool for quantitative proteomics data analysis, which is developed based on our solid foundations in quantitative proteomics for years. Several novelties have been implemented in it. First, we implement the advantage algorithms of LFQuant (Proteomics 2012, 12, (23-24), 3475-84) and SILVER (Bioinformatics 2014, 30, (4), 586-7) into PANDA. Second, we consider the state-of-art concept of quantification reliability in this quantitative workflow. On the levels of spectra, peptides and proteins, PANDA works out a few quantitative filters and new scores for quantification confidence. Third, PANDA is designed for processing proteomics big data in parallel.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 12

    TSA_CRAFT

    Automatic command line tool for TSA data analysis

    DSF is a high-throughput platform of TSA assay to screen various conditions that affect protein stability. To facilitate TSA data analysis, we developed an automatic tool "TSA-CRAFT". TSA-CRAFT was developed by integrating PERL script and Gnuplot. PERL scripts manage the entire workflow and data processing procedures of TSA data analysis. On the other hand, Gnuplot takes over the curve fitting and result presentation works. All analysed results are coded in an html file that can be easily displayed by a web browser.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 13
    neural network designer

    neural network designer

    a dbms for neural nets. Chatbots, DTrees, random forests, n-grams,...

    This project consists out of a windows based designer application and a library (that can run on multiple platforms, including android) together with several demo applications (including an MVC3 chatbot client and an android application). It is probably best compared to a database management system, but for neural networks instead of relational data. As such, the library is optimized for handling any type of data-size by using advanced streaming and caching algorithms. With the designer, you are able to create different types of decision trees, random forests, n-grams, pattern-matchers, conversational agents and all sorts of AI related algorithms. You can combine statistical approaches as well as pattern matchers or others. Do natural language processing, image or data analysis & interpretation,...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14

    Xross Tab / QScript

    Market Research Data Processing Suite

    XTCC/Qscript revolves around two concepts: 1) Data collection : Qscript 2) Data analysis : XTCC Qscript enables Market Research companies to securely collect data using online and offline mediums such as web, tablets/mobile. Different products include: 1) Pen and Paper Survey Management Qscript enables faster and easier keyboard entry with real time logic and validation. It generates SPSS and IBM Quantum compatible data tabulation programs leading to significant time savings of up to 80%. This is already being used by various companies in Market Research industry. 2) Tablet/Web enabled Survey Management Companies can launch the same survey, online and over Tablet/mobile. This product is in prototype phase. 3) XTCC, a web based cross tabulation tool that enables companies to carry out real-time analysis. It can analyze very large datasets in a very short period of time. It is in prototype phase. If you are interested in doing a pilot, please write to us.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15

    TrollEditor

    Editor of virtual worlds.

    Troll enables creation and visualisation of complex environments. It is based on Ogre3D (graphics engine), Bullet (physics engine), PhysX (physics engine), OpenAL (3D sound and music), OpenCV (image analysis and camera model) and Boost (threading and python binding). It uses Python for scripting, as well as built in simple scripting language. Some features: - built in multiplayer - incorporating data from different sensors (IMU, AHRS, Razer Hydra) - image server/client - built in support for motion capture data analysis - path finding using data from physics engine (backwave propagation algorithm) - built in XML parser - graphical world editor - plugins system (C++)
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    .NET for Apache Spark

    .NET for Apache Spark

    A free, open-source, and cross-platform big data analytics framework

    .NET for Apache Spark provides high-performance APIs for using Apache Spark from C# and F#. With these .NET APIs, you can access the most popular Dataframe and SparkSQL aspects of Apache Spark, for working with structured data, and Spark Structured Streaming, for working with streaming data. .NET for Apache Spark is compliant with .NET Standard - a formal specification of .NET APIs that are common across .NET implementations. This means you can use .NET for Apache Spark anywhere you write .NET code allowing you to reuse all the knowledge, skills, code, and libraries you already have as a .NET developer. .NET for Apache Spark runs on Windows, Linux, and macOS using .NET Core, or Windows using .NET Framework. It also runs on all major cloud providers including Azure HDInsight Spark, Amazon EMR Spark, AWS & Azure Databricks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Amazon Kinesis Flink Connectors

    Amazon Kinesis Flink Connectors

    Contains various Apache Flink connectors to connect to AWS data

    This library contains various Apache Flink connectors to connect to AWS data sources and sinks. This repository contains various Apache Flink connectors to connect to AWS Kinesis data sources and sinks. Flink maintain backwards compatibility for the Sink interface used by the Firehose Producer. This project is compatible with Flink 1.x, there is no guarantee it will support Flink 2.x should it release in the future. An Apache Flink application is a Java or Scala application that is created with the Apache Flink framework. You author and build your Apache Flink application locally. Applications primarily use either the DataStream API or the Table API. The other Apache Flink APIs are also available for you to use, but they are less commonly used in building streaming applications. The Apache Flink DataStream API programming model is based on two components.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    CANStream

    CANStream

    An applicaton for CAN bus coms development, testing and validation

    CANStream is a .NET application intended to be used for CAN (Controller Area Network) bus communication development, testing and validation. CANStream allows to extensively test any system using CAN communication by sending and receiving CAN frames. Thanks to its powerful built-in mathematical expression evaluator, CANStream can also behave as a real control system feeding back the test device or commanding a third party device with context sensible data. Extended data logging and data analysis features of CANStream provide a comprehensive solution for testing and results analysis CANStream extensively uses the PCAN-Basic API developed by PEAK System for the PCAN-USB adapter You will therefore need to have PCAN-USB adapter and at least one free USB port available to make a complete use of CANStream. CANStream can manage up to eight PCAN-USB adapters at the same time, thus you will need eight USB ports available.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    CSI-Math-Notation-PostfixInfix

    CSI-Math-Notation-PostfixInfix

    Perl Lib Math Notation

    * Introduction: - This Module is a Library based Perl code. - The library provide: - Convert INFIX expressions to POSTFIX; - Convert POSTFIX expressions to INFIX and; - Perform POSTFIX context validations. - Context validation can be implemented in item selection routines or data context validation, when it is possible to identify data to be selected or ignored in some data analysis process. * NOTE: - Before any implementation, we recommend details in WIKI (https://sourceforge.net/p/csi-math-notation-postfixinfix/wiki/) or the CPAN Perl Modules, see: https://metacpan.org/pod/Math::Notation::PostfixInfix * Support: The Support Service is FREE. Do you need support? Open a ticket and I will get back to you as soon as possible. * Professional Services: Do you need any FREE professional services to use? Open a ticket as "FAQ request" and I will get back to you as soon as possible. i.e: subject:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Qt4 GUI for scientific data analysis and visualization using Matlab (tm) style syntax. Link together Qt projects and other C++ libraries in an integrated user interface. For more information, visit: http://groups.google.com/group/chainlink
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Dash

    Dash

    Build beautiful web-based analytic apps, no JavaScript required

    Dash is a Python framework for building beautiful analytical web applications without any JavaScript. Built on top of Plotly.js, React and Flask, Dash easily achieves what an entire team of designers and engineers normally would. It ties modern UI controls and displays such as dropdown menus, sliders and graphs directly to your analytical Python code, and creates exceptional, interactive analytics apps. Dash apps are very lightweight, requiring only a limited number of lines of Python or R code; and every aesthetic element can be customized and rendered in the web. It’s also not just for dashboards. You have full control over the look and feel of your apps, so you can style them to look any way you want.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    GNUexp
    GNUexp is a GNUstep/Cocoa framework and a collection of tools helping you to plan, develop, and administer applications associated with visual psychophysics, experimental psychology and statistical data analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    HEIMDALL

    High-End Interface for Monitoring & spatial Data Analysis using L2Norm

    HEIMDALL, short for High-End Interface for Monitoring and spatial Data Analysis using L2-Norm, is a set of public libraries written in JAVA which contains routines and algorithms for geodetic hardware connection. To observe objects (in general spatial points) continuous, HEIMDALL’s programmable logic arrays can be used to design monitoring systems. For data analysis an interface to the least-squares program JAG3D can be provided. The serial communication with e.g. inclination sensors, meteorological stations or total stations is handled by the library RXTX. If only a proprietary communication via dynamic-link library (dll are not shared by the project) is supported by the devices, JNA is resorted. Notice, HEIMDALL is not a stand-alone monitoring application. It is only a set of libraries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Logbus-ng consists in a set of tools to aid developers perform Log Analysis in all the stages: log generation, collection, distribution, storage and analysis. It is designed specifically for Field Failure Data Analysis in critical distributed systems
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    MS-Helios

    MS-Helios: A Circos wrapper to visualize multi-omic datasets

    Advances in high-resolution mass spectrometry facilitate the identification of hundreds of metabolites, thousands of proteins and their post-translational modifications. This remarkable progress poses a challenge to data analysis and visualization, requiring methods to reduce dimensionality and represent the data in a compact way. To provide a more holistic view, we recently introduced circular proteome maps (CPMs). However, the CPM construction requires prior data transformation and extensive knowledge of the Perl-based tool, Circos. We present MS-Helios, an easy to use command line tool with multiple built-in data processing functions, allowing non-expert users to construct CPMs or in general terms circular plots with a non-genomic basis. MS-Helios automatically generates data and configuration files to create high quality and publishable circular plots with Circos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.