Java Big Data Tools for BSD

Browse free open source Java Big Data Tools for BSD and projects below. Use the toggles on the left to filter open source Java Big Data Tools for BSD by OS, license, language, programming language, and project status.

  • Get Avast Free Antivirus with 24/7 AI-powered online scam detection Icon
    Get Avast Free Antivirus with 24/7 AI-powered online scam detection

    Get protection for today’s online threats. Free.

    Award-winning antivirus protection, as well as protection against online scams, dangerous Wi-Fi connections, hacked accounts, and ransomware. It includes Avast Assistant, your built-in AI partner, which gives you help with suspicious online messages, offers, and more.
    Free Download
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    MOA - Massive Online Analysis

    MOA - Massive Online Analysis

    Big Data Stream Analytics Framework.

    A framework for learning from a continuous supply of examples, a data stream. Includes classification, regression, clustering, outlier detection and recommender systems. Related to the WEKA project, also written in Java, while scaling to adaptive large scale machine learning.
    Downloads: 54 This Week
    Last Update:
    See Project
  • 2
    Open Source Data Quality and Profiling

    Open Source Data Quality and Profiling

    World's first open source data quality & data preparation project

    This project is dedicated to open source data quality and data preparation solutions. Data Quality includes profiling, filtering, governance, similarity check, data enrichment alteration, real time alerting, basket analysis, bubble chart Warehouse validation, single customer view etc. defined by Strategy. This tool is developing high performance integrated data management platform which will seamlessly do Data Integration, Data Profiling, Data Quality, Data Preparation, Dummy Data Creation, Meta Data Discovery, Anomaly Discovery, Data Cleansing, Reporting and Analytic. It also had Hadoop ( Big data ) support to move files to/from Hadoop Grid, Create, Load and Profile Hive Tables. This project is also known as "Aggregate Profiler" Resful API for this project is getting built as (Beta Version) https://sourceforge.net/projects/restful-api-for-osdq/ apache spark based data quality is getting built at https://sourceforge.net/projects/apache-spark-osdq/
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3

    Big Sack

    Big Sack: A lightweight Java Key/Value store with undo and disk cache.

    Big Sack is a Java persistence mechanism that allows storage of key value pairs following the popular Big Data paradigms. Its a very simple and straightforward way to bridge the gap between in-memory data structures and long-term storage. It has the convenience of Java SDK TreeMap and TreeSet classes and is used the same easy way, but it includes rollback through undo logging to checkpoint data so it does not wind up in an unknown state regardless of failures. Data storage in the exabyte range is possible using filesystem and/or memory-mapped IO. Three levels of configurable write-through caching at different granularities ensure performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Universal Java Matrix Package

    Universal Java Matrix Package

    sparse and dense matrix, linear algebra, visualization, big data

    The Universal Java Matrix Package (UJMP) is an open source Java library which provides sparse and dense matrix classes, as well as a large number of calculations for linear algebra such as matrix multiplication or matrix inverse. Operations such as mean, correlation, standard deviation, replacement of missing values or the calculation of mutual information are supported, too. The Universal Java Matrix Package provides various visualization methods, import and export filters for a large number of file formats, and even the possibility to link to JDBC databases. Multi-dimensional matrices as well as generic matrices with a specified object type are supported and very large matrices can be handled even when they do not fit into memory.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Powering the best of the internet | Fastly Icon
    Powering the best of the internet | Fastly

    Fastly's edge cloud platform delivers faster, safer, and more scalable sites and apps to customers.

    Ensure your websites, applications and services can effortlessly handle the demands of your users with Fastly. Fastly’s portfolio is designed to be highly performant, personalized and secure while seamlessly scaling to support your growth.
    Try for free
  • 5
    apache spark data pipeline osDQ

    apache spark data pipeline osDQ

    osDQ dedicated to create apache spark based data pipeline using JSON

    This is an offshoot project of open source data quality (osDQ) project https://sourceforge.net/projects/dataquality/ This sub project will create apache spark based data pipeline where JSON based metadata (file) will be used to run data processing , data pipeline , data quality and data preparation and data modeling features for big data. This uses java API of apache spark. It can run in local mode also. Get json example at https://github.com/arrahtech/osdq-spark How to run Unzip the zip file Windows : java -cp .\lib\*;osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c .\example\samplerun.json Mac UNIX java -cp ./lib/*:./osdq-spark-0.0.1.jar org.arrah.framework.spark.run.TransformRunner -c ./example/samplerun.json For those on windows, you need to have hadoop distribtion unzipped on local drive and HADOOP_HOME set. Also copy winutils.exe from here into HADOOP_HOME\bin
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    giServer

    giServer

    giServer the easy to use and extensible batch and integration server

    The giServer is an easy-to-use integration server for process automation and event-driven or scheduled execution of batch jobs. Instead of using complex XML configuration files an elaborate GUI for batch job management is included. Some possible usage scenarios are: - Automatic processing of incoming data files - Big Data applications - Process automation - Data Mining/Aggregation applications - Automatic Reporting - Processing and analysis of database records
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.