mufasa

Musafa is simple Query Processing Engine (aka dataframe library)

WIP: It is still in development.

Installation

pip install git+https://github.com/zeenfaizpy/mufasa.git@main

Usage

from mufasa.core import ExecutionContext
from mufasa.functions import col, eq, lit

ctx = ExecutionContext()
df = (
    ctx.csv("employee.csv")
    .select(col('state'), col('first_name'), col('last_name'))
)

# where
df = df.filter(col("salary").gt(lit(12000)))

# group by and aggregations
df = (
    df.group_by(col('dept'))
    .agg(sum(col('salary')))
)

# save it to temp table, then query using sql
df.create_or_replace_table('employees')
new_df = ctx.sql("select first_name, salary from employees where salary > 10000")


# print the logical plan
df.show_plan()

# print the final data
df.collect()

Features

Dataframe API
SQL Support with catalog
Pyspark Compatible API

SQL Operations

License

The GNU license. Please check LICENSE for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
examples		examples
src/mufasa		src/mufasa
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
not-by-ai.png		not-by-ai.png
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

mufasa

Installation

Usage

Features

SQL Operations

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

zeenfaizpy/mufasa

Folders and files

Latest commit

History

Repository files navigation

mufasa

Installation

Usage

Features

SQL Operations

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages