porting to Spark for **very large industrial databases** using SparkSQL Operators only, or combine with SQL queries?