Skip to content

Conversation

@wbo4958
Copy link

@wbo4958 wbo4958 commented Oct 30, 2025

SparkContext and Py4j are unavailable in the Spark Connect environment. This PR tries best to use consistent code to achieve the same functionality across both classic Spark and Spark Connect modes.

I ran the CPU and GPU tests separately on Spark 3.5.1, Spark 4.0.0 (classic), and Spark 4.0.0 (Connect), with all tests passing.

@wbo4958
Copy link
Author

wbo4958 commented Oct 30, 2025

Hi @eordentlich, @jihoonson, please help review it. thx very much.

<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
Copy link

@firestarman firestarman Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So will this plugin use the version defined by <maven.compiler.source> and <maven.compiler.target> ? I saw you removed the relevatn configs.

execution_time_list: a list recording query execution time.
"""
spark_app_id = spark_session.sparkContext.applicationId
spark_app_id = spark_session.conf.get("spark.app.id")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the spark.app.id config always exist ?

'query': query_name,
}

def _is_above_spark_4(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
def _is_above_spark_4(self):
def _is_spark_400_or_later(self):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants