[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths #18971

HyukjinKwon · 2017-08-17T17:08:52Z

What changes were proposed in this pull request?

org.apache.spark.deploy.RPackageUtilsSuite

 - jars without manifest return false *** FAILED *** (109 milliseconds)
   java.io.IOException: Unable to delete file: C:\projects\spark\target\tmp\1500266936418-0\dep1-c.jar

org.apache.spark.deploy.SparkSubmitSuite

 - download one file to local *** FAILED *** (16 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2630198944759847458.jar

 - download list of files to local *** FAILED *** (0 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2783551769392880031.jar

org.apache.spark.scheduler.ReplayListenerSuite

 - Replay compressed inprogress log file succeeding on partial read (156 milliseconds)
   Exception encountered when attempting to run a suite with class name: 
   org.apache.spark.scheduler.ReplayListenerSuite *** ABORTED *** (1 second, 391 milliseconds)
   java.io.IOException: Failed to delete: C:\projects\spark\target\tmp\spark-8f3cacd6-faad-4121-b901-ba1bba8025a0

 - End-to-end replay *** FAILED *** (62 milliseconds)
   java.io.IOException: No FileSystem for scheme: C

 - End-to-end replay with compression *** FAILED *** (110 milliseconds)
   java.io.IOException: No FileSystem for scheme: C

org.apache.spark.sql.hive.StatisticsSuite

 - SPARK-21079 - analyze table with location different than that of individual partitions *** FAILED *** (875 milliseconds)
   org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);

 - SPARK-21079 - analyze partitioned table with only a subset of partitions visible *** FAILED *** (47 milliseconds)
   org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);

Note: this PR does not fix:

org.apache.spark.deploy.SparkSubmitSuite

 - launch simple application with spark-submit with redaction *** FAILED *** (172 milliseconds)
   java.util.NoSuchElementException: next on empty iterator

I can't reproduce this on my Windows machine but looks appearntly consistently failed on AppVeyor. This one is unclear to me yet and hard to debug so I did not include this one for now.

Note: it looks there are more instances but it is hard to identify them partly due to flakiness and partly due to swarming logs and errors. Will probably go one more time if it is fine.

How was this patch tested?

Manually via AppVeyor:

Before

org.apache.spark.deploy.RPackageUtilsSuite: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/8t8ra3lrljuir7q4
org.apache.spark.deploy.SparkSubmitSuite: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/taquy84yudjjen64
org.apache.spark.scheduler.ReplayListenerSuite: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/24omrfn2k0xfa9xq
org.apache.spark.sql.hive.StatisticsSuite: https://ci.appveyor.com/project/spark-test/spark/build/771-windows-fix/job/2079y1plgj76dc9l

After

org.apache.spark.deploy.RPackageUtilsSuite: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/3803dbfn89ne1164
org.apache.spark.deploy.SparkSubmitSuite: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/m5l350dp7u9a4xjr
org.apache.spark.scheduler.ReplayListenerSuite: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/565vf74pp6bfdk18
org.apache.spark.sql.hive.StatisticsSuite: https://ci.appveyor.com/project/spark-test/spark/build/775-windows-fix/job/qm78tsk8c37jb6s4

Jenkins tests are required and AppVeyor tests will be triggered.

HyukjinKwon · 2017-08-17T17:16:15Z

core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala

+    val logDir = new File(testDir.getAbsolutePath, "test-replay")
+    // Here, it creates `Path` from the URI instead of the absolute path for the explicit file
+    // scheme so that the string representation of this `Path` has leading file scheme correctly.
+    val logDirPath = new Path(logDir.toURI)


If we create this from the absolute path, it appears that the string ends up with C:/../.. form and  Utils.resolveURI recognises C as the scheme, causing "No FileSystem for scheme: C"  exception.

It looks Path can handle this but we can't currently replace Utils.resolveURI to  Path due to of some corner case of behaviour changes.

For example, with Path, "hdfs:///root/spark.jar#app.jar" becomes  "hdfs:///root/spark.jar%23app.jar" but with Utils.resolveURI,  "hdfs:///root/spark.jar#app.jar" becomes "hdfs:///root/spark.jar#app.jar".

Utils.resolveURI is being called via,

spark/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala

Line 163 in 9019548

sc = new SparkContext("local-cluster[2,1,1024]", "Test replay", conf)

spark/core/src/main/scala/org/apache/spark/SparkContext.scala

Line 401 in 6847e93

Some(Utils.resolveURI(unresolvedDir))

This test itself was added long time ago but looks there was a recent change related with this code path - edcb878.

spark/core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala

Line 107 in 9019548

val conf = EventLoggingListenerSuite.getLoggingConf(logFilePath)

I think this simple test describes what I intended:

Before

scala> import org.apache.hadoop.fs.Path import org.apache.hadoop.fs.Path scala> val path = new Path("C:\\a\\b\\c") path: org.apache.hadoop.fs.Path = C:/a/b/c scala> path.toString res0: String = C:/a/b/c scala> path.toUri.toString res1: String = /C:/a/b/c

After

scala> import org.apache.hadoop.fs.Path import org.apache.hadoop.fs.Path scala> import java.io.File import java.io.File scala> val file = new File("C:\\a\\b\\c") file: java.io.File = C:\a\b\c scala> val path = new Path(file.toURI) path: org.apache.hadoop.fs.Path = file:/C:/a/b/c scala> path.toString res2: String = file:/C:/a/b/c scala> path.toUri.toString res3: String = file:/C:/a/b/c

Please correct me if I am mistaken here.

cc @vanzin, I believe I need your look. Could you take a look when you have some time?

cc @sarutak too. Other changes should be fine as they are what I have usually fixed but I am less sure of this one. Current status conservatively fixes the test only but I guess I need a sign-off on this.

From your description it sounds like Utils.resolveURI might not be correct for Windows paths. I don't have Windows available, so if you could try these it might help in understanding:

Utils.resolveURI("C:\\WINDOWS") Utils.resolveURI("/C:/WINDOWS") Utils.resolveURI("C:/WINDOWS")

The first two should return the same thing ("file:/C:/WINDOWS" or something along those lines) while the third I'm not sure, since it's ambiguous. But that's probably the cause of the change of behavior.

Anyway the code change looks correct.

I have Windows one set up properly for dev env, and also a set of scripts to run a specific Scala tests by test-only via AppVeyor automatically against a PR. So, it is not really hard to test. I am fine with asking more cases in the future.

println("=== org.apache.spark.util.Utils.resolveURI") println(Utils.resolveURI("C:\\WINDOWS").toString) println(Utils.resolveURI("/C:/WINDOWS").toString) println(Utils.resolveURI("C:/WINDOWS").toString) println println(Utils.resolveURI("C:\\WINDOWS").getScheme) println(Utils.resolveURI("/C:/WINDOWS").getScheme) println(Utils.resolveURI("C:/WINDOWS").getScheme) println println("=== java.io.File") println(new File("C:\\WINDOWS").toURI.toString) println(new File("/C:/WINDOWS").toURI.toString) println(new File("C:/WINDOWS").toURI.toString) println println(new File("C:\\WINDOWS").toURI.getScheme) println(new File("/C:/WINDOWS").toURI.getScheme) println(new File("C:/WINDOWS").toURI.getScheme) println println("=== org.apache.hadoop.fs.Path") println(new Path("C:\\WINDOWS").toUri.toString) println(new Path("/C:/WINDOWS").toUri.toString) println(new Path("C:/WINDOWS").toUri.toString) println println(new Path("C:\\WINDOWS").toString) println(new Path("/C:/WINDOWS").toString) println(new Path("C:/WINDOWS").toString) println println(new Path("C:\\WINDOWS").toUri.getScheme) println(new Path("/C:/WINDOWS").toUri.getScheme) println(new Path("C:/WINDOWS").toUri.getScheme) println println("=== java.io.File.toURI and org.apache.hadoop.fs.Path") println(new Path(new File("C:\\WINDOWS").toURI).toUri.toString) println(new Path(new File("/C:/WINDOWS").toURI).toUri.toString) println(new Path(new File("C:/WINDOWS").toURI).toUri.toString) println println(new Path(new File("C:\\WINDOWS").toURI).toString) println(new Path(new File("/C:/WINDOWS").toURI).toString) println(new Path(new File("C:/WINDOWS").toURI).toString) println println(new Path(new File("C:\\WINDOWS").toURI).toUri.getScheme) println(new Path(new File("/C:/WINDOWS").toURI).toUri.getScheme) println(new Path(new File("C:/WINDOWS").toURI).toUri.getScheme)

produced

=== org.apache.spark.util.Utils.resolveURI file:/C:/WINDOWS/ file:/C:/WINDOWS/ C:/WINDOWS file file C === java.io.File file:/C:/WINDOWS/ file:/C:/WINDOWS/ file:/C:/WINDOWS/ file file file === org.apache.hadoop.fs.Path /C:/WINDOWS /C:/WINDOWS /C:/WINDOWS C:/WINDOWS C:/WINDOWS C:/WINDOWS null null null === java.io.File.toURI and org.apache.hadoop.fs.Path file:/C:/WINDOWS/ file:/C:/WINDOWS/ file:/C:/WINDOWS/ file:/C:/WINDOWS/ file:/C:/WINDOWS/ file:/C:/WINDOWS/ file file file

@HyukjinKwon I think this change it self looks reasonable.
resolveURI should seem to be fixed so that Windows' path is handled correctly.
If C:/path/to/some/file is passed to resolveURI, the letter drive "C" should not parsed as URI scheme.

HyukjinKwon · 2017-08-17T17:22:31Z

Build started: [TESTS] org.apache.spark.deploy.RPackageUtilsSuite
Build started: [TESTS] org.apache.spark.deploy.SparkSubmitSuite
Build started: [TESTS] org.apache.spark.scheduler.ReplayListenerSuite
Build started: [TESTS] org.apache.spark.sql.hive.StatisticsSuite
Diff: master...spark-test:D07ED879-66AE-454C-B77C-C6463F79387F

Due to flakiness, some tests might fail. I will rerun soon if failed. org.apache.spark.deploy.SparkSubmitSuite is expected to be failed as described in the PR.

SparkQA · 2017-08-17T20:26:30Z

Test build #80790 has finished for PR 18971 at commit 9019548.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-08-26T05:35:13Z

core/src/test/scala/org/apache/spark/deploy/RPackageUtilsSuite.scala

+      Utils.tryWithResource(new JarFile(jar)) { jarFile =>
+        assert(jarFile.getManifest == null, "jar file should have null manifest")
+        assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest should return false")
+      }


Simply closes JarFile. This should be closed

HyukjinKwon · 2017-08-26T05:46:00Z

core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala

@@ -824,7 +824,7 @@ class SparkSubmitSuite
    val hadoopConf = new Configuration()
    val tmpDir = Files.createTempDirectory("tmp").toFile
    updateConfWithFakeS3Fs(hadoopConf)
-    val sourcePath = s"s3a://${jarFile.getAbsolutePath}"
+    val sourcePath = s"s3a://${jarFile.toURI.getPath}"


Windows:

Before:

scala> f.getAbsolutePath res2: String = C:\a\b\c

After:

scala> f.toURI.getPath res1: String = /C:/a/b/c

Linux:

Before:

scala> new File("/a/b/c").getAbsolutePath res0: String = /a/b/c

After:

scala> new File("/a/b/c").toURI.getPath res1: String = /a/b/c

HyukjinKwon · 2017-08-26T05:47:14Z

core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala

@@ -112,17 +112,19 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with LocalSp

    // Verify the replay returns the events given the input maybe truncated.
    val logData = EventLoggingListener.openEventLog(logFilePath, fileSystem)
-    val failingStream = new EarlyEOFInputStream(logData, buffered.size - 10)
-    replayer.replay(failingStream, logFilePath.toString, true)
+    Utils.tryWithResource(new EarlyEOFInputStream(logData, buffered.size - 10)) { failingStream =>


Here EarlyEOFInputStream was not being closed.

HyukjinKwon · 2017-08-26T05:47:33Z

core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala

    }
+
+    override def close(): Unit = in.close()


EarlyEOFInputStream was not being closed.

HyukjinKwon · 2017-08-26T05:48:54Z

sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala

@@ -203,7 +203,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
          sql(s"INSERT INTO TABLE $tableName PARTITION (ds='$ds') SELECT * FROM src")
        }

-        sql(s"ALTER TABLE $tableName SET LOCATION '$path'")
+        sql(s"ALTER TABLE $tableName SET LOCATION '${path.toURI}'")


These tests here do not look dedicated to test path. I have fixed those so far.

SparkQA · 2017-08-26T07:04:47Z

Test build #81151 has finished for PR 18971 at commit 236b986.

This patch fails due to an unknown error code, -9.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-08-26T10:46:47Z

retest this please

SparkQA · 2017-08-26T13:59:24Z

Test build #81152 has finished for PR 18971 at commit 236b986.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

jiangxb1987

LGTM

jiangxb1987 · 2017-08-28T20:24:59Z

core/src/test/scala/org/apache/spark/scheduler/ReplayListenerSuite.scala

-    val failingStream2 = new EarlyEOFInputStream(logData2, buffered.size - 10)
-    intercept[EOFException] {
-      replayer.replay(failingStream2, logFilePath.toString, false)
+    Utils.tryWithResource(new EarlyEOFInputStream(logData2, buffered.size - 10)) { failingStream2 =>


nit: I think we can still use failingStream here?

It looks so but I think I am not confident enough to change this. Will keep this in mind and point out when someone fixes the codes around this.

HyukjinKwon · 2017-08-30T01:24:48Z

I simply rebased here.

SparkQA · 2017-08-30T04:38:29Z

Test build #81239 has finished for PR 18971 at commit 7d3716c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

HyukjinKwon · 2017-08-30T12:38:46Z

Merged to master.

HyukjinKwon · 2017-08-30T12:39:17Z

Thank you @vanzin, @sarutak, @jiangxb1987 and @srowen to review this.

HyukjinKwon commented Aug 17, 2017

View reviewed changes

HyukjinKwon changed the title ~~[WIP][SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths~~ [SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths Aug 17, 2017

HyukjinKwon force-pushed the windows-fixes branch from 9019548 to 236b986 Compare August 26, 2017 05:33

HyukjinKwon commented Aug 26, 2017

View reviewed changes

jiangxb1987 approved these changes Aug 28, 2017

View reviewed changes

srowen approved these changes Aug 29, 2017

View reviewed changes

HyukjinKwon added 2 commits August 30, 2017 10:10

Resource closing and path related problems on Windows in tests

2115abf

Add some comments

7d3716c

HyukjinKwon force-pushed the windows-fixes branch from 236b986 to 7d3716c Compare August 30, 2017 01:22

asfgit closed this in b30a11a Aug 30, 2017

HyukjinKwon deleted the windows-fixes branch January 2, 2018 03:37

[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths #18971

[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths #18971

Uh oh!

Conversation

HyukjinKwon commented Aug 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Aug 17, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Aug 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sarutak Aug 23, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Aug 17, 2017

Uh oh!

SparkQA commented Aug 17, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon Aug 26, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Aug 26, 2017

Uh oh!

HyukjinKwon commented Aug 26, 2017

Uh oh!

SparkQA commented Aug 26, 2017

Uh oh!

jiangxb1987 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HyukjinKwon commented Aug 30, 2017

Uh oh!

SparkQA commented Aug 30, 2017

Uh oh!

HyukjinKwon commented Aug 30, 2017

Uh oh!

HyukjinKwon commented Aug 30, 2017

Uh oh!

Uh oh!

HyukjinKwon commented Aug 17, 2017 •

edited

Loading

HyukjinKwon Aug 17, 2017 •

edited

Loading

HyukjinKwon Aug 23, 2017 •

edited

Loading

sarutak Aug 23, 2017 •

edited

Loading

HyukjinKwon Aug 26, 2017 •

edited

Loading