Skip to content

[SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths #18971

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

HyukjinKwon
Copy link
Member

@HyukjinKwon HyukjinKwon commented Aug 17, 2017

What changes were proposed in this pull request?

org.apache.spark.deploy.RPackageUtilsSuite

 - jars without manifest return false *** FAILED *** (109 milliseconds)
   java.io.IOException: Unable to delete file: C:\projects\spark\target\tmp\1500266936418-0\dep1-c.jar

org.apache.spark.deploy.SparkSubmitSuite

 - download one file to local *** FAILED *** (16 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2630198944759847458.jar

 - download list of files to local *** FAILED *** (0 milliseconds)
   java.net.URISyntaxException: Illegal character in authority at index 6: s3a://C:\projects\spark\target\tmp\test2783551769392880031.jar

org.apache.spark.scheduler.ReplayListenerSuite

 - Replay compressed inprogress log file succeeding on partial read (156 milliseconds)
   Exception encountered when attempting to run a suite with class name: 
   org.apache.spark.scheduler.ReplayListenerSuite *** ABORTED *** (1 second, 391 milliseconds)
   java.io.IOException: Failed to delete: C:\projects\spark\target\tmp\spark-8f3cacd6-faad-4121-b901-ba1bba8025a0

 - End-to-end replay *** FAILED *** (62 milliseconds)
   java.io.IOException: No FileSystem for scheme: C

 - End-to-end replay with compression *** FAILED *** (110 milliseconds)
   java.io.IOException: No FileSystem for scheme: C

org.apache.spark.sql.hive.StatisticsSuite

 - SPARK-21079 - analyze table with location different than that of individual partitions *** FAILED *** (875 milliseconds)
   org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);

 - SPARK-21079 - analyze partitioned table with only a subset of partitions visible *** FAILED *** (47 milliseconds)
   org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:java.lang.IllegalArgumentException: Can not create a Path from an empty string);

Note: this PR does not fix:

org.apache.spark.deploy.SparkSubmitSuite

 - launch simple application with spark-submit with redaction *** FAILED *** (172 milliseconds)
   java.util.NoSuchElementException: next on empty iterator

I can't reproduce this on my Windows machine but looks appearntly consistently failed on AppVeyor. This one is unclear to me yet and hard to debug so I did not include this one for now.

Note: it looks there are more instances but it is hard to identify them partly due to flakiness and partly due to swarming logs and errors. Will probably go one more time if it is fine.

How was this patch tested?

Manually via AppVeyor:

Before

After

Jenkins tests are required and AppVeyor tests will be triggered.

val logDir = new File(testDir.getAbsolutePath, "test-replay")
// Here, it creates `Path` from the URI instead of the absolute path for the explicit file
// scheme so that the string representation of this `Path` has leading file scheme correctly.
val logDirPath = new Path(logDir.toURI)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we create this from the absolute path, it appears that the string ends up with C:/../.. form and
 Utils.resolveURI recognises C as the scheme, causing "No FileSystem for scheme: C"
 exception.

It looks Path can handle this but we can't currently replace Utils.resolveURI to
 Path due to of some corner case of behaviour changes.

For example, with Path, "hdfs:///root/spark.jar#app.jar" becomes
 "hdfs:///root/spark.jar%23app.jar" but with Utils.resolveURI,
 "hdfs:///root/spark.jar#app.jar" becomes "hdfs:///root/spark.jar#app.jar".

Utils.resolveURI is being called via,

sc = new SparkContext("local-cluster[2,1,1024]", "Test replay", conf)

Some(Utils.resolveURI(unresolvedDir))

Copy link
Member Author

@HyukjinKwon HyukjinKwon Aug 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test itself was added long time ago but looks there was a recent change related with this code path - edcb878.

val conf = EventLoggingListenerSuite.getLoggingConf(logFilePath)

I think this simple test describes what I intended:

Before

scala> import org.apache.hadoop.fs.Path
import org.apache.hadoop.fs.Path

scala> val path = new Path("C:\\a\\b\\c")
path: org.apache.hadoop.fs.Path = C:/a/b/c

scala> path.toString
res0: String = C:/a/b/c

scala> path.toUri.toString
res1: String = /C:/a/b/c

After

scala> import org.apache.hadoop.fs.Path
import org.apache.hadoop.fs.Path

scala> import java.io.File
import java.io.File

scala> val file = new File("C:\\a\\b\\c")
file: java.io.File = C:\a\b\c

scala> val path = new Path(file.toURI)
path: org.apache.hadoop.fs.Path = file:/C:/a/b/c

scala> path.toString
res2: String = file:/C:/a/b/c

scala> path.toUri.toString
res3: String = file:/C:/a/b/c

Please correct me if I am mistaken here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @vanzin, I believe I need your look. Could you take a look when you have some time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @sarutak too. Other changes should be fine as they are what I have usually fixed but I am less sure of this one. Current status conservatively fixes the test only but I guess I need a sign-off on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your description it sounds like Utils.resolveURI might not be correct for Windows paths. I don't have Windows available, so if you could try these it might help in understanding:

Utils.resolveURI("C:\\WINDOWS")
Utils.resolveURI("/C:/WINDOWS")
Utils.resolveURI("C:/WINDOWS")

The first two should return the same thing ("file:/C:/WINDOWS" or something along those lines) while the third I'm not sure, since it's ambiguous. But that's probably the cause of the change of behavior.

Anyway the code change looks correct.

Copy link
Member Author

@HyukjinKwon HyukjinKwon Aug 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have Windows one set up properly for dev env, and also a set of scripts to run a specific Scala tests by test-only via AppVeyor automatically against a PR. So, it is not really hard to test. I am fine with asking more cases in the future.

println("=== org.apache.spark.util.Utils.resolveURI")
println(Utils.resolveURI("C:\\WINDOWS").toString)
println(Utils.resolveURI("/C:/WINDOWS").toString)
println(Utils.resolveURI("C:/WINDOWS").toString)
println
println(Utils.resolveURI("C:\\WINDOWS").getScheme)
println(Utils.resolveURI("/C:/WINDOWS").getScheme)
println(Utils.resolveURI("C:/WINDOWS").getScheme)
println

println("=== java.io.File")
println(new File("C:\\WINDOWS").toURI.toString)
println(new File("/C:/WINDOWS").toURI.toString)
println(new File("C:/WINDOWS").toURI.toString)
println
println(new File("C:\\WINDOWS").toURI.getScheme)
println(new File("/C:/WINDOWS").toURI.getScheme)
println(new File("C:/WINDOWS").toURI.getScheme)
println

println("=== org.apache.hadoop.fs.Path")
println(new Path("C:\\WINDOWS").toUri.toString)
println(new Path("/C:/WINDOWS").toUri.toString)
println(new Path("C:/WINDOWS").toUri.toString)
println
println(new Path("C:\\WINDOWS").toString)
println(new Path("/C:/WINDOWS").toString)
println(new Path("C:/WINDOWS").toString)
println
println(new Path("C:\\WINDOWS").toUri.getScheme)
println(new Path("/C:/WINDOWS").toUri.getScheme)
println(new Path("C:/WINDOWS").toUri.getScheme)
println

println("=== java.io.File.toURI and org.apache.hadoop.fs.Path")
println(new Path(new File("C:\\WINDOWS").toURI).toUri.toString)
println(new Path(new File("/C:/WINDOWS").toURI).toUri.toString)
println(new Path(new File("C:/WINDOWS").toURI).toUri.toString)
println
println(new Path(new File("C:\\WINDOWS").toURI).toString)
println(new Path(new File("/C:/WINDOWS").toURI).toString)
println(new Path(new File("C:/WINDOWS").toURI).toString)
println
println(new Path(new File("C:\\WINDOWS").toURI).toUri.getScheme)
println(new Path(new File("/C:/WINDOWS").toURI).toUri.getScheme)
println(new Path(new File("C:/WINDOWS").toURI).toUri.getScheme)

produced

=== org.apache.spark.util.Utils.resolveURI
file:/C:/WINDOWS/
file:/C:/WINDOWS/
C:/WINDOWS

file
file
C

=== java.io.File
file:/C:/WINDOWS/
file:/C:/WINDOWS/
file:/C:/WINDOWS/

file
file
file

=== org.apache.hadoop.fs.Path
/C:/WINDOWS
/C:/WINDOWS
/C:/WINDOWS

C:/WINDOWS
C:/WINDOWS
C:/WINDOWS

null
null
null

=== java.io.File.toURI and org.apache.hadoop.fs.Path
file:/C:/WINDOWS/
file:/C:/WINDOWS/
file:/C:/WINDOWS/

file:/C:/WINDOWS/
file:/C:/WINDOWS/
file:/C:/WINDOWS/

file
file
file

Copy link
Member

@sarutak sarutak Aug 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyukjinKwon I think this change it self looks reasonable.
resolveURI should seem to be fixed so that Windows' path is handled correctly.
If C:/path/to/some/file is passed to resolveURI, the letter drive "C" should not parsed as URI scheme.

@HyukjinKwon
Copy link
Member Author

Build started: [TESTS] org.apache.spark.deploy.RPackageUtilsSuite PR-18971
Build started: [TESTS] org.apache.spark.deploy.SparkSubmitSuite PR-18971
Build started: [TESTS] org.apache.spark.scheduler.ReplayListenerSuite PR-18971
Build started: [TESTS] org.apache.spark.sql.hive.StatisticsSuite PR-18971
Diff: master...spark-test:D07ED879-66AE-454C-B77C-C6463F79387F

Due to flakiness, some tests might fail. I will rerun soon if failed. org.apache.spark.deploy.SparkSubmitSuite is expected to be failed as described in the PR.

@SparkQA
Copy link

SparkQA commented Aug 17, 2017

Test build #80790 has finished for PR 18971 at commit 9019548.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon HyukjinKwon changed the title [WIP][SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths [SPARK-21764][TESTS] Fix tests failures on Windows: resources not being closed and incorrect paths Aug 17, 2017
Utils.tryWithResource(new JarFile(jar)) { jarFile =>
assert(jarFile.getManifest == null, "jar file should have null manifest")
assert(!RPackageUtils.checkManifestForR(jarFile), "null manifest should return false")
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply closes JarFile. This should be closed

@@ -824,7 +824,7 @@ class SparkSubmitSuite
val hadoopConf = new Configuration()
val tmpDir = Files.createTempDirectory("tmp").toFile
updateConfWithFakeS3Fs(hadoopConf)
val sourcePath = s"s3a://${jarFile.getAbsolutePath}"
val sourcePath = s"s3a://${jarFile.toURI.getPath}"
Copy link
Member Author

@HyukjinKwon HyukjinKwon Aug 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows:

Before:

scala> f.getAbsolutePath
res2: String = C:\a\b\c

After:

scala> f.toURI.getPath
res1: String = /C:/a/b/c

Linux:

Before:

scala> new File("/a/b/c").getAbsolutePath
res0: String = /a/b/c

After:

scala> new File("/a/b/c").toURI.getPath
res1: String = /a/b/c

@@ -112,17 +112,19 @@ class ReplayListenerSuite extends SparkFunSuite with BeforeAndAfter with LocalSp

// Verify the replay returns the events given the input maybe truncated.
val logData = EventLoggingListener.openEventLog(logFilePath, fileSystem)
val failingStream = new EarlyEOFInputStream(logData, buffered.size - 10)
replayer.replay(failingStream, logFilePath.toString, true)
Utils.tryWithResource(new EarlyEOFInputStream(logData, buffered.size - 10)) { failingStream =>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here EarlyEOFInputStream was not being closed.

}

override def close(): Unit = in.close()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EarlyEOFInputStream was not being closed.

@@ -203,7 +203,7 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto
sql(s"INSERT INTO TABLE $tableName PARTITION (ds='$ds') SELECT * FROM src")
}

sql(s"ALTER TABLE $tableName SET LOCATION '$path'")
sql(s"ALTER TABLE $tableName SET LOCATION '${path.toURI}'")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests here do not look dedicated to test path. I have fixed those so far.

@SparkQA
Copy link

SparkQA commented Aug 26, 2017

Test build #81151 has finished for PR 18971 at commit 236b986.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

retest this please

@SparkQA
Copy link

SparkQA commented Aug 26, 2017

Test build #81152 has finished for PR 18971 at commit 236b986.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

@jiangxb1987 jiangxb1987 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

val failingStream2 = new EarlyEOFInputStream(logData2, buffered.size - 10)
intercept[EOFException] {
replayer.replay(failingStream2, logFilePath.toString, false)
Utils.tryWithResource(new EarlyEOFInputStream(logData2, buffered.size - 10)) { failingStream2 =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we can still use failingStream here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks so but I think I am not confident enough to change this. Will keep this in mind and point out when someone fixes the codes around this.

@HyukjinKwon
Copy link
Member Author

I simply rebased here.

@SparkQA
Copy link

SparkQA commented Aug 30, 2017

Test build #81239 has finished for PR 18971 at commit 7d3716c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member Author

Merged to master.

@HyukjinKwon
Copy link
Member Author

Thank you @vanzin, @sarutak, @jiangxb1987 and @srowen to review this.

@asfgit asfgit closed this in b30a11a Aug 30, 2017
@HyukjinKwon HyukjinKwon deleted the windows-fixes branch January 2, 2018 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants