Skip to content

Random build failures: Catalog the failures #1474

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TomFinley opened this issue Oct 31, 2018 · 5 comments
Closed

Random build failures: Catalog the failures #1474

TomFinley opened this issue Oct 31, 2018 · 5 comments
Assignees
Labels
Build Build related issue P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away. test related to tests

Comments

@TomFinley
Copy link
Contributor

TomFinley commented Oct 31, 2018

Companion issue to #1471. We should investigate more fully at a high level why the builds are failing.

@eerhardt has helpfully provided a good list of these failures for builds that should have in principle succeeded, since they were against master. The task would then to be, somehow, to go through them, and determine why each has failed. This even has an analytics tab that will show the precise test failures, so perhaps this is not so hard of an issue! But getting a sense for why each has failed might itself be worthwhile, I imagine.

While tests are I believe the primary culprit, I don't quite have an appreciation for how big of a problem it might be compared to other issues. For those builds that do not appear to be due to a test failure, why did they fail? Timeout? Network resources unavailable? Some other failure to setup? An actual failure to build somehow? (Of course some of these things, especially timeouts, could themselves be test failures.)

Also for those things that are test failures, which tests are failing, on what environments? I have a sense for some "usual suspects" of tests that are problematic, but I am sometimes surprised by new ones, and it would surely help to build a more complete catalog so that the investigation can focus its efforts usefully.

@TomFinley TomFinley added Build Build related issue test related to tests labels Oct 31, 2018
@Zruty0
Copy link
Contributor

Zruty0 commented Oct 31, 2018

(Of course some of these things, especially timeouts, could themselves.)

I wholeheartedly

@Ivanidzo4ka
Copy link
Contributor

Ivanidzo4ka commented Nov 2, 2018

Test Platform Count Comments
Microsoft.ML.Runtime.RunTests.TestPredictors.MulticlassTreeFeaturizedLRTest MacOS_Release 1  
  MacOs_Debug 7  
StaticPipelineTests hung Windows x64 Debug 3  
  MacOS_Release 15  
  Windows_x64_Release 4  
  MacOs_Debug 5  
Microsoft.ML.Core.Tests Hung Windows_x86_Debug 2  
  Windows_x64_Release 1  
Microsoft.ML.Runtime.RunTests.TestTimeSeries.SavePipeSlidingWindowW1L2 Windows_x86_Debug 2 at Microsoft.ML.Runtime.Data.TextLoader.Cursor.ParseParallel(ParallelState state)+MoveNext() in D:\a\1\s\src\Microsoft.ML.Data\DataLoadSave\Text\TextLoaderCursor.cs:line 612
  Windows_x64_Debug 1  
Microsoft.ML.Tests Hung MacOS_Release 14  
  MacOs_Debug 4  
  Windows_x64_Release 1  
Microsoft.ML.Tests.Transformers.TextFeaturizerTests.TextTokenizationWorkout MacOs_Debug 2  
Microsoft.ML.Runtime.RunTests.TestCSharpApi.TestOvaMacro MacOs_Debug 1  
PipelineApiScenarioTests.Metacomponents Windows_x86_Debug 1  
Microsoft.ML.FSharp.Tests.SmokeTest3.FSharp-Sentiment-Smoke-Test Windows_x64_Debug 1 The file 'C:\Users\VssAdministrator\AppData\Local\Temp\TLC_26831444\0' already exists.
  Windows_x64_Release 1  
TestPipelineSweeper.PipelineSweeperMultiClassClassification Windows_x86_Debug 1 Microsoft.ML.Trainers.SdcaMultiClassTrainer.TrainWithoutLock(IProgressChannelProvider progress, Factory cursorFactory, IRandom rand, IdToIdxLookup idToIdx, Int32 numThreads, DualsTableBase duals, Single[] biasReg, Single[] invariants, Single lambdaNInv, VBuffer1[] weights, Single[] biasUnreg, VBuffer1[] l1IntermediateWeights, Single[] l1IntermediateBias, Single[] featureNormSquared) in D:\a\1\s\src\Microsoft.ML.StandardLearners\Standard\SdcaMultiClass.cs:line 178
TestTimeSeries.SavePipeMovingAverageNonUniform Windows_x86_Debug 1 Baseline
ScenariosTests.TrainAndPredictIrisModelWithStringLabelTest MacOS_Debug 1 SDCA invariants
Microsoft.ML.Runtime.RunTests.TestPredictors.LinearClassifierTest Windows_x64_Debug 1  
Microsoft.ML.Runtime.RunTests.TestCSharpApi.TestCrossValidationMacro Windows_x86_Debug 1  
Microsoft.ML.Runtime.RunTests.TestEntryPoints.EntryPointCaching Linux_Debug 1  
Microsoft.ML.Tests.TrainerEstimators.TrainerEstimators.SdcaWorkout MacOS_Debug 1 SDCA invariants
ApiScenariosTests.New_FileBasedSavingOfData MacOs_Debug 1 Probably temp file
ApiScenariosTests.New_TrainSaveModelAndPredict MacOS_Debug 1 Probably temp file
Microsoft.ML.Tests.Transformers.TextFeaturizerTests.WordBagWorkout MacOS_Debug 1 Probably temp file
Microsoft.ML.StaticPipelineTesting.Training.CrossValidate MacOS_Debug 1 SDCA invariants
CookbookSamples.CookbookSamples.CrossValidationIris Linux_Debug 1 SDCA invariants
Microsoft.ML.Tests.OnnxTests.MultiClassificationLRSaveModelToOnnxTest Windows_x64_Debug 1  
Microsoft.ML.Tests.Scenarios.Api.ApiScenariosTests.TrainWithInitialPredictor Windows_x64_Debug 1 SDCA invariants

@Ivanidzo4ka
Copy link
Contributor

Ivanidzo4ka commented Nov 2, 2018

I've look in past two days (it's hard to look further, pages start load really slow)
Also found what test system not always write down which test is finished (I can see it's started, and run on that test dll is over with everything is mark as passing, but no finishing line in log)

@Ivanidzo4ka
Copy link
Contributor

Ivanidzo4ka commented Nov 2, 2018

Point of this exercise is to look what is main pain point and allocate resources accordingly.
So far biggest issue is MacOS system and test hung into timeout.
I have following obvious proposals to ease pain:

  • Let's reduce timeout to 30 minutes, our regular run on macOs_Debug is about 18 minutes.
  • Let's move to bigger pool machine with Mac (hosted one, instead of current one)

proposals for investigation:

  • We need way to download tests artifacts.
  • We need better way to determine which test get started and which test finished. (We can make hook in base test class and write information about start and finish into one artifact file)
  • Let's turn of parallel test execution to determine is related to sententious test execution or not.

@codemzs codemzs closed this as completed Jun 30, 2019
@codemzs codemzs reopened this Jun 30, 2019
@harishsk harishsk added P2 Priority of the issue for triage purpose: Needs to be fixed at some point. P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away. and removed P2 Priority of the issue for triage purpose: Needs to be fixed at some point. labels Jan 12, 2020
@frank-dong-ms-zz
Copy link
Contributor

Introduced threading analyzer and fix hanging:
#4790
#4791
#4792
#4793
#4794

We fixed several race condition to fix some random failures:
#4829
#4950

investigate and fixed lightgbm related crash:
#4929

fixed benchmark test hanging:
#4985

@ghost ghost locked as resolved and limited conversation to collaborators Mar 27, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Build Build related issue P0 Priority of the issue for triage purpose: IMPORTANT, needs to be fixed right away. test related to tests
Projects
None yet
Development

No branches or pull requests

6 participants