Skip to content

Return distinct array of ParameterSet when ProposeSweep is called #368

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 19, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions Microsoft.ML.sln
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,8 @@ Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "netstandard2.0", "netstanda
pkg\Microsoft.ML\build\netstandard2.0\Microsoft.ML.targets = pkg\Microsoft.ML\build\netstandard2.0\Microsoft.ML.targets
EndProjectSection
EndProject
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "Microsoft.ML.Sweeper.Tests", "test\Microsoft.ML.Sweeper.Tests\Microsoft.ML.Sweeper.Tests.csproj", "{3DEB504D-7A07-48CE-91A2-8047461CB3D4}"
EndProject
Global
GlobalSection(SolutionConfigurationPlatforms) = preSolution
Debug|Any CPU = Debug|Any CPU
Expand Down Expand Up @@ -216,6 +218,10 @@ Global
{362A98CF-FBF7-4EBB-A11B-990BBF845B15}.Debug|Any CPU.Build.0 = Debug|Any CPU
{362A98CF-FBF7-4EBB-A11B-990BBF845B15}.Release|Any CPU.ActiveCfg = Release|Any CPU
{362A98CF-FBF7-4EBB-A11B-990BBF845B15}.Release|Any CPU.Build.0 = Release|Any CPU
{3DEB504D-7A07-48CE-91A2-8047461CB3D4}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
{3DEB504D-7A07-48CE-91A2-8047461CB3D4}.Debug|Any CPU.Build.0 = Debug|Any CPU
{3DEB504D-7A07-48CE-91A2-8047461CB3D4}.Release|Any CPU.ActiveCfg = Release|Any CPU
{3DEB504D-7A07-48CE-91A2-8047461CB3D4}.Release|Any CPU.Build.0 = Release|Any CPU
EndGlobalSection
GlobalSection(SolutionProperties) = preSolution
HideSolutionNode = FALSE
Expand Down Expand Up @@ -253,6 +259,7 @@ Global
{362A98CF-FBF7-4EBB-A11B-990BBF845B15} = {09EADF06-BE25-4228-AB53-95AE3E15B530}
{487213C9-E8A9-4F94-85D7-28A05DBBFE3A} = {DEC8F776-49F7-4D87-836C-FE4DC057D08C}
{9252A8EB-ABFB-440C-AB4D-1D562753CE0F} = {487213C9-E8A9-4F94-85D7-28A05DBBFE3A}
{3DEB504D-7A07-48CE-91A2-8047461CB3D4} = {AED9C836-31E3-4F3F-8ABC-929555D3F3C4}
EndGlobalSection
GlobalSection(ExtensibilityGlobals) = postSolution
SolutionGuid = {41165AF1-35BB-4832-A189-73060F82B01D}
Expand Down
5 changes: 5 additions & 0 deletions src/Microsoft.ML.Core/Prediction/ISweeper.cs
Original file line number Diff line number Diff line change
Expand Up @@ -174,6 +174,11 @@ public override string ToString()
{
return string.Join(" ", _parameterValues.Select(kvp => string.Format("{0}={1}", kvp.Value.Name, kvp.Value.ValueText)).ToArray());
}

public override int GetHashCode()
{
return _hash;
}
}

/// <summary>
Expand Down
8 changes: 4 additions & 4 deletions src/Microsoft.ML.Sweeper/Algorithms/Grid.cs
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,10 @@ protected SweeperBase(ArgumentsBase args, IHostEnvironment env, IValueGenerator[
SweepParameters = sweepParameters;
}

public virtual ParameterSet[] ProposeSweeps(int maxSweeps, IEnumerable<IRunResult> previousRuns)
public virtual ParameterSet[] ProposeSweeps(int maxSweeps, IEnumerable<IRunResult> previousRuns = null)
{
var prevParamSets = previousRuns?.Select(r => r.ParameterSet).ToList() ?? new List<ParameterSet>();
var result = new List<ParameterSet>();
var result = new HashSet<ParameterSet>();
for (int i = 0; i < maxSweeps; i++)
{
ParameterSet paramSet;
Expand Down Expand Up @@ -150,12 +150,12 @@ public RandomGridSweeper(IHostEnvironment env, Arguments args, IValueGenerator[]
}
}

public override ParameterSet[] ProposeSweeps(int maxSweeps, IEnumerable<IRunResult> previousRuns)
public override ParameterSet[] ProposeSweeps(int maxSweeps, IEnumerable<IRunResult> previousRuns = null)
{
if (_nGridPoints == 0)
return base.ProposeSweeps(maxSweeps, previousRuns);

var result = new List<ParameterSet>();
var result = new HashSet<ParameterSet>();
Copy link
Member

@sfilipi sfilipi Jun 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HashSet() [](start = 28, length = 24)

Looking at the other sweeping algos that we have. They might need the same fix. Will update the comment. #Resolved

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. If you'd like to extend the fix to the other sweepers, take a look at the same method in the other algorithms (inside the Algorithms) ex. ProposeSweeps, in KdoSweeper.

Besides that method, there are other instance fields that cache the results, that might benefit from changing their type from a List to a HashSet.


In reply to: 196221925 [](ancestors = 196221925)

Copy link
Contributor Author

@ross-p-smith ross-p-smith Jun 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sweeper in KdoSweeper eventually uses SweeperBase which this PR already fixes. Yes, there are some other fixes to make, but probably better as another issue. I could look at them in a few days under a different issue #Resolved

Copy link
Member

@sfilipi sfilipi Jun 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressing them on a different PR, and logging something as a reminder sounds good to me.

AFA ProposeSweep in Kdo, the do while loop as a potential for creating duplicates, since they all are generated indipendantly, there is no duplication check on the cached results etc. But again, adressing it as a separate issue sounds good.


In reply to: 196239510 [](ancestors = 196239510)

var prevParamSets = (previousRuns != null)
? previousRuns.Select(r => r.ParameterSet).ToList()
: new List<ParameterSet>();
Expand Down
12 changes: 12 additions & 0 deletions test/Microsoft.ML.Sweeper.Tests/Microsoft.ML.Sweeper.Tests.csproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netcoreapp2.0</TargetFramework>
<DefineConstants>CORECLR</DefineConstants>
<IsPackable>false</IsPackable>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false [](start = 3, length = 31)

is this leftover?

</PropertyGroup>
<ItemGroup>
<ProjectReference Include="..\..\src\Microsoft.ML.Sweeper\Microsoft.ML.Sweeper.csproj" />
<ProjectReference Include="..\Microsoft.ML.TestFramework\Microsoft.ML.TestFramework.csproj" />
</ItemGroup>
</Project>
69 changes: 69 additions & 0 deletions test/Microsoft.ML.Sweeper.Tests/SweeperTest.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

using Microsoft.ML.Runtime;
Copy link
Contributor

@TomFinley TomFinley Jun 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will see our other files tend to have a standard header:

// Licensed to the .NET Foundation under one or more agreements.
// The .NET Foundation licenses this file to you under the MIT license.
// See the LICENSE file in the project root for more information.

It may be a nice idea to include that as well in this file. #Closed

using Microsoft.ML.Runtime.CommandLine;
using Microsoft.ML.Runtime.Data;
using Microsoft.ML.Runtime.RunTests;
using Microsoft.ML.Runtime.Sweeper;
using System;
using System.IO;
using Xunit;

namespace Microsoft.ML.Sweeper.Tests
{
public class SweeperTest
{
[Fact]
public void UniformRandomSweeperReturnsDistinctValuesWhenProposeSweep()
{
DiscreteValueGenerator valueGenerator = CreateDiscreteValueGenerator();

using (var writer = new StreamWriter(new MemoryStream()))
using (var env = new TlcEnvironment(42, outWriter: writer, errWriter: writer))
{
var sweeper = new UniformRandomSweeper(env,
new SweeperBase.ArgumentsBase(),
new[] { valueGenerator });

var results = sweeper.ProposeSweeps(3);
Assert.NotNull(results);

int length = results.Length;
Assert.Equal(2, length);
}
}

[Fact]
public void RandomGridSweeperReturnsDistinctValuesWhenProposeSweep()
{
DiscreteValueGenerator valueGenerator = CreateDiscreteValueGenerator();

using (var writer = new StreamWriter(new MemoryStream()))
using (var env = new TlcEnvironment(42, outWriter: writer, errWriter: writer))
{
var sweeper = new RandomGridSweeper(env,
new RandomGridSweeper.Arguments(),
new[] { valueGenerator });

var results = sweeper.ProposeSweeps(3);
Assert.NotNull(results);

int length = results.Length;
Assert.Equal(2, length);
}
}

private static DiscreteValueGenerator CreateDiscreteValueGenerator()
{
var args = new DiscreteParamArguments()
{
Name = "TestParam",
Values = new string[] { "one", "two" }
};

return new DiscreteValueGenerator(args);
}
}
}