Skip to content

Precompute the normalized type names of DataTypes and use string.Create on NS2.1 for faster normalization. #364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 9, 2019

Conversation

ahsonkhan
Copy link
Contributor

@ahsonkhan ahsonkhan commented Dec 5, 2019

I noticed some areas where the implementation could be made more performant by avoiding using Linq and caching the normalized type names for the simple and complex DataTypes. That way, they don't have to be computed every time. I am not sure if the code paths are perf critical, in which case, I would understand if this PR isn't useful. If that's the case, please let me know.

Also, added a call to use string.Create to make normalizing the type names a bit faster (and allocate less). If adding a netstandard2.1 TFM isn't feasible, please let me know and I can revert that part of the change.

The behavior isn't being changed at all, hence existing tests should be sufficient.

BenchmarkDotNet=v0.11.5.1159-nightly, OS=Windows 10.0.18362
Intel Core i7-6700 CPU 3.40GHz (Skylake), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=3.1.100-preview1-014459
  [Host]     : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT
  Job-PDGMZC : .NET Core 3.0.0 (CoreCLR 4.700.19.46205, CoreFX 4.700.19.46214), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250.0000 ms  MaxIterationCount=10  
MinIterationCount=5  WarmupCount=1  

Accessing the TypeName property for the known simple/complex types got much faster:

Method Mean Error StdDev Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
TypeName_old 41.71 ns 0.684 ns 0.304 ns 41.61 ns 41.31 ns 42.22 ns 1.00 0.0095 - - 40 B
TypeName_new 14.52 ns 0.360 ns 0.188 ns 14.53 ns 14.25 ns 14.72 ns 0.35 - - - -

Normalizing a new type name (using string.Create) reduces allocations:

Method Mean Error StdDev Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
NormalizeTypeName_old 42.60 ns 1.359 ns 0.899 ns 42.54 ns 41.42 ns 43.86 ns 1.00 0.0190 - - 80 B
NormalizeTypeName_new 38.42 ns 0.762 ns 0.453 ns 38.22 ns 38.02 ns 39.35 ns 0.90 - - - 40 B

cc @eerhardt

@@ -1,7 +1,7 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<TargetFramework>netstandard2.0</TargetFramework>
<TargetFrameworks>netstandard2.0; netstandard2.1</TargetFrameworks>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want netcoreapp2.1, so we get the perf improvements there? Or is getting them only on .NET Core 3.0+ ok?

Copy link
Contributor Author

@ahsonkhan ahsonkhan Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind adding the ncapp2.1 TFM if there are no concerns with doing so. If there are other perf benefits of adding netcoreapp2.1 specifically, then let's do it. Otherwise, this change alone might not warrant doing so.

Debug.Assert(s_complexTypeNormalizedNames == null);
BuildNormalizedStringMapping();
}
return s_simpleTypeNormalizedNames.AsSpan().IndexOf(typeName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why AsSpan here? Why not use Array.IndexOf?

Copy link
Contributor Author

@ahsonkhan ahsonkhan Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Habit/personal preference to use spans :)

Array.IndexOf works fine (and is effectively equivalent). Shall I change it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your call. It was just surprising to get a span just for IndexOf.

eerhardt
eerhardt previously approved these changes Dec 6, 2019
Copy link
Member

@eerhardt eerhardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@ahsonkhan
Copy link
Contributor Author

@ahsonkhan ahsonkhan Ahson Khan FTE dismissed eerhardt Eric Erhardt FTE’s stale review via b33d6b7 1 hour ago

Looks like GitHub is asking for another approval :)

@imback82
Copy link
Contributor

imback82 commented Dec 6, 2019

Thanks @ahsonkhan for the PR! This code is not in a critical path, but I will evaluate if it's worthwhile for extra complexity. @suhsteve can you also take a look? Thanks!

Comment on lines 43 to 44
private static string[] s_simpleTypeNormalizedNames = null;
private static string[] s_complexTypeNormalizedNames = null;
Copy link
Member

@suhsteve suhsteve Dec 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this with something like

        private static Lazy<string[]> s_simpleTypeNormalizedNames =
            new Lazy<string[]>(
                () => s_simpleTypes.Select(t => NormalizeTypeName(t.Name)).ToArray());

        private static Lazy<string[]> s_complexTypeNormalizedNames =
            new Lazy<string[]>(
                () => s_complexTypes.Select(t => NormalizeTypeName(t.Name)).ToArray());

With this you should be able to get rid of

        private static int SimpleTypeIndex(string typeName)
        private static int ComplexTypeIndex(string typeName)
        private static void BuildNormalizedStringMapping()

as well as remove the if check in

        private static string NormalizeTypeName(Type type)
        {
            if (s_simpleTypeNormalizedNames == null)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this you should be able to get rid of

Even with lazy initialization, how can we get rid of SimpleTypeIndex and ComplexTypeIndex? We still need to figure out what index the incoming type maps to in the normalized names array, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call
s_simpleTypeNormalizedNames.Value.IndexOf(typeName) at the callsite instead ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I was thinking of keeping it a method to add doc comments.

/// <summary>
/// Normalized type name.
/// </summary>
public string TypeName => NormalizeTypeName(GetType().Name);
public string TypeName => NormalizeTypeName(GetType());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of having a new method for this, it may be simpler to normalize the name and cache the result. Something like

        private string _typeName;

        /// <summary>
        /// Normalized type name.
        /// </summary>
        public string TypeName
        {
            get
            {
                return _typeName ?? (_typeName = NormalizeTypeName(GetType().Name));
            }
        }

@ahsonkhan
Copy link
Contributor Author

ahsonkhan commented Dec 7, 2019

Fyi, the ParseSimpleType and ParseDataType(JToken json) methods got faster and allocate less (when the type was found in the cache):

Method Mean Error StdDev Median Min Max Ratio Gen 0 Gen 1 Gen 2 Allocated
ParseSimpleType_old 465.22 ns 35.651 ns 23.581 ns 461.61 ns 429.68 ns 507.12 ns 1.00 0.0875 - - 368 B
ParseSimpleType_new 88.88 ns 7.526 ns 4.978 ns 87.10 ns 84.02 ns 96.57 ns 0.19 0.0057 - - 24 B

I suspect ParseDataType(string) will get even faster once this change is in (and we avoid using JToken.Parse): #358

Copy link
Contributor

@imback82 imback82 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @ahsonkhan

@imback82
Copy link
Contributor

imback82 commented Dec 7, 2019

@eerhardt, before I merge this PR, I have a naive question. Do you see any cons of targeting multiple frameworks going forward? I don't see any, but just wanted to confirm since you are an expert on this. Thanks!

@suhsteve
Copy link
Member

suhsteve commented Dec 7, 2019

@ahsonkhan Just a general comment, but if you really wanted, we can hardcode and create a mapping between the typename => type and use this instead of doing a linear search of the Array for the index of the type.

@eerhardt
Copy link
Member

eerhardt commented Dec 9, 2019

Do you see any cons of targeting multiple frameworks going forward?

The only real con is that you need to build multiple times, but since this repo is small, it won't matter.

This change will only help future code that wants to take advantage of new APIs (typically for perf reasons, like this change).

@imback82 imback82 merged commit e650769 into dotnet:master Dec 9, 2019
@ahsonkhan ahsonkhan deleted the NormalizeTypeNameImprovements branch December 9, 2019 21:47
@imback82 imback82 mentioned this pull request Jan 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants