Skip to content

Format-Hex: Improve mixed-collection piped input and piped streams of input #8674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Oct 15, 2019

Conversation

vexx32
Copy link
Collaborator

@vexx32 vexx32 commented Jan 17, 2019

/cc @iSazonov

PR Summary

Refactoring & Streamed Input Support

  • Handles continuous data stream input as a single input stream, and will display it in the same manner for both Format-Hex -InputObject $byteArray and $byteArray | Format-Hex
    • This includes arrays / streams of all basic value types.
  • Handles jagged streams of input by clumping input objects according to their type. This is not expected to handle a jagged array passed directly to the -InputObject parameter by name.
  • Adds ByteCollection properties:
    • Label — stores the type name and a random hexadecimal ID. Used to group objects in output formatting. This can be utilised by, for example, Group-Object, in order to enable downstream cmdlets to handle each group of ByteCollection objects separately, if needed.
    • Ascii — stores the ASCII representation of the bytes in the ByteCollection
    • HexBytes — stores the hexadecimal representation of the bytes in the ByteCollection
  • Filters ASCII display to replace control characters with the <?> unicode glyph

Additional tests have been added to cover this behaviour. Details below.

Behaviour of Piped Input

A Primitive Value

  1. Each item is converted to bytes and collected into an input buffer.
  2. The type of the last InputObject is stored in a private field.
  3. If a new InputObject is received of the same type, it is converted to bytes and added to the buffer.
  4. If a new InputObject is received of a different type, the buffer is divided into 16-byte segments and each segment is sent to the success output stream as a ByteCollection. Then, the process restarts from the beginning.

System.IO.FileInfo

  1. If any data is in the primitives buffer, it is output as described above, the buffer is cleared, as well as the information on what the last type was (resetting the grouping logic).
  2. One ByteCollection object is emitted per 16 bytes of the file.
  3. All ByteCollection objects are tagged with the input file's path as their Label, which will group their output into a single table.
  4. Offset and Count are interpreted as applying to each file, the same way as they would be for -Path inputs.

String

  1. If any data is in the primitives buffer, it is output as described above, the buffer is cleared, as well as the information on what the last type was (resetting the grouping logic).
  2. All strings are handled on an individual basis (no grouping), and are converted into bytes according to the -Encoding parameter's value.
  3. One ByteCollection object is emitted per 16 bytes of the string, and each is tagged with the System.String type name and a shared random hexadecimal ID to ensure that each string is assigned a separate Label group.
  4. Offset and Count are interpreted as applying to each input string.

Array of Primitives

  1. If any data is in the primitives buffer, it is output as described above, the buffer is cleared, as well as the information on what the last type was (resetting the grouping logic).
  2. Each array is treated as a single "group", and will not be grouped with other items in the output.
  3. All items in the array are converted into a single chunk of bytes, which then is sliced according to Offset and Count.
  4. One ByteCollection object is emitted per 16 bytes, and each is tagged with the array's type and a shared random hexadecimal ID to ensure that each array is assigned a single Label group.

Examples

Jagged Input Example
Jagged Input Incl. FileInfo Object Example

PR Context

Format-Hex needs a bit more versatility and to be able to handle piped input in a more effective manner.

PR Checklist

@vexx32
Copy link
Collaborator Author

vexx32 commented Jan 17, 2019

RE the Codacy issues - _isHeterogenousPipedInput and _lastInputType exist as fields as their values must persist between calls to the method it's used in.

I'll take care of the rest.

The Static Analysis link failure is unrelated, possibly an older page was taken down:

2019-01-17T17:09:58.2571685Z       [-] http://www.powershellmagazine.com/2014/04/24/windows-powershell-4-0-and-other-quick-reference-guides/ should work 131.74s
2019-01-17T17:09:58.3083022Z         RuntimeException: retry of url failed with error: 
2019-01-17T17:09:58.3092376Z         at <ScriptBlock>, /home/vsts/work/1/s/test/common/markdown/markdown-link.tests.ps1: line 109

EDIT: Seems fine now. Weird.

@vexx32
Copy link
Collaborator Author

vexx32 commented Jan 17, 2019

Codacy is a little wonky at times, it seems. Its request to make the methods static cannot be resolved, for example if we try to make the methods static we get an error on the WriteObject() calls:

image

@iSazonov
Copy link
Collaborator

@vexx32 Thanks for the great contribution!
Could you please move all style changes to another PR (we will fast merge it)? It'll make the code review more easy and fast. Thanks!

@vexx32
Copy link
Collaborator Author

vexx32 commented Jan 18, 2019

@iSazonov yeah, that makes sense. Will do!

@vexx32 vexx32 mentioned this pull request Jan 18, 2019
11 tasks
@vexx32 vexx32 closed this Jan 19, 2019
@vexx32 vexx32 reopened this Jan 19, 2019
@vexx32
Copy link
Collaborator Author

vexx32 commented Jan 19, 2019

Close & Reopen to restart failing xUnit test on Windows. Don't think that should have been affected by anything here.

@vexx32
Copy link
Collaborator Author

vexx32 commented Jan 28, 2019

/cc @iSazonov @JamesWTruher @PaulHigin

Style changes PR was #8683 and is merged. This one's just pending your guys review, I think? Let me know. 😄

@stale
Copy link

stale bot commented Feb 27, 2019

This PR has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs within 10 days.
Thank you for your contributions.
Community members are welcome to grab these works.

@stale stale bot added the Stale label Feb 27, 2019
@vexx32
Copy link
Collaborator Author

vexx32 commented Feb 27, 2019

Ping @PaulHigin for review. 😃

@stale stale bot removed the Stale label Feb 27, 2019
@iSazonov iSazonov changed the title Format-Hex: Improve mixed-collection piped input and piped streams of input WIP: Format-Hex: Improve mixed-collection piped input and piped streams of input Mar 16, 2019
@iSazonov
Copy link
Collaborator

@vexx32 I reviewed this again. I think we could close the PR as controversial and continue the discussion in an issue:

Handles continuous data stream input as a single input stream

I doubt it is necessary. Perhaps this is exactly what users want in some scenarios.

Handles heterogenous streams of input by treating each input item as a discrete item to display

Looks useful in all scenarious.

@vexx32
Copy link
Collaborator Author

vexx32 commented Mar 16, 2019

If we must, we must. 🤷‍♂️

I'm inclined to think that users would find its current behaviour when piping a byte stream/array to it extremely useless, but that's just me.

But if I'm honest, it may well be just that it doesn't need the behaviour as it isn't used extensively.

@stale
Copy link

stale bot commented Apr 15, 2019

This PR has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed if no further activity occurs within 10 days.
Thank you for your contributions.
Community members are welcome to grab these works.

@stale stale bot added the Stale label Apr 15, 2019
@iSazonov
Copy link
Collaborator

@vexx32 Please resolve the merge conflict. I think we need rebase to get latest CIs updates.

@stale stale bot removed the Stale label Apr 16, 2019
@vexx32 vexx32 force-pushed the FormatHexRefactor branch from 5de1539 to d405003 Compare April 16, 2019 20:46
@vexx32
Copy link
Collaborator Author

vexx32 commented Apr 16, 2019

@iSazonov done and done. Let's see how CI runs fare. 😄

@vexx32 vexx32 changed the title WIP: Format-Hex: Improve mixed-collection piped input and piped streams of input Format-Hex: Improve mixed-collection piped input and piped streams of input Apr 18, 2019
@vexx32
Copy link
Collaborator Author

vexx32 commented May 31, 2019

@anmenaga ping for review. 🙂

Copy link
Collaborator

@JamesWTruher JamesWTruher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not blocking.
one comment about the new tests (and i suppose a reflection on all the tests for this).

@vexx32
Copy link
Collaborator Author

vexx32 commented Oct 10, 2019

@iSazonov updated tests to account for behavioural changes and updated the PR description to hopefully clarify what's going on. If anything still isn't clear, please let me know! 💖

@vexx32
Copy link
Collaborator Author

vexx32 commented Oct 10, 2019

@PoshChan please retry macos

@PoshChan
Copy link
Collaborator

@vexx32, successfully started retry of PowerShell-CI-macOS

@iSazonov
Copy link
Collaborator

@PoshChan Please remind me in 1 day

@PoshChan
Copy link
Collaborator

@iSazonov, this is the reminder you requested 1 day ago

@iSazonov
Copy link
Collaborator

@SteveL-MSFT @TravisEz13 @rkeithhill The PR is ready for final code review.

}
else
{
WriteHexadecimal(bytes, offset: 0, label: GetGroupLabel(typeof(string)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should either consistently name the label param or don't, but it's a bit inconsistent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll opt for naming it, since one of the other possible overloads also has a string param (path), so it's safer to name this one.

Copy link
Member

@SteveL-MSFT SteveL-MSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one NIT comment

@vexx32
Copy link
Collaborator Author

vexx32 commented Oct 12, 2019

@PoshChan please retry static

@PoshChan
Copy link
Collaborator

@vexx32, successfully started retry of PowerShell-CI-static-analysis

@vexx32
Copy link
Collaborator Author

vexx32 commented Oct 12, 2019

Oh, I guess MSDN is down for a bit. 😄

@TravisEz13
Copy link
Member

retried the URL check

@anmenaga
Copy link

Added doc issue reference to PR template: MicrosoftDocs/PowerShell-Docs#4532

@anmenaga anmenaga merged commit 2e95179 into PowerShell:master Oct 15, 2019
@anmenaga
Copy link

@vexx32 Thank you!

@ghost
Copy link

ghost commented Oct 23, 2019

🎉v7.0.0-preview.5 has been released which incorporates this pull request.:tada:

Handy links:

kilasuit pushed a commit to kilasuit/PowerShell that referenced this pull request Nov 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CL-General Indicates that a PR should be marked as a general cmdlet change in the Change Log Documentation Needed in this repo Documentation is needed in this repo
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants