Convert a Log file to JSON file - json

I would like to convert a log file to JSON format.
The content of the Log file is as below:
2021-07-13T14:32:00.197904 DDD client=10.4.35.4
2021-07-13T14:32:00.271923 BBB from=<josh.josh#test.com>
2021-07-13T14:32:01.350434 CCC from=<rob.roder#test.com>
2021-07-13T14:32:01.417904 DDD message-id=<1-2-3-a-a#A1>
2021-07-13T14:32:01.586494 DDD from=<Will.Smith#test.com>
2021-07-13T14:32:02.643101 DDD to=<Will.Smith#test.com>
2021-07-13T14:32:02.712803 AAA client=10.1.35.2
2021-07-13T14:32:03.832661 BBB client=2021:8bd::98e7:2e04:f94s
2021-07-13T14:32:03.920297 DDD status=sent
However the problem that occurs is that I need to match the IDs for each line to export to JSON that looks like:
{
"time": {
"start": "2021-07-13T14:32:01.417904",
"duration": "0:00:02.502393"
},
"sessionid": "DDD",
"client": "10.4.35.4",
"messageid": "<1-2-3-a-a#A1>",
"address": {
"from": "<Will.Smith#test.com>",
"to": "<Will.Smith#test.com>"
},
"status": "sent"
}
]
Next step is to import this data to analysis tool which only acceps JSON format. I've tried this with powershell and python, but got nowhere near the expected output.
The problems along the way:
How to link each row by session?
How to count 1st and last session duration?
How to link each session 3rd column results and how to convert them to json?
I would really appreciate any help, links, studies, etc.

You may do something similar to the following:
Get-Content a.log | Foreach-Object {
if ($_ -match '^(?<time>\S+)\s+(?<sessionid>\w+)\s+(?<key>[^=]+)=(?<data>.*)$') {
[pscustomobject]#{
time = $matches.time
sessionid = $matches.sessionid
key = $matches.key
data = $matches.data
}
}
} | Group sessionid | Foreach-Object {
$jsonTemplate = [pscustomobject]#{
time = [pscustomobject]#{ start = ''; duration = '' }
sessionid = ''
client = ''
messageid = ''
address = [pscustomobject]#{from = ''; to = ''}
status = ''
}
$start = ($_.Group | where key -eq 'message-id').time
$end = ($_.Group | where key -eq 'status').time -as [datetime]
$jsonTemplate.time.start = $start
$jsonTemplate.time.duration = ($end - ($start -as [datetime])).ToString()
$jsonTemplate.sessionid = $_.Name
$jsonTemplate.client = ($_.Group | where key -eq 'client').data
$jsonTemplate.messageid = ($_.Group | where key -eq 'message-id').data
$jsonTemplate.address.from = ($_.Group | where key -eq 'from').data
$jsonTemplate.address.to = ($_.Group | where key -eq 'to').data
$jsonTemplate.status = ($_.Group | where key -eq 'status').data
[regex]::Unescape(($jsonTemplate | convertTo-Json))
}
The general steps that are happening are the following:
Parse the log file to separate data elements
Group by sessionid to easily identify all event entries belonging to the session id
Create a custom object that contains the schema that'll easily convert to the desired JSON format.
The regex unescape is to remove the unicode escape codes for the < and > characters.
The $matches automatic variable updates when the -match operation returns $true. Since we are using named capture groups in the regex expression, capture groups are accessible as keys in the $matches hashtable.
Caveats:
Assumes sessionid only has one session per id.
Missing session data shows up as null in JSON format.
Alternative solutions may use ConvertFrom-String when reading the file. It is just simpler for me personally to do regex matching instead.

A solution based on a switch statement, which enables fast line-by-line processing with its -File parameter:
A nested (ordered) hashtable is used to compile session-specific information across lines.
The -split operator is used to split each line into fields, and to split the last field into property name and value.
Note:
The calculation of the session duration assumes that the first line for a given session ID marks the start of the session, and a line with a status= value the end.
$sessions = [ordered] #{}
switch -File file.log { # process file line by line
default {
$timestamp, $sessionId, $property = -split $_ # split the line into fields
$name, $value = $property -split '=', 2 # split the property into name an value
if ($session = $sessions.$sessionId) { # entry for session ID already exists
$session.$name = $value
# end of session? calculate the duration
if ($name -eq 'status') { $session.time.duration = ([datetime] $timestamp - [datetime] $session.time.start).ToString() }
}
else { # create new entry for this session ID
$sessions.$sessionId = [ordered] #{
$name = $value
time = [ordered] #{
start = $timestamp
duration = $null
}
}
}
}
}
# Convert the hashtable to JSON
$sessions | ConvertTo-Json

Related

I want to create json file by substituting values from environment variables in a json template file

One requirement of mine is - Using windows, not use any tools not already available as part of aws cli or windows
For example, I have this json file test.json with below content:
"My number is $myvar"
I read this into a powershell variable like so:
$myobj=(get-content .\test.json | convertfrom-json)
$myvar=1
From here, I would like to do something with this $myobj which will enable me to get this output:
$myobj | tee json_with_values_from_environment.json
My number is 1
I got some limited success with iex, but not sure if it can be made to work for this example
You can use $ExecutionContext.InvokeCommand.ExpandString()
$myobj = '{test: "My number is $myvar"}' | ConvertFrom-Json
$myvar = 1
$ExecutionContext.InvokeCommand.ExpandString($myobj.test)
Output
My number is 1
Here is one way to do it using the Parser to find all VariableExpressionAst and replace them with the values in your session.
Given the following test.json:
{
"test1": "My number is $myvar",
"test2": {
"somevalue": "$env:myothervar",
"someothervalue": "$anothervar !!"
}
}
We want to find and replace $myvar, $myothervar and $anothervar with their corresponding values defined in the current session, so the code looks like this (note that we do the replacement before converting the Json string into an object, this way is much easier):
using namespace System.Management.Automation.Language
$isCore7 = $PSVersionTable.PSVersion -ge '7.2'
# Define the variables here
$myvar = 10
$env:myothervar = 'hello'
$anothervar = 'world'
# Read the Json
$json = Get-Content .\test.json -Raw
# Now parse it
$ast = [Parser]::ParseInput($json, [ref] $null, [ref] $null)
# Find all variables in it, and enumerate them
$ast.FindAll({ $args[0] -is [VariableExpressionAst] }, $true) |
Sort-Object { $_.Extent.Text } -Unique | ForEach-Object {
# now replace the text with the actual value
if($isCore7) {
# in PowerShell Core is very easy
$json = $json.Replace($_.Extent.Text, $_.SafeGetValue($true))
return
}
# in Windows PowerShell not so much
$varText = $_.Extent.Text
$varPath = $_.VariablePath
# find the value of the var (here we use the path)
$value = $ExecutionContext.SessionState.PSVariable.GetValue($varPath.UserPath)
if($varPath.IsDriveQualified) {
$value = $ExecutionContext.SessionState.InvokeProvider.Item.Get($varPath.UserPath).Value
}
# now replace the text with the actual value
$json = $json.Replace($varText, $value)
}
# now we can safely convert the string to an object
$json | ConvertFrom-Json
If we were to convert it back to Json to see the result:
{
"test1": "My number is 10",
"test2": {
"somevalue": "hello",
"someothervalue": "world !!"
}
}

Get-ADUser in function returns nothing on first run, then returns double values [duplicate]

I have a script block/function that returns PSCustomObject followed by Write-Host.
I want to get the output first then print the write-host but I can't seem to figure it out.
function ReturnArrayList {
param (
[int] $number
)
[System.Collections.ArrayList]$folderList = #()
$folderObject = [PSCustomObject]#{
Name = 'John'
number = $number
}
#Add the object to the array
$folderList.Add($folderObject) | Out-Null
return $folderList
}
$sb = {
param (
[int] $number
)
[System.Collections.ArrayList]$folderList = #()
$folderObject = [PSCustomObject]#{
Name = 'John'
number = $number
}
#Add the object to the array
$folderList.Add($folderObject) | Out-Null
return $folderList
}
ReturnArrayList -number 5
#Invoke-Command -ScriptBlock $sb -ArgumentList 5
Write-Host "This write host should come later"
Result:
This write host should come after
Name number
---- ------
John 5
Desired result:
Name number
---- ------
John 5
This write host should come after
How can I get the return result first and print the write-host message?
Thank you for your help in advance!
You can force PowerShell to write the output from ReturnArrayList to the screen before reaching Write-Host by piping it to either one of the Format-* cmdlets or Out-Default:
$object = ReturnArrayList -number 5
$object |Out-Default
Write-Host "This write host should come later"
Result:
Name number
---- ------
John 5
This write host should come later
Beware that your ReturnArrayList function does not actually return an ArrayList - PowerShell will automatically enumerate the item(s) in $folderlist, and since it only contains one item, the result is just a single PSCustomObject, "unwrapped" from the ArrayList so to speak:
PS ~> $object = ReturnArrayList -number 5
PS ~> $object.GetType().Name
PSCustomObject
To preserve enumerable types as output from functions, you'll need to either use Write-Output -NoEnumerate, or wrap the it in an array using the , operator:
function ReturnArrayList {
param (
[int] $number
)
[System.Collections.ArrayList]$folderList = #()
$folderObject = [PSCustomObject]#{
Name = 'John'
number = $number
}
#Add the object to the array
$folderList.Add($folderObject) | Out-Null
return ,$folderList
# or
Write-Output $folderList -NoEnumerate
}
Data is usually output to the pipeline, while Write-Host bypasses the pipeline and writes to the console directly.
Using Write-Output instead of Write-Host will fix this issue. You can easily find more in-depth information on this topic, and when not to Write-Host.

Passing switch parameter thru pipeline in PowerShell

Passing switch parameter thru pipeline in PowerShell
Problem
I am trying to make a function that has a switch parameter, but also I want to able to pass all function parameters thru pipeline in a script, and I don't know ho to do that. Is it that even possible? I my case I load parameters from .csv file in witch values are string values.
Exposition
To simplify my problem and to make it easier for others to use answers of this question, I am not going to use my code but an abstract version of my code. Let us call my function New-Function that has a -StringParameter, a -IntParameter and a -SwitchParameter parameters. And just to be clear in my .csv file all fields are named same as the New-Function parameters.
Using the function
Normally I you can use the New-Function this way:
New-Function -StringParameter "value" -IntParameter 123 -SwitchParameter
But I also want to use the New-Function this way:
$Data = Import-Csv -Path "$PSScriptRoot\Data.csv" -Delimiter ';'
$Data | New-Function
My attempts
I have tried to convert the string values in pipe line to boolean but it seems like the function's -SwitchParameter does not accept boolean($true, $false) values, because it skipping the process block completely when I debug it.
$Data | ForEach-Object -Process {
if ($_.SwitchParameter -eq "true") {
$_.SwitchParameter = $true
}
else {
$_.SwitchParameter = $false
}
} | New-Function
My temporary workaround
I have settled to use a string parameter instead of a switch parameter, so I can feed the New-Function with data thru pipeline from a .csv file with no problem.
function New-Function {
param (
[Parameter(Position = 0, Mandatory, ValueFromPipelineByPropertyName)]
[string]
$StringParameter,
[Parameter(Position = 1, Mandatory, ValueFromPipelineByPropertyName)]
[int]
$IntParameter,
[Parameter(Position = 2, ValueFromPipelineByPropertyName)]
[string]
$SwitchParameter = "false"
)
#----------------------------------------------------------------------------
}
You have to convert values for switch parameter to boolean type.
It works to me:
function Out-Test
{
param
(
[Parameter(ValueFromPipelineByPropertyName)]
[String]
$Label,
[Parameter(ValueFromPipelineByPropertyName)]
[Switch]
$Show
)
process
{
$Color = if ($Show) { 'Yellow' } else { 'Gray' }
Write-Host -ForegroundColor $Color $Label
}
}
$row1 = '' | select Label, Show
$row1.Label = 'First'
$row1.Show = 'True'
$row2 = '' | select Label, Show
$row2.Label = 'Second'
$row1.Show = 'False'
$rows = $row1, $row2
$rows |% { $_.Show = [bool]$_.Show }
$rows | Out-Test
Result:
You can convert your string to a Boolean object while leaving your parameter as type [switch] in your function. The Boolean type will be coerced into [switch] during binding.
$Data | Foreach-Object {
$_.SwitchParameter = [boolean]::Parse($_.SwitchParameter)
$_
} | New-Function
Alternatively, you can update all of your objects first and then pipe to your function. It matters how your function handles the input objects.
$Data | Foreach-Object {
$_.SwitchParameter = [boolean]::Parse($_.SwitchParameter)
}
$Data | New-Function
Part of the issue with your Foreach-Object attempt is that you never output the updated object $_ before piping into your function.

Convert Import-CSV result from string to arbitrary data types

Suppose I read in a CSV file from PowerShell:
$data = Import-Csv "myfilename.csv"
CSV files (in general) can contain strings and numbers, but PowerShell stores them in memory as strings:
PS D:\> $data[0].Col3.GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True String System.Object
After importing, it would be useful to be able to convert the types from string. If there are only one or two columns then I can convert them using a calculated property as follows:
$data = Import-Csv "myfilename.csv" |
select -Property #{n='Col2';e={[int]$_.Col2}},
#{n='Col3';e={[double]$_.Col3}}
But suppose I don't know in advance the column names and intended types. Instead I have an arbitrary "schema" telling me which columns should be which type, for example:
$Schema = #{Col1=[string];Col2=[int];Col3=[double]}
How can I convert the output from Import-CSV to the types as determined by the schema? (And preferably in an efficient/elegant way)
Sample CSV file
"Col1","Col2","Col3"
"a",2,4.3
"b",5,7.9
You can do this with a -as cast:
$data = Import-Csv "myfilename.csv" |
select -Property #{n='Col2';e={$_.Col2 -as $Schema.Col2}},
#{n='Col3';e=$_.Col3 -as $Schema.Col3}}
For an arbitrary number of columns you can extend the approach outlined in this answer to a similar question:
$data = Import-Csv "myfilename.csv" | Foreach-Object {
foreach ($property in $_.PSObject.Properties) {
$property.Value = $property.Value -as $Schema[$property.Name]
}
$_ # return the modified object
}
I expanded upon Martin Brandl's great answer here in two ways:
First, it can handle more complex cases. Instead of having the schema be a hash table of data types, I generalized it to be a hash table of conversion functions. This allows you to do non-trivial data-type conversions as well as per-column pre-processing/clean-up.
I also flipped the for-each logic so that it iterates through the schema keys instead of the object's properties. That way, your schema doesn't need to contain every field, which is helpful if you have a CSV with many string fields that can be left alone and just a few fields that need data type conversion.
In the example below:
The text column is purposefully left out of the schema to demonstrate that that's ok.
The memory column is converted from bytes to kilobytes.
The third column is converted from sloppy strings to booleans.
Example
$testData = #(
[PSCustomObject]#{Text = 'A'; MemoryWithConversion = 10*1024; BooleanText="yes"},
[PSCustomObject]#{Text = 'B'; MemoryWithConversion = 20*1024; BooleanText="no"},
[PSCustomObject]#{Text = 'C'; MemoryWithConversion = 30*1024; BooleanText=""}
)
$testData | Export-Csv 'test.csv'
$schema = #{
MemoryWithConversion = {Param($value) $value / 1kB -as [int]}
BooleanText = {Param($value) $value -in 'true', 't', 'yes', 'y' -as [boolean]}
}
Import-Csv 'test.csv' | Foreach-Object {
foreach ($key in $schema.Keys) {
$property = $_.PSObject.Properties[$key]
if ($property -ne $null) {
$property.Value = & $schema[$property.Name] $property.Value
}
}
$_
}
Result
Text MemoryWithConversion BooleanText
---- -------------------- -----------
A 10 True
B 20 False
C 30 False

Powershell: Custom object to CSV

I created custom object that basically stores Date and few integers in it's keys:
$data = [Ordered]#{
"Date" = $currentdate.ToString('dd-MM-yyyy');
"Testers" = $totalTesterCount;
"StNoFeedback" = $tester_status.NoFeedback;
"StNotSolved" = $tester_status.NotSolved;
"StSolved" = $tester_status.Solved;
"StNoIssues" = $tester_status.NoIssues;
"OSNoFeedback" = $tester_os.NoFeedback;
"OSW7" = $tester_os.W7;
"OSW10" = $tester_os.W10;
"OfficeNoFeedback" = $tester_Office.NoFeedback;
"OfficeO10" = $tester_Office.O10;
"OfficeO13" = $tester_Office.O13;
"OfficeO16" = $tester_Office.O16;
}
I need to Output it to CSV file in a way that every value is written in new column.
I tried using $data | export-csv dump.csv
but my CSV looks like that:
#TYPE System.Collections.Specialized.OrderedDictionary
"Count","IsReadOnly","Keys","Values","IsFixedSize","SyncRoot","IsSynchronized"
"13","False","System.Collections.Specialized.OrderedDictionary+OrderedDictionaryKeyValueCollection","System.Collections.Specialized.OrderedDictionary+OrderedDictionaryKeyValueCollection","False","System.Object","False"
Not even close to what I want to achieve. How to get something closer to:
date,testers,stnofeedback....
04-03-2016,2031,1021....
I created the object because it was supposed to be easy to export it as csv. Maybe there is entirely different, better approach? Or is my object lacking something?
You didn't create an object, you created an ordered dictionary. A dictionary can't be exported to CSV-directly as it's a single object which holds multiple key-value-entries.
([ordered]#{}).GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True OrderedDictionary System.Object
To export a dictionary you need to use GetEnumerator() to get the objects one by one, which would result in this CSV:
$data = [Ordered]#{
"Date" = (get-date).ToString('dd-MM-yyyy')
"Testers" = "Hello world"
}
$data.GetEnumerator() | ConvertTo-Csv -NoTypeInformation
"Name","Key","Value"
"Date","Date","04-03-2016"
"Testers","Testers","Hello world"
If you want a single object, cast the hashtable of properties to a PSObject using [pscustomobject]#{}.
$data = [pscustomobject]#{
"Date" = (get-date).ToString('dd-MM-yyyy')
"Testers" = "Hello world"
}
$data | ConvertTo-Csv -NoTypeInformation
"Date","Testers"
"04-03-2016","Hello world"
Or if you're using PS 1.0 or 2.0:
$data = New-Object -TypeName psobject -Property #{
"Date" = (get-date).ToString('dd-MM-yyyy')
"Testers" = "Hello world"
}
$data | ConvertTo-Csv -NoTypeInformation
"Testers","Date"
"Hello world","04-03-2016"