I have no experience with PowerShell and I was asked to create this script as a favor for a friend of mine. The script is supposed to read a csv file (These files have different columns except for time and host, which are common among all files), and output its content into a JSON file of the following format:
CSV file contains columns:
host| message | time | severity | source |
{
"time": 1437522387,
"host": "dataserver992.example.com",
"event": {
"message": "Something happened",
"severity": "INFO",
"source": "testapp"
#...All columns except for time and host should be under "event"
}
}
*The only guaranteed columns are time and host. All other column headers vary from file to file.
This is part of what I have so far:
$csvFile = Import-Csv $filePath
function jsonConverter($file)
{
#Currently not in use
$eventString = $file| select * -ExcludeProperty time, host
$file | Foreach-Object {
Write-Host '{'
Write-Host '"host":"'$_.host'",'
Write-Host '"time":"'$_.time'",'
Write-Host '"event":{'
#TODO: Put all other columns (key, values) under event - Except for
time and host
Write-Host '}'
}
}
jsonConverter($csvFile)
Any ideas of how I could extract only the remaining columns, row by row, outputting its content to a key, value JSON format like the example above?
Thank you!
Provided your csv looks like this:
"host","message","time","severity","source"
"dataserver992.example.com","Something happened","1437522387","INFO","testapp"
this script:
$filepath = '.\input.csv'
$csvData = Import-Csv $filePath
$NewCsvData = foreach($Row in $csvData){
[PSCustomObject]#{
time = $Row.time
host = $Row.host
event = ($Row| Select-Object -Property * -ExcludeProperty time,host)
}
}
$NewCsvData | ConvertTo-Json
will output this Json:
{
"time": "1437522387",
"host": "dataserver992.example.com",
"event": {
"message": "Something happened",
"severity": "INFO",
"source": "testapp"
}
}
If your powershell version is 3.0 or higher (it should):
Import-CSV $filepath | ConvertTo-JSON
Done!
Related
One requirement of mine is - Using windows, not use any tools not already available as part of aws cli or windows
For example, I have this json file test.json with below content:
"My number is $myvar"
I read this into a powershell variable like so:
$myobj=(get-content .\test.json | convertfrom-json)
$myvar=1
From here, I would like to do something with this $myobj which will enable me to get this output:
$myobj | tee json_with_values_from_environment.json
My number is 1
I got some limited success with iex, but not sure if it can be made to work for this example
You can use $ExecutionContext.InvokeCommand.ExpandString()
$myobj = '{test: "My number is $myvar"}' | ConvertFrom-Json
$myvar = 1
$ExecutionContext.InvokeCommand.ExpandString($myobj.test)
Output
My number is 1
Here is one way to do it using the Parser to find all VariableExpressionAst and replace them with the values in your session.
Given the following test.json:
{
"test1": "My number is $myvar",
"test2": {
"somevalue": "$env:myothervar",
"someothervalue": "$anothervar !!"
}
}
We want to find and replace $myvar, $myothervar and $anothervar with their corresponding values defined in the current session, so the code looks like this (note that we do the replacement before converting the Json string into an object, this way is much easier):
using namespace System.Management.Automation.Language
$isCore7 = $PSVersionTable.PSVersion -ge '7.2'
# Define the variables here
$myvar = 10
$env:myothervar = 'hello'
$anothervar = 'world'
# Read the Json
$json = Get-Content .\test.json -Raw
# Now parse it
$ast = [Parser]::ParseInput($json, [ref] $null, [ref] $null)
# Find all variables in it, and enumerate them
$ast.FindAll({ $args[0] -is [VariableExpressionAst] }, $true) |
Sort-Object { $_.Extent.Text } -Unique | ForEach-Object {
# now replace the text with the actual value
if($isCore7) {
# in PowerShell Core is very easy
$json = $json.Replace($_.Extent.Text, $_.SafeGetValue($true))
return
}
# in Windows PowerShell not so much
$varText = $_.Extent.Text
$varPath = $_.VariablePath
# find the value of the var (here we use the path)
$value = $ExecutionContext.SessionState.PSVariable.GetValue($varPath.UserPath)
if($varPath.IsDriveQualified) {
$value = $ExecutionContext.SessionState.InvokeProvider.Item.Get($varPath.UserPath).Value
}
# now replace the text with the actual value
$json = $json.Replace($varText, $value)
}
# now we can safely convert the string to an object
$json | ConvertFrom-Json
If we were to convert it back to Json to see the result:
{
"test1": "My number is 10",
"test2": {
"somevalue": "hello",
"someothervalue": "world !!"
}
}
I have extra large log file in CSV format which includes JSON formatted data inside. What I'm trying to do is extract JSON parts from the data and store it in a separate file.
The real problem is that the file size is almost 70Gb which causes some interesting problems to tackle.
The file size makes it impossible to read the whole file in one chunk. With Powershell's Get-Content combined with -ReadCount and Foreach-Object I can take smaller chunks and run regex pattern over them, chunk by chunk.
$Path = <pathToFile>
$outPath = <pathToOutput>
Out-File -Encoding utf8 -FilePath $outPath
$JsonRegex = "(?smi)\{.*?\}"
Get-Content -Path $Path -ReadCount 100000 | Foreach-Object {
( "$_" | Select-String -Pattern $JsonRegex -AllMatches | Foreach-Object { $_.Matches } | Foreach-Object { $_.Value } ) | Add-Content $outPath
}
But here what happens is, every 100k lines the ReadCount is in the middle of a JSON object thus skipping said object and continuing from next object.
Here is an example how this log data looks like. It includes some columns on first row and then JSON formatted data which is not consistent so I cannot use any fixed ReadCount value to avoid being in the middle of a JSON object.
"5","5","9/10/2019 12:00:46 AM","2","some","data","removed","comment","{
"message": "comment",
"level": "Information",
"logType": "User",
"timeStamp": "2019-09-10T03:00:46.5573047+03:00",
"fingerprint": "some",
}","11"
"5","5","9/10/2019 12:00:46 AM","2","some","data","removed","comment","{
"message": "comment",
"level": "Information",
"logType": "User",
"timeStamp": "2019-09-10T03:00:46.5672713+03:00",
"fingerprint": "some",
"windowsIdentity": "LOCAL\\WinID",
"machineName": "TK-141",
"processVersion": "1.0.71",
"jobId": "24a8",
"machineId": 11
}","11"
Is there any way to accomplish this without missing any data rows from the gigantous logfile?
Use a switch statement with the -Regex and -File parameters to efficiently (by PowerShell standards) read the file line by line and keep state across multiple lines.
For efficient writing to a file, use a .NET API, namely a System.IO.StreamWriter instance.
The following code assumes:
Each JSON string spans multiple lines and is non-nested.
On a given line, an opening { / closing } unambiguously marks the start / end of a (multi-line) JSON string.
# Input file path
$path = '...'
# Output file path
# Important: specify a *full* path
$outFileStream = [System.IO.StreamWriter] "$PWD/out.txt"
$json = ''
switch -Regex -File $path {
'\{.*' { $json = $Matches[0]; continue }
'.*\}' {
$json += "`n" + $Matches[0]
$outFileStream.WriteLine($json)
$json = ''
continue
}
default { if ($json) { $json += "`n" + $_ } }
}
$outFileStream.Close()
If you can further assume that no part of the JSON string follows the opening { / precedes the closing } on the same line, as your sample data suggest, you can simplify (and speed up) the switch statement:
$json = ''
switch -Regex -File $path {
'\{$' { $json ='{'; continue }
'^\}' { $outFileStream.WriteLine(($json + "`n}")); $json = ''; continue }
default { if ($json) { $json += "`n" + $_ } }
}
$outFileStream.Close()
Doug Maurer had a solution attempt involving a System.Text.StringBuilder instance so as to optimize the iterative concatenation of the parts making up each JSON string:
However, at least with an input file crafted from many repetitions of the sample data, I saw only a small performance gain in my informal tests.
For the sake of completeness, here's the System.Text.StringBuilder solution:
$json = [System.Text.StringBuilder]::new(512) # tweak the buffer size as needed
switch -Regex -File $path {
'\{$' { $null = $json.Append('{'); continue }
'^\}' { $outFileStream.WriteLine($json.Append("`n}").ToString()); $null = $json.Clear(); continue }
default { if ($json.Length) { $null = $json.Append("`n").Append($_) } }
}
$outFileStream.Close()
I need to integrate a JSON file which contains paths of the different objects in a PS script that generates and compares the hash files of source and destination. The paths in the JSON file are written in the format that I have stated below. I want to use the paths in that manner and pipe them into Get-FileHash in PowerShell. I can't figure out how to integrate my current PowerShell script with the JSON file that contains the information (File Name, full path) etc.
I have two scripts that I have tested and they work fine. One generates the MD5 hashes of two directories (source and destination) and stores them in a csv file. The other compares the MD5 hashes from the two CSV files and generates a new one, showing the result(whether a file is absent from source or destination).
Now, I need to integrate these scripts in another one, which is basically a Powershell Installer. The installer saves the configurations (path, ports, new files to be made etc etc) in a JSON format. In my original scripts, the user would type the path of the source and destination which needed to be compared. However, I now need to take the path from the JSON configuration files. For example, the JSON file below is of the similar nature as the one I have.
{
"destinatiopath": "C:\\Destination\\Mobile Phones\\"
"sourcepath": "C:\\Source\\Mobile Phones\\"
"OnePlus" : {
"files": [
{
"source": "6T",
"destination: "Model\\6T",
]
}
"Samsung" : {
"files": [
{
"source": "S20",
"destination": "Galaxy\\S20",
}
]
This is just a snippet of the JSON code. It's supposed to have the destination and source files. So for instance if the destination path is: C:\\Destination\\Mobile Phones\\ and the source path is C:\\Source\\Mobile Phones\\ and OnePlus has 6T as source and Model\\6T as destination, that means that the Powershell Installer will have the full path C:\\Destination\\Mobile Phones\\Model\\6T as the destination, and C:\\Source\\Mobile Phones\\6T as the source. The same will happen for Samsung and others.
For now, the MD5 hash comparison PS script just generates the CSV files in the two desired directories and compares them. However, I need to check the source and destination of each object in this case. I can't figure out how I can integrate it here. I'm pasting my MD5 hash generation code below.
Generating hash
#$p is the path. In this case, I'm running the script twice in order to get the hashes of both source and destination.
#$csv is the path where the csv will be exported.
Get-ChildItem $p -Recurse | ForEach-Object{ Get-FileHash $_.FullName -Algorithm MD5 -ErrorAction SilentlyContinue} | Select-Object Hash,
#{
Name = "FileName";
Expression = { [string]::Join("\", ($_.Path -split "\\" | Select-Object -Skip ($number))) }
} | Export-Csv -Path $csv
I want to use the paths in that manner and pipe them into Get-FileHash in PowerShell.
As the first step I would reorganize the JSON to be easier to handle. This will make a big difference on the rest of the script.
{
"source": "C:\\Source\\Mobile Phones",
"destination": "C:\\Destination\\Mobile Phones",
"phones": [
{
"name": "OnePlus",
"source": "6T",
"destination": "Model\\6T"
},
{
"name": "Samsung",
"source": "S20",
"destination": "Galaxy\\S20"
}
]
}
Now it's very easy to get all the paths no matter how many "phone" entries there are. You don't even really need an intermediary CSV file.
$config = Get-Content config.json -Encoding UTF8 -Raw | ConvertFrom-Json
$config.phones | ForEach-Object {
$source_path = Join-Path $config.source $_.source
$destination_path = Join-Path $config.destination $_.destination
$source_hashes = Get-ChildItem $source_path -File -Recurse | Get-FileHash -Algorithm MD5
$destination_hashes = Get-ChildItem $destination_path -File -Recurse | Get-FileHash -Algorithm MD5
# the combination of relative path and file hash needs to be unique, so let's combine them
$source_relative = $source_hashes | ForEach-Object {
[pscustomobject]#{
Path = $_.Path
PathHash = $_.Path.Replace($source_path, "") + '|' + $_.Hash
}
}
$destination_relative = $destination_hashes | ForEach-Object {
[pscustomobject]#{
Path = $_.Path
PathHash = $_.Path.Replace($destination_path, "") + '|' + $_.Hash
}
}
# Compare-Object finds the difference between both lists
$diff = Compare-Object $source_relative $destination_relative -Property PathHash, Path
Write-Host $diff
$diff | ForEach-Object {
# work with $_.Path and $_.SideIndicator
}
}
I'm trying to access a JSON attribute which contains an array of strings, using PowerShell
JSON
{
"id": "00000000-0000-0000-0000-000000000000",
"teamName": "Team A",
"securityGroups": [{
"name": "Security Group 1",
"members:": ["abc#mail.com", "def#mail.com", "ghi#mail.com"]
},
{
"name": "Securiy Group 2",
"members:": ["123#mail.com", "456#mail.com", "789#mail.com"]
}]
}
PowerShell
$json = Get-Content 'test.json' | ConvertFrom-Json
ForEach($group in $json.securityGroups)
{
Write-Host "Team: $($group.name)"
ForEach($member in $group.members)
{
Write-Host "Member: $($member)"
}
}
Output
Team: Security Group 1
Team: Securiy Group 2
As you can see, only the name of the security group (securityGroup.name) gets shown. I'm unable to access the securityGroups.members node, which contains an array of strings (containing emails). My goal is to store this list of strings and loop through them.
When I check to see how the $json object looks like in PS, I get the following:
PS C:\Users\XYZ> $json
id teamName securityGroups
-- -------- --------------
00000000-0000-0000-0000-000000000000 Team A {#{name=Security Group 1; members:=System.Object[]}, #{name=Securiy Group 2; members:=System.Object[]}}
What am I missing here?
You can use this:
$json = Get-Content 'test.json' | ConvertFrom-Json
ForEach ($group in $json.securityGroups)
{
Write-Host "Team: $($group.name)"
ForEach ($member in $group."members:")
{
Write-Host "Member: $($member)"
}
}
You haven't noted that member key contains a colon at the end. Otherwise it will give you wrong result.
I have imported some JSON data and converted it to a PowerShell Object. I would like to understand how to retrieve specific portions of said data.
test.json:
{
"Table": {
"Users": {
"Columns": [ "[Id]",
"[FName]",
"[MName]",
"[SName]",
"[UName]",
"[Pasword]" ],
"data": "CustomUserData"
},
"Roles": {
"Columns": [ "[Id]",
"[Role]",
"[Description]" ],
"data": "CustomRoleData"
}
}
}
Import to PS Object:
$userdata = Get-Content .\test.json |ConvertFrom-Json
Retrieve and format column data:
PS> $userdata = Get-Content ./test.json |ConvertFrom-Json
PS> $columns = $userdata.Table.Users.Columns -join ","
PS> $columns
[Id],[FName],[MName],[SName],[UName],[Pasword]
Example retrieval of custom data:
PS> $userdata.Table.Users.data
CustomUserData
What I would like to do is:
Select just the table names. When I try and do this by calling $userdata.table I get the following:
PS> $userdata.Table |Format-List
Users : #{Columns=System.Object[]; data=CustomUserData}
Roles : #{Columns=System.Object[]; data=CustomRoleData}
What I am looking for is just a list of the table names, in this case - Users,Roles
I would also like to know how to leverage this to create a ForEach loop which cycles through each table name and prints the columns associated with each table - ultimately I will be using this to craft a SQL query.
Thank you!
Maybe this can help you.
It is a small function to output the property names recursively.
function Get-Properties($obj, [int]$level = 0) {
$spacer = " "*$level
$obj.PSObject.Properties | ForEach-Object {
$spacer + $_.Name
if ($_.Value -is [PSCustomObject]){
Get-Properties $_.Value ($level + 2)
}
}
}
In your case, you can use it like this:
$userdata = Get-Content ./test.json | ConvertFrom-Json
Get-Properties $userData
The console output will look like this:
Table
Users
Columns
data
Roles
Columns
data