Doing basic math in regex capture groups [duplicate] - json

I have a string needs to be changed in a file between two values. What I want to do is if I found value A then change to value B, if I found value B then change to value A. there will be a message box popup saying that value has been changed to [xxxxx] then background picture will be also changed accordingly.
$path = c:\work\test.xml
$A = AAAAA
$B = BBBBB
$settings = get-content $path
$settings | % { $_.replace($A, $B) } | set-content $path
I could not figured out how to use IF A then replace with B or IF B then replace A. Also, the code above will delete rest of contents in the file and only save the part that I modified back to the file.

Assuming that $A and $B contain just simple strings rather than regular expressions you could use a switch statement with wildcard matches:
$path = 'c:\work\test.xml'
$A = 'AAAAA'
$B = 'BBBBB'
(Get-Content $path) | % {
switch -wildcard ($_) {
"*$A*" { $_ -replace [regex]::Escape($A), $B }
"*$B*" { $_ -replace [regex]::Escape($B), $A }
default { $_ }
}
} | Set-Content $path
The [regex]::Escape() makes sure that characters having a special meaing in regular expressions are escaped, so the values are replaced as literal strings.
If you're aiming for something a little more advanced, you could use a regular expression replacement with a callback function:
$path = 'c:\work\test.xml'
$A = 'AAAAA'
$B = 'BBBBB'
$rep = #{
$A = $B
$B = $A
}
$callback = { $rep[$args[0].Groups[1].Value] }
$re = [regex]("({0}|{1})" -f [regex]::Escape($A), [regex]::Escape($B))
(Get-Content $path) | % {
$re.Replace($_, $callback)
} | Set-Content $path

This isn't tested extensively, but I think it should work:
path = c:\work\test.xml
$A = 'AAAAA'
$B = 'BBBBB'
[regex]$regex = "$A|$B"
$text =
Get-Content $path |
foreach {
$regex.Replace($text,{if ($args[0].value -eq $A){$B} else {$A}})
}
$text | Set-Content $path
Hard to be sure without knowing exactly what the data looks like.

Related

What is the good way to read data from CSV and converting them to JSON?

I am trying to read the data from CSV file which has 2200000 records using PowerShell and storing each record in JSON file, but this takes almost 12 hours.
Sample CSV Data:
We will only concern about the 1st column value's.
Code:
function Read-IPData
{
$dbFilePath = Get-ChildItem -Path $rootDir -Filter "IP2*.CSV" | ForEach-Object{ $_.FullName }
Write-Host "file path - $dbFilePath"
Write-Host "Reading..."
$data = Get-Content -Path $dbFilePath | Select-Object -Skip 1
Write-Host "Reading data finished"
$count = $data.Count
Write-host "Total $count records found"
return $data
}
function Convert-NumbetToIP
{
param(
[Parameter(Mandatory=$true)][string]$number
)
try
{
$w = [int64]($number/16777216)%256
$x = [int64]($number/65536)%256
$y = [int64]($number/256)%256
$z = [int64]$number%256
$ipAddress = "$w.$x.$y.$z"
Write-Host "IP Address - $ipAddress"
return $ipAddress
}
catch
{
Write-Host "$_"
continue
}
}
Write-Host "Getting IP Addresses from $dbFileName"
$data = Read-IPData
Write-Host "Checking whether output.json file exist, if not create"
$outputFile = Join-Path -Path $rootDir -ChildPath "output.json"
if(!(Test-Path $outputFile))
{
Write-Host "$outputFile doestnot exist, creating..."
New-Item -Path $outputFile -type "file"
}
foreach($item in $data)
{
$row = $item -split ","
$ipNumber = $row[0].trim('"')
Write-Host "Converting $ipNumber to ipaddress"
$toIpAddress = Convert-NumbetToIP -number $ipNumber
Write-Host "Preparing document JSON"
$object = [PSCustomObject]#{
"ip-address" = $toIpAddress
"is-vpn" = "true"
"#timestamp" = (Get-Date).ToString("o")
}
$document = $object | ConvertTo-Json -Compress -Depth 100
Write-Host "Adding document - $document"
Add-Content -Path $outputFile $document
}
Could you please help optimize the code or is there a better way to do it. or is there a way like multi-threading.
Here is a possible optimization:
function Get-IPDataPath
{
$dbFilePath = Get-ChildItem -Path $rootDir -Filter "IP2*.CSV" | ForEach-Object FullName | Select-Object -First 1
Write-Host "file path - $dbFilePath"
$dbFilePath # implicit output
}
function Convert-NumberToIP
{
param(
[Parameter(Mandatory=$true)][string]$number
)
[Int64] $numberInt = 0
if( [Int64]::TryParse( $number, [ref] $numberInt ) ) {
if( ($numberInt -ge 0) -and ($numberInt -le 0xFFFFFFFFl) ) {
# Convert to IP address like '192.168.23.42'
([IPAddress] $numberInt).ToString()
}
}
# In case TryParse() returns $false or the number is out of range for an IPv4 address,
# the output of this function will be empty, which converts to $false in a boolean context.
}
$dbFilePath = Get-IPDataPath
$outputFile = Join-Path -Path $rootDir -ChildPath "output.json"
Write-Host "Converting CSV file $dbFilePath to $outputFile"
$object = [PSCustomObject]#{
'ip-address' = ''
'is-vpn' = 'true'
'#timestamp' = ''
}
# Enclose foreach loop in a script block to be able to pipe its output to Set-Content
& {
foreach( $item in [Linq.Enumerable]::Skip( [IO.File]::ReadLines( $dbFilePath ), 1 ) )
{
$row = $item -split ','
$ipNumber = $row[0].trim('"')
if( $ip = Convert-NumberToIP -number $ipNumber )
{
$object.'ip-address' = $ip
$object.'#timestamp' = (Get-Date).ToString('o')
# Implicit output
$object | ConvertTo-Json -Compress -Depth 100
}
}
} | Set-Content -Path $outputFile
Remarks for improving performance:
Avoid Get-Content, especially for line-by-line processing it tends to be slow. A much faster alternative is the File.ReadLines method. To skip the header line, use the Linq.Enumerable.Skip() method.
There is no need to read the whole CSV into memory first. Using ReadLines in a foreach loop does lazy enumeration, i. e. it reads only one line per loop iteration. This works because it returns an enumerator instead of a collection of lines.
Avoid try and catch if exceptions occur often, because the "exceptional" code path is very slow. Instead use Int64.TryParse() which returns a boolean indicating successful conversion.
Instead of "manually" converting the IP number to bytes, use the IPAddress class which has a constructor that takes an integer number. Use its method .GetAddressBytes() to get an array of bytes in network (big-endian) order. Finally use the PowerShell -join operator to create a string of the expected format.
Don't allocate a [pscustomobject] for each row, which has some overhead. Create it once before the loop and inside the loop only assign the values.
Avoid Write-Host (or any output to the console) within inner loops.
Unrelated to performance:
I've removed the New-Item call to create the output file, which isn't necessary because Set-Content automatically creates the file if it doesn't exist.
Note that the output is in NDJSON format, where each line is like a JSON file. In case you actually want this to be a regular JSON file, enclose the output in [ ] and insert a comma , between each row.
Modified processing loop to write a regular JSON file instead of NDJSON file:
& {
'[' # begin array
$first = $true
foreach( $item in [Linq.Enumerable]::Skip( [IO.File]::ReadLines( $dbFilePath ), 1 ) )
{
$row = $item -split ','
$ipNumber = $row[0].trim('"')
if( $ip = Convert-NumberToIP -number $ipNumber )
{
$object.'ip-address' = $ip
$object.'#timestamp' = (Get-Date).ToString('o')
$row = $object | ConvertTo-Json -Compress -Depth 100
# write array element delimiter if necessary
if( $first ) { $row; $first = $false } else { ",$row" }
}
}
']' # end array
} | Set-Content -Path $outputFile
You can optimize the function Convert-NumberToIP like below:
function Convert-NumberToIP {
param(
[Parameter(Mandatory=$true)][uint32]$number
)
# either do the math yourself like this:
# $w = ($number -shr 24) -band 255
# $x = ($number -shr 16) -band 255
# $y = ($number -shr 8) -band 255
# $z = $number -band 255
# '{0}.{1}.{2}.{3}' -f $w, $x, $y, $z # output the dotted IP string
# or use .Net:
$n = ([IPAddress]$number).GetAddressBytes()
[array]::Reverse($n)
([IPAddress]$n).IPAddressToString
}

PowerShell JSON adding value format

I am adding data to a json file. I do this by
$blockcvalue =#"
{
"connectionString":"server=(localdb)\\mssqllocaldb; Integrated Security=true;Database=$database;"
}
"#
$ConfigJson = Get-Content C:\Users\user\Desktop\myJsonFile.json -raw | ConvertFrom-Json
$ConfigJson.data | add-member -Name "database" -value (Convertfrom-Json $blockcvalue) -MemberType NoteProperty
$ConfigJson | ConvertTo-Json| Set-Content C:\Users\user\Desktop\myJsonFile.json
But the format comes out like this:
{
"data": {
"database": {
"connectionString": "server=(localdb)\\mssqllocaldb; Integrated Security=true;Database=mydatabase;"
}
}
}
but I need it like this:
{
"data": {
"database":"server=(localdb)\\mssqllocaldb; Integrated Security=true;Database=mydatabase;"
}
}
}
Can someone help please?
Here's my function to prettify JSON output:
function Format-Json {
<#
.SYNOPSIS
Prettifies JSON output.
.DESCRIPTION
Reformats a JSON string so the output looks better than what ConvertTo-Json outputs.
.PARAMETER Json
Required: [string] The JSON text to prettify.
.PARAMETER Indentation
Optional: The number of spaces to use for indentation. Defaults to 2.
.PARAMETER AsArray
Optional: If set, the output will be in the form of a string array, otherwise a single string is output.
.EXAMPLE
$json | ConvertTo-Json | Format-Json -Indentation 4
#>
[CmdletBinding()]
Param(
[Parameter(Mandatory = $true, Position = 0, ValueFromPipeline = $true)]
[string]$Json,
[int]$Indentation = 2,
[switch]$AsArray
)
# If the input JSON text has been created with ConvertTo-Json -Compress
# then we first need to reconvert it without compression
if ($Json -notmatch '\r?\n') {
$Json = ($Json | ConvertFrom-Json) | ConvertTo-Json -Depth 100
}
$indent = 0
$Indentation = [Math]::Abs($Indentation)
$regexUnlessQuoted = '(?=([^"]*"[^"]*")*[^"]*$)'
$result = $Json -split '\r?\n' |
ForEach-Object {
# If the line contains a ] or } character,
# we need to decrement the indentation level unless it is inside quotes.
if ($_ -match "[}\]]$regexUnlessQuoted") {
$indent = [Math]::Max($indent - $Indentation, 0)
}
# Replace all colon-space combinations by ": " unless it is inside quotes.
$line = (' ' * $indent) + ($_.TrimStart() -replace ":\s+$regexUnlessQuoted", ': ')
# If the line contains a [ or { character,
# we need to increment the indentation level unless it is inside quotes.
if ($_ -match "[\{\[]$regexUnlessQuoted") {
$indent += $Indentation
}
$line
}
if ($AsArray) { return $result }
return $result -Join [Environment]::NewLine
}
Use it like so:
$ConfigJson | ConvertTo-Json | Format-Json | Set-Content C:\Users\user\Desktop\myJsonFile.json
Replace
(Convertfrom-Json $blockcvalue)
with
(Convertfrom-Json $blockcvalue).connectionString
Then your output object's data.database property will directly contain the "server=(localdb)\\..." value, as desired, not via a nested object that has a connectionString property.
There is one simple Newtonsoft.Json Parser which makes it rly simple to get required format:
Import-Module Newtonsoft.Json
$path = "C:\..."
$json = Get-Content -Path $path -Raw
$parsedJson = [Newtonsoft.Json.Linq.JToken]::Parse($json);
Set-Content $path $parsedJson.ToString();
Enjoy ;)

Loop through a CSV file and verify column count for each row

I'm new to PowerShell and have been trying to loop through a CSV file and return column count of each row. Compare that column count to the first row and have something happen it its not equal. In this case replace comma with nothing. Then create a new file with the changes.
$csvColumnCount = (import-csv "a CSV file" | get-member -type NoteProperty).count
$CurrentFile = Get-Content "a CSV file" |
ForEach-Object { $CurrentLineCount = import-csv "a CSV file" | get-member -type NoteProperty).count
$Line = $_
if ($csvColumnCount -ne $CurrentLineCount)
{ $Line -Replace "," , "" }
else
{ $Line } ;
$CurrentLineCount++} |
Set-Content ($CurrentFile+".out")
Copy-Item ($CurrentFile+".out") $ReplaceCSVFile
If your intention is to check which rows of a CSV file are invalid then just use a simple split and count, something like so:
$csv = Get-Content 'your_file.csv'
$count = ($csv[0] -split ',').count
$csv | Select -Skip 1 | % {
if(($_ -split ',').count -eq $count) {
...do valid stuff
} else {
...do invalid stuff
}
}
For CSV checking purposes avoid CSV cmdlets because these will have a tendency to try and correct problems, for example:
$x = #"
a,b,c
1,2,3,4
"#
$x | ConvertFrom-Csv
> a b c
- - -
1 2 3
Also I think the flow of your code is a little confused. You trying to return the results of a pipeline to a variable called $CurrentFile whilst at the other end of that pipeline you are trying use the same variable as a file name for Set-Content.
If your CSV has quoted fields which could contain commas then a simple split will not work. If that is the case a better option would be to use a regex to break each line into columns which can then be counted. Something like this:
$re = '(?:^|,)(?:\"(?:[^\"]+|\"\")*\"|[^,]*)'
$csv = Get-Content 'your_file.csv'
$count = [regex]::matches($csv[0], $re).groups.count
$csv | Select -Skip 1 | % {
if([regex]::matches($_, $re).groups.count -eq $count) {
...do valid stuff
} else {
...do invalid stuff
}
}

Unable to combine all csv files using powershell

I would like to combine all the csv files in my local folder but it shows empty results. I am trying to take the header of the first file and skip all the headers in the rest of the files in the folder and join them.
get-childItem "C:\Users\*.csv" | foreach {[System.IO.File]::AppendAllText
("C:\Users\finalCSV.csv", [System.IO.File]::ReadAllText($_.FullName))}
$getFirstLine = $true
get-childItem "C:\Users\*.csv" | foreach {
$filePath = $_
$lines = $lines = Get-Content $filePath
$linesToWrite = switch($getFirstLine) {
$true {$lines}
$false {$lines | Select -Skip 1}
}
$getFirstLine = $false
Add-Content "C:\Users\finalCSV.csv" $linesToWrite
}
My end result is that when I open finalCSV.csv it shows no results.
I think you are trying to overwork your solution. Just use Import-Csv and append to an array. Something like this:
$a = #(); ls *.csv | % {$a += (Import-Csv $_.FullName)}; $a
Works even if the columns are in a different order.

Replace blank characters from a file line by line

I would like to be able to find all blanks from a CSV file and if a blank character is found on a line then should appear on the screen and I should be asked if I want to keep the entire line which contains that white space or remove it.
Let's say the directory is C:\Cr\Powershell\test. In there there is one CSV file abc.csv.
Tried doing it like this but in PowerShell ISE the $_.PSObject.Properties isn't recognized.
$csv = Import-Csv C:\Cr\Powershell\test\*.csv | Foreach-Object {
$_.PSObject.Properties | Foreach-Object {$_.Value = $_.Value.Trim()}
}
I apologize for not includding more code and what I tried more so far but they were silly attempts since I just begun.
This looks helpful but I don't know exactly how to adapt it for my problem.
Ok man here you go:
$yes = New-Object System.Management.Automation.Host.ChoiceDescription "&Yes", "Retain line."
$no = New-Object System.Management.Automation.Host.ChoiceDescription "&No", "Delete line."
$n = #()
$f = Get-Content .\test.csv
foreach($item in $f) {
if($item -like "* *"){
$res = $host.ui.PromptForChoice("Title", "want to keep this line? `n $item", [System.Management.Automation.Host.ChoiceDescription[]]($yes, $no), 0)
switch ($res)
{
0 {$n+=$item}
1 {}
}
} else {
$n+=$item
}
}
$n | Set-Content .\test.csv
if you have questions please post in the comments and i will explain
Get-Content is probably a better approach than Import-Csv, because that'll allow you to check an entire line for spaces instead of having to check each individual field. For fully automated processing you'd just use a Where-Object filter to remove non-matching lines from the output:
Get-Content 'C:\CrPowershell\test\input.csv' |
Where-Object { $_ -notlike '* *' } |
Set-Content 'C:\CrPowershell\test\output.csv'
However, since you want to prompt for each individual line that contains spaces you need a ForEach-Object (or a similiar construct) and a nested conditional, like this:
Get-Content 'C:\CrPowershell\test\input.csv' | ForEach-Object {
if ($_ -notlike '* *') { $_ }
} | Set-Content 'C:\CrPowershell\test\output.csv'
The simplest way to prompt a user for input is Read-Host:
$answer = Read-Host -Prompt 'Message'
if ($answer -eq 'y') {
# do one thing
} else {
# do another
}
In your particular case you'd probably do something like this for any matching line:
$anwser = Read-Host "$_`nKeep the line? [y/n] "
if ($answer -ne 'n') { $_ }
The above checks if the answer is not n to make removal of the line a conscious decision.
Other ways to prompt for user input are choice.exe (which has the additional advantage of allowing a timeout and a default answer):
choice.exe /c YN /d N /t 10 /m "$_`nKeep the line"
if ($LastExitCode -ne 2) { $_ }
or the host UI:
$title = $_
$message = 'Keep the line?'
$yes = New-Object Management.Automation.Host.ChoiceDescription '&Yes'
$no = New-Object Management.Automation.Host.ChoiceDescription '&No'
$options = [Management.Automation.Host.ChoiceDescription[]]($yes, $no)
$answer = $Host.UI.PromptForChoice($title, $message, $options, 1)
if ($answer -ne 1) { $_ }
I'm leaving it as an exercise for you to integrate whichever prompting routine you chose with the rest of the code.