& sign is converted into \u0026 through powershell - json

I have below code:
$getvalue= 'true&replicaSet=users-shard-0&authSource=adsfsdfin&readPreference=neasrest&maxPoolSize=50&minPoolSize=10&maxIdleTimeMS=60'
$getvalue = $getvalue -replace '&','&'
$pathToJson = 'C:\1\test.json'
$a = Get-content -Path $pathToJson | ConvertFrom-Json
$a.connectionStrings.serverstring=$getvalue
$a | ConvertTo-Json | Set-content $pathToJson -ErrorAction SilentlyContinue
I got below result:
true\u0026replicaSet=users-shard-0\u0026authSource=adsfsdfin\u0026readPreference=neasrest\u0026maxPoolSize=50\u0026minPoolSize=10\u0026maxIdleTimeMS=60
There & sign converted into \u0026. How to prevent covert value.
You can take reference from this question
I need & sign in json file instead of \u0026

Windows PowerShell's ConvertTo-Json unexpectedly serializes & to its equivalent Unicode escape sequence (\u0026); ditto for ', < and > (fortunately, this no longer happens in PowerShell (Core) 7+) - while unexpected and hindering readability - this isn't a problem for programmatic processing, since JSON parsers, including ConvertFrom-Json do recognize such escape sequences:
($json = 'a & b' | ConvertTo-Json) # -> `"a \u0026 b"` (WinPS)
ConvertFrom-Json $json # -> verbatim `a & b`, i.e. successful roundtrip
If you do want to convert such escape sequences to the verbatim character they represent:
This answer to the linked question shows a robust, general string-substitution approach.
However, in your case - given that you know the specific and only Unicode sequence to replace and there seems to be no risk of false positives - you can simply use another -replace operation:
$getvalue= 'true&replicaSet=users-shard-0&authSource=adsfsdfin&readPreference=neasrest&maxPoolSize=50&minPoolSize=10&maxIdleTimeMS=60'
$getvalue = $getvalue -replace '&','&'
# Simulate reading an object from a JSON
# and update one of its properties with the string of interest.
$a = [pscustomobject] #{
connectionStrings = [pscustomobject] #{
serverstring = $getValue
}
}
# Convert the object back to JSON and translate '\\u0026' into '&'.
# ... | Set-Content omitted for brevity.
($a | ConvertTo-Json) -replace '\\u0026', '&'
Output (note how the \u0026 instance were replaced with &):
{
"connectionStrings": {
"serverstring": "true&replicaSet=users-shard-0&authSource=adsfsdfin&readPreference=neasrest&maxPoolSize=50&minPoolSize=10&maxIdleTimeMS=60"
}
}
You can cover all problematic characters - & ', < and > - with multiple -replace operations:
However, if you need to rule out false positives (e.g., \\u0026), the more sophisticated solution from the aforementioned answer is required.
# Note: Use only if false positives aren't a concern.
# Sample input string that serializes to:
# "I\u0027m \u003cfine\u003e \u0026 dandy."
($json = "I'm <fine> & dandy." | ConvertTo-Json)
# Transform the Unicode escape sequences for chars. & ' < >
# back into those chars.
$json -replace '\\u0026', '&' -replace '\\u0027', "'" -replace '\\u003c', '<' -replace '\\u003e', '>'

Related

How to convert cyrillic into utf16

tl;dr Is there a way to convert cyrillic stored in hashtable into UTF-16?
Like кириллица into \u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430
I need to import file, parse it into id and value then convert it into .json and now im struggling to find a way to convert value into utf codes.
And yes, it is needed that way
cyrillic.txt:
1 кириллица
PH:
clear-host
foreach ($line in (Get-Content C:\Users\users\Downloads\cyrillic.txt)){
$nline = $line.Split(' ', 2)
$properties = #{
'id'= $nline[0] #stores "1" from file
'value'=$nline[1] #stores "кириллица" from file
}
$temp+=New-Object PSObject -Property $properties
}
$temp | ConvertTo-Json | Out-File "C:\Users\user\Downloads\data.json"
Output:
[
{
"id": "1",
"value": "кириллица"
},
]
Needed:
[
{
"id": "1",
"value": "\u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430"
},
]
At this point as a newcomer to PH i have no idea even how to search for it properly
Building on Jeroen Mostert's helpful comment, the following works robustly, assuming that the input file contains no NUL characters (which is usually a safe assumption for text files):
# Sample value pair; loop over file lines omitted for brevity.
$nline = '1 кириллица'.Split(' ', 2)
$properties = [ordered] #{
id = $nline[0]
# Insert aux. NUL characters before the 4-digit hex representations of each
# code unit, to be removed later.
value = -join ([uint16[]] [char[]] $nline[1]).ForEach({ "`0{0:x4}" -f $_ })
}
# Convert to JSON, then remove the escaped representations of the aux. NUL chars.,
# resulting in proper JSON escape sequences.
# Note: ... | Out-File ... omitted.
(ConvertTo-Json #($properties)) -replace '\\u0000', '\u'
Output (pipe to ConvertFrom-Json to verify that it works):
[
{
"id": "1",
"value": "\u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430"
}
]
Explanation:
[uint16[]] [char[]] $nline[1] converts the [char] instances of the strings stored in $nline[1] into the underlying UTF-16 code units (a .NET [char] is an unsigned 16-bit integer encoding a Unicode code point).
Note that this works even with Unicode characters that have code points above 0xFFFF, i.e. that are too large to fit into a [uint16]. Such characters outside the so-called BMP (Basic Multilingual Plane), e.g. 👍, are simply represented as pairs of UTF-16 code units, so-called surrogate pairs, which a JSON processor should recognize (ConvertFrom-Json does).
However, on Windows such chars. may not render correctly, depending on your console window's font. The safest option is to use Windows Terminal, available in the Microsoft Store
The call to the .ForEach() array method processes each resulting code unit:
"`0{0:x4}" -f $_ uses an expandable string to create a string that starts with a NUL character ("`0"), followed by a 4-digit hex. representation (x4) of the code unit at hand, created via -f, the format operator.
This trick of replacing what should ultimately be a verbatim \u prefix temporarily with a NUL character is needed, because a verbatim \ embedded in a string value would invariably be doubled in its JSON representation, given that \ acts the escape character in JSON.
The result is something like "<NUL>043a", which ConvertTo-Json transforms as follows, given that it must escape each NUL character as \u0000:
"\u0000043a"
The result from ConvertTo-Json can then be transformed into the desired escape sequences simply by replacing \u0000 (escaped as \\u0000 for use with the regex-based -replace oeprator) with \u, e.g.:
"\u0000043a" -replace '\\u0000', '\u' # -> "\u043a", i.e. к
Here's a way simply saving it to a utf16be file and then reading out the bytes, and formatting it, skipping the first 2 bytes, which is the bom (\ufeff). $_ didn't work by itself. Note that there's two utf16 encodings that have different byte orders, big endian and little endian. The range of cyrillic is U+0400..U+04FF. Added -nonewline.
'кириллица' | set-content utf16be.txt -encoding BigEndianUnicode -nonewline
$list = get-content utf16be.txt -Encoding Byte -readcount 2 |
% { '\u{0:x2}{1:x2}' -f $_[0],$_[1] } | select -skip 1
-join $list
\u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430
There must be a simpler way of doing this, but this could work for you:
$temp = foreach ($line in (Get-Content -Path 'C:\Users\users\Downloads\cyrillic.txt')){
$nline = $line.Split(' ', 2)
# output an object straight away so it gets collected in variable $temp
[PsCustomObject]#{
id = $nline[0] #stores "1" from file
value = (([system.Text.Encoding]::BigEndianUnicode.GetBytes($nline[1]) |
ForEach-Object {'{0:x2}' -f $_ }) -join '' -split '(.{4})' -ne '' |
ForEach-Object { '\u{0}' -f $_ }) -join ''
}
}
($temp | ConvertTo-Json) -replace '\\\\u', '\u' | Out-File 'C:\Users\user\Downloads\data.json'
Simpler using .ToCharArray():
$temp = foreach ($line in (Get-Content -Path 'C:\Users\users\Downloads\cyrillic.txt')){
$nline = $line.Split(' ', 2)
# output an object straight away so it gets collected in variable $temp
[PsCustomObject]#{
id = $nline[0] #stores "1" from file
value = ($nline[1].ToCharArray() | ForEach-Object {'\u{0:x4}' -f [uint16]$_ }) -join ''
}
}
($temp | ConvertTo-Json) -replace '\\\\u', '\u' | Out-File 'C:\Users\user\Downloads\data.json'
Value "кириллица" will be converted to \u043a\u0438\u0440\u0438\u043b\u043b\u0438\u0446\u0430

Powershell: Modify key value pair in JSON file

How do I modify a Key Value Pair in a JSON File with powershell?
We are trying to modify Database Connection, sometimes it can be two levels nested deep, sometimes it can be three levels deep.
Trying to utilize this answer,
Currently we are switching servers in multiple json files, so we can test in different server environments.
Add new key value pair to JSON file in powershell.
"JWTToken": {
"SecretKey": "Security Key For Generate Token",
"Issuer": "ABC Company"
},
"AllowedHosts": "*",
"ModulesConfiguration": {
"AppModules": [ "ABC Modules" ]
},
"ConnectionStrings": {
"DatabaseConnection": "Server=testserver,1433;Database=TestDatabase;User Id=code-developer;password=xyz;Trusted_Connection=False;MultipleActiveResultSets=true;",
"TableStorageConnection": "etc",
"BlobStorageConnection": "etc"
},
Once you convert JSON string to an object with PowerShell, it's not really a problem to then change the properties. The main issue you are going to face here is that your string is currently invalid JSON for .Net or at least it won't be expecting it in the current format. We can fix that though.
Here is your current JSON.
"JWTToken": {
"SecretKey": "Security Key For Generate Token",
"Issuer": "ABC Company"
},
"AllowedHosts": "*",
"ModulesConfiguration": {
"AppModules": [ "ABC Modules" ]
},
"ConnectionStrings": {
"DatabaseConnection": "Server=testserver,1433;Database=TestDatabase;User Id=code-developer;password=xyz;Trusted_Connection=False;MultipleActiveResultSets=true;",
"TableStorageConnection": "etc",
"BlobStorageConnection": "etc"
},
There may be other issues, for PowerShell JSON, in your application.config file, but these two are immediately noticeable to me.
Unnecessary trailing commas
No definitive opening { and closing }
How Can We Fix This?
We can use simple string concatenation to add { and } where necessary.
$RawText = Get-Content -Path .\path_to\application.config -Raw
$RawText = "{ " + $RawText + " }"
To remove any unnecessary parsing issues with trailing commas when parsing the JSON with ConvertFrom-Json we need to remove them via regex. My proposed approach would be to identify them by whether the current array } or ] closes after them, it might be that these closing brackets have a number of spaces or \s before they appear. So we would have a regex that looks like this:
"\,(?=\s*?[\}\]])".
We could then use that with -replace in PowerShell. Of course we will replace them with an empty string.
$FormattedText = $RawText -replace "\,(?=\s*?[\}\]])",""
From here we convert to JSON.
$JsonObj = $FormattedText | ConvertFrom-Json
We can now change your database string by setting a property.
$JsonObj.ConnectionStrings.DatabaseConnection = "your new string"
We use ConvertTo-Json to convert the array back to a Json string.
$JsonString = $JsonObj | ConvertTo-Json
It's not important to return the trailing commas, they aren't valid JSON, but your file needs the first { and last } removing before we commit it back to file with Set-Content.
# Remove the first { and trim white space. Second TrimStart() clears the space.
$JsonString = $JsonString.TrimStart("{").TrimStart()
# Repeat this but for the final } and use TrimEnd().
$JsonString = $JsonString.TrimEnd("}").TrimEnd()
# Write back to file.
$JsonString | Set-Content -Path .\path_to\application.config -Force
Your config file should be written back more or less as you found it. I will try and think of a regex to fix the appearance of the formatting, it shouldn't error, it just doesn't look great. Hope that helps.
EDIT
Here is a function to fix the unsightly appearance of the text in the file.
function Restore-Formatting {
Param (
[parameter(Mandatory=$true,ValueFromPipeline=$true)][string]$InputObject
)
$JsonArray = $InputObject -split "\n"
$Tab = 0
$Output = #()
foreach ($Line in $JsonArray) {
if ($Line -match "{" -or $Line -match "\[") {
$Output += (" " * $Tab) + $Line.TrimStart()
$Tab += 4
}
elseif ($Line -match "^\s+}" -or $Line -match "^\s+\]") {
$Tab -= 4
$Output += (" " * $Tab) + $Line.TrimStart()
}
else {
$Output += (" " * $Tab) + $Line.TrimStart()
}
}
$Output
}
TL;DR Script:
$RawText = Get-Content -Path .\path_to\application.config -Raw
$RawText = "{ " + $RawText + " }"
$FormattedText = $RawText -replace "\,(?=\s*?[\}\]])",""
$JsonObj = $FormattedText | ConvertFrom-Json
$JsonObj.ConnectionStrings.DatabaseConnection = "your new string"
$JsonString = $JsonObj | ConvertTo-Json
$JsonString = $JsonString.TrimStart("{").TrimStart()
$JsonString = $JsonString.TrimEnd("}").TrimEnd()
$JsonString | Restore-Formatting | Set-Content -Path .\path_to\application.config -NoNewLine -Force

What is the most efficient way to replace all \ with \\, within a huge JSON File?

I have to replace all occurrences of \ with \\ within a huge JSON Lines File. I wanted to use Powershell, but there might be other options too.
The source file is 4.000.000 lines and is about 6GB.
The Powershell script I was using took too much time, I let it run for 2 hours and it wasn't done yet. A performance of half an hour would be acceptable.
$Importfile = "C:\file.jsonl"
$Exportfile = "C:\file2.jsonl"
(Get-Content -Path $Importfile) -replace "[\\]", "\\" | Set-Content -Path $Exportfile
If the replacement is simply a conversion of a single backslash to a a double backslash, the file can be processed row by row.
Using a StringBuilder puts data into a memory buffer, which is flushed on disk every now and then. Like so,
$src = "c:\path\MyBigFile.json"
$dst = "c:\path\MyOtherFile.json"
$sb = New-Object Text.StringBuilder
$reader = [IO.File]::OpenText($src)
$i = 0
$MaxRows = 10000
while($null -ne ($line = $reader.ReadLine())) {
# Replace slashes
$line = $line.replace('\', '\\')
# ' markdown coloring is confused by backslash-apostrophe
# so here is an extra one just for looks
[void]$sb.AppendLine($line)
++$i
# Write builder contents into file every now and then
if($i -ge $MaxRows) {
add-content $dst $sb.ToString() -NoNewline
[void]$sb.Clear()
$i = 0
}
}
# Flush the builder after the while loop if there's data
if($sb.Length -gt 0) {
add-content $dst $sb.ToString() -NoNewline
}
$reader.close()
Use -ReadCount parameter for Get-Content cmdlet (and set it to 0).
-ReadCount
Specifies how many lines of content are sent through the pipeline at a
time. The default value is 1. A value of 0 (zero) sends all of the
content at one time.
This parameter does not change the content displayed, but it does
affect the time it takes to display the content. As the value of
ReadCount increases, the time it takes to return the first line
increases, but the total time for the operation decreases. This can
make a perceptible difference in large items.
Example (runs cca 17× faster for a file cca 20MB):
$file = 'D:\bat\files\FileTreeLista.txt'
(Measure-Command {
$xType = (Get-Content -Path $file ) -replace "[\\]", "\\"
}).TotalSeconds, $xType.Count -join ', '
(Measure-Command {
$yType = (Get-Content -Path $file -ReadCount 0) -replace "[\\]", "\\"
}).TotalSeconds, $yType.Count -join ', '
Get-Item $file | Select-Object FullName, Length
13,3288848, 338070
0,7557814, 338070
FullName Length
-------- ------
D:\bat\files\FileTreeLista.txt 20723656
Based on the your earlier question How can I optimize this Powershell script, converting JSON to CSV?. You should try to use the PopwerShell pipeline for this, especially as it concerns large input and output files.
The point is that you shouldn't focus on single parts of the solution to determine the performance because this usually leaves wrong impression as the performance of a complete (PowerShell) pipeline solution is supposed to be better than the sum of its parts. Besides it saves a lot of memory and result is a lean PowerShell syntax...
In your specific case, if correctly setup, the CPU will replacing the slashes, rebuilds the json strings and converting it to objects while the harddisk is busy reading and writing the data...
To implement the replacement of the slashes into the PowerShell pipeline together with the ConvertFrom-JsonLines cmdlet:
Get-Content .\file.jsonl | ForEach-Object { $_.replace('\', '\\') } |
ConvertFrom-JsonLines | ForEach-Object { $_.events.items } |
Export-Csv -Path $Exportfile -NoTypeInformation -Encoding UTF8

How can I organize this location data (json output) in a text file using PowerShell?

C:\temp\GeoDATA.txt:39:Content : {"ip":"68.55.28.227","city":"Plymouth","region_code":"MI","zip":"48170"}
C:\temp\GeoDATA.txt:56:Content : {"ip":"72.95.198.227","city":"Homestead","region_code":"PA","zip":"15120"}
C:\temp\GeoDATA.txt:73:Content : {"ip":"68.180.94.219","city":"Normal","region_code":"IL","zip":"61761"}
C:\temp\GeoDATA.txt:90:Content : {"ip":"75.132.165.245","city":"Belleville","region_code":"IL","zip":"62226"}
C:\temp\GeoDATA.txt:107:Content : {"ip":"97.92.20.220","city":"Farmington","region_code":"MN","zip":"55024"}
Each line starts with the path and ends with the closing }
I would like to organize this as a table with the headers being "ip, city, region_code, zip" and the appropriate data below each header. Something like this...
ip city region_code zip
68.55.28.227 Plymouth MI 48170
72.95.198.227 Homestead PA 15120
68.180.94.219 Normal IL 61761
75.132.165.245 Belleville IL 62226
97.92.20.220 Farmington MN 55024
This is the first 5 lines of a text file with hundreds more, so please keep that in mind.
Assuming that file input.txt contains data like your sample input data, the following should work:
(Get-Content input.txt) -replace '.*: (?=\{)' | ConvertFrom-Json
-replace '.*: (?=\{)' strips the prefix from each input line using a regular expression, returning only the JSON part:
.*:  matches any sequence of characters followed by : and a space.
(?=\{) is a lookahead assertion ((?=...)) that matches a single { (escaped as \{, because { has special meaning in regexes
Since lookaround assertions aren't considered part of the substring matched by the regex, each line is only matched up to the space before the { that starts the JSON part, and by replacing the matching part with the empty string (implicitly, because no replacement string is given), it is effectively removed from each line, leaving just the JSON part.
Piping the result to ConvertFrom-Json yields a collection of custom objects whose properties reflect the JSON input, yielding the desired tabular output by default.
Assuming the data is in the test.txt file.
Try this:
$Data = $null
$Table = #()
$Data = Get-Content C:\Users\lt\AppData\Local\Temp\test.txt
$Data | %{
$IP = (($_ -split "{")[1] -split "," -split ":")[1] -replace "`"",""
$City = (($_ -split "{")[1] -split "," -split ":")[3] -replace "`"",""
$Region_Code = (($_ -split "{")[1] -split "," -split ":")[5] -replace "`"",""
$ZIP = (($_ -split "{")[1] -split "," -split ":")[7] -replace "}","" -replace "`"",""
$Table += "$IP,$City,$Region_Code,$ZIP"
}
ConvertFrom-Csv -Header "IP","City","Region_Code","ZIP" -InputObject $Table
Please let me know if this helps and don't forget to mark it as answer :).

Eliminate Nulls

I'm out of luck finding information...
This powershell script collecting cert info in LocalMachine:
$cert_days = get-childitem cert:LocalMAchine -recurse |
select #{Name="{#CERTINFO}"; Expression={($_.FriendlyName)}} |
Sort "{#CERTINFO}"
write-host "{"
write-host " `"data`":`n"
convertto-json $cert_days
write-host
write-host "}"
I can't exclude Nulls or empty items like " ".
Using -ne $Null i get boolean results like true or false...
I would appreciate to hear Yours advice how to eliminate nulls or empty entries
To exclude empty entries, you could add a filter to remove those, preferably before the Sort-Object call., e.g.
$certs = ls Cert:\LocalMachine\ -Recurse |
Select #{Name = '{#CertInfo}'; Expression = {$_.FriendlyName}} |
Where { $_.'{#CertInfo}' } |
Sort '{#CertInfo}'
Robert Westerlund's helpful answer shows one way of filtering out $null and '' (empty-string) values, using the Where-Object cmdlet, which coerces the output from the script block to a Boolean, causing both $null and '' evaluate to $False and thus causing them to be filtered out.
This answer shows an alternative approach and discusses other aspects of the question.
tl;dr:
#{
data = #((Get-ChildItem -Recurse Cert:\LocalMachine).FriendlyName) -notlike '' |
Sort-Object | Select-Object #{ n='{#CERTINFO}'; e={ $_ } }
} | ConvertTo-Json
Using -ne $Null i get boolean results like true or false...
You only get a Boolean if the LHS is a scalar rather than an array - in the case of an array, the matching array elements are returned.
To ensure that the LHS (or any expression or command output) is an array, wrap it in #(...) the array-subexpression operator (the following uses PSv3+ syntax ):
#((Get-ChildItem -Recurse Cert:\LocalMachine).FriendlyName) -notlike ''
Note the use of -notlike '' to weed out both $null and '' values: -notlike forces the LHS to a string, and $null is converted to ''.
By contrast, if you wanted to use -ne $null, you'd have to use -ne '' too so as to also eliminate empty strings (though, in this particular case you could get away with just -ne '', because ConvertTo-Json would simply ignore $null values in its input).
Calling .FriendlyName on the typically array-valued output of Get-ChildItem directly is a PSv3+ feature called member-access enumeration: the .FriendlyName property access is applied to each element of the array, and the results are returned as a new array.
Filtering and sorting the values before constructing the wrapper objects with the {#CERTINFO} property not only simplifies the command, but is also more efficient.
Further thoughts:
Do not use Write-Host to output data: Write-Host bypasses PowerShell's (success) output stream; instead, use Write-Output, which you rarely need to call explicitly however, because its use is implied.
Instead of write-host "{", use write-output "{" or - preferably - simply "{" by itself.
PowerShell supports multi-line strings (see Get-Help about_Quoting_Rules), so there's no need to output the result line by line:
#"
{
"data":
$(<your ConvertTo-Json pipeline>)
}
"#
However, given that you're invoking ConvertTo-Json anyway, it's simpler to provide the data wrapper as a PowerShell object (in the simplest form as a hashtable) to ConvertTo-Json, as shown above.