In a powershell script, I have a mysqldump command which outputs to stdin.
The goal is to replace all occurences of a string in that stdin before pushing it into a file, because there is not enough disk space on the machine to hold two separate files (dump is around 30Go).
I have tried this (removed the invoke-expression and mysql args):
mysqldump [...args] | ForEach-Object -Process {$_ -replace 'sourceText','targetText' | Add-Content $dumpDataFile}
Or this:
mysqldump [...args] | Foreach-Object {$_ -replace 'sourceText','targetText'} | Set-Content $dumpDataFile
but it is eating up all the memory on the machine.
I have also tried replacing content in the result file but it always ends up in copying to an another file.
I also thought about reading line by line and replacing line by line to a new file, with each X lines removing lines from the original file, but methods I have found to cut lines in files end up eating all memory.
In linux I would have used sed, I know it exists for windows but I do not want to add a dependency to the script.
Here is the command that is run:
$expr = "& 'C:\Program Files\MySQL\MySQL Server 5.7\bin\mysqldump.exe' --defaults-extra-file=env.cnf --log-error=err.log --no-create-info foo | ForEach-Object -Process {$_ -replace 'foo','bar' | Add-Content dump.sql}"
Invoke-Expression $expr
UPDATE
I have found that even piping out to out-null eats up all the memory:
& 'C:\Program Files\MySQL\MySQL Server 5.7\bin\mysqldump.exe' --defaults-extra-file=env.cnf --log-error=err.log --no-create-info foo | out-null
also the scripts run on an amazon virtual machine which has powershell 4
UPDATE 2
This also eats up all the memory, but it does not when running from cmd:
& 'C:\Program Files\MySQL\MySQL Server 5.7\bin\mysqldump.exe' --defaults-extra-file=env.cnf --log-error=err.log --no-create-info foo > dump.sql
Do you know how to call the full replace command with cmd? I do not manage to escape the mysqldump executable path
UPDATE 3
Realized that my dump contains huge tables, which results in some of the INSERT line being extremely long (thus the memory usage maybe). I tries without extended inserts but it is too long to import then.
If the disk space is premium, how about compressing the data? If NTFS compression isn't good enough, let's write the output into a GZipStream. It should offer good savings for text data. Thus the file on disk would be considerably smaller.
First off, a compression function (idea from a blog post):
function Compress-Stream {
[CmdletBinding()]
param (
[Parameter(Mandatory = $true, ValueFromPipeline=$true)]
[AllowEmptyString()]
[string]$Row
)
begin {
$ms = New-Object System.IO.MemoryStream
$cs = New-Object System.IO.Compression.GZipStream($ms, [System.IO.Compression.CompressionMode]::Compress)
$sw = New-Object System.IO.StreamWriter($cs)
}
process {
if(-not [string]::IsNullOrWhiteSpace($row)) {
$sw.Write($Row + [environment]::NewLine)
}
}
end {
try {$cs.Close(); $cs.Dispose()} catch{}
try {$sw.Close(); $sw.Dispose()} catch{}
$s = [System.Convert]::ToBase64String($ms.ToArray());
try {$ms.Close(); $ms.Dispose()} catch {}
$s
}
}
Sample usage is to query DBA Overflow data dump. Tt's much more manageable that SO. On my system the result set is 13 MB uncompressed, 3,5 MB compressed.
# SQL Server, so sqlcmd for illustration.
# Pipe results to compression and pipe compressed data into a file
sqlcmd -E -S .\sqli001 -d dbaoverflow -Q "select id, postid from votes order by id;" `
| compress-stream | Set-Content -Encoding ascii -Path c:\temp\data.b64
This should provide a compressed text file. To process it, use MemoryStream and GZipStream again:
$d = get-content c:\temp\data.b64
$data = [System.Convert]::FromBase64String($d)
$ms = New-Object System.IO.MemoryStream
$ms.Write($data, 0, $data.Length)
$ms.Seek(0,0) | Out-Null
$sr = New-Object System.IO.StreamReader(New-Object System.IO.Compression.GZipStream($ms, [System.IO.Compression.CompressionMode]::Decompress))
# $sr can now read decompressed data. For example,
$sr.ReadLine()
id postid
$sr.ReadLine()
----------- -----------
$sr.ReadLine()
1 2
Doing replacements and writing the final result into another a file should be easy enough.
In the end I use python to replace the string in the dump file while sending it to mysql.
It is fast enough and low on memory.
Related
I have to replace all occurrences of \ with \\ within a huge JSON Lines File. I wanted to use Powershell, but there might be other options too.
The source file is 4.000.000 lines and is about 6GB.
The Powershell script I was using took too much time, I let it run for 2 hours and it wasn't done yet. A performance of half an hour would be acceptable.
$Importfile = "C:\file.jsonl"
$Exportfile = "C:\file2.jsonl"
(Get-Content -Path $Importfile) -replace "[\\]", "\\" | Set-Content -Path $Exportfile
If the replacement is simply a conversion of a single backslash to a a double backslash, the file can be processed row by row.
Using a StringBuilder puts data into a memory buffer, which is flushed on disk every now and then. Like so,
$src = "c:\path\MyBigFile.json"
$dst = "c:\path\MyOtherFile.json"
$sb = New-Object Text.StringBuilder
$reader = [IO.File]::OpenText($src)
$i = 0
$MaxRows = 10000
while($null -ne ($line = $reader.ReadLine())) {
# Replace slashes
$line = $line.replace('\', '\\')
# ' markdown coloring is confused by backslash-apostrophe
# so here is an extra one just for looks
[void]$sb.AppendLine($line)
++$i
# Write builder contents into file every now and then
if($i -ge $MaxRows) {
add-content $dst $sb.ToString() -NoNewline
[void]$sb.Clear()
$i = 0
}
}
# Flush the builder after the while loop if there's data
if($sb.Length -gt 0) {
add-content $dst $sb.ToString() -NoNewline
}
$reader.close()
Use -ReadCount parameter for Get-Content cmdlet (and set it to 0).
-ReadCount
Specifies how many lines of content are sent through the pipeline at a
time. The default value is 1. A value of 0 (zero) sends all of the
content at one time.
This parameter does not change the content displayed, but it does
affect the time it takes to display the content. As the value of
ReadCount increases, the time it takes to return the first line
increases, but the total time for the operation decreases. This can
make a perceptible difference in large items.
Example (runs cca 17× faster for a file cca 20MB):
$file = 'D:\bat\files\FileTreeLista.txt'
(Measure-Command {
$xType = (Get-Content -Path $file ) -replace "[\\]", "\\"
}).TotalSeconds, $xType.Count -join ', '
(Measure-Command {
$yType = (Get-Content -Path $file -ReadCount 0) -replace "[\\]", "\\"
}).TotalSeconds, $yType.Count -join ', '
Get-Item $file | Select-Object FullName, Length
13,3288848, 338070
0,7557814, 338070
FullName Length
-------- ------
D:\bat\files\FileTreeLista.txt 20723656
Based on the your earlier question How can I optimize this Powershell script, converting JSON to CSV?. You should try to use the PopwerShell pipeline for this, especially as it concerns large input and output files.
The point is that you shouldn't focus on single parts of the solution to determine the performance because this usually leaves wrong impression as the performance of a complete (PowerShell) pipeline solution is supposed to be better than the sum of its parts. Besides it saves a lot of memory and result is a lean PowerShell syntax...
In your specific case, if correctly setup, the CPU will replacing the slashes, rebuilds the json strings and converting it to objects while the harddisk is busy reading and writing the data...
To implement the replacement of the slashes into the PowerShell pipeline together with the ConvertFrom-JsonLines cmdlet:
Get-Content .\file.jsonl | ForEach-Object { $_.replace('\', '\\') } |
ConvertFrom-JsonLines | ForEach-Object { $_.events.items } |
Export-Csv -Path $Exportfile -NoTypeInformation -Encoding UTF8
Working on Powershell Script to get last 2 days IIS log files and check size difference or growth of files and email or generate a html email with values.
i did few steps but stuck when getting difference between 2 files of logs and also the html part.
here is my code,
# Set your backup path
$BackupPath = "D:\log files\"
# Get the log file created today
$BackupToday = Get-ChildItem $BackupPath -Filter "*.log" | Where-Object {$_.CreationTime.Date -eq (Get-Date).Date} | %{[int]($_.length/1KB)}
# Get the log file created yesterday
$BackupYDay = Get-ChildItem $BackupPath -Filter "*.log" | Where-Object {$_.CreationTime.Date -eq ((Get-Date).AddDays(-1)).Date} | %{[int]($_.length/1KB)}
# Compare the two files based on the size
$compare = ($BackupYDay - $BackupToday)
Write-Host($BackupToday)
Write-Host($BackupYDay)
Write-Host($compare)
Not quite sure with what you mean by "and email or generate a html email with values".
For that, have a look at the Send-MailMessage cmdlet.
If what you want to email is as simple as a string showing the difference (if any) in size of the two files, you could do something like this:
# Set your backup path
$BackupPath = "D:\log files"
# get the logfiles and select only the latest two
$logNewest, $logBefore = Get-ChildItem -Path $BackupPath -Filter "*.log" -File | Sort-Object CreationTime -Descending | Select-Object -First 2
$sizeDiff = $logNewest.Length - $logBefore.Length
$difference = switch ([math]::Sign($sizeDiff)) {
1 { 'File {0} is {1:N2}KB larger than file {2}' -f $logNewest.Name, [math]::Abs($sizeDiff / 1KB ), $logBefore.Name }
-1 { 'File {0} is {1:N2}KB smaller than file {2}' -f $logNewest.Name, [math]::Abs($sizeDiff / 1KB ), $logBefore.Name }
0 { 'File {0} and file {1} are of equal size' -f $logNewest.Name, $logBefore.Name }
}
Write-Host $difference
Result:
File Today.log is 0,01KB larger than file Yesterday.log
How can I access "dxdiag" with powershell . I want to run a script for gather some information from a few remote computers.
If you use the /x parameter, you can have dxdiag output to an xml file, which is then really easily parsed from powershell. Basically just something like this:
# Drop output in temp dir
$logFile = $env:TEMP + "\dxDiagLog.xml"
# Piping to Out-Null forces it to wait for dxdiag to complete before continuing. Otherwise
# it tries to load the file before it actuallygets written
dxdiag.exe /whql:off /dontskip /x $logFile | Out-Null
[xml]$dxDiagLog = Get-Content $logFile
$dxDiagLog.DxDiag.DirectSound.SoundDevices.SoundDevice | ft Description, DriverProvider
Which dumps this for output on my machine:
Description DriverProvider
----------- --------------
Speakers (Realtek High Definition Audio) Realtek Semiconductor Corp.
Polycom CX700 (Polycom CX700) Microsoft
In my case the command would run, and only later would it create the file.
(& dxdiag.exe /whql:off /dontskip /t `"$path`") | Out-Null
The problem is with the ampersand & which made the command exit before completion.
So either use:
dxdiag.exe /whql:off /dontskip /x $logFile | Out-Null
Or:
Start-Process -FilePath "C:\Windows\System32\dxdiag.exe" -ArgumentList "/dontskip /whql:off /t C:\Temp\dxdiag.txt" -Wait
From: https://community.spiceworks.com/topic/2116806-how-can-i-run-dxdiag-on-a-remote-pc
I have following code of Powershell where i am trying to sort lastest backup file of mysql database and then try to import this file
I am using the Powershell script for this according to script till the last i get desired o/p and then i copy this o/p and execute in seprate
cmd window it execute smooth but in power shell when i try to do the same thing it fails with following error please help me
Error message
C:\wamp\bin\mysql\mysql5.5.24\bin\mysql.exe --user=root --password=xxx testdest < "C:\mysqltemp\testsrc_2013-12-23_10-46-AM.sql"
cmd.exe : The system cannot find the file specified.
At C:\Users\IBM_ADMIN\AppData\Local\Temp\8a7b4576-97b2-42aa-a0eb-42bb934833a6.ps1:19 char:4
+ cmd <<<< /c " "$pathtomysqldump" --user=$param1 --password=$param2 $param3 < $param5 "
+ CategoryInfo : NotSpecified: (The system cann...file specified.:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
Script is as following
##Select latest file created by Export of mysql dumper
$a=(get-childitem C:\mysqltemp | sort LastWriteTime -Descending | Select-Object Name | select -first 1 -ExpandProperty Name)
$pathtomysqldump = "C:\wamp\bin\mysql\mysql5.5.24\bin\mysql.exe"
#Write-Host "Print variable A -------------"
#$a
$a=$a.Replace(" ", "")
#Write-Host "After Triming ---------------"
#$a
$param1="root"
$param2="xxx"
$param3="testdest"
#$param4="""'<'"""
$param5="""C:\mysqltemp\$a"""
#$p1="$param1 $param2 $param3 < $param5"
# Invoke backup Command. /c forces the system to wait to do the backup
Write-Host " "$pathtomysqldump" --user=$param1 --password=$param2 $param3 < $param5 "
cmd /c " "$pathtomysqldump" --user=$param1 --password=$param2 $param3 < $param5 "
Thanks and Appreciate your help and time for the same.
This is a common misunderstanding involving calling command lines in the Windows operating system, particularly from PowerShell.
I highly recommend using the Start-Process cmdlet to launch a process instead of calling cmd.exe. It's much easier to mentally parse out and understand the path to the executable, and all of the command line parameters separately. The problem with your current script is that you're trying to call an executable file with the following name: C:\wamp\bin\mysql\mysql5.5.24\bin\mysql.exe --user=root --password=xxx testdest < "C:\mysqltemp\testsrc_2013-12-23_10-46-AM.sql", which has been wrapped in a call to cmd.exe. Obviously, that file does not exist, because you're including all of the parameters as part of the filesystem path.
There are too many layers going on here to make it simple to understand. Instead, use Start-Process similar to the following example:
# 1. Define path to mysql.exe
$MySQL = "C:\wamp\bin\mysql\mysql5.5.24\bin\mysql.exe"
# 2. Define some parameters
$Param1 = 'value1';
$Param2 = 'value2';
$Param3 = 'value 3 with spaces';
# 3. Build the command line arguments
# NOTE: Since the value of $Param3 has spaces in it, we must
# surround the value with double quotes in the command line
$ArgumentList = '--user={0} --password={1} "{2}"' -f $Param1, $Param2, $Param3;
Write-Host -Object "Arguments are: $ArgumentList";
# 4. Call Start-Process
# NOTE: The -Wait parameter tells it to "wait"
# The -NoNewWindow parameter prevents a new window from popping up
# for the process.
Start-Process -FilePath $MySQL -ArgumentList $ArgumentList -Wait -NoNewWindow;
I am pretty new with PowerShell. I was recently tasked with making a error message popup that would help a local user determine whether or not a MS SQL on-demand DB merge worked or not. I wrote a script that woudld do the following:
Run the batch file that conducted the merge
Read the results of a text log file into a variable
Check the variable for any instances of the word "ERROR" and pop a success or fail dialog box depending on whether or not it found the word error in the log file.
Quick and simple I thought but I appear to be struggling with the conditional statement. Here is the script:
cmd /c c:\users\PERSON\desktop\merge.bat
$c = get-content c:\replmerg.log
if ($c -contains ("ERROR"))
{
$a = new-object -comobject wscript.shell
$b = $a.popup(“ERROR - Database Merge“,0,”Please Contact Support”,1)
}
else
{
$a = new-object -comobject wscript.shell
$b = $a.popup(“SUCCESS - Database Merge“,0,”Good Job!”,1)
}
Right now what happens is that the script runs and just skips to the Success message. I can confirm that simply running the 'get-content' command in powershell will on its own produce a variable that I can then call and show the content of the log file. My script however does not appear as though it is actually checking the $c variable for the word and then popping the error message as intended. What am I missing here?
Christian's answer is correct. You could also use the -match operator. For example:
if ((Get-Content c:\replmerg.log) -match "ERROR")
{
'do error stuff'
}
else
{
'do success stuff'
}
You can use -cmatch if you want a case sensitive comparison.
You actually don't need to use Get-Content at all. Select-String can take a -path parameter. I created two very simple text files, one which has the word ERROR and one which does not
PS C:\> cat .\noerror.txt
not in here
PS C:\> cat .\haserror.txt
ERROR in here
this has ERROR in here
PS C:\> if ( Select-String -Path .\noerror.txt -Pattern ERROR) {"Has Error"}
PS C:\> if ( Select-String -Path .\haserror.txt -Pattern ERROR) {"Has Error"}
Has Error
PS C:\>
The one thing that might trip you up is that the -pattern actually takes a regular expression, so be careful of what you use for your pattern. THis will find ERROR anywhere in the log file, even if there are multiple instances, like in my "haserror.txt" file.
The -contains operator is used for looking for an exact match in a list (or array). As the other answers indicate, you should use -match, -like, or -eq to compare strings.
You can use the -quiet switch of select-string if you just wnat to test for the presence of a string in a file.
select-string -path c:\replmerg.log -pattern "ERROR" -casesensetive -quiet
Will return $true if the string is found in the file, and $false if it is not.
contain operator test only an identical value (not part of a value).
You can try this
$c = get-content c:\replmerg.log | select-string "ERROR" -casesensitive
if ($c.length -gt 0)