Linux shell script command - gzip - csv

I am having one shell script in Linux in which the output will be generated in .csv format.
At the end of the script i am making this .csv to .gz format to reduce the space on my machine.
The file which is generated comes in this format Output_04-07-2015.csv
The command which i have written to make it zip is:-gzip Output_*.csv
But i am facing an issue that if the file already exists, then it should make the new file with that reported time stamp.
Can anyone help me with it.?

If all you want is to just overwrite the file if it already exists, gzip has a -f flag for it.
gzip -f Output_*.csv
What the -f flag does is forcefully create the gzip file, and overwrite whatever existing zip file there might already be.
Have a look at the man pages by typing man gzip or even this link for many other options.
If instead you want to do it more elegantly, you could check out and see if shell commands for your script work for you or not. But that would differ depending on what shell you have, bash, cshell, etc.

Related

Executing binary SQL file using SQLCMD from WiX

I'm trying to install SQL script(SSDT) using SQLCMD - as this script contains to many SSDT definitions and cannot be run by the WIX SQL extension.
i want my SQL script file to be binary(as i don't want it to stay on target machine)
how can i set the SQLCMD command to use the binary script (with -i)?
p.s.
i tried this blog:
http://neilsleightholm.blogspot.co.il/2008/08/executing-sqlcmd-from-wix.html##
but this code don't shows the link between the binary SQL file and the SCLCMD command.
can someone help me with the correct code?
this is the code i used, which did not work for me
<Binary Id="CreateSchema.sql" SourceFile="..\SQL\CreateSchema.sql" />
<CustomAction Id="sqlcmd.cmd"
Property="sqlcmd"
Value=""sqlcmd.exe" -S [DATABASE_SERVER]
-i "[#CreateSchema.sql]" -v var=SYSTEM_USER -o [INSTALLDIR]installSql.log" />
<CustomAction Id="sqlcmd"
BinaryKey="WixCA"
DllEntry="CAQuietExec"
Return="check"
Execute="deferred"
Impersonate="yes" />
<InstallExecuteSequence>
<Custom Action="sqlcmd.cmd" After="InstallFiles">NOT Installed</Custom>
<Custom Action="sqlcmd" After="sqlcmd.cmd">NOT Installed</Custom>
</InstallExecuteSequence>
the log file showed that -i parameter did not had any file name value:
MSI (s) (4C:6C) [09:58:15:610]: Executing op: CustomActionSchedule(Action=sqlcmd,ActionType=1025,Source=BinaryData,Target=CAQuietExec,CustomActionData="sqlcmd.exe" -S (local) -i "" -v var=SYSTEM_USER -o C:\installSql.log)
That's not how <Binary> works. The [#FileID] syntax is used to dynamically use the at runtime installation full path of a component's file.
Binaries are used typically as temporary extracted files for custom actions or, in this case, sql files among other things.
Consider looking into the SQL Extension in wix. As a minimal example take a look at this code.
Add the sql namespace xmlns:sql="http://schemas.microsoft.com/wix/SqlExtension"
<Binary Id="CreateSchema" SourceFile="..\SQL\CreateSchema.sql" />
<sql:SqlDatabase Id="MyDB" Database="[DATABASE]" Server="[DATABASE_SERVER]" />
And in a component you can add
<sql:SqlScript Id="CreateSchemaScript" BinaryKey="CreateSchema" ExecuteOnInstall="yes" Sequence="1" SqlDb="MyDB"/>
Here is a link to the SQL Schema definition with all the available elements. I haven't done much with the SQL Extension so you may need to do some reading to get a better idea of what you will need to do to accomplish creating your DB on install.
As i mentioned i wanted to use both SQLCMD - since my SQL script is SSDT format, and binary file(so file will be deleted in end of the install).
After looking for answers i understood that i cannot use the WiX [#filekey], as binary file will not be extracted as long as there is no custom action that is running - using it explicitly.
So in the end i understood that the best way is to extract the binary file by my self.
the steps i used in one single custom action are:
extract binary SQL script from MSI binary table.
save this file locally
run SQLCMD with -i and new file path(the one i save to)
delete the SQL file
I encounter some issues, worth mentioned, if you save the file to INSTALLDIR than the directory may not exist at the tun time of the custom action, so consider save it to temp folder or to create directory beforehand.

Importing large datasets into Couchbase

I am having difficulty importing large datasets into Couchbase. I have experience doing this very fast with Redis via the command line but I have not seen anything yet for Couchbase.
I have tried using the PHP SDK and it imports about 500 documents / second. I have also tried the cbcdocload script in the Couchbase bin folder but it seems to want each document in its on JSON file. It is a bit of work to create all these files and then load them. Is there some other importation process I am missing? If cbcdocload is the only way load data fast then is it possible to put multiple documents into 1 json file.
Take the file that has all the JSON documents in it and zip up the file:
zip somefile.zip somefile.json
Place the zip file(s) into a directory. I used ~/json_files/ in my home directory.
Then load the file or files by the following command:
cbdocloader -u Administrator -p s3kre7Pa55 -b MyBucketToLoad -n 127.0.0.1:8091 -s 1000 \
~/json_files/somefile.zip
Note: '-s 1000' is the memory size. You'll need to adjust this value for your bucket.
If successful you'll see output stating how many documents were loaded, success, etc.
Here is a brief script to load up a lot of .zip files in a given directory:
#!/bin/bash
JSON_Dir=~/json_files/
for ZipFile in $JSON_Dir/*.zip ;
do /Applications/Couchbase\ Server.app/Contents/Resources/couchbase-core/bin/cbdocloader \
-u Administrator -p s3kre7Pa55 -b MyBucketToLoad \
-n 127.0.0.1:8091 -s 1000 $ZipFile
done
UPDATED: Keep in mind this script will only work if your data is formatted correctly or if the files are less than the max single document size of 20MB. (not the zipfile, but any document extracted from the zip)
I have created a blog post describing bulk loading from a single file as well and it is listed here:
Bulk Loading Documents Into Couchbase

How to get AWStats to generate static HTML files?

I want to get AWStats running on my webserver that runs Debian 4.4.5-8 with Apache 2.
There are several websites that all have their own configuration file, similar to this:
Include "/etc/awstats/awstats.model.conf"
LogFile="/var/customers/logs/myname-example.com-access.log"
LogType=W
LogFormat = 1
LogSeparator=" "
SiteDomain="example.com"
HostAliases="*.example.com"
DirData="/www/myname/awstats/example.com/"
What I expect is that HTML files are written to /www/myname/awstats/example.com/ which I can then access through Apache. However when I run /usr/share/awstats/tools/buildstatic.sh what happens is that .txt files are written to that directory and HTML files that I want are written to /var/cache/awstats. The error file in /tmp remains empty.
Why is this happening and how do I make it work the way I want?
DirData is not supposed to be read directly by the Web Server. It is used by awstats.pl.
The fact is that /var/cache/awstats is hardcoded in buildstatic.sh so you have to change the two lines mentioning it:
mkdir -p /var/cache/awstats/$c/$Y/$m/
and
-dir=/var/cache/awstats/$c/$Y/$m/ >$TMPFILE 2>&1

--no-clobber still overwrites file if --html-extension used in wget?

I have a script for downloading all of my Chrome Bookmarks. I use wget with the --html-extension because some of the bookmarks end in .php and can't be opened by a web browser unless --html-extension option is used. The problem I am having is that when I use --html-extension with --no-clobber, It doesn't recognize that most of the files are already there for some reason, so it goes through the whole process of redownloading stuff it already has.
An example:
wget -nc http://www.test.com/
run once will save the file like it is supposed to. if you run it again then it will say the file already there so not retrieving. that is the operation i would expect.
however, delete the file that was just saved and run:
wget -nc http://www.test.com/ --html-extension
and then run that same command again. it overwrites the file instead of saying file already there. What is going on?
When the html suffix is added, wget can't tell what remote file you want to compare it to.
man wget: http://unixhelp.ed.ac.uk/CGI/man-cgi?wget
======================
--html-extension
If a file of type application/xhtml+xml or text/html is downloaded
and the URL does not end with the regexp .[Hh][Tt][Mm][Ll]?, this
option will cause the suffix .html to be appended to the local
filename. This is useful, for instance, when you're mirroring a
remote site that uses .asp pages, but you want the mirrored pages
to be viewable on your stock Apache server. Another good use for
this is when you're downloading CGI-generated materials. A URL
like http://site.com/article.cgi?25 will be saved as arti-
cle.cgi?25.html.
Note that filenames changed in this way will be re-downloaded every
time you re-mirror a site, because Wget can't tell that the local
X.html file corresponds to remote URL X (since it doesn't yet know
that the URL produces output of type text/html or application/xhtml+xml. To prevent this re-downloading, you must use -k
and -K so that the original version of the file will be saved as
X.orig.

Recursive directory parsing with Pandoc on Mac

I found this question which had an answer to the question of performing batch conversions with Pandoc, but it doesn't answer the question of how to make it recursive. I stipulate up front that I'm not a programmer, so I'm seeking some help on this here.
The Pandoc documentation is slim on details regarding passing batches of files to the executable, and based on the script it looks like Pandoc itself is not capable of parsing more than a single file at a time. The script below works just fine in Mac OS X, but only processes the files in the local directory and outputs the results in the same place.
find . -name \*.md -type f -exec pandoc -o {}.txt {} \;
I used the following code to get something of the result I was hoping for:
find . -name \*.html -type f -exec pandoc -o {}.markdown {} \;
This simple script, run using Pandoc installed on Mac OS X 10.7.4 converts all matching files in the directory I run it in to markdown and saves them in the same directory. For example, if I had a file named apps.html, it would convert that file to apps.html.markdown in the same directory as the source files.
While I'm pleased that it makes the conversion, and it's fast, I need it to process all files located in one directory and put the markdown versions in a set of mirrored directories for editing. Ultimately, these directories are in Github repositories. One branch is for editing while another branch is for production/publishing. In addition, this simple script is retaining the original extension and appending the new extension to it. If I convert back again, it will add the HTML extension after the markdown extension, and the file size would just grow and grow.
Technically, all I need to do is be able to parse one branches directory and sync it with the production one, then when all changed, removed, and new content is verified correct, I can run commits to publish the changes. It looks like the Find command can handle all of this, but I just have no clue as to how to properly configure it, even after reading the Mac OS X and Ubuntu man pages.
Any kind words of wisdom would be deeply appreciated.
TC
Create the following Makefile:
TXTDIR=sources
HTMLS=$(wildcard *.html)
MDS=$(patsubst %.html,$(TXTDIR)/%.markdown, $(HTMLS))
.PHONY : all
all : $(MDS)
$(TXTDIR) :
mkdir $(TXTDIR)
$(TXTDIR)/%.markdown : %.html $(TXTDIR)
pandoc -f html -t markdown -s $< -o $#
(Note: The indented lines must begin with a TAB -- this may not come through in the above, since markdown usually strips out tabs.)
Then you just need to type 'make', and it will run pandoc on every file with a .html extension in the working directory, producing a markdown version in 'sources'. An advantage of this method over using 'find' is that it will only run pandoc on a file that has changed since it was last run.
Just for the record: here is how I achieved the conversion of a bunch of HTML files to their Markdown equivalents:
for file in $(ls *.html); do pandoc -f html -t markdown "${file}" -o "${file%html}md"; done
When you have a look at the script code from the -o argument, you'll see it uses string manipulation to remove the existing html with the md file ending.