I have a number of files that get dumped every morning on a server. There are 9 CSV comma delimited files, varying in size from 59kb to as large as 127mb. I need to import these into a MySQL database, however the files are currently in western iso-8859-16 format and I need to convert these to UTF-8 for my import. The manual process of converting these is quite teadious as you can imagine.
Is there a script / batch file I can create to run every morning via task scheduler? Or what is the best way to try automating this process on a Windows Server 2012?
Current Script, that reads the files, but doesn't seem to convert them:
!/bin/bash
Recursive file convertion CP1252 --> utf-8
Place this file in the root of your site, add execute permission and run
Converts *.php, *.html, *.css, *.js files.
To add file type by extension, e.g. *.cgi, add '-o -name "*.cgi"' to the find command
find ./ -name "*.csv" -o -name "*.html" -o -name "*.css" -o -name "*.js" -type f |
while read file
do
echo " $file"
mv $file $file.icv
iconv -f WINDOWS-1252 -t UTF-8 $file.icv > $file
rm -f $file.icv
done
Any help is greatly appreciated.
Thanks.
Related
I have multiple JSON files within a folder and I would like to post them all at once, in a single command line using curl. Is there a way to do this?
I have these files within a folder in my directory..
20190116_101859_WifiSensor(1).json
20190116_101859_WifiSensor(2).json
20190116_101859_WifiSensor(3).json
20190116_101859_WifiSensor(4).json
20190116_101859_WifiSensor(5).json
20190116_101859_WifiSensor(6).json
20190116_101859_WifiSensor(7).json
20190116_101859_WifiSensor(8).json
... plus more
I'd like to post all of the files from the folder in one go.
I know how to post one file using
curl -d "#20190116_101859_WifiSensor(1).json" http://iconsvr:8005/data
I need a way of posting them in one go, without having to write out each file name, if possible.
You can use a foreach loop to iterate over all files in your current directory which contains WifiSensor in the filename.
In Linux (Bash) you could use
for f in *WifiSensor*.json; do curl -d $f http://iconsvr:8005/data; done
In Windows (CMD)
for /r %f in (*WifiSensor*.json) do curl -d %f http://iconsvr:8005/data
Don't forget if you using the Windows Snippet above in a Batch file, you need to double the % signs.
I have more than 1000 files in aws s3 bucket in different folders, all the files are json files only, these json files have 30 properties, now I have to change the name of 2 properties (Ex: code to httpCode and time to responseTime). Can we write a script file which can change these property names in all files
Note: You should run this command without -i switch in sed command just to verify that you are getting desired results. -i will make changes in the file. If you are getting desired results then only put -i switch.
// Get the files from s3 bucket
aws s3 sync s3://mybucket .
find . -iname "*.json" -type f -exec sed -i 's/code/httpCode/g;s/time/responseTime/g' {} \;
// sync the files with s3 from current local directory
aws s3 sync . s3://mybucket
ps: this is untested.
I've got 300 .csv files. I need to add a column that contains the filename in each row. I'm new to using command line in terminal. I've looked around but haven't found code I understand for doing this in mac. Would be very grateful for help
You could write a bash-script:
vi csv-script.sh
press i (for inserting text)
copy/paste this code as example (is kind of self-explaining):
#!/bin/bash
for file in $(find ./ -maxdepth 1 -name '*.csv' -type f)
do
touch tmp
for line in $(cat $file)
do
fileCol="${file:3}"
echo "${fileCol};${line}" >>tmp
done
mv -f tmp $file
rm -f tmp
done
change the semicolon within the echo-command to whatever separator you need.
press esc and press colon :wq
then call from cli with:. csv-script.sh. (pay attention to dot space csv-script.sh)
have a try.
I am having one shell script in Linux in which the output will be generated in .csv format.
At the end of the script i am making this .csv to .gz format to reduce the space on my machine.
The file which is generated comes in this format Output_04-07-2015.csv
The command which i have written to make it zip is:-gzip Output_*.csv
But i am facing an issue that if the file already exists, then it should make the new file with that reported time stamp.
Can anyone help me with it.?
If all you want is to just overwrite the file if it already exists, gzip has a -f flag for it.
gzip -f Output_*.csv
What the -f flag does is forcefully create the gzip file, and overwrite whatever existing zip file there might already be.
Have a look at the man pages by typing man gzip or even this link for many other options.
If instead you want to do it more elegantly, you could check out and see if shell commands for your script work for you or not. But that would differ depending on what shell you have, bash, cshell, etc.
I found this question which had an answer to the question of performing batch conversions with Pandoc, but it doesn't answer the question of how to make it recursive. I stipulate up front that I'm not a programmer, so I'm seeking some help on this here.
The Pandoc documentation is slim on details regarding passing batches of files to the executable, and based on the script it looks like Pandoc itself is not capable of parsing more than a single file at a time. The script below works just fine in Mac OS X, but only processes the files in the local directory and outputs the results in the same place.
find . -name \*.md -type f -exec pandoc -o {}.txt {} \;
I used the following code to get something of the result I was hoping for:
find . -name \*.html -type f -exec pandoc -o {}.markdown {} \;
This simple script, run using Pandoc installed on Mac OS X 10.7.4 converts all matching files in the directory I run it in to markdown and saves them in the same directory. For example, if I had a file named apps.html, it would convert that file to apps.html.markdown in the same directory as the source files.
While I'm pleased that it makes the conversion, and it's fast, I need it to process all files located in one directory and put the markdown versions in a set of mirrored directories for editing. Ultimately, these directories are in Github repositories. One branch is for editing while another branch is for production/publishing. In addition, this simple script is retaining the original extension and appending the new extension to it. If I convert back again, it will add the HTML extension after the markdown extension, and the file size would just grow and grow.
Technically, all I need to do is be able to parse one branches directory and sync it with the production one, then when all changed, removed, and new content is verified correct, I can run commits to publish the changes. It looks like the Find command can handle all of this, but I just have no clue as to how to properly configure it, even after reading the Mac OS X and Ubuntu man pages.
Any kind words of wisdom would be deeply appreciated.
TC
Create the following Makefile:
TXTDIR=sources
HTMLS=$(wildcard *.html)
MDS=$(patsubst %.html,$(TXTDIR)/%.markdown, $(HTMLS))
.PHONY : all
all : $(MDS)
$(TXTDIR) :
mkdir $(TXTDIR)
$(TXTDIR)/%.markdown : %.html $(TXTDIR)
pandoc -f html -t markdown -s $< -o $#
(Note: The indented lines must begin with a TAB -- this may not come through in the above, since markdown usually strips out tabs.)
Then you just need to type 'make', and it will run pandoc on every file with a .html extension in the working directory, producing a markdown version in 'sources'. An advantage of this method over using 'find' is that it will only run pandoc on a file that has changed since it was last run.
Just for the record: here is how I achieved the conversion of a bunch of HTML files to their Markdown equivalents:
for file in $(ls *.html); do pandoc -f html -t markdown "${file}" -o "${file%html}md"; done
When you have a look at the script code from the -o argument, you'll see it uses string manipulation to remove the existing html with the md file ending.