Cassandra CQLSH COPY FROM CSV: Can I create my own colum from others - csv

I often use the cqlsh command COPY...FROM CSV... but I have new needs.
I'd like to add an extra colum in my cassandra table that would be created from two other columns.
Example (cvs file)
1;2
2;4
3;6
would become a table with these values:
my table: 12;1;2
24;2;4
36;3;6
I ve used other options but they're much slower than COPY...FROM CSV
Do you know if I can do that using COPY...FROM CSV?

You can't do this with only copy command.
If you are using Linux then
First dumb the csv to file with copy command let's say csv_test.csv
1;2
2;4
3;6
Then use the below command to combine first two column into one.
cat csv_test.csv | awk -F ";" '{print $1$2 ";" $0}' > csv_test_combine.csv
Output file csv_test_combine.csv :
12;1;2
24;2;4
36;3;6

Related

Not able to access tables from a corrupted MySQL Dump file

grep -n "Table Structure" dumpfile.sql
returns
XXXXXX:-- Table structure for table `table_name_1`
XXXXXX:-- Table structure for table `table_name_2`
XXXXXX:-- Table structure for table `table_name_3`
But after this point, it breaks. Not sure why ?
AND also
For retrieving a single table from huge dump file (Around 489GB), I used:
sed -n -e '/Table Structure 'table_name'/p' dump_file_name.sql > extracted_file.sql
But it is not able to locate the table_name.
So my question here is. How can all the tables be accessed ? Or why is it after certain table, it is not able to find the table.
Please If anyone can help me with this. It will be a greatest deed !
You have two problems with your sed command.
First, you're using single quotes inside the string that's delimited by single quotes. That won't work, because the inside quotes will just end the shell string, not be included literally.
Second, the quotes in the dump file are backticks, not single quotes.
Also, you're missing for table in your pattern, and the s in structure should be lowercase.
sed -n -e '/Table structure for table `table_name`/p' dump_file_name.sql > extracted_file.sql
But you can just use grep for this, you don't need sed:
grep 'Table structure for table `table_name`' dump_file_name.sql > extracted_file.sql

How to insert content of a file in different fields in mysql database using shell script?

I am trying to scan a folder for new files and reading those files and inserting its content into database and then delete file from folder.Till here its working but the issue that the whole content is getting inserted in one field in database.
Below is the code:
inotifywait -m /home/a/b/c -e create -e moved_to |
while read path action file; do
for filename in `ls -1 /home/a/b/c/*.txt`
do
while read line
do
echo $filename $line
mysql -uroot -p -Bse "use datatable; INSERT INTO
table_entries (file,data ) VALUES ('$filename','$line'); "
done <$filename
done
find /home/a/b/c -type f -name "*.txt" -delete
done
Basically the files contains:name,address,contact_no,email.
I want to insert name from file to name field in database,address in address. In php we use explode to split data,what do i use in shell script ?
This would be far easier if you use LOAD DATA INFILE (see the manual for full explanation of syntax and options).
Something like this (though I have not tested it):
inotifywait -m /home/a/b/c -e create -e moved_to |
while read path action file; do
for filename in `ls -1 /home/a/b/c/*.txt`
do
mysql datatable -e "LOAD DATA LOCAL INFILE '$filename'
INTO TABLE table_entries (name, address, contact_no, email)
SET file='$filename'"
done
find /home/a/b/c -type f -name "*.txt" -delete
done
edit: I specified mysql datatable which is like using USE datatable; to set the default database. This should resolve the error about "no database selected."
The columns you list as (name, address, contact_no, email) name the columns in the table, and they must match the columns in the input file.
If you have another column in your table that you want to set, but not from data in the input file, you use the extra clause SET file='$filename'.
You should also use some error checking to make sure the import was successful before you delete your *.txt files.
Note that LOAD DATA INFILE assumes lines end in newline (\n), and fields are separated by tab (\t). If your text file uses commas or some other separator, you can add syntax to the LOAD DATA INFILE statement to customize how it reads your file. The documentation shows how to do this, with many examples: https://dev.mysql.com/doc/refman/5.7/en/load-data.html I recommend you spend some time and read it. It's really not very long.

AWK or GREP 1 instance of repeated output

so what I have here is some output from a cisco switch and I need to capture the host name and use that to populate a csv file.
basically I run a show mac address-table and pull mac addresses and populate them into a csv file. that I got however I cant figure out how to grab the host name so that I can put that in a separate column.
I have done this:
awk '/#/{print $1}'
but that will print every line that has '#' in it. I only need 1 to populate a variable so I can re use it. the end result needs to look like this: (the CSV file has MAC address, port number , hostname. I use commas to indicate the column seperation
0011.2233.4455,Gi1/1,Switch1#
0011.2233.4488,Gi1/2,Switch1#
0011.2233.4499,Gi1/3,Switch1#
Without knowing what the input file looks like, the exact solution that is required will be uncertain. However, as an example, given an input file like the requested output (which I've called switch.txt):
0011.2233.4455,Gi1/1,Switch1#
0011.2233.4488,Gi1/2,Switch1#
0011.2233.4499,Gi1/3,Switch1#
0011.2233.4455,Gi1/1,Switch3#
0011.2233.4488,Gi1/2,Switch2#
0011.2233.4498,Gi1/3,Switch3#
... a list of the unique values of the first field (comma-separated) can be obtained from:
$ awk -F, '{print $1}' <switch.txt | sort | uniq
0011.2233.4455
0011.2233.4488
0011.2233.4498
0011.2233.4499
An approach like this might help with extracting unique values from the actual input file.

DB load CSV into multiple tables

UPDATE: added an example to clarify the format of the data.
Considering a CSV with each line formatted like this:
tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5,[tbl2.col1:tbl2.col2]+
where [tbl2.col1:tbl2.col2]+ means that there could be any number of these pairs repeated
ex:
tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2,tbl2.col1:tbl2.col2
The tables would relate to eachother using the line number as a key which would have to be created in addition to any columns mentioned above.
Is there a way to use mysql load
data infile to load the data into
two separate tables?
If not, what Unix command line tools
would be best suited for this?
no, not directly. load data can only insert into one table or partitioned table.
what you can do is load the data into a staging table, then use insert into to select the individual columns into the 2 final tables. you may also need substring_index if you're using different delimiters for tbl2's values. the line number is handled by an auto incrementing column in the staging table (the easiest way is to make the auto column last in the staging table definition).
the format is not exactly clear, and is best done w/perl/php/python, but if you really want to use shell tools:
cut -d , -f 1-5 file | awk -F, '{print NR "," $0}' > table1
cut -d , -f 6- file | sed 's,\:,\,,g' | \
awk -F, '{i=1; while (i<=NF) {print NR "," $(i) "," $(i+1); i+=2;}}' > table2
this creates table1 and table 2 files with these contents:
1,tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5
2,tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5
3,tbl1.col1,tbl1.col2,tbl1.col3,tbl1.col4,tbl1.col5
and
1,tbl2.col1,tbl2.col2
1,tbl2.col1,tbl2.col2
2,tbl2.col1,tbl2.col2
2,tbl2.col1,tbl2.col2
3,tbl2.col1,tbl2.col2
3,tbl2.col1,tbl2.col2
As you say, the problematic part is the unknown number of [tbl2.col1:tbl2.col2] pairs declared in each line. I would tempted to solve this through sed: split the one file into two files, one for each table. Then you can use load data infile to load each file into its corresponding table.

Manipulating giant MySQL dump files

What's the easiest way to get the data for a single table, delete a single table or break up the whole dump file into files each containing individual tables? I usually end up doing a lot of vi regex munging, but I bet there are easier ways to do these things with awk/perl, etc. The first page of Google results brings back a bunch of non-working perl scripts.
When I need to pull a single table from an sql dump, I use a combination of grep, head and tail.
Eg:
grep -n "CREATE TABLE" dump.sql
This then gives you the line numbers for each one, so if your table is on line 200 and the one after is on line 269, I do:
head -n 268 dump.sql > tophalf.sql
tail -n 69 tophalf.sql > yourtable.sql
I would imagine you could extend upon those principles to knock up a script that would split the whole thing down into one file per table.
Anyone want a go doing it here?
Another bit that might help start a bash loop going:
grep -n "CREATE TABLE " dump.sql | tr ':`(' ' ' | awk '{print $1, $4}'
That gives you a nice list of line numbers and table names like:
200 FooTable
269 BarTable
Save yourself a lot of hassle and use mysqldump -T if you can.
From the documentation:
--tab=path, -T path
Produce tab-separated data files. For each dumped table, mysqldump
creates a tbl_name.sql file that contains the CREATE TABLE statement
that creates the table, and a tbl_name.txt file that contains its
data. The option value is the directory in which to write the files.
By default, the .txt data files are formatted using tab characters
between column values and a newline at the end of each line. The
format can be specified explicitly using the --fields-xxx and
--lines-terminated-by options.
Note This option should be used only when mysqldump is run on the
same machine as the mysqld server. You must have the FILE privilege,
and the server must have permission to write files in the directory
that you specify.
This shell script will grab the tables you want and pass them to splitted.sql.
It’s capable of understanding regular expressions as I’ve added a sed -r option.
Also MyDumpSplitter can split the dump into individual table dumps.
Maatkit seems quite appropriate for this with mk-parallel-dump and mk-parallel-restore.
I am a bit late on that one, but if it can help anyone, I had to split a huge SQL dump file in order to import the data to another Mysql server.
what I ended up doing was splitting the dump file using the system command.
split -l 1000 import.sql splited_file
The above will split the sql file every 1000 lines.
Hope this helps someone