insert fields from database1 into table from database2 - mysql

I am using prestashop and have data in zencart I am matching up information and want to select the data to be inserted into a different table under different fields.
insert into presta_table1 (c1, c2, ...)
select c1, c2, ...
from zen_table1`
Since a lot is different I need to do approximately 800 records once I match up what field is what in what table.
I recently found a example
USE datab1;INSERT INTO datab3.prestatable (author,editor)
SELECT author_name,editor_name FROM author,datab2.editor
WHERE author.editor_id = datab2.editor.editor_id;
be nice to find a way to import avoiding duplicates
I am unable to find examples of this.

Here is what I did to get data out of this POS (Point of sale) system that uses mysql for a database.
I found tables with the data I needed and I exported the data out that came out in a csv format. I used calc in Libreoffice to open and then in anouther sheet manipulate the data into the example csv feilds and that worked good.
I had some issues with some of the data but I used consol commands to help me get by these let me share them with you.
Zencart description data exported from mysqlworkbench had some model numbers I needed to export out into there own field
43,1,"Black triple SS","Black Triple SS
12101-57 (7.5 inch)",,0
i used a command to add in " , " in so I could extract the data and over lay it . essential copy past into the spreadsheet Calc where I needed it in the normal order.
this sed command removes the ),( in the file and replaces with a carriage return
I did a database dump and removed starting ( and ending ) then saved as zencart_product.csv then ran this in a console
sed 's/),(/\n/g' zencart_product.csv > zencart_productNEW.csv
I had about 1000 files with $ and # in them so I put them all in a dir and renamed them with
get rid of the $ symbol
rename 's/\$//g' *
get rid of the $ symbol
rename 's/\$//g' *
get rid of space
rename "s/\s+//g" *
I hope people stuck in some software that want the data out are able to get it out with some time and effort and that this helps someone. Thanks

Related

Difficulties creating CSV table in Google BigQuery

I'm having some difficulties creating a table in Google BigQuery using CSV data that we download from another system.
The goal is to have a bucket in the Google Cloud Platform that we will upload a 1 CSV file per month. This CSV files have around 3,000 - 10,000 rows of data, depending on the month.
The error I am getting from the job history in the Big Query API is:
Error while reading data, error message: CSV table encountered too
many errors, giving up. Rows: 2949; errors: 1. Please look into the
errors[] collection for more details.
When I am uploading the CSV files, I am selecting the following:
file format: csv
table type: native table
auto detect: tried automatic and manual
partitioning: no partitioning
write preference: WRITE_EMPTY (cannot change this)
number of errors allowed: 0
ignore unknown values: unchecked
field delimiter: comma
header rows to skip: 1 (also tried 0 and manually deleting the header rows from the csv files).
Any help would be greatly appreciated.
This usually points to the error in the structure of data source (in this case your CSV file). Since your CSV file is small, you can run a little validation script to see that the number of columns is exactly the same across all your rows in the CSV, before running the export.
Maybe something like:
cat myfile.csv | awk -F, '{ a[NF]++ } END { for (n in a) print n, "rows have",a[n],"columns" }'
Or, you can bind it to the condition (lets say if your number of columns should be 5):
ncols=$(cat myfile.csv | awk -F, 'x=0;{ a[NF]++ } END { for (n in a){print a[n]; x++; if (x==1){break}}}'); if [ $ncols==5 ]; then python myexportscript.py; else echo "number of columns invalid: ", $ncols; fi;
It's impossible to point out the error without seeing an example CSV file, but it's very likely that your file is incorrectly formatted. As a result, one typo confuses BQ into thinking there are thousands. Let's say you have the following csv file:
Sally Whittaker,2018,McCarren House,312,3.75
Belinda Jameson 2017,Cushing House,148,3.52 //Missing a comma after the name
Jeff Smith,2018,Prescott House,17-D,3.20
Sandy Allen,2019,Oliver House,108,3.48
With the following schema:
Name(String) Class(Int64) Dorm(String) Room(String) GPA(Float64)
Since the schema is missing a comma, everything is shifted one column over. If you have a large file, it results in thousands of errors as it attempts to inserts Strings into Ints/Floats.
I suggest you run your csv file through a csv validator before uploading it to BQ. It might find something that breaks it. It's even possible that one of your fields has a comma inside the value which breaks everything.
Another theory to investigate is to make sure that all required columns receive an appropriate (non-null) value. A common cause of this error is if you cast data incorrectly which returns a null value for a specific field in every row.
As mentioned by Scicrazed, this issue seems to be generated as some file rows has an incorrect format, in which case it is required to validate the content data in order to figure out the specific error that is leading this issue.
I recommend you to check the errors[] collection that might contains additional information about the aspects that can be making to fail the process. You can do this by using the Jobs: get method that returns detailed information about your BigQuery Job or refer to the additionalErrors field of the JobStatus Stackdriver logs that contains the same complete error data that is reported by the service.
I'm probably too late for this, but it seems the file has some errors (it can be a character that cannot be parsed or just a string in an int column) and BigQuery cannot upload it automatically.
You need to understand what the error is and fix it somehow. An easy way to do it is by running this command on the terminal:
bq --format=prettyjson show -j <JobID>
and you will be able to see additional logs for the error to help you understand the problem.
If the error happens only a few times you just can increase the number of errors allowed.
If it happens many times you will need to manipulate your CSV file before you upload it.
Hope it helps

SSIS - Reading from a multi part text file

I'm trying to build an SSIS package that reads from a text file and outputs into another text file. The catch is that the file I'm trying to read from has multiple sections and I can't find anything that shows how to do that.
The file looks like this:
[sectionA]
key1=value1
key2=value2
key3=value3
[sectionB]
key4=value4
key5=value5
key6=value6
I started with a couple of tutorials that read from a flat file source but the data gets pulled into an equally ugly table. Hoping someone has some input on this.
The SSIS Flat File Connection is built for speed so it doesnt allow for niceties like that.
I would still use the Flat File Connection but just load all the data into a single, wide NVARCHAR column in a SQL table. I would add an IDENTITY column to that table to get a relative Row Number.
Then I would add downstream tasks using SQL to select by Sections e.g. for Section A rows:
WHERE File_Row_Number > ( SELECT MIN ( File_Row_Number ) FROM Staging_Table WHERE nvarchar_column = '[sectionA]' )
AND File_Row_Number < ( SELECT MIN ( File_Row_Number ) FROM Staging_Table WHERE nvarchar_column = '[sectionB]' )
If the split requirements are as simple as those shown I might attempt them in SQL e.g.
How do I split a string so I can access item x?
But I would probably lean towards using Strings.Split in a Script Task where the code will be simpler and safer.

Howto process multivariate time series given as multiline, multirow *.csv files with Apache Pig?

I need to process multivariate time series given as multiline, multirow *.csv files with Apache Pig. I am trying to use a custom UDF (EvalFunc) to solve my problem. However, all Loaders I tried (except org.apache.pig.impl.io.ReadToEndLoader which I do not get to work) to load data in my csv-files and pass it to the UDF return one line of the file as one record. What I need is, however one column (or the content of the complete file) to be able to process a complete time series. Processing one value is obviously useless because I need longer sequences of values...
The data in the csv-files looks like this (30 columns, 1st is a datetime, all others are double values, here 3 sample lines):
17.06.2013 00:00:00;427;-13.793273;2.885583;-0.074701;209.790688;233.118828;1.411723;329.099170;331.554919;0.077026;0.485670;0.691253;2.847106;297.912382;50.000000;0.000000;0.012599;1.161726;0.023110;0.952259;0.024673;2.304819;0.027350;0.671688;0.025068;0.091313;0.026113;0.271128;0.032320;0
17.06.2013 00:00:01;430;-13.879651;3.137179;-0.067678;209.796500;233.141233;1.411920;329.176863;330.910693;0.071084;0.365037;0.564816;2.837506;293.418550;50.000000;0.000000;0.014108;1.159334;0.020250;0.954318;0.022934;2.294808;0.028274;0.668540;0.020850;0.093157;0.027120;0.265855;0.033370;0
17.06.2013 00:00:02;451;-15.080651;3.397742;-0.078467;209.781511;233.117081;1.410744;328.868437;330.494671;0.076037;0.358719;0.544694;2.841955;288.345883;50.000000;0.000000;0.017203;1.158976;0.022345;0.959076;0.018688;2.298611;0.027253;0.665095;0.025332;0.099996;0.023892;0.271983;0.024882;0
Has anyone an idea how I could process this as 29 time series?
Thanks in advance!
What do you want to achieve?
If you want to read all rows in all files as a single record, this can work:
a = LOAD '...' USING PigStorage(';') as <schema> ;
b = GROUP a ALL;
b will contain all the rows in a bag.
If you want to read each CSV file as a single record, this can work:
a = LOAD '...' USING PigStorage(';','tagsource') as <schema> ;
b = GROUP a BY $0; --$0 is the filename
b will contain all the rows per file in a bag.

Sybase ASE 12.0 CSV Table Export

What I'm trying to do is export a view/table from Sybase ASE 12.0 into a CSV file, but I'm having a lot of difficulty in it.
We want to import it into IDEA or MS-Access. The way that these programs operate is with the text-field encapsulation character and a field separator character, along with new lines being the record separator (without being able to modify this).
Well, using bcp to export it is ultimately fruitless with its built in options. It doesn't allow you to define a text field encapsulation character (as far as I can tell). So we tried to create another view that reads from the other view/table that concatenates the fields that have new lines in them (text fields), however, you may not do that without losing some of the precision because it forces the field into a varchar of 8000 characters/bytes, of which our max field used is 16000 (so there's definitely some truncation).
So, we decided to create columns in the new view that had the text field delimiters. However, that put our column count for the view at 320 -- 70 more than the 250 column limit in ASE 12.0.
bcp can only work on existing tables and views, so what can we do to export this data? We're pretty much open to anything.
If its only the new line char that is causing problems can you not just do a replace
create new view as
select field1, field2, replace(text_field_with_char, 'new line char,' ' ')
from old_view
You may have to consider exporting as 2 files, importing into your target as 2 tables and then combining them again in the target. If both files have a primary key this is simple.
That sounds like bcp's right, but process the output via awk or perl.
But are those things you have and know? That might be a little overhead for you.
If you're on Windows you can get Active Perl free and it could be quick.
something like:
perl -F, -lane 'print "\"$F[0]\",$F[1],\"$F[2]\",$F[3]\n" ;' bcp-output-file
how's that? $F is an array of fields. The text ones you encircle with \"
You can use BCP format files for this.
bcp .... -f XXXX.fmt
BCP can also produce this format files interactively if you don't state
any of -c -n -f flags. Then you can save the format file and experiment with it, editing it and runnign BCP.
To safe time while exporting and debugging, use -F -L flags like "-F 1 -L 10" -- this gets only first 10 lines.

Transferring a flat file database to a MySQL database

I have a flat file database (yeah gross I know - the worst part is that it's 1.4GB), and I'm in the process of moving it to a MySQL database. The problem is that I'm not sure how to go about doing this - and I've checked through every related question on here but none relate to what I want to do, nor how my database is currently setup.
My current flat file database is setup to where a normal MySQL row is its own file, and a MySQL table would be the directory. So for example if you have a user named Jon, there would be a file for the user in a directory named /members/. Within that file would be various information for the user including the users id, rank etc - all separated by tabs, all on separate lines (userid\t4).
So here's an example user file:
userid 4
notes staff notes: bla bla staff2 notes: bla bla bla
username Example
So how can I convert the above into their own rows and fields in MySQL? And if possible, could I do thousands of these files at once?
Thanks.
This seems like a fairly trivial scripting problem.
See the example (pseudocode) below for how you might read in the user directory into a user table.
Clearly, you would want it to be a bit more robust, with error checking / data validation, but just for perspective, see below:
for file in list_dir('/path/to/users/'):
line_data = dict()
for line in open(file, 'r'):
key, value = line.split("\t", 1)
line_data[key] = value
mysql_query('''
INSERT INTO
users
SET
user_id = $1,
foo = $2,
bar = $3
''',
(
line_data['user_id'],
line_data['foo'],
line_data['bar']
)
)
LOAD DATA INFILE is used for CSV files, and yours are not, so:
merge all files in a directory in a single CSV file, removing the name of the columns (userid, username...) and separate the cols with a separator ([TAB], ";", ...) than import as CVS.
Loop for every dirs you got.
or write a "stupid" program (php works well) that do all this job for you.