Insert data in database (Mysql) from Lua table - mysql

I am trying to feed a table in a mysql database with something like 1 000 000 lines.
I am using Lua and the function :
conn:execute("INSERT INTO orders (dates, ordertype) VALUES ('"..tab[1][dateIndex]......
for each line.
The problem is that it is very long and I really need more efficiency.
Do you have others solutions (maybe creating a .csv and loading it with mysql, maybe there is a function that can load a matrix in a database more efficiently,...). Using Lua is an obligation as I am using an existing project.
Thank you for your help

First you can stop committing on each insert.
Also you can use prepared query. It provides by Lua-DBI and Lua-ODBC
I use ODBC.
local env = odbc.environment()
lcoal db = env:driverconnect{
Driver = IS_WINDOWS and '{MySQL ODBC 5.2 ANSI Driver}' or 'MySQL';
db='test';
uid='root';
};
cnn:set_autocommit(false)
local stmt = db:prepare("INSERT INTO orders (dates, ordertype) VALUES(?,?)")
for i, row in ipairs(tab) do
stmt:bindstr(row[dateIndex])
...
stmt:execute()
if i % 1000 == 0 then
cnn:commit()
end
end
Also ODBC provide variables. May be the could be faster because they do not call SQLBindParam each time.
-- create stmt as preview
...
local dateValue = odbc.date():bind_param(stmt, 1)
local orderValue = odbc.ulong():bind_param(stmt, 2)
for i, row in ipairs(tab) do
dateValue:set(row[1]) -- data is yyyy-mm-dd e.g. 2014-10-14
orderValue:set(row[2])
stmt:execute()
...
-- same as preview

Related

Generate DDL from 4D database

I have inherited a 4D database that I need to extract all the data from to import to another relational database. The 4D database ODBC driver seems to have quite a few quirks that prevents it from being used as a SQL Server linked server. I can give the gory details if anyone wants them but suffice to say; it's not looking like a possibility.
Another possibility I tried was using the MS SQL Server Import Data wizard. This is, of course, SSIS under the covers and it requires the 32 bit ODBC driver. This gets part of the way but it fails trying to create the target tables because it doesn't understand what a CLOB datatype is.
So my reasoning is that if I can build the DDL from the existing table structure in the 4D database I might be able to just import the data using the Data Import wizard if I create the tables first.
Any thoughts on what tools I could use to do this?
Thanks.
Alas, the 4D ODBC drivers are a (ahem) vessel filled with a fertiliser so powerful that none may endure its odour...
There is no simple answer but if you have made it here, you are already in a bad place so I will share some things that will help.
You can use the freeware ODBC Query Tool that can connect to the ODBC through a user or system DSN with the 64 bit driver. Then you run this query:
SELECT table_id, table_name,column_name, data_type, data_length, nullable, column_id FROM _user_columns ORDER BY table_id, column_id limit ALL
Note: ODBC Query Tool fetches the first 200 row pages by default. You need to scroll to the bottom of the result set.
I also tried DataGrip from JetBrains and RazorSQL. Neither would work against the 4D ODBC DSN.
Now that you have this result set, export it to Excel and save the spreadsheet. I found the text file outputs to be not be useful. They are exported as readable text, not CSV or tab delimited.
I then used the Microsoft SQL Server Import Data Wizard (which is SSIS) to import that data into a table that I could then manipulate. I am targeting SQL Server so it makes sense for me to make this step but if you importing to another destination database, you may create the table definitions from the data you now have whatever tool you think is best.
Once I had this in a table, I used this T-SQL script to generate the DDL:
use scratch;
-- Reference for data types: https://github.com/PhenX/4d-dumper/blob/master/dump.php
declare #TableName varchar(255) = '';
declare C1 CURSOR for
select distinct table_name
from
[dbo].[4DMetadata]
order by 1;
open C1;
fetch next from C1 into #TableName;
declare #SQL nvarchar(max) = '';
declare #ColumnDefinition nvarchar(max) = '';
declare #Results table(columnDefinition nvarchar(max));
while ##FETCH_STATUS = 0
begin
set #SQL = 'CREATE TABLE [' + #TableName + '] (';
declare C2 CURSOR for
select
'[' +
column_name +
'] ' +
case data_type
when 1 then 'BIT'
when 3 then 'INT'
when 4 then 'BIGINT'
when 5 then 'BIGINT'
when 6 then 'REAL'
when 7 then 'FLOAT'
when 8 then 'DATE'
when 9 then 'DATETIME'
when 10 then
case
when data_length > 0 then 'NVARCHAR(' + cast(data_length / 2 as nvarchar(5)) + ')'
else 'NVARCHAR(MAX)'
end
when 12 then 'VARBINARY(MAX)'
when 13 then 'NVARCHAR(50)'
when 14 then 'VARBINARY(MAX)'
when 18 then 'VARBINARY(MAX)'
else 'BLURFL' -- Put some garbage to prevent this from creating a table!
end +
case nullable
when 0 then ' NOT NULL'
when 1 then ' NULL'
end +
', '
from
[dbo].[4DMetadata]
where
table_name = #TableName
order by column_id;
open C2;
fetch next from C2 into #ColumnDefinition;
while ##FETCH_STATUS = 0
begin
set #SQL = #SQL + #ColumnDefinition;
fetch next from C2 into #ColumnDefinition;
end
-- Set the last comma to be a closing parenthesis and statement terminating semi-colon
set #SQL = SUBSTRING(#SQL, 1, LEN(#SQL) - 1) + ');';
close C2;
deallocate C2;
-- Add the result
insert into #Results (columnDefinition) values (#SQL);
fetch next from C1 into #TableName;
end
close C1;
deallocate C1;
select * from #Results;
I used the generated DDL to create the database table definitions.
Unfortunately, SSIS will not work with the 4D database ODBC driver. It keeps throwing authentication errors. But you may be able to load this database with your own bespoke tool that works with the ODBC weirdness of 4D.
I have my own tool (unfortunately I cannot share it) that will load the XML exported data directly to the database. So I am finished.
Good luck.
Boffin,
Does "inherited a 4D database" mean it's running or that you have the datafile and structure but can't open it?
If it's running and you have access to the user environment the easy thing to do is simply use 4D's export functions. If you don't have access to the user environment the only option for ODBC would be if it's designed to allow ODBC or if the developer provided some export capability.
If you can't run it you won't be able to directly access the datafile. 4D uses a proprietary datastructure and it changed from version to version. It's not encrypted by default so you can actually read/scavage the data but you can't just build a DDL and pull from it. ODBC is a connection between the running app and some other source.
Your best bet will be to contact the developer and ask for help. If that's not an option get the thing running. If it's really old you can contact 4D to get a copy of archived versions. Depending on which version it is and how it's built (compiled, interpreted, engined) your options vary.
[Edit] The developer can specify the schema that's available through SQL and we frequently limit what's exposed either for security or usability reasons. It sounds like this may be the case here - it would explain why you don't see the total structure.
This can also be done with the native 4D structure. I can limit how much of the 4D structure is visible in user mode on a field by field/table by table basis. Usually this is to make the system less confusing to users but it's also a way to enforce data security. So I could allow you to download all your 'data' while not allowing you to download the internal elements that make the database to work.
If you are able to export the data you want that sounds like the thing to do even if it is slow.

Deploying to hundreds of mysql databases [duplicate]

Is there any way to easily create a stored procedure on multiple MySQL databases at once? All the databases are on the same MySQL install.
Installing in all schemas
To get a list of the schemas, use show databases;. Combine this with -- use:
use schemaA;
-- use schemaB;
-- use schemaC;
create procedure ...
Manually iterate through the schemas, removing and uncommenting use clauses as you move on, checking that everything works out. In MySQL Workbench, Ctrl+Shift+Enter is your friend.
Installing routines in a subset of schemas
Normally you don't want to install the stored routine in all schemas on a server, but only in a subset --- often defined by the set of schemas which already have some specific stored routine installed. Then, as discussed on SO, you can use a query like this to get the names of the relevant schemas:
SELECT ROUTINE_SCHEMA FROM `information_schema`.`ROUTINES` where specific_name = 'MyRoutine';
Verification
After deploying routines, to verify the existence of them, you can use a query like this:
SELECT distinct
r1.ROUTINE_SCHEMA,
case when r2.specific_name is not null then '' else '####' end as RoutineName1,
case when r3.specific_name is not null then '' else '####' end as RoutineName2,
case when r4.specific_name is not null then '' else '####' end as RoutineName3
FROM
`information_schema`.`ROUTINES` as r1
LEFT JOIN (select * from `information_schema`.`ROUTINES` where specific_name = 'RoutineName1') as r2 on r1.routine_schema = r2.routine_schema
LEFT JOIN (select * from `information_schema`.`ROUTINES` where specific_name = 'RoutineName2') as r3 on r1.routine_schema = r3.routine_schema
LEFT JOIN (select * from `information_schema`.`ROUTINES` where specific_name = 'RoutineName3') as r4 on r1.routine_schema = r4.routine_schema
where
r1.specific_name = 'FilteringRoutineName';
This query will check whether RoutineName1, RoutineName2 and RoutineName3 exist in the database schemas on your server which have the routine FilteringRoutineName. If a routine is missing, it will be marked with ####.
Of course, this only checks for routine existence. To verify their implementation, you may need a database diff tool (such as MySQL Compare or similar).
Assuming you are using Linux, a simple BASH loop with an array of schema names will let you do this.
Save your procedure definition to a file (e.g. myproc.sql), then use the file as input to mysql in the loop. If you put your sign-in details in ~/.my.cnf you can also avoid having to put usernames and passwords on the cmdline.
for i in dbname1 dbname2 dbname3; do mysql ${i} < myproc.sql; done;
I would recommend doing a copy-paste and create the stored procedure in each database schema if they need to be available to that schema only. Otherwise I would follow the recommendation from 'Kelly Vista' and just refer to the stored procedure located in one of the schema's.

How to parsimoniously refer to a data frame in RMySQL

I have a MySQL table that I am reading with the RMySQL package of R. I would like to be able to directly refer to the data frame stored in the table so I can seamlessly interact with it rather than having to execute RMySQL statement every time I want to do something. Is there a way to accomplish this? I tried:
data <- dbReadTable(conn = con, name = 'tablename')
For example, if I now want to check how many rows I have in this table I would run:
nrow(data)
Does this go through the database connection, or am I now storing the object "data" locally, defeating the whole purpose of using an external database?
data <- dbReadTable(conn = con, name = 'tablename')
This command downloads all the data into a local R dataframe (assuming you have enough RAM). Any operations with data from that point forward do not require the SQL connection.

How to use prepared statements in Stata?

I want to use prepared statements in Stata, like in the following (pseudocode) example:
for each key in keylist
odbc load, exec("SELECT * FROM table where tablekey = $key")
do stuff
end
How can I bring the parameter value key into my statements? I have tried string concats, local variables etc. but nothing works. I would like to know whether there are prepared statements like in Java (SELECT * FROM Table WHERE tablekey = ?).
Read help local in Stata. The local macros start with the single quote (to the left of 1) and end with a closing single quote (to the left of Enter). And then may be help foreach. I guess that the right syntax would be
local keylist "the actual list of keys"
foreach key of local keylist {
odbc load, exec("SELECT * FROM table where tablekey = `key'")
save thisdataset`key', replace
}
etc.
(Stata is the only programming environment I know :) ).

SQL Server 2008: insert into table in batches

I have a linked server (Sybase) set up in SQL Server from which I need to draw data. The Sybase server sits on the other side of the world, and connectivity is pretty shoddy. I would like to insert data into one of the SQL Server tables in manageable batches (e.g. 1000 records at a time). I.e I want to do;
INSERT IN [SQLServerTable] ([field])
SELECT [field] from [LinkedServer].[DbName].[dbo].[SybaseTable]
but I want to fetch 1000 records at a time and insert them.
Thanks
Karl
I typically use python with the pyodbc module to perform batches like this against a SQL server. Take a look and see if it is an option, if so I can provide you an example.
You will need to modify a lot of this code to fit your particular situation, however you should be able to follow the logic. You can comment out the cnxn.commit() line to rollback the transactions until you get everything working.
import pyodbc
#This is an MS SQL2008 connection string
conn='DRIVER={SQL Server};SERVER=SERVERNAME;DATABASE=DBNAME;UID=USERNAME;PWD=PWD'
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rowCount=cursor.execute('SELECT Count(*) from RemoteTable').fetchone()[0]
cnxn.close()
count=0
lastID=0
while count<rowCount:
#You may want to close the previous connection and start a new one in this loop. Otherwise
#the connection will be open the entire time defeating the purpose of performing the transactions in batches.
cnxn=pyodbc.connect(conn)
cursor=cnxn.cursor()
rows=cursor.execute('SELECT TOP 1000 ID, Field1, Field2 FROM INC WHERE ((ID > %s)) ' % (lastID)).fetchall()
for row in rows:
cursor.execute('INSERT INTO LOCALTABLE (FIELD1, FIELD2) VALUES (%s, %s)' % (row.Field1, row.Field2))
cnxn.commit()
cnxn.close()
#The [0] assumes the id is the first field in the select statement.
lastID=rows[len(rows)-1][0]
count+=len(rows)
#Pause after each insert to see if the user wants to continue.
raw_input("%s down, %s to go! Press enter to continue." % (count, rowCount-count))