I'll start by saying I'm new to MySql, at least in the level of my question. :)
I got a data logger with a high data output and I'm interested in saving the data to a database.
I've been wondering if it's possible to filter the INSERT query in the database itself, so it will save only data if certain values appear in the query.
As #Akina mentioned, you can use CHECK CONSTRAINT and INSERT IGNORE. However, It is better not trying to insert any problematic data, since it will slow down insert operation.
you need to filter data before insert operation. You may want to consider writing custom log shipper or if you have option you can use logstash
Related
My problem:
I am trying to delete some important rows from multiple tables, around 20 tables, I am afraid that deleting the rows might cause some problem(I am not the creator of this website), so before deleting the rows I am selecting the rows and writing it into a file. But I write it as an array.
Is there a way to write it as an sql insert statement, to a file, so that it would be easy for me to update the database if there is some problem.
For me it would be easier to store the information in a way that would allow me to understand the data. Then IF I need it, I could mutate the data into an INSERT statement.
I strongly encourage you as a professional software engineer, to try not to solve a problem that you might encounter, until you DO encounter it.
If you use phpMyAdmin you can run a query that selects those rows, then click the Export link under Query results operations:
In the next page, select Custom - display all possible options and SQL Format:
Then, further down the page, select data under Format specific options:
And then press Go. You will be prompted to Save or Open a file, which will include the appropriate INSERT statements to recreate the data from those rows.
I have a situation where I am using data pipeline to import data from csv file stored in S3. For initial data load, data pipeline is executing good.
Now I need to keep this database up-to-date and synced to the in-premise DB. Which mean there will be set of CSV file coming to S3 which would be the updates to some existing records, new records or deletion. I need that to be updated on RDS through data pipeline.
Question - Can data pipeline is designed for such purpose OR is just meant for one-off data load? If it can be used for incremental updates, then how do I go about it.
Any help is much appreciated!
Yes, you need to do an update and insert (aka upsert).
If you have a table with keys: key_a, key_b and other columns: col_c, col_d you can use the following SQL:
insert into TABLENAME (key_a, key_b, col_c, col_d) values (?,?,?,?) ON DUPLICATE KEY UPDATE col_c=values(col_c), col_d=values(col_d)
Kindly refer to the aws documentation: http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-template-incrementalcopyrdstos3.html
There is a predefined template for Mysql RDS incremental upload, I personally have tried incremental uploads from mysql, sql server and redshift.
You can start with using the mysql template and edit it in architect view to get a insight of the new/additional fiels it uses and likewise create datapipeline for other RDS database as well.
Internally the Incremental requires you to provide the change column which needs to be essentially a date column, and it this changecolumn is them used in the Sql script which is like:
select * from #{table} where #{myRDSTableLastModifiedCol} >= '#{format(#scheduledStartTime, 'YYYY-MM-dd HH-mm-ss')}' and #{myRDSTableLastModifiedCol} <= '#{format(#scheduledEndTime, 'YYYY-MM-dd HH-mm-ss')}'
scheduledStartTime and scheduleEndTime are the datapipeline expression whose value depends upon your schedule.
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-pipeline-expressions.html
and scheduletype is timeseries to execute the sql at the end of the schedule end time to guarrante that there is no data loss.
And yes deleted data cant be tracked in through datapipeline; also datapipleline would also not help if the datetime column is not there in your table, in which case i wlould prefer loading full table.
I hope i have covered pretty-much i know :)
Regards,
Varun R
I come up with Insert query generator from pentaho spoon that writes input data to a text file in the form of a set of SQL statements.
I wonder if there is any method that can be used similar to this but generate update query based on input.
Well, if you need to update a table based on some key columns compared to your stream, you may use the Insert/Update step.
The downside is that it won't generate the statements in a file, it will execute the updates or inserts based on that comparison and that's all.
Can you give more details about your scenario? We may work things out together.
Why do you need a file with UPDATE statements?
Can't we connect to the database and run the updates right away?
Sure use the "Dynamic SQL Row" step.
I'm receiving a MySQL dump file .sql daily from an external server, which I don't have any control of. I created a local database to store all data in the .sql file. I hope I can set up a script to automatically update my local database daily. The sql file I'm receiving daily contains old data that is in the local database already. How can I avoid duplicates of such old data and only insert into the local MySQL server new data? Thank you very much!
You can use a third-party database compare tool such as those from Red Gate to create two databases, one current (your "master") and the new dump. You can then run the compare tool between the two versions and update only changes between them, updating your master.
Use unique constraints on field, that you want to be unique.
Also, as Danny Beckett mentioned, to avoid errors in output (which I would prefer to redirect into file for future analysis, to check, if I haven't missed anything in process), you can use INSERT IGNORE construct instead of INSERT.
You can use a constraint supported with IGNORE statement.
The second option, you can first insert the data to a temp table. Then insert only the difference.
Using the second option you may use some restriction to do not search for duplication through add records stored in database.
You need to create a primary key in your table. It should be a unique combination of column values. Using the INSERT query with IGNORE will avoid adding duplicates in this table.
see http://dev.mysql.com/doc/refman/5.5/en/insert.html
If this is a plain vanilla mysqldump file, then normally it includes DROP TABLE IF EXISTS... statements and create table statements, so the tables are recreated when the data is imported. So duplicte data should not be a problem, unless I'm missing something.
I have a MySQL database that I use only for logging. It consists of several simple look-alike MyISAM tables. There is always one local (i.e. located on the same machine) client that only writes data to db and several remote clients that only read data.
What I need is to insert bulks of data from local client as fast as possible.
I have already tried many approaches to make this faster such as reducing amount of inserts by increasing the length of values list, or using LOAD DATA .. INFILE and some others.
Now it seems to me that I've came to the limitation of parsing values from string to its target data type (doesn't matter if it is done when parsing queries or a text file).
So the question is:
does MySQL provide some means of manipulating data directly for local clients (i.e. not using SQL)? Maybe there is some API that allow inserting data by simply passing a pointer.
Once again. I don't want to optimize SQL code or invoke the same queries in a script as hd1 adviced. What I want is to pass a buffer of data directly to the database engine. This means I don't want to invoke SQL at all. Is it possible?
Use mysql's LOAD DATA command:
Write the data to file in CSV format then execute this OS command:
LOAD DATA INFILE 'somefile.csv' INTO TABLE mytable
For more info, see the documentation
Other than LOAD DATA INFILE, I'm not sure there is any other way to get data into MySQL without using SQL. If you want to avoid parsing multiple times, you should use a client library that supports parameter binding, the query can be parsed and prepared once and executed multiple times with different data.
However, I highly doubt that parsing the query is your bottleneck. Is this a dedicated database server? What kind of hard disks are being used? Are they fast? Does your RAID controller have battery backed RAM? If so, you can optimize disk writes. Why aren't you using InnoDB instead of MyISAM?
With MySQL you can insert multiple tuples with one insert statement. I don't have an example, because I did this several years ago and don't have the source anymore.
Consider as mentioned to use one INSERT with multiple values:
INSERT INTO table_name (col1, col2) VALUES (1, 'A'), (2, 'B'), (3, 'C'), ( ... )
This leads to you only having to connect to your database with one bigger query instead of several smaller. It's easier to take in the entire couch through the door once than running back and forth with all disassembled pieces of the couch, opening the door every time. :)
Apart from that, you can also run LOCK TABLES table_name WRITE before INSERT and UNLOCK TABLES afterwards. That will secure that nothing else is inserted during.
Lock tables
INSERT into foo (foocol1, foocol2) VALUES ('foocol1val1', 'foocol2val1'),('foocol1val2','foocol2val2') and so on should sort you. More information and sample code will be found here. If you have further problems, do leave a comment.
UPDATE
If you don't want to use SQL, then try this shell script to do as many inserts as you want, put it in a file, say insertToDb.sh, and get on with your day/evening:
#!/bin/sh
mysql --user=me --password=foo dbname -h foo.example.com -e "insert into tablename (col1, col2) values ($1, $2);"
Invoke as sh insertToDb.sh col1value col2value. If I've still misunderstood your question, leave another comment.
After making some investigation I found no way of passing data directly to mysql database engine (without parsing it).
My aim was to speed up communication between local client and db server as much as possible. The idea was if client is local then it could use some api functions to pass data to db engine thus not using (i.e. parsing) SQL and values in it. The only closest solution was proposed by bobwienholt (using prepared statement and binding parameters). But LOAD DATA .. INFILE appeared to be a bit faster in my case.
The best way to insert data on MS SQL without using insert into or update queries is just to access MS SQL Interface. Right click on the table name and select "Edit top 200 rows". Then you will be able to add data on the database directly by just typing per cell. For you to enable searching or using select or other sql commands just right click on any of the 200 rows you have selected. Go to pane then select SQL and you can add sql command. Check it out. :D
without using insert statement , use " Sqllite Studio " for inserting data in mysql. It's free and open source so u can download and check.