I have a mysql table which has an auto increment id field(id), videoid(int) and userid(int). The site is live now. I forgot to put a date/time field there. now there are some data here. Is it possible to get the insert time for all the existing data?
I have another table which gets reset every week by a cron job. Something wrong happened last week and now I badly need those data. Is there any option by which I can get any kind of backup from a certain date? Does mysql has auto backup or something like that?
If you have access to the binary log, you can get the insert statements by using mysqlbinlog. Read more about it in the manual.
The output from mysqlbinlog can be re-executed (for example, by using it as input to mysql) to redo the statements in the log. This is useful for recovery operations after a server crash.
Related
I am running an ETL script that loads data from mysql into teradata. The script aims to select all rows later than the timestamp of the last successful run of the bash script. Since I do not have write access to the mysql database, I need to store the last run timestamp with the bash script. Is there an easy way to store the timestamp of a successful run? I was thinking I could have a file that I would touch at the end of the script and then check its mtime, or just parse out the timestamp from a log file. What are some better strategies to do this?
Within your script, use set -e1 so that the script exits immediately if any command within the script fails. Then, at the end, log successful completion with a unix timestamp date +%s.
You can then use SELECT FROM_UNIXTIME(<YOUR TIMESTAMP>, <YOUR MYSQL DATE FORMAT>)2 to pull rows that are newer than the last successful completion.
One big caveat: I would not rely solely on timestamps to approach this problem. I would pull from MySQL with some time overlap and check primary keys for each insert into teradata to avoid inserting duplicates. To follow this approach, just subtract 1800 from <YOUR TIMESTAMP> to ensure a 30 minute overlap.
I have an sql file with alot of create, alterings and modifies to a database. If I need to back out at some point (up to a day maybe) after executing the sql script, is there an easy way to do that? For example, is there any tool to read an sql script and produce a 'rollback' script from it?
I am using sqlyog aswell, in case there happens to be any such features built-in (I havn't found any)
No, sorry, there are many statements that cannot be reversed from looking at the SQL command.
DROP TABLE (what was in the table that dropped?)
UPDATE mytable SET timestamp = NOW() (what was the timestamp before?)
INSERT INTO mytable (id) VALUES (NULL) (assuming id is auto-increment, what row was created?)
Many others...
If you want to recover the database from before your day's worth of changes, take a backup before you begin changing it.
You can also do point-in-time recovery using binary logs, to restore the database to any moment since your last backup.
I have a development server that is getting crowded.
I would like to see what date the databases have been accessed to determine what ones can be deleted. Is there a way to do this?
The only thing I found when searching was for postgredb:
How to get last access/modification date of a PostgreSQL database?
If you have a table that always gets values inserted you can add a trigger to the update/insert. Inside this trigger you can set the current timestamp in a dedicated database, including the name of the database from which the insert took place.
This way the only requirement of your database is that it supports triggers.
How to use flume to read continously from mysql to load to hbase?
I am familiar with sqoop but I need to continuously do it from a mysql source.
Is it required to have custom source to do this?
Sqoop is good for bulk import from RDBMS to HDFS/Hive/HBase. If it's just one time import, it's very good, it does what it promises on paper. But the problem comes when you wanna have real-time incremental updates. Between two types of incremental updates Sqoop supports:
Append, this one allows you to re-run the sqoop job, and every new job starts where the last old job ends. eg. first sqoop job only imported row 0-100, then next job will start from 101 based on --last-value=100. But even if 0-100 got updated, Append mode won't cover them anymore.
last-modified, this one is even worse IMHO, it requires the source table has a timestamp field which indicates when the row gets last updated. Then based on the timestamp, it does the incremental updates import. If the source table doesn't have anything like that, this one is not useful.
I'd say, if you do have hands on your source db, you can go for the last-modified mode using Sqoop.
There are a number of ways to do this, but I would write a script that takes your data from MySQL and generates an Avro event for each.
You can then use the built-in Avro source to receive this data, sending it to the HDFS sink.
As the title says...
How do I determine the time an entry was made into a mysql table without adding a new column? I realize that I could add a table.created TIMESTAMP column but I'd rather not do this. I'm using MySQL 5.1
I don't think you can do that. If you could, then timestamp columns would be unnecessary.
Why the reluctance to use a column?
Well, you first need to figure out where you want this data to be stored. mySql doesn't just automatically track when rows are created or updated, so that means it's up to you to store it.
Your first option is to store it in the database. This means altering your table and adding a new column, or storing it elsewhere in the database. If you want to store the information in another table, you have to modify the code that does the insert to also log the data - or use a TRIGGER to automatically log the data.
If you don't want to store the data in the database, you could perhaps use a logging library to write the information to an event log or file. You'd have to modify the code that does the insert to also log this data through that mechanism.
Hope this helps.