I have created logical volumes, using already existing /home /var directories...the process for creating the logical volumes was no issue, however after I create the logical volumes, all of the content in the directories looks like it was wiped.
Is there a way to avoid this?
Related
I have a MYSQL database on my SDA. It's mostly all one schema with "popular" tables in it. I want to store the less "popular" tables of the schema (which take up another 1TB or so) on my SDB partition.
What is the right way to do this? Do I need another MYSQL server running on that drive? Or can I simply set like DATA_DIRECTORY= or something? This is Ubuntu and MYSQL 5.7.38. Thank you for any help, it's much appreciated.
As of MySQL 8.0.21, the ability to specify the data directory per table has finally improved.
CREATE TABLE t1 (c1 INT PRIMARY KEY) DATA DIRECTORY = '/external/directory';
Read https://dev.mysql.com/doc/refman/8.0/en/innodb-create-table-external.html#innodb-create-table-external-data-directory for details.
In earlier versions of MySQL, you could use symbolic links. That is, the link still has to reside under the default data directory, but the link can point to a file on another physical device.
It was unreliable to use symbolic links for individual tables in this way, because OPTIMIZE TABLE or many forms of ALTER TABLE would recreate the file without the symbolic link, effectively moving it back to the primary storage device. To solve this, it was recommended to use a symbolic link for the schema subdirectory instead of individual tables.
To be honest, I've never found a case where I needed to use either of these techniques. Just keep it simple: one data directory on one filesystem, and don't put the data directory on the same device as the root filesystem. Make sure the data storage volume is large enough for all your data. Use software RAID if you need to use multiple devices to make one larger filesystem.
I want Mysql to store it's data on Amazone S3, So I mounted an S3 bucket to my server and changed the path of data dir to mounted directory in my.cnf.
After doing this, I restarted the server and created the database and it caused no problem but when I try to create a table (say test), it gives me the following error.
ERROR 1033 (HY000): Incorrect information in file: './test/t.frm'
Can any one please tell me, what I am trying to oo is actually possible?
If yes, where am I going wrong?
If no, Why?
There is no viable solution for storing MySQL databases on S3. None.
There's nothing wrong with using s3fs in limited applications where it is appropriate, but it's not appropriate here.
S3 is not a filesystem. It is an object store. To modify a single byte of a multi-gigabyte "file" in S3 requires that the entire file be copied over itself.
Now... there are tools like s3nbd and s3backer that take a different approach to using S3 for storage. These use S3 to emulate a block device over which you can create a filesystem, and these would come closer than s3fs to being an appropriate bridge between what S3 is and what MySQL would need, but still this approach cannot reliably be used either, for one reason.
Consistency.
When MySQL writes data to a file, it needs absolute assurance that if it reads that same data, that it will get back what it wrote. S3 does not guarantee this.
Q: What data consistency model does Amazon S3 employ?
Amazon S3 buckets in all Regions provide read-after-write consistency for PUTS of new objects and eventual consistency for overwrite PUTS and DELETES.
https://aws.amazon.com/s3/faqs/
When an object in S3 is "modified" (that's done with an overwrite PUT), there is no guarantee that a read of that file won't return a previous version for a short time after the write occurred.
In short, you are pursuing an essentially impossible objective trying to use S3 for something it isn't designed to do.
There, is, however, a built-in mechanism in MySQL that can save on storage costs: InnoDB natively supports on-the-fly table compression.
Or if you have large, read-only MyISAM tables, those can also be compressed with myisampack.
Some EC2 instances include the ephemeral disks/instance store, which are zero-cost, but volatile, hard drives that should never be used for critical data, but that might be a good option to consider if the database in question is a secondary database that can easily be rebuilt from authoritative sources in the event of data loss. They can be quite nice for "disposable" databases, like QA databases or log analytics, where the database is not the authoritative store for the data.
Actually s3 is not really a file system so it will not work as data directory in normal scenario.
May be you can use it as data directory after mounting it with data directory like /var/lib/mysql but still it will perform slow. So I don't think that it is a good idea.
S3 bucket is a storage directory where you can store your images, files, backup files etc.
If still you want to use it as data directory then you can take help from here.
http://centosfaq.org/centos/s3-as-mysql-directory/
files cannot be appended/modified in AWS S3 once created. It might be not be possible to store Mysql DB on S3.
MySQL with RocksDB engine can possibly do this:
Run MyRocks on S3FS, or
Use rockset's RocksDB-Cloud and modify MyRocks to support RocksDB-Cloud.
Both solutions might do some modification on MyRocks.
See source codes:
MyRocks
RocksDB-cloud
I'm just getting started with learning Hadoop, and I'm wondering the following: suppose I have a bunch of large MySQL production tables that I want to analyze.
It seems like I have to dump all the tables into text files, in order to bring them into the Hadoop filesystem -- is this correct, or is there some way that Hive or Pig or whatever can access the data from MySQL directly?
If I'm dumping all the production tables into text files, do I need to worry about affecting production performance during the dump? (Does it depend on what storage engine the tables are using? What do I do if so?)
Is it better to dump each table into a single file, or to split each table into 64mb (or whatever my block size is) files?
Importing data from mysql can be done very easily. I recommend you to use Cloudera's hadoop distribution, with it comes program called 'sqoop' which provides very simple interface for importing data straight from mysql (other databases are supported too).
Sqoop can be used with mysqldump or normal mysql query (select * ...).
With this tool there's no need to manually partition tables into files. But for hadoop it's much better to have one big file.
Useful links:
Sqoop User Guide
2)
Since I dont know your environment I will aire on the safe, side - YES, worry about affecting production performance.
Depending on the frequency and quantity of data being written, you may find that it processes in an acceptable amount of time, particularly if you are just writing new/changed data. [subject to complexity of your queries]
If you dont require real time or your servers have typically periods when they are under utilized (overnight?) then you could create the files at this time.
Depending on how you have your environment setup, you could replicate/log ship to specific db server(s) who's sole job is to create your data file(s).
3)
No need for you to split the file, HDFS will take care of partitioning the data file into bocks and replicating over the cluster. By default it will automatically split into 64mb data blocks.
see - Apache - HDFS Architecture
re: Wojtek answer - SQOOP clicky (doesn't work in comments)
If you have more questions or specific environment info, let us know
HTH
Ralph
I have a PHP based web application which is currently only using one webserver but will shortly be scaling up to another. In most regards this is pretty straightforward, but the application also stores a lot of files on the filesystem. It seems that there are many approaches to sharing the files between the two servers, from the very simple to the reasonably complex.
These are the options that I'm aware of
Simple network storage
NFS
SMB/CIFS
Clustered filesystems
Lustre
GFS/GFS2
GlusterFS
Hadoop DFS
MogileFS
What I want is for a file uploaded via one webserver be immediately available if accessed through the other. The data is extremely important and absolutely cannot be lost, so whatever is implemented needs to a) never lose data and b) have very high availability (as good as, or better, than a local filesystem).
It seems like the clustered filesystems will also provide faster data access than local storage (for large files) but that isn't of vita importance at the moment.
What would you recommend? Do you have any suggestions to add or anything specifically to look out for with the above options? Any suggestions on how to manage backup of data on the clustered filesystems?
You can look at the Mirror File System that replicate files between servers in real time.
It's very easy to install and set up. One mount command does it and you can have a HA,
Load Balancing and Backup solution in less than 10 minutes.
http://www.TwinPeakSoft.com/
Fish.Ada
It looks like the clustered filesystems are the best bet. Backup can be done as for any other filesystem, although with most of them having built in redundancy, they are already more reliable than a standard filesystem.
The application that I am working on generates files dynamically with use. This makes backup and syncronization between staging,development and production a real big challenge. One way that we might get smooth solution (if feasable) is to have a script that at the moment of backing up the database can backup the dynamically generated files inside the database and in restore time can bring those file out of the database and in the filesystem again.
I am wondering if there are any available (pay or free) application that could be use as scripts to make this happen.
Basically if I have
/usr/share/appname/server/dynamicdir
/usr/share/appname/server/otherdir/etc/resource.file
Then taking the examples above and with the script put them on the mysql database.
Please let me know if you need more information.
Do you mean that the application is storing a files as blobs in the MySQL database, and/or creating lots of temporary tables? Or that you just want temporary files - themselves unrelated to a database - to be stored in MySQL as a backup?
I'm not sure that trying to use MySQL as an net-new intermediary for backups of files is a good idea. If the app already uses it, thats one thing, if not, MySQL isn't the right tool here.
Anyway. If you are interested in capturing a filesystem at point-in-time, the answer is to utilize LVM snapshots. You would likely have to rebuild your server to get your filesystems onto LVM, and have enough free storage there for as many snapshots as you think you'd need.
I would recommend having a new mount point just for this apps temporary files. If your MySQL tables are using InnoDB, a simple script to run mysqldump --single-transaction in the background, and then the lvm snapshot process, you could get these synced up to less then a second.
the should be trivial to accomplish using PHP, perl, python, etc. are you looking for someone to write this for you?