Many MySQL databases - problem? - mysql

I'd like to know if it is any kind of issue having 200+ MySQL databases on the same server. None of them are probably going to be very used, I'm just wondering if there is any issue having so much databases.
Thanks in advance

Not necessarily. Shared-hosting services will usually have many hundreds of databases on each server, all relatively small. Just be sure you're not confusing "Databases" with "Tables," as is a problem for those new to that area of development.

No issues, they are just taking some disk space. If you don't need them, you can delete them or take backup of them then delete them.

Multiple tables is the problem, not databases.
Having enough tables will result in very poor performance as they'll need to be closed and reopened. With some engines (MyISAM) this also blows away some of the cache, which makes for very poor performance.
Whether you put them in multiple databases or a single one, makes no difference from a performance point of view.
It does however, make permissions management much easier.

Related

Large 250 GB Mysql Database of online shop- Indexing Needed

Hello to everyone and happy new year.
I am quite new to MySQL databases and I need a bit of help and advice if possible. I currently created a very large e-shop with over 250 gb worth of products and still growing. I have optimized my dedicated server and WordPress website to the best options but I am still not satisfied with the speed of my website and some of its features like when i use the search bar is extremely slow. I understand that I need MySQL optimizations, which I have done some of them but I am not sure how to proceed even further maybe with more optimizations or indexing the tables. I don't know how to do it effectively and what keys and commands to use on so big database to make the indexing correctly.
Thank you in advance
WordPress uses an EAV schema design, which is inherently inefficient. To top it off, the INDEXes it uses on wp_postmeta could be made better:
http://mysql.rjweb.org/doc.php/index_cookbook_mysql#speeding_up_wp_postmeta
Congratulations on your success with your e-shop!
The most common optimization is to create indexes. Which indexes you need depends on what queries you are running.
There's no way anyone on Stack Overflow can recommend specific optimizations, since we know nothing about your tables or queries.
I made a presentation about the process of optimizing with indexes years ago, but the principles haven't changed:
How to Design Indexes, Really
Video of the presentation: https://www.youtube.com/watch?v=ELR7-RdU9XU

Is there any difference between multiple databases and single database when using a single mysql server?

I have a mysql server.
And I need to run wordpress and discuz on it, there are two ways to use the mysql server:
create two databases in mysql, one for wordpress and the other one for discuz
create only one database in mysql, the two shares the database.
Which one provides better mysql performance?
Thanks!
actually as far as I can tell it's almost the same, but just a readability issue, lots of websites are hosted on a server that allows only a single db, in that case, you can use the same db with different table prefixes'for example' to enhance readability, other than that, it's almost the same, no performance issues, but on the other side, having to data bases, will allow you to track problems on the db server like overload and things like that easier since you will know already where it's coming from. I hope that's enough.
for some software engineering point of view, maybe use two databases is better.
(Ok, if there is no too much big data in your databases, it nearly make no difference but if your data is big enough it's better to seperate and manage independently. mostly good engineering work is forcast work of performance, so that's my suggestion maybe it's not a good answer.)

MySQL was shut down during Index - How to fix the index?

MySQL was shut down in the middle of an indexing operation.
It still works but some of the queries seem much slower than before.
Is there anything particular we can check?
Is it possible that an index gets half way through?
Thanks much
As a suggested in my comment, you could try a repair on the relevant table(s).
That said, there's a section of the MySQL manual dedicated to this precise topic, which details how to use the REPAIR <table> statement and indeed dump/re-import.
Is this doesn't make any difference, you may need to check the database settings (if it's a InnoDB engined table/database, it'll love being able to be resident in memory for example) and perhaps try to see what specific indexes are being used via an EXPLAIN on the queries that are causing pain.
There are also commercial tools such as New Relic that'll show what specific queries are being sluggish in quite a lot of detail as well as monitoring other aspects of your system, which may be worth exploring if this is a commercial project/web site.

Clustering, Sharding or simple Partition / Replication

We have created a Facebook application and it got a lot of virality. The problem is that our database started getting REALLY FULL (some tables have more than 25 million rows now). It got to the point that the app just stopped working because there was a queue of thousands and thousands of writes to be made.
I need to implement a solution for scaling this app QUICKLY but I'm not sure if I should pursue Sharding or Clustering since I'm not sure what are the pro's and con's of each of them and I was thinking of doing a Partition / Replication approach but I think that doesn't help if the load is on the writes?
25 million rows is a completely reasonable size for a well-constructed relational database. Something you should bear in mind, however, is that the more indexes you have (and the more comprehensive they are), the slower your writes will be. Indexes are designed to improve query performance at the expense of write speed. Be sure that you're not over-indexed.
What sort of hardware is powering this database? Do you have enough RAM? It's far easier to change these attributes than it is to try to implement complex RDBMS load balancing techniques, especially if you're under a time crunch.
Clustering/Sharding/Partitioning comes when single node has reached to the point where its hardware cannot bear the load. But your hardware has still room to expand.
This is the first lesson I learnt when I started being hit by such issues
Well, to understand that, you need to understand how MySQL handles clustering. There are 2 main ways to do it. You can either do Master-Master replication, or NDB (Network Database) clustering.
Master-Master replication won't help with write loads, since both masters need to replay every single write issued (so you're not gaining anything).
NDB clustering will work very well for you if and only if you are doing mostly primary key lookups (since only with PK lookups can NDB operate more efficient than a regular master-master setup). All data is automatically partitioned among many servers. Like I said, I would only consider this if the vast majority of your queries are nothing more than PK lookups.
So that leaves two more options. Sharding and moving away from MySQL.
Sharding is a good option for handling a situation like this. However, to take full advantage of sharding, the application needs to be fully aware of it. So you would need to go back and rewrite all the database accessing code to pick the right server to talk to for each query. And depending on how your system is currently setup, it may not be possible to effectively shard...
But another option which I think may suit your needs best is switching away from MySQL. Since you're going to need to rewrite your DB access code anyway, it shouldn't be too hard to switch to a NoSQL database (again, depending on your current setup). There are tons of NoSQL servers out there, but I like MongoDB. It should be able to withstand your write load without worry. Just beware that you really need a 64 bit server to use it properly (with your data volume).
Replication is for data backup not for performance so its out of question.
Well, 8GB RAM is still not that much you can have many hundred GB RAM with quite big hard disk space and MySQL would still work for you.
Clustering/Sharding/Partitioning comes when single node has reached to the point where its hardware cannot bear the load. But your hardware has still room to expand.
If you don't want to upgrade your hardware then you need to give more information about database design and if there are lot of joins or not so that above named options can be considered deeply.

Is MySQL appropriate for a read-heavy database with 3.5m+ rows? If so, which engine?

My experience with databases is with fairly small web applications, but now I'm working with a dataset of voter information for an entire state. There are approximately 3.5m voters and I will need to do quite a bit of reporting on them based on their address, voting history, age, etc. The web application itself will be written with Django, so I have a few choices of database including MySQL and PostgreSQL.
In the past I've almost exclusively used MySQL since it was so easily available. I realize that 3.5m rows in a table isn't really all that much, but it's the largest dataset I've personally worked with, so I'm out of my personal comfort zone. Also, this project isn't a quickie throw-away application though, so I want to make sure I choose the best database for the job and not just the one I'm most comfortable with.
If MySQL is an appropriate tool for the job I would also like to know if it makes sense to use InnoDB or MyISAM. I understand the basic differences between the two, but some sources say to use MyISAM for speed but InnoDB if you want a "real" database, while others say all modern uses of MySQL should use InnoDB.
Thanks!
I've run DB's far bigger than this on mysql- you should be fine. Just tune your indexes carefully.
InnoDB supports better locking semantics, so if there will be occasional or frequent writes (or if you want better data integrity), I'd suggest starting there, and then benchmarking myisam later if you can't hit your performance targets.
MyISAM only makes sense if you need speed so badly that you're willing to accept many data integrity issues downsides to achieve it. You can end up with database corruption on any unclean shutdown, there's no foreign keys, no transactions, it's really limited. And since 3.5 million rows on modern hardware is a trivial data set (unless your rows are huge), you're certainly not at the point where you're forced to optimize for performance instead of reliability because there's no other way to hit your performance goals--that's the only situation where you should have to put up with MyISAM.
As for whether to choose PostgreSQL instead, you won't really see a big performance difference between the two on an app this small. If you're familiar with MySQL already, you could certainly justify just using it again to keep your learning curve down.
I don't like MySQL because there are so many ways you can get bad data into the database where PostgreSQL is intolerant of that behavior (see Comparing Speed and Reliability), the bad MyISAM behavior is just a subset of the concerns there. Given how fractured the MySQL community is now and the uncertainties about what Oracle is going to do with it, you might want to consider taking a look at PostgreSQL just so you have some more options here in the future. There's a lot less drama around the always free BSD licensed PostgreSQL lately, and while smaller at least the whole development community for it is pushing in the same direction.
Since it's a read-heavy table, I will recommend using MyISAM table type.
If you do not use foreign keys, you can avoid the bugs like this and that.
Backing up or copying the table to another server is as simple as coping frm, MYI and MYD files.
If you need to compute reports and complex aggregates, be aware that postgres' query optimizer is rather smart and ingenious, wether the mysql "optimizer" is quite simple and dumb.
On a big join the difference can be huge.
The only advantage MySQL has is that it can hit the indexes without hitting the tables.
You should load your dataset in both databases and experiment the biger queries you intend to run. It is better to spend a few days of experimenting, rather than be stuck with the wrong choice.