Is there any tools which is equivalent to pgpool-II (which is for PostgreSQL) for MySQL db?
Of particular importance for me is the feature - Load Balalce
Load Balance
If a database is replicated, executing
a SELECT query on any server will
return the same result. pgpool-II
takes an advantage of the replication
feature to reduce the load on each
PostgreSQL server by distributing
SELECT queries among multiple servers,
improving system's overall throughput.
At best, performance improves
proportionally to the number of
PostgreSQL servers. Load balance works
best in a situation where there are a
lot of users executing many queries at
the same time.
Related
I have a MySQL table with about 150 million rows. When i perform a search with one clause it takes about 1.5 minutes for it to get the result. Why does it take so long? I am running debian in virtualbox with 2 CPU cores and 4gb of ram. I am using MySQL and apache2.
I am a bit new to this so so don't know what more information to provide.
Searches, or rather queries, in databases like MySQL or any other Relational Database Management System (RDBMS) are subject to a number of factors for performance including:
Structure of the WHERE clause and Indexing to support it
Contention for system resources such as Memory and CPU
The amount of data being retrieved and how it is delivered
Some quick wins and strategies for each:
Structure of the WHERE clause and Indexing to support it
Order your WHERE clause in the order that will cut down the results by the biggest margin as you go from left to right. Also, use Indexes and align these Indexes to the order of those columns in the WHERE clause. If you're searching a large database with SELECT * FROM TABLE WHERE SomeID = 5 AND CreatedDate > '10-01-2015' then be sure you have an Index in place with the columns SomeID and CreatedDate in the order that makes the most sense. If SomeID is a column that is highly unique or likely to have results much smaller than CreatedDate > '10-01-2015' then you should create the query in that order and an Index with columns in the same order.
Contention for system resources such as Memory and CPU
Are you using a table that is constantly updated? There are transactional databases (OLTP) and databases meant for analysis (OLAP). If you're hitting a table that is being constantly updated you may be slowing things down for everyone including yourself. Remember you're a citizen in an environment and as such you need to respect the other use cases. This includes knowing a bit about how the system is used, what resources are available and making sure you are mindful of how your queries will affect others.
The amount of data being retrieved and how it is delivered
Even the best query cannot escape the time it takes to get data from one place to another. You can optimize settings of the RDBMS, have incredible bandwidth etc. but many factors including disk IOPS, network bandwidth, et. al. all play into a cost of doing business. Make sure you're using the right protocols to transfer, have good disk IOPS and all the Best Practices around MySQL.
Some final thoughts:
If you're using AWS and hosting your database in the cloud you may
consider using Amazon Aurora which is a MySQL-compatible RDBMS
that is substantially faster than MySQL.
Having a MySQL site with 90,000 accounts, you need to split traffic into two or three servers, how to do this, the site runs on MySQL. The site has many queries to the database, more Insert than select. How to split this traffic into several servers, and if possible optimize the database (to reduce the number of links)?
I will add that I do not want to change the database for this time to better e.g MemSQL because I do not know about it and the current development. In the future I have such an intention.
Your solution is to use database virtualization, you can use PAR elastic software for this.
Many consider it essential to use a NoSQL database for high data ingestion rates. This is simply not true.
While it is true that a single MySQL server in the cloud cannot ingest data at high rates; often no more than a few thousand rows a second when inserting in small batches, or tens of thousands of rows a second using a larger batch size. However, database virtualization software from ParElastic is able to scale MySQL servers to hundreds of thousands and even more than 1,000,000 (one million) rows per second.
I'm running a MySQL database (InnoDB) running on its own server, with a number of other server's conducting high frequency and long running (approx 5 - 10 seconds per query) querying to the database.
Majority of the queries involves a SELECT statement, followed by either an UPDATE or an INSERT.
I'm seeing significant query bottlenecks building up on the database server.
Initially I thought this might be due to queries locking rows inside tables, creating a queue, however the majority of queries do not interact with the rows created/updated by the other queries.
What other factors I should be considering in order to reduce this bottle neck?
I believe the number of concurrent connections is unlikely to be the cause because I think MySQL can take a relatively large amount (000's).
In oracle we can create a table and insert data and select it with parallel option.
Is there any similar option in mysql. I am migrating from oracle to mysql and my system has more select and less data change, so any option to select parallely is what i am seeking for.
eg: Lets consider my table has 1 million rows and if i use parallel(5) option then five threads are running the same query with limit and fetching approximately 200K each and as final result i get 1 million record in 1/5th of usual time.
In short, the answer is no.
The MySQL server is designed to execute concurrent user sessions in parallel, but not to execute one given user session in several parts in parallel.
This is a personal opinion, but I would refrain from wanting to apply optimizations up front, making assumptions about how the RDBMS works. Better measure the query first, and see if the response time is a real concern or not, and only then investigate possible optimizations.
"Premature optimization is the root of all evil." (Donald Knuth)
Queries within MySQL are always run parallel. If you want to run different queries simultaneously through your program, however, you would need to open different connections through workers that your program would have async access to.
You could also run tasks through creating events or using delayed inserts, however I don't think that applies very well here. Something else to consider:
Generally, some operations are guarded between individual query
sessions (called transactions). These are supported by InnoDB
backends, but not MyISAM tables (but it supports a concept called
atomic operations). There are various level of isolation which differ
in which operations are guarded from each other (and thus how
operations in one parallel transactions affect another) and in their
performance impact. - Holger Just
He also mentions the MySQL transcations page, which breifly goes over the different engine types available to MySQL (MyISAM being faster, but not as reliable):
MySQL Transcations
Can anyone explain to me why there is a dramatic difference in performance between MySQL and SQL Server for this simple select statement?
SELECT email from Users WHERE id=1
Currently the database has just one table with 3 users. MySQL time is on average 0.0003 while SQL Server is 0.05. Is this normal or the MSSQL server is not configured properly?
EDIT:
Both tables have the same structure, primary key is set to id, MySQL engine type is InnoDB.
I tried the query with WITH(NOLOCK) but the result is the same.
Are the servers of the same level of power? Hardware makes a difference, too. And are there roughly the same number of people accessing the db at the same time? Are any other applications using the same hardware (databases in general should not share servers with other applications).
Personally I wouldn't worry about this type of difference. If you want to see which is performing better, then add millions of records to the database and then test queries. Database in general all perform well with simple queries on tiny tables, even badly designed or incorrectly set up ones. To know if you will have a performance problem you need to test with large amounts of data and many simulataneous users on hardware similar to the one you will have in prod.
The issue with diagnosing low cost queries is that the fixed cost may swamp the variable costs. Not that I'm a MS-Fanboy, but I'm more familiar with MS-SQL, so I'll address that, primarily.
MS-SQL probably has more overhead for optimization and query parsing, which adds a fixed cost to the query when decising whether to use the index, looking at statistics, etc. MS-SQL also logs a lot of stuff about the query plan when it executes, and stores a lot of data for future optimization that adds overhead
This would all be helpful when the query takes a long time, but when benchmarking a single query, seems to show a slower result.
There are several factors that might affect that benchmark but the most significant is probably the way MySQL caches queries.
When you run a query, MySQL will cache the text of the query and the result. When the same query is issued again it will simply return the result from cache and not actually run the query.
Another important factor is the SQL Server metric is the total elapsed time, not just the time it takes to seek to that record, or pull it from cache. In SQL Server, turning on SET STATISTICS TIME ON will break it down a little bit more but you're still not really comparing like for like.
Finally, I'm not sure what the goal of this benchmarking is since that is an overly simplistic query. Are you comparing the platforms for a new project? What are your criteria for selection?