MySQL GROUP BY on large tables - mysql

I have a table with over 75 millions registers. I want to run a group by to summarize this registries.
The table structure is:
CREATE TABLE `output_medicos_full` (
`name` varchar(100) NOT NULL DEFAULT '',
`term` varchar(50) NOT NULL DEFAULT '',
`hash` varchar(40) NOT NULL DEFAULT '',
`url` varchar(2000) DEFAULT NULL,
PRIMARY KEY (`name`,`term`,`hash`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
I want execute the query bellow, but is taking so long using a dedicate mysql server 5.5 with 4GB RAM:
INSERT INTO TABLE report
SELECT
`hash`
,CASE UPPER(SUBSTRING_INDEX(url, ':', 1))
WHEN 'HTTP' THEN 1
WHEN 'HTTPS' THEN 2
WHEN 'FTP' THEN 3
WHEN 'FTPS' THEN 4
ELSE 0 end
,url
FROM output_medicos_full
GROUP BY `hash`;
On table report there is an unique index on hash column
Any help to speed it up?
Thank's

The main cost here is all the I/O. The entire table needs to be read.
innodb_buffer_pool_size = 2G is dangerously high for 4GB of RAM. If swapping occurs, performance will suffer terribly.
Since the hash is a SHA1, it is extremely likely to be unique across a mere 75M urls. So that GROUP BY will yield 75M rows. This is probably not what you wanted. Once you rewrite the query, we can discuss optimizations.

Related

Very Slow MySQL Performance on NAS

Just got myself a Asustor NAS to handle my videos, pictures, music and etc.
Having both a Desktop and a Notebook in the house, I figured it would be a good idea to setup my database on the NAS (which already comes with MariaDB preinstalled).
The Setup: RAID 1, max read speads of about 110MB/s from the disk, connected via 1.3 Mbps WiFi with gigabit connection. Getting about 60MB/s using BlackMagic Benchmark.
The query:
SELECT items.title, items.discount, items.qtd, items.price, ((price * qtd) - discount) AS total, DATE_FORMAT(orders.created_at, '%m-%y')
FROM items
INNER JOIN orders ON orders.order_id = items.order_id
ORDER BY created_at;
The table orders has about 1.8k rows, the table items has about 4.7k rows. The query affects 5k rows and takes between 4.8 to 7.0 seconds to run, which seems absurd for such a simple query. I used to run the same query in my localhost (ok, it is a NVMe SSD, which I get is a lot faster), on milliseconds. order_id is a VARCHAR with about 10 characters in it.
It took about 7 (9 last time) minutes to insert all the data in all tables:
`orders` - 1.7k rows, 11 columns
`items` - 4.8k rows, 12 columns
`customers` - 1.7k rows, 9 columns
My question:
Is it really that bad of a performance, or am I getting the wrong performance benchmark after having used NVMe SSD's?
If it is bad indeed, what can I do to improve it (still hosting my DB on my NAS)?
What could I expect performance-wise on an online hosted database?
Thanks a lot.
**Tables:**
`CREATE TABLE `orders` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`order_id` varchar(15) DEFAULT NULL COMMENT 'VA',
`created_at` datetime DEFAULT NULL,
`gateway` varchar(25) DEFAULT NULL,
`total` decimal(15,0) DEFAULT NULL,
`subtotal` decimal(15,0) DEFAULT NULL,
`status` varchar(20) DEFAULT NULL,
`discounts` decimal(15,0) DEFAULT NULL,
`total_price` decimal(15,0) DEFAULT NULL,
`order_number` varchar(15) DEFAULT NULL,
`processing` varchar(15) DEFAULT NULL,
`customer_id` varchar(15) DEFAULT NULL,
`number` varchar(15) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `number` (`number`),
UNIQUE KEY `order_id` (`order_id`),
KEY `customer_id` (`customer_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1712 DEFAULT CHARSET=utf8;`
Is it really that bad of a performance, or am I getting the wrong performance benchmark after having used NVMe SSD's?
Yes this is kind of bad performance. Correct indexing for the query will go part way in solving the performance problem of the query. Getting to get the NAS to use innodb_buffer_pool and free memory as disk cache is going to be difficult with only 512M on board.
If it is bad indeed, what can I do to improve it (still hosting my DB on my NAS)?
Correct indexing of tables with help the join and order. Design changes to use integer primary keys for joins. As a first step, if order_id really isn't utf8 and is just latin1 changing for that column that would make the key smaller, could change this to the primary key too.
As this is an entire two tables data search in the query, its only going to eliminate IO latency if it can all stay in RAM.
What could I expect performance-wise on an online hosted database?
Hosted database will offer more RAM, and probably a faster CPU.
My NAS mysql version is Mariadb 10.3.29 , blow change work for me , you can try it!
/volume1/#appstore/MariaDB10/usr/local/mariadb10/etc/mysql/my.cnf
Change
#innodb_flush_log_at_trx_commit = 1
TO
innodb_flush_log_at_trx_commit = 0
Restart MySQL
----OR----
Login MySQL
SET GLOBAL innodb_flush_log_at_trx_commit=0;
You can change it on runtime and check the different!

Optimize Query on mysql

I have a query that runs really slow (15 20 seconds) when is not on memory and quite fast when is on memory (2s - 0.6s)
select count(distinct(concat(conexiones.tMacAdres,date_format(conexiones.fFecha,'%Y%m%d')))) as Conexiones,
sum(if(conexiones.tEvento='megusta',1,0)) as MeGusta,sum(if(conexiones.tEvento='megusta',conexiones.nAmigos,0)) as ImpactosMeGusta,
sum(if(conexiones.tEvento='checkin',1,0)) as CheckIn,sum(if(conexiones.tEvento='checkin',conexiones.nAmigos,0)) as ImpactosCheckIn,
min(conexiones.fFecha) Fecha_Inicio, now() Fecha_fin,datediff(now(),min(conexiones.fFecha)) as dias
from conexiones, instalaciones
where conexiones.idInstalacion=instalaciones.idInstalacion and conexiones.idInstalacion=190
and (fFecha between '2014-01-01 00:00:00' and '2016-06-18 23:59:59')
group by instalaciones.tNombre
order by instalaciones.idCliente
This is Table SCHEMAS:
Instalaciones with 1332 rows:
CREATE TABLE `instalaciones` (
`idInstalacion` int(10) unsigned NOT NULL AUTO_INCREMENT,
`idCliente` int(10) unsigned DEFAULT NULL,
`tRouterSerial` varchar(50) DEFAULT NULL,
`tFacebookPage` varchar(256) DEFAULT NULL,
`tidFacebook` varchar(64) DEFAULT NULL,
`tNombre` varchar(128) DEFAULT NULL,
`tMensaje` varchar(128) DEFAULT NULL,
`tWebPage` varchar(128) DEFAULT NULL,
`tDireccion` varchar(128) DEFAULT NULL,
`tPoblacion` varchar(128) DEFAULT NULL,
`tProvincia` varchar(64) DEFAULT NULL,
`tCodigoPosta` varchar(8) DEFAULT NULL,
`tLatitud` decimal(15,12) DEFAULT NULL,
`tLongitud` decimal(15,12) DEFAULT NULL,
`tSSID1` varchar(40) DEFAULT NULL,
`tSSID2` varchar(40) DEFAULT NULL,
`tSSID2_Pass` varchar(40) DEFAULT NULL,
`fSincro` datetime DEFAULT NULL,
`tEstado` varchar(10) DEFAULT NULL,
`tHotspot` varchar(10) DEFAULT NULL,
`fAlta` datetime DEFAULT NULL,
PRIMARY KEY (`idInstalacion`),
UNIQUE KEY `tRouterSerial` (`tRouterSerial`),
KEY `idInstalacion` (`idInstalacion`)
) ENGINE=InnoDB AUTO_INCREMENT=1332 DEFAULT CHARSET=utf8;
Conexiones with 2370365 rows
CREATE TABLE `conexiones` (
`idConexion` int(10) unsigned NOT NULL AUTO_INCREMENT,
`idInstalacion` int(10) unsigned DEFAULT NULL,
`idUsuario` int(11) DEFAULT NULL,
`tMacAdres` varchar(64) DEFAULT NULL,
`tUsuario` varchar(128) DEFAULT NULL,
`tNombre` varchar(64) DEFAULT NULL,
`tApellido` varchar(64) DEFAULT NULL,
`tEmail` varchar(64) DEFAULT NULL,
`tSexo` varchar(20) DEFAULT NULL,
`fNacimiento` date DEFAULT NULL,
`nAmigos` int(11) DEFAULT NULL,
`tPoblacion` varchar(64) DEFAULT NULL,
`fFecha` datetime DEFAULT NULL,
`tEvento` varchar(20) DEFAULT NULL,
PRIMARY KEY (`idConexion`),
KEY `idInstalacion` (`idInstalacion`),
KEY `tMacAdress` (`tMacAdres`) USING BTREE,
KEY `fFecha` (`fFecha`),
KEY `idUsuario` (`idUsuario`),
KEY `insta_fecha` (`idInstalacion`,`fFecha`)
) ENGINE=InnoDB AUTO_INCREMENT=2370365 DEFAULT CHARSET=utf8;
This is EXPLAIN
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE instalaciones const PRIMARY,idInstalacion PRIMARY 4 const 1
1 SIMPLE conexiones ref idInstalacion,fFecha,insta_fecha idInstalacion 5 const 110234 "Using where"
Thanks !
(Edited)
SHOW TABLE STATUS LIKE 'conexiones'
Name Engine Version Row_format Rows Avg_row_length Data_length Max_data_length Index_length Data_free Auto_increment Create_time Update_time Check_time Collation Checksum Create_options Comment
conexiones InnoDB 10 Compact 2305296 151 350060544 0 331661312 75497472 2433305 28/06/2016 22:26 NULL NULL utf8_general_ci NULL
Here's why it is so slow. And I will end with a possible speedup.
First, please do
SELECT COUNT(*) FROM conexiones
WHERE idInstalacion=190
and fFecha >= '2014-01-01'
and fFecha < '2016-06-19
in order to see how many rows we are dealing with. The EXPLAIN suggests 110234, but that is only a crude estimate.
Assuming there are 110K rows of conexiones involved in the query, and assuming the rows were (approximately) inserted in chronological order by fFecha, then...
There are a lot of rows to work with, and
They are scattered around the table on disk, hence
The query takes a lot of I/O, unless it is cached.
Let's further check on my last claim... How much RAM do you have? What is the value of innodb_buffer_pool_size? It should be about 70% of available RAM. Use a lower percentage if you have less than 4GB of RAM.
Assuming that conexiones is too big to be 'cached' in the 'buffer_pool', we need to find a way to decrease the I/O.
There are 1332 different values for idInstalacion. Perhaps you insert 1332 rows every few minutes/hours into conexiones? Since the PRIMARY KEY merely an AUTO_INCREMENT, those rows will be 'appended' to the end of the table.
Now let's look at where the idInstalacion=190 rows are. A new one of them occurs every 1332 (or so) rows. That means they are spread out. It means that (probably) no two rows are in the same block (16KB in InnoDB). That means that the 110234 will be in 110234 different blocks. That's about 2GB. If the buffer_pool is smaller than that, then there will be I/O. Even if it is bigger than that, that's a lot of data to touch.
But what to do about it? If we could arrange the =190 rows to be consecutive in the table, then the 2GB might drop to, say, 20MB -- a much more manageable and cacheable size. But how can that be done? By changing the PRIMARY KEY.
PRIMARY KEY(idInstalacion, fFecha, idConexion),
INDEX(idConexion)
and DROP any other indexes starting with idInstalacion or idConexion. To explain:
Since the PK is "clustered" with the data, all idInstalacion=190 rows over any consecutive fFetcha range will be consecutive in the data. So, fetching one block will get about 100 rows -- much less I/O.
A PK must be unique. Assuming (idInstalacion, fFecha) is not unique, I tacked on idConexion to make it unique.
I added INDEX(idConexion) to make AUTO_INCREMENT happy.
Potential drawback... Since this change rearranges the order of the data, other queries, including the INSERTs may be slowed down. The INSERTs will be scattered, but not really slowed down. 1332 "hots spots" would be accepting the new rows; that many blocks can easily be cached.
Arithmetic... If you have spinning drives, I would expect the existing structure to take about 1102 seconds (perhaps under 110 seconds for SSD) for 110234 rows. Since it is taking under 20 seconds, I suspect there is some caching (or you have SSDs) or the 110234 is grossly overestimated. My suggested change should decrease the "worst" time significantly, and slightly improve the "in memory" time. This "slight improvement" comes from being able to use the PK instead of a secondary key.
Caveat: Since 110234 * 1332 is nowhere near 2370365, much of my numerical analysis is probably nowhere near correct. For example, 2370365 rows with that schema is possible less than 1GB. Please provide SHOW TABLE STATUS LIKE 'conexiones'.
Addenda
"server has 2GB Ram and innodb_buffer_pool_size is 5368709120" -- Either that is a typo or it is terrible. Since the buffer_pool needs to reside in RAM, do not set the buffer_pool to 5GB. 500MB might be OK for your tiny 2GB of RAM.
The SHOW TABLE STATUS confirms that it (data + indexes) won't quite fit in 500M, so you may periodically experience I/O bound queries with 500M.
Increasing your RAM and buffer_pool would temporarily (until the data gets bigger) help performance.
Before putting this into production, test the ALTER and time the various queries you use:
ALTER TABLE conexiones
DROP PRIMARY KEY,
DROP INDEX insta_fecha,
DROP INDEX idInstalacion,
PRIMARY KEY(idInstalacion, fFecha, idConexion),
INDEX(idConexion)
Caution: The ALTER will need about 1GB of free disk space.
When timing, run with the Query Cache off, and run twice -- the first may involve I/O; the second is the 'in memory' as you mentioned.
Revised analysis: Since the bigger table has 300MB of data and some amount of indexes in use, and assuming 500MB buffer pool, I suspect that blocks are bumped out of the buffer pool some of the time. This fits well with your initial comment on the query's speed. My suggested index changes should help avoid the speed variance, but may hurt the performance of other queries.
Try to use a multi column index:
CREATE idx_nn_1 ON conexiones(idInstalacion,fFecha);
You might need to have it the other way around depending on the data, so test both. This avoids reading all the records for between condition on fFecha matching the idInstalacion condition, and should improve performance.
Try the following:
Either delete the idInstalacion INDEX or tell the engine to use the correct key in the from clause:
from conexiones use index (insta_fecha), instalaciones
And you don't need to JOIN, GROUP or ORDER. You are joining on a constant value (190) with one row. And you don't use any column from instalaciones.
So all you need is this:
select count(distinct(concat(conexiones.tMacAdres,date_format(conexiones.fFecha,'%Y%m%d')))) as Conexiones,
sum(if(conexiones.tEvento='megusta',1,0)) as MeGusta,sum(if(conexiones.tEvento='megusta',conexiones.nAmigos,0)) as ImpactosMeGusta,
sum(if(conexiones.tEvento='checkin',1,0)) as CheckIn,sum(if(conexiones.tEvento='checkin',conexiones.nAmigos,0)) as ImpactosCheckIn,
min(conexiones.fFecha) Fecha_Inicio, now() Fecha_fin,datediff(now(),min(conexiones.fFecha)) as dias
from conexiones -- use index (insta_fecha)
where conexiones.idInstalacion=190
and (fFecha between '2014-01-01 00:00:00' and '2016-06-18 23:59:59')
However - it doesn't mean it will be faster. MySQL will probably optimize all that stuff away.

Changing data organization on disk in MySQL

We have a data set that is fairly static in a MySQL database, but the read times are terrible (even with indexes on the columns being queried). The theory is that since rows are stored randomly (or sometimes in order of insertion), the disk head has to scan around to find different rows, even if it knows where they are due to the index, instead of just reading them sequentially.
Is it possible to change the order data is stored in on disk so that it can be read sequentially? Unfortunately, we can't add a ton more RAM at the moment to have all the queries cached. If it's possible to change the order, can we define an order within an order? As in, sort by a certain column, then sort by another column if the first column is equal.
Could this have something to do with the indices?
Additional details: non-relational single-table database with 16 million rows, 1 GB of data total, 512 mb RAM, MariaDB 5.5.30 on Ubuntu 12.04 with a standard hard drive. Also this is a virtualized machine using OpenVZ, 2 dedicated core E5-2620 2Ghz CPU
Create syntax:
CREATE TABLE `Events` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`provider` varchar(10) DEFAULT NULL,
`location` varchar(5) DEFAULT NULL,
`start_time` datetime DEFAULT NULL,
`end_time` datetime DEFAULT NULL,
`cost` int(11) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `provider` (`provider`),
KEY `location` (`location`),
KEY `start_time` (`start_time`),
KEY `end_time` (`end_time`),
KEY `cost` (`cost`)
) ENGINE=InnoDB AUTO_INCREMENT=16321002 DEFAULT CHARSET=utf8;
Select statement that takes a long time:
SELECT *
FROM `Events`
WHERE `Events`.start_time >= '2013-05-03 23:00:00' AND `Events`.start_time <= '2013-06-04 22:00:00' AND `FlightRoutes`.location = 'Chicago'
Explain select:
1 SIMPLE Events ref location,start_time location 18 const 3684 Using index condition; Using where
MySQL can only select one index upon which to filter (which makes sense, because having restricted the results using an index it cannot then determine how such restriction has affected other indices). Therefore, it tracks the cardinality of each index and chooses the one that is likely to be the most selective (i.e. has the highest cardinality): in this case, it has chosen the location index, but that will typically leave 3,684 records that must be fetched and then filtered Using where to find those that match the desired range of start_time.
You should try creating a composite index over (location, start_time):
ALTER TABLE Events ADD INDEX (location, start_time)

Slow INSERT .. ON DUPLICATE KEY UPDATE query with InnoDB

Basically I am monitoring slowest query on a website. It turns out they are something like:
INSERT INTO beststat (bestid,period,rawView) VALUES ( 'idX' , 2012 , 1 )
ON DUPLICATE KEY UPDATE rawView = rawView+1
Basically it's a logging table. If the row is already there it updates rawView with a +1
beststat is InnoDB so I have row-level locking and consindering I do a lot of inserts-updates it should be faster than MyISAM.
Anyway that query shouldn't take so long, maybe there is something else wrong. What it could be ?
Of course I have an Unique Index on bestid, period
Additional Info
This table (beststat) currently has ~1mil record and its size is: 68MB. I have 4GB RAM and innodb buffer pool size = 104,857,600. Mysql: 5.1.49-3
CREATE TABLE `beststat` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`bestid` int(11) unsigned NOT NULL,
`period` mediumint(8) unsigned NOT NULL,
`view` mediumint(8) unsigned NOT NULL DEFAULT '0',
`rawView` mediumint(8) unsigned NOT NULL DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `bestid` (`bestid`,`period`)
) ENGINE=InnoDB AUTO_INCREMENT=2020577 DEFAULT CHARSET=utf8
Notice to faster thing a litte bit i could do somethijng like:
UPDATE beststat SET rawView = rawView + 1 WHERE bestid = idX AND period = 2012;
if (mysql_affected_rows()==0)
INSERT INTO beststat (bestid,period,rawView) VALUES ('idX',2012,1)
So most of time i would run only the first query UPDATE. But I would like to understand why the first, more concise, query is slow.
I found this interesting article... still reading
dealing with big # of rows, i suggest to use load date infile to make query faster.
To further improve the query time, you can consider using memory table as well.

Select takes long time. How to solve this problem?

I have a big base in MYSQL - 300 mb, where are 4 tables: the first one is about 200mb, the second is - 80.
There are 150 000 records in first table and 200 000 in second.
At the same time I use inner join there.
Select takes 3 seconds when I use optimization and indeces (before that it took about 20-30 seconds).
It is enough good result. But I need more, because page is loading for 7-8 seconds (3-4 for select, 1 for count, another small queries 1 sec, and 1-2 for page generation).
So, what I should do then? May be postgres takes less time than mysql? Or may be better to use memcaches, but in this case it can take lots of memory then (there are too many variants of sorting).
May be anybody has another idea? I would be glad to hear the new one:)
OK. I see we need queries:)
I renamed fields for table_1.
CREATE TABLE `table_1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`field` varchar(2048) DEFAULT NULL,
`field` varchar(2048) DEFAULT NULL,
`field` int(10) unsigned DEFAULT NULL,
`field` text,
`field` text,
`field` text,
`field` varchar(128) DEFAULT NULL,
`field` text,
`field` text,
`field` text,
`field` text,
`field` text,
`field` varchar(128) DEFAULT NULL,
`field` text,
`field` varchar(4000) DEFAULT NULL,
`field` varchar(4000) DEFAULT NULL,
`field` int(10) unsigned DEFAULT '1',
`field` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`field` text,
`new` tinyint(1) NOT NULL DEFAULT '0',
`applications` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `indexNA` (`new`,`applications`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=153235 DEFAULT CHARSET=utf8;
CREATE TABLE `table_2` (
`id_record` int(10) unsigned NOT NULL AUTO_INCREMENT,
`catalog_name` varchar(512) NOT NULL,
`catalog_url` varchar(4000) NOT NULL,
`parent_id` int(10) unsigned NOT NULL DEFAULT '0',
`checked` tinyint(1) NOT NULL DEFAULT '0',
`level` int(10) unsigned NOT NULL DEFAULT '0',
`work` int(10) unsigned NOT NULL DEFAULT '0',
`update` int(10) unsigned NOT NULL DEFAULT '1',
`type` int(10) unsigned NOT NULL DEFAULT '0',
`hierarchy` varchar(512) DEFAULT NULL,
`synt` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`id_record`,`type`) USING BTREE,
KEY `rec` (`id_record`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=14504 DEFAULT CHARSET=utf8;
CREATE TABLE `table_3` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`id_table_1` int(10) unsigned NOT NULL,
`id_category` int(10) unsigned NOT NULL,
`work` int(10) unsigned NOT NULL DEFAULT '1',
`update` int(10) unsigned NOT NULL DEFAULT '1',
PRIMARY KEY (`id`),
KEY `site` (`id_table_1`,`id_category`) USING BTREE
) ENGINE=InnoDB AUTO_INCREMENT=203844 DEFAULT CHARSET=utf8;
There queries are:
1) get general count (takes less than 1 sec):
SELECT count(table_1.id) FROM table_1
INNER JOIN table_3 ON table_3.id_table_id = table_1.id
INNER JOIN table_2 ON table_2.id_record = table_3.id_category
WHERE ((table_2.type = 0)
AND (table_3.work = 1 AND table_2.work = 1)
AND (table_1.new = 1))AND 1 IN (table_1.applications)
2) get list for page with limit (it takes from 3 to 7 seconds, depends on count):
SELECT table_1.field, table_1.field, table_1.field, table_1.field, table_2.catalog_name FROM table_1
INNER JOIN table_3 ON table_3.id_table_id = table_1.id
INNER JOIN table_2 ON table_2.id_record = table_3.id_category
WHERE ((table_2.type = 0)
AND (table_3.work = 1 AND table_2.work = 1)
AND (table_1.new = 1))AND 1 IN (table_1.applications) LIMIT 10 OFFSET 10
Do Not Change DBMS
I would not suggest to change your DBMS, it may be very disruptive. If you have used MySQL specific queries that are not compatible with Postgres; you might need to redo whole indexing etc. Even then it may not guarantee a performance improvement.
Caching is a Good Option
Caching is really good idea. It takes load off your DBMS. It is best suited if you have heavy read, light write. This way objects would stay more time in Cache. MemcacheD is really good caching mechanism, and is really simple. Rapidly scaling sites (like Facebook and the likes) make heavy use of MemcacheD to alleviate the load from database.
How to Scale-up Really Big Time
Although, you do not have very heavy data.. so most likely caching would help you. But the next step ahead of caching is noSQL based solutions like Cassandra. We use cassandra in one of our application where we have heavy read and write (50:50) operation and database is really large and fast growing. Cassandra gives good performance. But, I guess in your case, Cassandra is an overkill.
But...
Before, you dive into any serious changes, I would suggest to really look into indexes. Try scaling vertically. Look into slow queries. (Search for slow query logging directive). Hopefully, MySQL will be faster after optimizing these thing and you would not need additional tools.
You should look into indexing specific to the most frequent/time consuming queries you use. Check this post on indexing for mysql.
Aside from all the other suggestions others have offered, I've slightly altered and not positive of the performance impact under MySQL. However, I've added STRAIGHT_JOIN so the optimizer doesn't try to think which order or table to join FOR you.
Next, I moved the "AND" conditions into the respective JOIN clauses for tables 2 & 3.
Finally, the join from table 1 to 3 had (in your post)
table_3.id_table_id = table_1.id
instead of
table_3.id_table_1 = table_1.id
Additionally, I can't tell performance, but maybe having a stand-alone index on just the "new" column for exact match first without regards to the "applications" column. I don't know if the compound index is causing an issue since you are using an "IN" for the applications and not truly an indexable search basis.
Here's the modified results
SELECT STRAIGHT_JOIN
count(table_1.id)
FROM
table_1
JOIN table_3
ON table_1.id = table_3.id_table_1
AND table_3.work = 1
JOIN table_2
ON table_3.id_category = table_2.id_record
AND table_2.type = 0
AND table_2.work = 1
WHERE
table_1.new = 1
AND 1 IN table_1.applications
SELECT STRAIGHT_JOIN
table_1.field,
table_1.field,
table_1.field,
table_1.field,
table_2.catalog_name
FROM
table_1
JOIN table_3
ON table_1.id = table_3.id_table_1
AND table_3.work = 1
JOIN table_2
ON table_3.id_category = table_2.id_record
AND table_2.type = 0
AND table_2.work = 1
WHERE
table_1.new = 1
AND 1 IN table_1.applications
LIMIT 10 OFFSET 10
You should also optimize your query.
Without a look into the statements this question can only be answered using theoretical approaches. Just a few ideas to take into consideration...
The SELECT-Statement...
First of all, make sure that your query is as "good" as it can be. Are there any indeces you might have missed? Are those indeces the same field types and so on? Can you perhaps narrow the query down so the database has less to work on?
The Query cache...
If your query is repeated pretty often, it might help to use the Query cache or - in case you're already using it - give it more RAM.
The Hardware...
Of course different RDBMS are slower or faster than others, depending on their strenght or weaknesses, but if your query is optimized into oblivion, you only can get it faster while scaling up the database server (better cpu, better i/o and so on, depending on where the bottleneck is).
Other Factors...
If this all is maxed out, maybe try speeding up the other components (1-2 secs for page generation looks pretty slow to me).
To all those factors mentioned there is a huge amount of ideas and posts in stackoverflow.com.
That is actually not such a big database, certainly not too much for your database system. As comparison, the database that we are using is currently around 40 GB. It's an MS SQL Server, though, so it's not directly comparable, but there is no dramatic difference between the database systems.
My guess is that you haven't been completely successful in using indexes to speed up the query. You should look at the execution plan for the query and see if you can spot what part of the execution that is taking most of the time.