I've a table in MySQL which has more than 50 Million rows. Bellow is the table structure:
CREATE TABLE `links` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`loc` text NOT NULL,
`lastmod` datetime NOT NULL,
`changefreq` varchar(15) NOT NULL,
`priority` float NOT NULL,
`isdownloaded` tinyint(1) NOT NULL,
`mainrepoid` bigint(20) NOT NULL,
PRIMARY KEY (`id`),
FULLTEXT KEY `locfulltext` (`loc`)
) ENGINE=MyISAM AUTO_INCREMENT=11426345 DEFAULT CHARSET=latin1
FullText is enabled on field loc. I need to fetch all the rows containing both name and details words. The following query didn't return the expected results:
SELECT *
FROM links
WHERE
MATCH (
loc
)
AGAINST (
'name +details' IN BOOLEAN MODE
)
so I am forced to use the following query:
SELECT id, loc
FROM links
WHERE
loc like '%name%' and
loc like '%details%'
Is there any better alternative?
I don't know which version of mySql do you use. But the following query that you give works very well in my server.
SELECT *
FROM links
WHERE
MATCH (
loc
)
AGAINST (
'name +details' IN BOOLEAN MODE
)
If you fetch a problem with this query till now then just try with add a plus(+) sign before name . Just like :
SELECT *
FROM links
WHERE
MATCH (
loc
)
AGAINST (
'+name +details' IN BOOLEAN MODE
)
For further more information see the following link:
http://dev.mysql.com/doc/refman/5.0/en/fulltext-boolean.html
Related
I have a set of tables containing staff memebers' secondary specialities in different formats. Actually, only few primary specialities allow a person to have a secondary speciality.
I make a UNION view, combining the secondary specialities in this way and it is working fine:
SELECT tpl.id, tpl.speciality, 20 as gr FROM tpath_list tpl
UNION ALL
SELECT tlt.id, tlt.lab_type as speciality, 11 as gr FROM tlab_types tlt
UNION ALL
SELECT til.id, til.speciality, 10 as gr FROM tinstrumental_list til
Field gr contains the index field of the primary specialities (10, 11 and 20) allowing to have a secondary speciality. All other specialities (with id 1, 2, 3 etc.) do not have secondary ones.
I receive something like this...
I can now fetch data from the created view using WHERE gr=:n.
How can I modify the view so that fetching data from it using WHERE gr=:n (1, 2, 3 etc.) clause will give me a single record id=1 speciality="Not specified" gr=n
Appended on comments of the community:
I need to fetch all the records if :n IN(10,11,20) that are present in the tables and presented by the view. For example, 3 records listed in the picture for :n=20. Only if :n NOT IN(10,11,20) I need an additional (single) record absenting from the tables (and the view). Please, note that records with gr IN(10,11,20) have got the id=1 and speciality='Not specified'.
Appended on request of Jon Tofte-Hansen
Here is an unfiltered view output (without WHERE clause).
If I fetch the view with WHERE gr=20 I get the following (that is what I want):
If with WHERE gr=10 - this set (that is what I want):
If I fetch it with WHERE gr=1 then I woud like to receive a sigle record that is absent in any of the tables used for view creation. I wish to get this output but do not know how to:
Here is the structure of the tables if it might help:
DROP TABLE IF EXISTS `tspecialities_list`;
CREATE TABLE `tspecialities_list` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`speciality` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`speciality`),
KEY `i_by_speciality` (`speciality`)
) ENGINE=InnoDB AUTO_INCREMENT=21 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `tlab_types`;
CREATE TABLE `tlab_types` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`lab_type` varchar(255) DEFAULT NULL,
`deleted` smallint(5) unsigned DEFAULT '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=9 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `tpath_list`;
CREATE TABLE `tpath_list` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`speciality` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`speciality`),
KEY `i_by_speciality` (`speciality`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;
DROP TABLE IF EXISTS `tinstrumental_list`;
CREATE TABLE `tinstrumental_list` (
`id` smallint(5) unsigned NOT NULL AUTO_INCREMENT,
`speciality` varchar(30) NOT NULL DEFAULT '',
PRIMARY KEY (`id`,`speciality`),
KEY `i_by_speciality` (`speciality`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8;
It is not entirely clear what you want, but this could be it. Add an extra union that produce all the id's not present in the three tables with secondary specialities:
SELECT tpl.id, tpl.speciality, 20 as gr FROM tpath_list tpl
UNION ALL
SELECT tlt.id, tlt.lab_type as speciality, 11 as gr FROM tlab_types tlt
UNION ALL
SELECT til.id, til.speciality, 10 as gr FROM tinstrumental_list til
UNION ALL
select tsl.id, 'Not specified', tsl.id as gr from tspecialities_list tsl
where tsl.id not in
(SELECT tpl.id FROM tpath_list tpl
UNION
SELECT tlt.id FROM tlab_types tlt
UNION
SELECT til.id FROM tinstrumental_list til)
It seems like you are just asking for the following:
SELECT *
FROM viewname
WHERE speciality='Not specified' and gr=:n
Is there a reason this isn't what you want?
I have a query that is executed in 35s, which is waaaaay too long.
Here are the 3 tables concerned by the query (each table is approx. 13000 lines long, and should be much longer in the future) :
Table 1 : Domains
CREATE TABLE IF NOT EXISTS `domain` (
`id_domain` int(11) NOT NULL AUTO_INCREMENT,
`domain_domain` varchar(255) NOT NULL,
`projet_domain` int(11) NOT NULL,
`date_crea_domain` int(11) NOT NULL,
`date_expi_domain` int(11) NOT NULL,
`active_domain` tinyint(1) NOT NULL,
`remarques_domain` text NOT NULL,
PRIMARY KEY (`id_domain`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Table 2 : Keywords
CREATE TABLE IF NOT EXISTS `kw` (
`id_kw` int(11) NOT NULL AUTO_INCREMENT,
`kw_kw` varchar(255) NOT NULL,
`clics_kw` int(11) NOT NULL,
`cpc_kw` float(11,3) NOT NULL,
`date_kw` int(11) NOT NULL,
PRIMARY KEY (`id_kw`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Table 3 : Linking between domain and keyword
CREATE TABLE IF NOT EXISTS `kw_domain` (
`id_kd` int(11) NOT NULL AUTO_INCREMENT,
`kw_kd` int(11) NOT NULL,
`domain_kd` int(11) NOT NULL,
`selected_kd` tinyint(1) NOT NULL,
PRIMARY KEY (`id_kd`),
KEY `kw_to_domain` (`kw_kd`,`domain_kd`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The query is as follows :
SELECT ng.*, kd.*, kg.*
FROM domain ng
LEFT JOIN kw_domain kd ON kd.domain_kd = ng.id_domain
LEFT JOIN kw kg ON kg.id_kw = kd.kw_kd
GROUP BY ng.id_domain
ORDER BY kd.selected_kd DESC, kd.id_kd DESC
Basically, it selects all domains, with, for each one of these domains, the last associated keyword.
Does anyone have an idea on how to optimize the tables or the query ?
The following will get the last keyword, according to your logic:
select ng.*,
(select kw_kd
from kw_domain kd
where kd.domain_kd = ng.id_domain and kd.selected_kd = 1
order by kd.id_kd desc
limit 1
) as kw_kd
from domain ng;
For performance, you want an index on kw_domain(domain_kd, selected_kd, kw_kd). In this case, the order of the fields matters.
You can use this as a subquery to get more information about the keyword:
select ng.*, kg.*
from (select ng.*,
(select kw_kd
from kw_domain kd
where kd.domain_kd = ng.id_domain and kd.selected_kd = 1
order by kd.id_kd desc
limit 1
) as kw_kd
from domain ng
) ng left join
kw kg
on kg.id_kw = ng.kw_kd;
In MySQL, group by can have poor performance, so this might work better, particularly with the right indexes.
CREATE TABLE `schedule` (
`id` smallint(6) NOT NULL AUTO_INCREMENT,
`aircraftType` varchar(50) DEFAULT NULL,
//...other fields
PRIMARY KEY (`id`),
) ENGINE=MyISAM AUTO_INCREMENT=5611 DEFAULT CHARSET=latin1;
CREATE TABLE `aircrafts` (
`id` smallint(6) NOT NULL AUTO_INCREMENT,
`aircraftType` varchar(50) DEFAULT NULL,
//...other fields
PRIMARY KEY (`id`),
) ENGINE=MyISAM AUTO_INCREMENT=5611 DEFAULT CHARSET=latin1;
SAMPLE CONTENT OF DB TABLES:
Table "Schedule"
aircraftType = "320"
Table "Aircrafts"
aircraftType = "A320"
aircraftType = "A330"
Query:
SELECT *
FROM Schedule F, Aircrafts A
WHERE F.aircraftType = A.aircraftType;
How to update this query in such a way that aircratf types "320" and "A320" would be considered as similar in the WHERE clause?
SELECT *
FROM Schedule F, Aircrafts A
WHERE F.aircraftType = A.aircraftType LIKE CONCAT('%', F.aircraftType, '%')
OR
LIKE CONCAT('\"','%', F.aircraftType, '%','\"')-added double quotes.
Try cutting of first char with SUBSTRING() :
SELECT *
FROM Schedule F, Aircrafts A
WHERE F.aircraftType = SUBSTRING(A.aircraftType, 2)
Or like #Mihai suggested, just put % on field that hasn't have A in it :
SELECT *
FROM Schedule F, Aircrafts A
WHERE CONCAT('%', F.aircraftType) LIKE A.aircraftType
But best solution would be to update data for identical relation string.
In the same datbase I have a table messages whos columns: id, title, text I want. I want only the records of which title has no entries in the table lastlogon who's title equivalent is then named username.
I have been using this SQL command in PHP, it generally took 2-3 seconds to pull up:
SELECT DISTINCT * FROM messages WHERE title NOT IN (SELECT username FROM lastlogon) LIMIT 1000
This was all good until the table lastlogon started to have about 80% of the values table messages. Messages has about 8000 entries, lastlogon about 7000. Now it takes about a minute to 2 minutes for it to go through. MySQL shoots up to very high CPU usage.
I tried the following but had no luck reducing the time:
SELECT id,title,text FROM messages a LEFT OUTER JOIN lastlogon b ON (a.title = b.username) LIMIT 1000
Why all of a sudden is it taking so long for such low amount of entries? I tried restarting mysql and apache multiple times. I am using debian linux.
Edit: Here are the structures
--
-- Table structure for table `lastlogon`
--
CREATE TABLE IF NOT EXISTS `lastlogon` (
`username` varchar(25) NOT NULL,
`lastlogon` date NOT NULL,
`datechecked` date NOT NULL,
PRIMARY KEY (`username`),
KEY `username` (`username`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
-- --------------------------------------------------------
--
-- Table structure for table `messages`
--
CREATE TABLE IF NOT EXISTS `messages` (
`id` smallint(9) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(255) NOT NULL,
`name` varchar(255) NOT NULL,
`email` varchar(50) NOT NULL,
`text` mediumtext,
`folder` tinyint(2) NOT NULL,
`read` smallint(5) unsigned NOT NULL,
`dateline` int(10) unsigned NOT NULL,
`ip` varchar(15) NOT NULL,
`attachment` varchar(255) NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
`username` varchar(300) NOT NULL,
`error` varchar(500) NOT NULL,
PRIMARY KEY (`id`),
KEY `title` (`title`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=9010 ;
Edit 2
Edited structure with new indexes.
After putting an index on both messages.title and lastlogon.username I came up with these results:
Showing rows 0 - 29 (623 total, Query took 74.4938 sec)
First: replace the key on title, with a compound key on title + id
ALTER TABLE messages DROP INDEX title;
ALTER TABLE messages ADD INDEX title (title, id);
Now change the select to:
SELECT m.* FROM messages m
LEFT JOIN lastlogon l ON (l.username = m.title)
WHERE l.username IS NULL
-- GROUP BY m.id DESC -- faster replacement for distinct. I don't think you need this.
LIMIT 1000;
Or
SELECT m.* FROM messages m
WHERE m.title NOT IN (SELECT l.username FROM lastlogon l)
-- GROUP BY m.id DESC -- faster than distinct, I don't think you need it though.
LIMIT 1000;
Another problem with the slowness is the SELECT m.* part.
By selecting all column, you are forcing MySQL to do extra work.
Only select the columns you need:
SELECT m.title, m.name, m.email, ......
This will speed up the query as well.
There's another trick you can use:
Replace the limit 1000 with a cutoff date.
Step 1: Add an index on timestamp (or whatever field you want to use for the cutoff).
SELECT m.* FROM messages m
LEFT JOIN lastlogon l ON (l.username = m.title)
WHERE (m.id > (SELECT MIN(M2.ID) FROM messages m2 WHERE m2.timestamp >= '2011-09-01'))
AND l.username IS NULL
-- GROUP BY m.id DESC -- faster replacement for distinct. I don't think you need this.
I suggest you to add an index on messages.title . Then try to run again the query and test the performance.
Looking at this query there's got to be something bogging it down that I'm not noticing. I ran it for 7 minutes and it only updated 2 rows.
//set product count for makes
$tru->query->run(array(
'name' => 'get-make-list',
'sql' => 'SELECT id, name FROM vehicle_make',
'connection' => 'core'
));
while($tempMake = $tru->query->getArray('get-make-list')) {
$tru->query->run(array(
'name' => 'update-product-count',
'sql' => 'UPDATE vehicle_make SET product_count = (
SELECT COUNT(product_id) FROM taxonomy_master WHERE v_id IN (
SELECT id FROM vehicle_catalog WHERE make_id = '.$tempMake['id'].'
)
) WHERE id = '.$tempMake['id'],
'connection' => 'core'
));
}
I'm sure this query can be optimized to perform better, but I can't think of how to do it.
vehicle_make = 45 rows
taxonomy_master = 11,223 rows
vehicle_catalog = 5,108 rows
All tables have appropriate indexes
UPDATE: I should note that this is a 1-time script so overhead isn't a big deal as long as it runs.
CREATE TABLE IF NOT EXISTS `vehicle_make` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(32) NOT NULL,
`product_count` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=46 ;
CREATE TABLE IF NOT EXISTS `taxonomy_master` (
`product_id` int(10) NOT NULL,
`v_id` int(10) NOT NULL,
`vehicle_requirement` varchar(255) DEFAULT NULL,
`is_sellable` enum('True','False') DEFAULT 'True',
`programming_override` varchar(25) DEFAULT NULL,
PRIMARY KEY (`product_id`,`v_id`),
KEY `idx2` (`product_id`),
KEY `idx3` (`v_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
CREATE TABLE IF NOT EXISTS `vehicle_catalog` (
`v_id` int(10) NOT NULL,
`id` int(11) NOT NULL,
`v_make` varchar(255) NOT NULL,
`make_id` int(11) NOT NULL,
`v_model` varchar(255) NOT NULL,
`model_id` int(11) NOT NULL,
`v_year` varchar(255) NOT NULL,
PRIMARY KEY (`v_id`,`v_make`,`v_model`,`v_year`),
UNIQUE KEY `idx` (`v_make`,`v_model`,`v_year`),
UNIQUE KEY `idx2` (`v_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Update: The successful query to get what I needed is here....
SELECT
m.id,COUNT(t.product_id) AS CountOf
FROM taxonomy_master t
INNER JOIN vehicle_catalog v ON t.v_id=v.id
INNER JOIN vehicle_make m ON v.make_id=m.id
GROUP BY m.id;
without the tables/columns this is my best guess from reverse engineering the given queries:
UPDATE m
SET product_count =COUNT(t.product_id)
FROM taxonomy_master t
INNER JOIN vehicle_catalog v ON t.v_id=v.id
INNER JOIN vehicle_make m ON v.make_id=m.id
GROUP BY m.name
The given code loops over each make, and then runs a query the counts for each. My answer just does them all in one query and should be a lot faster.
have an index for each of these:
vehicle_make.id cover on name
vehicle_catalog.id cover make_id
taxonomy_master.v_id
EDIT
give this a try:
CREATE TEMPORARY TABLE CountsOf (
id int(11) NOT NULL
, CountOf int(11) NOT NULL DEFAULT 0.00
);
INSERT INTO CountsOf
(id, CountOf )
SELECT
m.id,COUNT(t.product_id) AS CountOf
FROM taxonomy_master t
INNER JOIN vehicle_catalog v ON t.v_id=v.id
INNER JOIN vehicle_make m ON v.make_id=m.id
GROUP BY m.id;
UPDATE taxonomy_master,CountsOf
SET taxonomy_master.product_count=CountsOf.CountOf
WHERE taxonomy_master.id=CountsOf.id;
instead of using nested query ,
you can separated this query to 2 or 3 queries,
and in php insert the result of the inner query to the out query ,
its faster !
#haim-evgi Separating the queries will not increase the speed significantly, it will just shift the load from the DB server to the Web server and create overhead of moving data between the two servers.
I am not sure with the appropriate indexes you run such query 7 minutes. Could you please show the table structure of the tables involved in these queries.
Seems like you need the following indices:
INDEX BTREE('make_id') on vehicle_catalog
INDEX BTREE('v_id') on taxonomy_master