SELECT COUNT(*) Performance - mysql

Lately I discovered that the most consuming requests in my website are the SELECT COUNT(*)
a simply request can take sometimes more than a second
SELECT COUNT(*) as count FROM post WHERE category regexp '[[:<:]](17|222)[[:>:]]' AND approve=1 AND date < '2014-01-25 19:08:17';
+-------+
| count |
+-------+
| 3585 |
+-------+
1 row in set (0.49 sec)
I'm not sure what's the problem I've indexes for category, approve and date.

This is your query:
SELECT COUNT(*) as count
FROM post
WHERE category regexp '[[:<:]](17|222)[[:>:]]' AND approve=1 AND
date < '2014-01-25 19:08:17';
It is not a simple request because the regexp has to run on every row (or every row filtered by the other conditions).
An index on post(approve, date, category) might help. You want one index with the columns listed in that order.
EDIT:
If the values are being stored in a space separated list, you might try this to see if it is faster:
WHERE (concat(' ', category, ' ') like '% 17 %' or concat(' ', category, ' ') like '% 222 %') AND
approve = 1 AND date < '2014-01-25 19:08:17';
It is possible that these expressions are faster than the regular expression.
And, finally, if you really do need to search for "words" in a field, then consider a full text index. I think you might have to tinker with the options in this case so numbers are allowed in the index.

Related

Is there a way to count the LIKE results per row in MySQL?

I have a MySQL table jobs like this:
ID | title | keywords
1 | UI Designer | HTML, CSS, Photoshop
2 | Web site Designer | PHP
3 | UI/UX Developer | CSS, HTML, JavaScript
and I have a query like this:
SELECT * FROM jobs
WHERE title LIKE '%UX%' OR title LIKE '%UI%' OR title LIKE '%Developer%' OR keywords LIKE '%HTML%' OR keywords LIKE '%CSS%'
I want to sort results by most similarity.
for example for first row (ID 1), there is UI and HTML and CSS in the record row. then the number of CORRECT LIKE conditions is 3 for first row. same as this calculation, it is 0 for second row and it is 5 for third row.
then I want the result ordered by the number of CORRECT LIKE conditions, like this:
Results
ID | title | keywords
3 | UI/UX Developer | CSS, HTML, JavaScript
1 | UI Designer | HTML, CSS, Photoshop
Then, is there anyway to count the number of similarities per row in query and sort the result like what I describe?
You could sum the matching resul in order by using if
SELECT *
FROM jobs
WHERE title LIKE '%UX%'
OR title LIKE '%UI%'
OR title LIKE '%Developer%'
OR keywords LIKE '%HTML%'
OR keywords LIKE '%CSS%'
ORDER BY (title LIKE '%UX%'+ title LIKE '%UI%'+
keywords LIKE '%HTML%'+ keywords LIKE '%HTML%') DESC
if return 1 or 0 so adding the true result you should obatin the most matching rows
You should not be storing keywords in a string like that. You should have a separate table.
If -- for some reason such as someone else's really, really, really bad design choices -- you have to deal with this data, then take the delimiters into account. In MySQL, I would recommend find_in_set() for this purpose:
SELECT j.*
FROM jobs j
WHERE title LIKE '%UX%' OR
title LIKE '%UI%' OR
title LIKE '%Developer%' OR
FIND_IN_SET('HTML', REPLACE(keywords, ', ', '')) > 0 OR
FIND_IN_SET('CSS', REPLACE(keywords, ', ', '')) > 0
ORDER BY ( (title LIKE '%UX%') +
(title LIKE '%UI%') +
(title LIKE '%Developer%') +
(FIND_IN_SET('HTML', REPLACE(keywords, ', ', '')) > 0) +
(FIND_IN_SET('CSS', REPLACE(keywords, ', ', '')) > 0)
) DESC ;
This finds an exact match on the keyword.
You can simplify the WHERE, but not the ORDER BY, to:
WHERE title REGEXP 'UX|UI|Developer' OR
FIND_IN_SET('HTML', REPLACE(keywords, ', ', '')) > 0 OR
FIND_IN_SET('CSS', REPLACE(keywords, ', ', '')) > 0

SQL Query Select Clause, need a solution to not return few rows

I have a column in a static table like this:
Vehicles
-------------
Bike
Truck
car_2018
car_2019
car_2020
car_2021
Bus
The select query needs to fetch only the car row based on the year of query (for example now its 2018, if I run this next year, it should get back _2019) long with the rest of the rows that's not based on years. Need a solution for this.
So far I have this:
SELECT Vehicles
FROM VehicleMaster
WHERE 'some where clause based on other columns'
select Vehicles
from table_name
where Vehicles like '%2018'
union all
select Vehicles
from table_name
where Vehicles not like '%car%'
You can use substring_index to split that field by underscore _ and query based on that:
CREATE TABLE vehicles(f1 varchar(30));
INSERT INTO vehicles VALUES ('Bike'),
('Truck'),
('car_2018'),
('car_2019'),
('car_2020'),
('car_2021'),
('Bus');
SELECT f1
FROM vehicles
WHERE
f1 NOT LIKE 'car%'
OR (f1 LIKE 'car%' AND substring_index(f1, "_", -1) = YEAR(CURDATE()));
+----------+
| f1 |
+----------+
| Bike |
| Truck |
| car_2018 |
| Bus |
+----------+
SqlFiddle here
You can use regex to exclude all car_#### rows, except for the current year. Assuming that your Vehicles column is called name, this should work for you:
select *
from Vehicles
where
(
-- Exclude all car_####
not trim(name) REGEXP '^car_[0-9]{4}$'
-- Except for the current year
or name = concat('car_', year(now()))
)
I think you want:
select t.*
from t
where t.vehicle = concat('car_', year(curdate())) or
t.vehicle not regexp '[0-9]{4}$'
If you want a general purpose "any current year or any without a year", then:
select t.*
from t
where t.vehicle like concat('%_', year(curdate())) or
t.vehicle not regexp '[0-9]{4}$'

Count the frequency of each word

I've been trolling the internet and realize that MySQL is not the best way to get at this but I'm asking anyway. What query, function or stored procedure has anyone seen or used that will get the frequency of a word across a text column.
ID|comment
----------------------
Ex. 1|I love this burger
2|I hate this burger
word | count
-------|-------
burger | 2
I | 2
this | 2
love | 1
hate | 1
This solution seems to do the job (stolen almost verbatim from this page). It requires an auxiliary table, filled with sequential numbers from 1 to at least the expected number of distinct words. This is quite important to check that the auxiliary table is large enough, or results will be wrong (showing no error).
SELECT
SUBSTRING_INDEX(SUBSTRING_INDEX(maintable.comment, ' ', auxiliary.id), ' ', -1) AS word,
COUNT(*) AS frequency
FROM maintable
JOIN auxiliary ON
LENGTH(comment)>0 AND SUBSTRING_INDEX(SUBSTRING_INDEX(comment, ' ', auxiliary.id), ' ', -1)
<> SUBSTRING_INDEX(SUBSTRING_INDEX(comment, ' ', auxiliary.id-1), ' ', -1)
GROUP BY word
HAVING word <> ' '
ORDER BY frequency DESC;
SQL Fiddle
This approach is as inefficient as one can be, because it cannot use any index.
As an alterative, I would use a statistics table that I would keep up-to-date with triggers. Perhaps initialise the stats table with the above.
Something like this should work. Just make sure you don't pass in a 0 length string.
SET #searchString = 'burger';
SELECT
ID,
LENGTH(comment) - LENGTH(REPLACE(comment, #searchString, '')) / LENGTH(#searchString) AS count
FROM MyTable;

Selecting value corresponding with MAX value of a group

I am trying to get the OXSEOURL of my OXSEO table.
Structure:
oxobjectid | oxseourl | oxparams
Data:
http://imageshack.com/a/img268/7443/3xr4.png
http://imageshack.com/a/img42/315/8bdu.png
My deepest SEO URL always has the higher value in OXPARAMS field.
Only the numeric values, the others are never count..
Return should be:
http://imageshack.com/a/img29/8404/4jbv.png
I found a solution yesterday, but it was very slow, now I am trying to get a faster way to do it.
So I would like to get the oxseourl for the same oxobjectid with the max oxparams value.
I have more than 330.000 rows, so every ms counts..
I only have to select the urls for products staring with "tbproduct_" objectid.
My query:
SELECT seo2.oxseourl, seo2.oxobjectid, seo2.oxparams
FROM oxseo AS seo2
JOIN (
SELECT oxobjectid,
MAX(oxparams) AS maxparam
FROM oxseo
GROUP BY
oxobjectid
) AS usm
ON usm.maxparam = seo2.oxparams
WHERE seo2.oxobjectid LIKE '%tbproduct_%'
AND seo2.oxparams REGEXP '^-?[0-9]+$'
But this returns the same rows for the products.
Thanks for any help.
A bit optimized, and a lot faster:
SELECT seo.oxseourl, seo.oxobjectid, MAX(seo.oxparams)
FROM oxseo AS seo
WHERE seo.oxobjectid LIKE 'tbproduct_%' AND seo.oxparams REGEXP '^-?[0-9]+$'
GROUP BY seo.oxseourl, seo.oxobjectid

MYSQL get count of each column where it equals a specific value

I recently set up a MYSQL database connected to a form filled with checkboxes. If the checkbox was selected, it would insert into the associated column a value of '1'; otherwise, it would receive a value of '0'.
I'd like to eventually look at aggregate data from this form, and was wondering if there was any way I could use MYSQL to get a number for each column which would be equal to the number of rows that had a value of '1'.
I've tried variations of:
select count(*) from POLLDATA group by column_name
which was unsuccessful, and nothing else I can think of seems to make sense (admittedly, I'm not all too experienced in SQL).
I'd really like to avoid:
select count(*) from POLLDATA where column_1='1'
for each column (there close to 100 of them).
Is there any way to do this besides typing out a select count(*) statement for each column?
EDIT:
If it helps, the columns are 'artist1', 'artist2', ....'artist88', 'gender', 'age', 'city', 'state'. As I tried to explain below, I was hoping that I'd be able to do something like:
select sum(EACH_COLUMN) from POLLDATA where gender='Male', city='New York City';
(obviously EACH_COLUMN is bogus)
SELECT SUM(CASE
WHEN t.your_column = '1' THEN 1
ELSE 0
END) AS OneCount,
SUM(CASE
WHEN t.your_column='0' THEN 1
ELSE 0
END) AS ZeroCount
FROM YOUR_TABLE t
If you are just looking for the sheer number of 1's in the columns, you could try…
select sum(col1), sum(col2), sum(col3) from POLLDATA
A slightly more compact notation is SUM( IF( expression ) ).
For the askers example, this could look something like:
select
count(*) as total,
sum(if(gender = 'MALE', 1, 0)) as males,
sum(if(gender = 'FEMALE', 1, 0)) as females,
sum(if(city = 'New York City', 1, 0)) as newYorkResidents
from POLLDATA;
Example result:
+-------+-------+---------+------------------+
| total | males | females | newYorkResidents |
+-------+-------+---------+------------------+
| 42 | 23 | 19 | 42 |
+-------+-------+---------+------------------+
select count(*) from POLLDATA group by column_name
I dont think you want to do a count cause this will also count the records with a 0.
try
select column_naam,sum(column_name) from POLLDATA group by column_name
or
select column_naam,count(*) from POLLDATA
where column_name <> 0
group by column_name
only adds the 0
Instead of strings why not store actual numbers, 1 or 0.
Then you could use the sql SUM function.
When the query begins to be a little too complicated, maybe it's because you should think again about your database structure. But if you want to keep your table as it is, you could use a prepared statement that automatically calculates all the sums for you, without specifying every single column:
SELECT
CONCAT(
'SELECT ',
GROUP_CONCAT(CONCAT('SUM(', `column_name`, ') AS sum_', `column_name`)),
' FROM POLLDATA WHERE gender=? AND city=?')
FROM `information_schema`.`columns`
WHERE `table_schema`=DATABASE()
AND `table_name`='POLLDATA'
AND `column_name` LIKE 'artist%'
INTO #sql;
SET #gender := 'male';
SET #city := 'New York';
PREPARE stmt FROM #sql;
EXECUTE stmt USING #gender, #city;
Please see fiddle here.