I have troubles understanding subqueries. Can someone properly explain it to me please?
I have this query:
SELECT substring_index(marca, ' ', 1) AS companie, count(*) AS numar_masini
FROM tip_masina
GROUP BY companie;
That displays:
companie numar_masini
Chevrolet 5
Dacia 1
Dodge 5
Ford 6
And this query:
SELECT substring_index(id_masina, ' ', 1) AS marca, SUM(nr_vehicule) AS nr_masini
FROM proprietate
GROUP BY marca;
That displays:
marca nr_masini
Chevrolet 18
Dodge 9
Ford 11
I want to 'combine' this queries properly and get a result of:
companie numar_marci numar_masini
Chevrolet 5 18
Dacia 1 0 (or null)
Dodge 5 9
Ford 6 11
How can I do that properly? I don't want code, I want a proper explanation of subqueries as I did not understand them properly and I didn't find a proper documentation to explain it to me.
A sub query (in this way) is just returning a set of rows. You can treat the result of a sub query as if it were a table.
For example your queries could be used as follows.
SELECT b.marca, b.nr_masini, c.numar_masini
(
SELECT substring_index(id_masina, ' ', 1) AS marca, SUM(nr_vehicule) AS nr_masini
FROM proprietate
GROUP BY marca
) b
LEFT OUTER JOIN
(
SELECT substring_index(marca, ' ', 1) AS companie, count(*) AS numar_masini
FROM tip_masina
GROUP BY companie
) c
ON b.marca = c.companie
Here it is taking your first query and getting the 4 makes (with the counts to go with them), then joining that result against the result of your 2nd query. You 2nd query only brings back 3 rows.
Possibly that both queries could return rows that the other query doesn't return. In which case you might use a 3rd sub query to get the main rows. Then you can join the other 2 against that:-
SELECT a.companie, b.nr_masini, c.numar_masini
FROM
(
SELECT substring_index(marca, ' ', 1) AS companie
FROM tip_masina
UNION
SELECT substring_index(id_masina, ' ', 1) AS companie
FROM proprietate
) a
LEFT OUTER JOIN
(
SELECT substring_index(id_masina, ' ', 1) AS marca, SUM(nr_vehicule) AS nr_masini
FROM proprietate
GROUP BY marca
) b
ON a.companie = b.marca
LEFT OUTER JOIN
(
SELECT substring_index(marca, ' ', 1) AS companie, count(*) AS numar_masini
FROM tip_masina
GROUP BY companie
) b
ON a.companie = c.companie
In this way you are using one sub query to get a list of all the possible companie values, and treating that as a table to join against the other 2 sub queries.
You could set up each of these sub queries as a view which would make them look even more like tables when you join them.
That is not a subquery. substring_index is a function.
This is how a subquery works.
SELECT *
FROM
(
-- This is the subquery
SELECT id, test1, test2
FROM table
) TableAlias
Basically, mysql collects the result from the inner select:
SELECT id, test1, test2 FROM table
is now the TableAlias
So the syntax can be used like this now
SELECT TableAlias.Id, TableAlias.test1, TableAlias.test2
FROM
(
-- This is the subquery
SELECT id, test1, test2
FROM table
) TableAlias
Subqueries are useful for many things. Like when using aggergated functions like SUM, AVG. Then it can really help to ORDER reuslts.
Try this link: http://www.toadworld.com/platforms/mysql/w/wiki/6355.table-subqueries.aspx
I hope that explains Subqueries a little bit better for you.
Related
I have two tables Posts, categories. Here in the posts table I stored the category values as comma separated string like this 5,8,23,7. While displaying the posts, I just want to show the post categories as comma separated like this Flower, Birds, Animals. So I tried some queries nothing helped me to get it. The Posts Table Example.
ID Post title categories
3 Example Post 5,7,23,8
And the Categories Table will be like this
ID name
5 Flowers
7 Animals
8 Birds
23 Naturals
And I want result like this
ID Post Tile Category
3 Example Post Flowers, Animals, Birds
For that I tried this query but didn't help me to get it .
SELECT post.ID, post.Post_title, (SELECT cat.name FROM Categories as cat WHERE cat.ID IN (post.category)) AS Categories FROM Posts as post
And it returns only one category, it retrieves the first category name only.
If you simply must use that schema, you could try something like this:
select P.ID, P.Title, (
select group_concat(C.name SEPARATOR ', ')
from Categories C
where LOCATE(CONCAT(C.ID, ','), P.categories) > 0
or LOCATE(CONCAT(', ', C.ID), P.categories) > 0
) as categories
from Post P;
It's hacky because in a comma separated list either a value occurs before a comma or after a comma, taking into account values at the beginning or end of the list. You can't just do a straight substring, because otherwise you'll get a category ID of 5 matched to a 'categories' value of '1, 2, 555'.
EDIT: Updated to consider the fact that Posts.categories is a CSV value.
You need to use the GROUP_CONCAT() function, and also the trick posted in SQL split comma separated row in order to split the JOIN CSV and then create the output CSV:
SELECT
Posts.ID,
Posts.Post_title,
GROUP_CONCAT(Categories.name SEPARATOR ',') AS `Category`
FROM Posts
INNER JOIN Categories
ON Categories.ID IN (
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(Posts.categories, ',', n.n), ',', -1) value
FROM (
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(Posts.categories) - LENGTH(REPLACE(Posts.categories, ',', '')))
ORDER BY value
)
GROUP BY
Posts.ID,
Posts.Post_title
Fiddle: http://sqlfiddle.com/#!9/b1ddc9/4
I believe you can use group_concat? Just join the Categories table and group_concat the name group_concat(name)
as for the JOIN try to use find_in_set(Categories.ID, Post.ID) > 0
Thou this approach may not work if the comma delimited Category ID has space i.e. 1, 2 etc...But if you are saving it accurately this may work.
Problem
I have a query that I pasted below. The problem I face is how can I trim the latency to under the current time of about 10 seconds.
set# csum = 0;
SELECT Date_format(assigneddate, '%b %d %Y') AS assigneddate, (#csum: = #csum + numactionitems) AS totalactionitems
FROM(
SELECT assigneddate,
Sum(numactionitems) AS numactionitems FROM(
SELECT assigneddate,
Count( * ) AS numactionitems FROM(
SELECT *
FROM(
SELECT actionitemtitle,
actionitemstatement,
altownerid,
approvalstatement,
assigneddate,
assignorid,
closeddate,
closurecriteria,
closurestatement,
criticality,
duedate,
ecd,
notes,
ownerid,
Concat(lastname, ', ', firstname) AS owner,
cnames2.categoryvalue AS `team`,
cnames2.categorynameid AS `teamid`,
cnames3.categoryvalue AS `department`,
cnames3.categorynameid AS `departmentid`,
cnames4.categoryvalue AS `source`,
cnames4.categorynameid AS `sourceid`,
cnames5.categoryvalue AS `project_phase`,
cnames5.categorynameid AS `project_phaseid`,
ac1.actionitemid FROM actionitemcategories AS ac1 INNER JOIN actionitems AS a INNER JOIN users AS u INNER JOIN(
SELECT actionitemid AS a2id,
categorynameid AS c2 FROM actionitemcategories WHERE categoryid = 195) AS ac2 INNER JOIN categorynames AS cnames2 ON cnames2.categorynameid = ac2.c2 AND ac1.categoryid = 195 AND a.actionitemid = ac2.a2id AND ac1.actionitemid = a.actionitemid AND a.ownerid = u.userid INNER JOIN(
SELECT actionitemid AS a3id,
categorynameid AS c3 FROM actionitemcategories WHERE categoryid = 200) AS ac3 INNER JOIN categorynames AS cnames3 ON cnames3.categorynameid = ac3.c3 AND ac2.a2id = ac3.a3id INNER JOIN(
SELECT actionitemid AS a4id,
categorynameid AS c4 FROM actionitemcategories WHERE categoryid = 202) AS ac4 INNER JOIN categorynames AS cnames4 ON cnames4.categorynameid = ac4.c4 AND ac3.a3id = ac4.a4id INNER JOIN(
SELECT actionitemid AS a5id,
categorynameid AS c5 FROM actionitemcategories WHERE categoryid = 203) AS ac5 INNER JOIN categorynames AS cnames5 ON cnames5.categorynameid = ac5.c5 AND ac4.a4id = ac5.a5id) s WHERE 1 = 1) f GROUP BY assigneddate UNION ALL(
SELECT a.date AS assigneddate,
0 AS numactionitems FROM(
SELECT '2015-03-05' + INTERVAL(a.a + (10 * b.a) + (100 * c.a)) day AS date FROM(
SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a CROSS JOIN(
SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b CROSS JOIN(
SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c) a) ORDER BY assigneddate ASC) t GROUP BY assigneddate LIMIT 282) t
WHERE assigneddate != '0000-00-00'
Purpose of Query
The purpose of this query is to get all records that contain date values and to collect the running count of all records that fall on a certain date. Date values are computed within the sqlfiddle below. It's final purpose is to be displayed in a graph that takes the running total as a line graph. It will be counting upwards so it is a growing graph.
The graph I am displaying it in is called a build-up graph of all my records (action items with date values).
Description of Issue
My problem is that I am getting the results of the query in at least 10 seconds.
Question
How can I accelerate and reduce the latency of the query so that I will not stall the loading of my graph?
Complete Schema and portion of my above query that Runs Successfully
(I am having difficulty getting the main query to run at all on sqlfiddle, though I can run it from my own machine).
http://sqlfiddle.com/#!9/865ee/11
Any help or suggestions would be tremendously appreciated!
EDIT
ADDED Sample Screenshot of my Categories Interface
Category (First Table) has a field called categoryname which assumes one of 4 values can be expanded or deleted which is - Team, Department, Source, Project_Phase.
CategoryName (Second Table) has a field called categoryvalue which is the actual allowed value for each category (First Table)
Example - Team 1, Team 2, Team 3 are categoryvalues within categoryname and corresponding the category of Team.
Category
Start by making that table of dates a permanent table, not a subquery.
This construct performs very poorly, and can usually be turned into JOINs without subqueries:
JOIN ( SELECT ... )
JOIN ( SELECT ... )
This is because there is no index on the subqueries, so full scans are needed.
Provide EXPLAIN for the entire query.
Addenda
A PRIMARY KEY is a key; don't add another key with the same column(s).
EAV schema leads to complexity and sluggishness that you are encountering.
Don't use TINYTEXT; it slows down tmp tables in complex queries; use VARCHAR(255). Don't use VARCHAR(255), use VARCHAR with a realistic limit.
Why do you need both categories and categorynames?
I have a simple MySQL statement:
SELECT q1, COUNT(q1) FROM results WHERE q1 IN ('1','2','3');
Currently there are only results for 1 and 3 - results are:
1 = 6
3 = 7
But what I need is for MySQL to bring back a result for 1,2 and 3 even though 2 has no data, as this:
1 = 6
2 = 0
3 = 7
Any ideas?
This is tricky because no rows match your value (2), they cannot be counted.
I would solve this by creating a temp table containing the list of values I want counts for:
CREATE TEMPORARY TABLE q ( q1 INT PRIMARY KEY );
INSERT INTO q (q1) VALUES (1), (2), (3);
Then do an OUTER JOIN to your results table:
SELECT q.q1, COALESCE(COUNT(*), 0) AS count
FROM q LEFT OUTER JOIN results USING (q1)
GROUP BY q.q1;
This way each value will be part of the final result set, even if it has no matching rows.
Re comment from #Mike Christensen:
MySQL doesn't support CTE's, in spite of it being requested as far back as 2006: http://bugs.mysql.com/bug.php?id=16244
You could do the same thing with a derived table:
SELECT q.q1, COALESCE(COUNT(*), 0) AS count
FROM (SELECT 1 AS q1 UNION ALL SELECT 2 UNION ALL SELECT 3) AS q
LEFT OUTER JOIN results USING (q1)
GROUP BY q.q1;
But this creates a temp table anyway.
A SQL query doesn't really have a way to refer to the values in your IN clause. I think you'd have to break this down into one query for each value. Something like:
SELECT 1 as q1, COUNT(1) FROM results WHERE q1 = '1'
UNION ALL
SELECT 2 as q1, COUNT(1) FROM results WHERE q1 = '2'
UNION ALL
SELECT 3 as q1, COUNT(1) FROM results WHERE q1 = '3'
Fiddle
Note: If there are a lot of values in your IN clause, you might be better off to write your code in a way where missing values are assumed to have zero.
In general, you cannot query something that does not exists. So, you must create data for it. Use union to add those missing data values.
select q1, COUNT(*)
from results
where q1 in ('1','2','3')
group by q1
union
select q1, 0
from (
select '1' as q1
union
select '2'
union
select '3'
) as q
where q1 not in (
select q1
from results
)
i have 4 tables, which has two fields in common
total_share and idea_user_id
i have a queru written to calculate total_income out of the SUM of total_share fro each table
here's my query
SELECT(
(SELECT SUM(total_share) FROM `idea_submitter_percentage` WHERE idea_user_id='3')
+
(SELECT SUM(total_share) FROM `idea_revisor_percentage` WHERE idea_user_id='3')
+
(SELECT SUM(total_share) FROM `idea_contributor_percentage` WHERE idea_user_id='3')
+
(SELECT SUM(total_share) FROM `idea_comparisor_percentage` WHERE idea_user_id='3')
)
AS total_income
the problem is that it works fine when i have atleast one row in each table where idea_user_id='3'
but if i have one table where no entries are present based on idea_user_id='3', then it returns me NULL, as such the total amount named total_income returns me NULL
how can i solve this problem
You could use coalesce() to deal with the nulls:
select coalesce(sum(...), 0) ...
Or you could move the sum in the outer query, and stick to union all in the subqueries:
select sum(...)
from ( select ... from ...
union all
select ... from ...
... ) as sub
...
You could also join all the tables (juergen's answer).
This type of syntax works in oracle... basically, you need an outer join.
SELECT NVL(SUM(total_share),0)
FROM `idea_comparisor_percentage`
WHERE idea_user_id (+) ='3'
I have a table with a varchar(255) field. I want to get (via a query, function, or SP) the number of occurences of each word in a group of rows from this table.
If there are 2 rows with these fields:
"I like to eat bananas"
"I don't like to eat like a monkey"
I want to get
word | count()
---------------
like 3
eat 2
to 2
i 2
a 1
Any idea? I am using MySQL 5.2.
#Elad Meidar, I like your question and I found a solution:
SELECT SUM(total_count) as total, value
FROM (
SELECT count(*) AS total_count, REPLACE(REPLACE(REPLACE(x.value,'?',''),'.',''),'!','') as value
FROM (
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.sentence, ' ', n.n), ' ', -1) value
FROM table_name t CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(t.sentence) - LENGTH(REPLACE(t.sentence, ' ', '')))
ORDER BY value
) AS x
GROUP BY x.value
) AS y
GROUP BY value
Here is the full working fiddle: http://sqlfiddle.com/#!2/17481a/1
First we do a query to extract all words as explained here by #peterm(follow his instructions if you want to customize the total number of words processed). Then we convert that into a sub-query and then we COUNT and GROUP BY the value of each word, and then make another query on top of that to GROUP BY not grouped words cases where accompanied signs might be present. ie: hello = hello! with a REPLACE
I would recommend not to do this in SQL at all. You're loading DB with something that it isn't best at. Selecting a group of rows and doing frequency calculation on the application side will be easier to implement, will work faster and will be maintained with less issues/headaches.
You can try this perverted-a-little way:
SELECT
(LENGTH(field) - LENGTH(REPLACE(field, 'word', ''))) / LENGTH('word') AS `count`
ORDER BY `count` DESC
This query can be very slow. Also, it looks pretty ugly.
I think you should do it like indexing, with additional table.
Whenever u create, update, or delete a row in your original table, you should update your indexing table. That indexing table should have the columns: word, and the number of occurrences.
I think you are trying to do too much with SQL if all the words are in one field of each row. I recommend to do any text processing/counting with your application after you grab the text fields from the db.