OK, a slight variation on an earlier theme. Using the same basic idea, I want to get independent counts of the fields, then I want them grouped by a higher order breakdown.
I expanded the example by David to include a higher order column:
district_id, product_id, service_id
dist proj serv
1 1 1
1 1 2
1 1 2
1 1 3
1 1 3
1 1 4
1 2 2
1 2 4
1 2 4
1 2 5
1 2 5
2 1 1
2 2 1
2 1 6
2 2 6
2 3 6
To get a result on the total, I used a simple query with two sub-queries.
select
(select count(Distinct project_id) from GroupAndCountTest) AS "projects",
(select count(Distinct service_id) from GroupAndCountTest) as "services";
projects services
3 6
The challenge was to get this grouped within the district_id. What I wanted was:
district_id projects services
1 2 5
2 3 6
I ended up using similar sub-queries, but the only way I was able to combine them (other than using a stored function) was to re-run the sub-queries for every district. (Not a big problem here, but in my application the sub-queries use multiple tables with a substantial number of "districts" so the two sub-queries are run again for each "district" which will become increasingly ineffecient.
This query works, but I would love to see something more effecient.
select t1.district_id, p1.projects, s1.services
from GroupAndCountTest as t1
join (select district_id, count(Distinct project_id) as projects
from GroupAndCountTest
group by district_id) AS p1
on p1.district_id=t1.district_id
join (select district_id, count(Distinct service_id) as services
from GroupAndCountTest
group by district_id) as s1
on s1.district_id=t1.district_id
group by t1.district_id;
Thanks.
PS: If you want to experiment, you can create the table with:
CREATE TABLE `GroupAndCountTest` (
`district_id` int(5) DEFAULT NULL,
`project_id` int(5) DEFAULT NULL,
`service_id` int(5) DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
insert into `GroupAndCountTest`(`district_id`,`project_id`,`service_id`)
values (1,1,1),(1,1,2),(1,1,2),(1,1,3),(1,1,3),(1,1,4),(1,2,2),(1,2,4),
(1,2,4),(1,2,5),(1,2,5),(2,1,1),(2,2,1),(2,1,6),(2,2,6),(2,3,6);
select district_id,
count(distinct(product_id)) projects,
count(distinct(service_id)) services
from MyTable group by district_id;
where MyTable contains district_id, product_id, service_id columns
You're making this way harder than it needs to be. You don't need subqueries for this, just a GROUP BY.
select district_id, count(distinct project_id), count(distinct service_id)
from GroupAndCountTest
group by district_id
SELECT district_id, count( DISTINCT (
project_id
) ) projects, count( DISTINCT (
service_id
) ) services
FROM GroupAndCountTest
GROUP BY district_id
I have been advanced :(
Related
I want to join columns from multiple tables to one column, in my case column 'battery_value' and 'technical_value' into column 'value'. I want to fetch data for only given category_ids, but because of UNION, I get data from other tables as well.
I have 4 tables:
Table: car
car_id model_name
1 e6
Table: battery
battery_category_id car_id battery_value
1 1 125 kW
Table: technical_data
technical_category_id car_id technical_value
1 1 5
3 1 2008
Table: categories
category_id category_name category_type
1 engine power battery
1 seats technical
3 release year technical
From searching, people are suggesting that I use union to join these columns. My query now looks like this:
SELECT CARS.car_id
category_id,
CATEGORIES.category_name,
value,
FROM CARS
left join (SELECT BATTERY.battery_category_id AS category_id,
BATTERY.car_id AS car_id,
BATTERY.value AS value
FROM BATTERY
WHERE `BATTERY`.`battery_category_id` IN (1)
UNION
SELECT TECHNICAL_DATA.technical_category_id AS category_id,
TECHNICAL_DATA.car_id AS car_id,
TECHNICAL_DATA.value AS value
FROM TECHNICAL_DATA
WHERE `TECHNICAL_DATA`.`technical_category_id` IN (3))
tt
ON CARS.car_id = tt.car_id
left join CATEGORIES
ON category_id = CATEGORIES.id
So the result I want is this, because I only want to get the data where category_id 1 is in battery table:
car_id category_id category_name technical_value
1 1 engine power 125 kW
1 3 release year 2008
but with the query above I get this, category_id 1 from technical table is included which is not something I want:
car_id category_id category_name value
1 1 engine power 125 kW
1 1 seats 125 kW
1 3 release year 2008
How can get exclude the 'seats' row?
For the results you want, I don't see why the cars table is needed. Then, you seem to need an additional key for the join to categories based on which table it is referring to.
So, I suggest:
SELECT tt.*, c.category_name
FROM ((SELECT b.battery_category_id AS category_id,
b.car_id AS car_id, b.value AS value,
'battery' as which
FROM BATTERY b
WHERE b.battery_category_id IN (1)
) UNION ALL
(SELECT td.technical_category_id AS category_id,
td.car_id AS car_id, td.value AS value,
'technical' as which
FROM TECHNICAL_DATA td
WHERE td.technical_category_id IN (3)
)
) tt LEFT JOIN
CATEGORIES c
ON c.id = tt.category_id AND
c.category_type = tt.which;
That said, you seem to have a problem with your data model, if the join to categories requires "hidden" data such as the type. However, that is outside the scope of the question.
I have a database with the following tables: Students, Classes, link_student_class. Where Students contains the information about the registered students and classes contains the information about the classes. As every student can attend multiple classes and every class can be attended by multiple students, I added a linking-table, for the mapping between students and classes.
Linking-Table
id | student_id | class_id
1 1 1
2 1 2
3 2 1
4 3 3
In this table both student_id as well as class_id will appear multiple times!
What I am looking for, is a SQL-Query that returns the information about all students (like in 'SELECT * FROM students') that are not attending a certain class (given by its id).
I tried the following SQL-query
SELECT * FROM `students`
LEFT JOIN(
SELECT * FROM link_student_class
WHERE class_id = $class_id
)
link_student_class ON link_student_class.student_id = students.student_id
Where $class_id is the id of the class which students i want to exclude.
In the returned object the students i want to include and those i want to exclude are different in the value of the column 'class_id'.
Those to be included have the value 'NULL' whereas those I want to exclude have a numerical value.
NOT EXISTS comes to mind:
select s.*
from students s
where not exists (select 1
from link_student_class lsc
where lsc.student_id = s.student_id and
lsc.class_id = ?
);
The ? is a placeholder for the parameter that provides the class.
you should check for NULL link_student_class.student_id
SELECT *
FROM `students`
LEFT JOIN(
SELECT *
FROM link_student_class
WHERE class_id = $class_id
) link_student_class ON link_student_class.student_id = students.student_id
where link_student_class.student_id is null
Or also a NOT IN predicate:
WITH
stud_class(id,stud_id,class_id) AS (
SELECT 1, 1,1
UNION ALL SELECT 2, 1,2
UNION ALL SELECT 3, 2,1
UNION ALL SELECT 4, 3,3
)
,
stud(stud_id,fname,lname) AS (
SELECT 1,'Arthur','Dent'
UNION ALL SELECT 2,'Ford','Prefect'
UNION ALL SELECT 3,'Tricia','McMillan'
UNION ALL SELECT 4,'Zaphod','Beeblebrox'
)
SELECT
s.*
FROM stud s
WHERE stud_id NOT IN (
SELECT
stud_id
FROM stud_class
WHERE class_id= 2
);
-- out stud_id | fname | lname
-- out ---------+--------+------------
-- out 3 | Tricia | McMillan
-- out 4 | Zaphod | Beeblebrox
-- out 2 | Ford | Prefect
-- out (3 rows)
-- out
-- out Time: First fetch (3 rows): 9.516 ms. All rows formatted: 9.550 ms
I am having trouble with understanding how to solve a seemingly simple problem of sorting results.
I want to compare how many other users like the same fruits as like the user with ID 1, a count who has the most matches and display the results in descending order.
users:
1 jack
2 john
3 jim
fruits:
id, title
1 apple
2 banana
3 orange
4 pear
5 mango
relations: 2 indexes (user_id, fruit_id) and (fruit_id, user_id)
user_id, fruit_id
1 1
1 2
1 5
2 1
2 2
2 4
3 3
3 1
expected results: (comparing with Jack's favourite fruits (user_id=1))
user_id, count
1 3
2 2
3 1
Query:
SELECT user_id, COUNT(*) AS count FROM relations
WHERE fruit_id IN (SELECT fruit_id FROM relations WHERE user_id=1)
GROUP BY user_id
HAVING count>=2
More "optimized" query:
SELECT user_id, COUNT(*) AS count FROM relations r
WHERE EXISTS (SELECT 1 FROM relations WHERE user_id=1 and r.fruit_id=fruit_id)
GROUP BY user_id
HAVING count>=2
2 is the minimum number of matches. (required for the future)
explain:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY r index NULL uid 8 NULL 15 Using where; Using index
2 DEPENDENT SUBQUERY relations eq_ref xox,uid xox 8 r.relations,const 1 Using where; Using index
Everything is working fine, until I try to use ORDER BY count DESC
Then I see: Using temporary; Using filesort
I don't want to use temporary tables or filesort. Because in the future, the database should be under high load.
I know, this is how SQL is defined and how it operates. But I can not figure out how to do it in other way? Without temporary tables and filesort.
I need to show the users who has the most matches first.
Please, help me out.
UPDATE:
I did some tests with the query from Walker Farrow (which is still uses a filesort).
20,000 rows - avg 0.05 seconds
120,000 0.20 sec.
1,100,000 2.9 sec.
disappointing results.
It would be possible to change the tables structure, but, with such a counting and sorting - I don't know how.
Is there any suggestions on how this can be done?
Probably the best way to do this would be to create a subquery and then order by in the outer-query, something like this:
select *
from (
SELECT user_id, COUNT(*) AS count FROM relations r
WHERE EXISTS (SELECT 1 FROM relations WHERE user_id=1 and r.fruit_id=fruit_id)
GROUP BY user_id
HAVING count(*)>=2
) x
order by count desc
Also, I don't know why you need to add exists. Can you just say the following:
select *
from (
SELECT user_id, COUNT(*) AS count FROM relations r
WHERE user_id=1
GROUP BY user_id
HAVING count(*)>=2
) x
order by count desc
?
I am not sure, maybe I am missing something. HOpe that helps!
I have a mysql table with actions and question_ids. Each action comes with a score like this:
ACTION | SCORE
downvote_question | -1
upvote_question | +1
in_cardbox | +2
I want to query for the question with the highest score but I can't figure it out.
http://sqlfiddle.com/#!2/84e26/15
So far my query is:
SELECT count(*), l1.question_id, l1.action
FROM `log` l1
GROUP BY l1.question_id, l1.action
which gives me every question_id with all its accumulated actions.
What I want is this:
QUESTION_ID | SCORE
2 | 5
1 | 4
3 | 1
4 | 1
5 | 1
I can't figure it out - I probably need subqueries, JOINS or UNIONS...
Maybe you can try this one.
SELECT a.question_id, sum(b.score) totalScore
FROM `log` a INNER JOIN scoreValue b
on a.`action` = b.actionname
group by a.question_id
ORDER BY totalScore DESC
SQLFiddle Demo
You should replace count(*) with sum(l1.score) because sum will add all values based on group by statement
SELECT sum(l1.score), l1.question_id, l1.action
FROM `log` l1
GROUP BY l1.question_id, l1.action
With constant scores works on SQL Fiddle (with grouping by question):
SELECT
sum(
CASE WHEN l1.action = 'downvote_question' THEN -1
WHEN l1.action = 'upvote_question' THEN 1
ELSE 2 END
) score,
l1.question_id
FROM `log` l1
GROUP BY l1.question_id
SELECT sum(SCORE), l1.question_id, l1.action
FROM `log` l1
GROUP BY l1.question_id, l1.action
is it what you want to?
upd: in your code on fidel, there is no such column as score, but i think it wont be a problem to create a tabel with action | score and join it to sum(score)
Example:
SELECT `film_id`,COUNT(film_id) AS COUNT FROM films_genres AS FilmsGenre
WHERE genre_id In (4)
GROUP BY film_id,COUNT
HAVING COUNT = 1
return:
film_id | COUNT
7 1
6 1
But I want it to return:
film_id
7
6
How do I return only 1 colomn?
To do it, just move your "COUNT(film_id)". Your HAVING clause will do the work for you.
SELECT `film_id` FROM films_genres AS FilmsGenre
WHERE genre_id In (4)
GROUP BY anime_id,film_id
HAVING COUNT(film_id) = 1
This isn't phrased as a CakePHP question,although it's tagged as such.
However, in CakePHP:
$this->FilmGenre->find('list',array('fields'=>array('film_id','film_id','anime_id')));
or make use of derived table
SELECT film_id from
(
SELECT `film_id`,COUNT(film_id) AS COUNT FROM films_genres AS FilmsGenre
WHERE genre_id In (4)
GROUP BY anime_id,COUNT
HAVING COUNT = 1
) as t