MySQL: COUNT with implicit GROUP BY - mysql

I've been trying to guess how to solve my problem for some time and I cannot seem to find a solution, so I come to you, experts.
What I've got
A MySQL table with the following structure and values (as an example):
+----+---------+----------------+-----------------+--------------+
| id | item_id | attribute_name | attribute_value | deleted_date |
+----+---------+----------------+-----------------+--------------+
| 1 | 2 | action | call | NULL |
| 2 | 2 | person | Joseph | NULL |
| 3 | 2 | action | fault | NULL |
| 4 | 2 | otherattr | otherval | NULL |
| 5 | 5 | action | call | NULL |
| 6 | 5 | person | Mike | NULL |
| 7 | 5 | action | sprint | NULL |
| 8 | 8 | action | call | NULL |
| 9 | 8 | person | Joseph | NULL |
| 10 | 8 | action | block | NULL |
| 11 | 8 | action | call | NULL |
+----+---------+----------------+-----------------+--------------+
What I need
I'd like a query to return me how many items (item_id) have at least one attribute_name with 'action' and with attribute_value as 'call', grouped by 'person', but only counting one of them.
So, if - like in the example, at ids 8 and 11 - there is an item_id with two "action" = "call", only COUNT one of them.
The query should return something like this:
+--------+--------------+
| person | action_calls |
+--------+--------------+
| Joseph | 2 |
| Mike | 1 |
+--------+--------------+
The problem
The problem is that I don't know how to do that in a simple way that would not make a huge performance increment, as this query will be returning and searching along a lot of rows - and returning a lot of them, too, in some cases.
The only thing that comes to my mind is with nested and nested queries, and I'd like to avoid that.
If I make a COUNT(DISTINCT), it only returns '1' in 'Joseph', because the value is always 'call', and if I GROUP BY b.item_id, it returns me two rows with Joseph (and, in this case too, it counts both 'call' attributes, so it wouldn't be the solution neither).
What I've tried
The query that I've tried is the following:
SELECT a.attribute_value AS person, COUNT(b.attribute_value) AS action_calls
FROM `er_item_attributes` a, `er_item_attributes` b
WHERE a.attribute_name = 'person'
AND b.item_id IN (SELECT DISTINCT item_id FROM er_item_parents WHERE parent_id IN (1234,4567))
AND b.item_id = a.item_id
AND b.attribute_name = 'action'
AND b.attribute_value = 'call'
AND b.deleted_date IS NULL
GROUP BY a.attribute_value, b.attribute_name
Additional information
The item_id, as you can see, will be also chosen from an inner WHERE clause, because the ones that are valid are in another table (just like a parent - son table). The parent_id numbers are for an example and are not relevant.
To sum up
How can I make a COUNT in MySQL to behave like a COUNT GROUP BY without nesting SELECTs that could deteriorate the performance?
If any further information was needed, comment it and I will try to add it.
Also, any recommendations on another way to query the information needed to improve performance will be welcome.
Thank you everyone for your time and help!
Kind regards.

Try this!
SELECT attribute_value AS person, COUNT(*) FROM `stack_1239`
WHERE item_id IN (
SELECT item_id FROM `stack_1239` WHERE attribute_name = 'action' AND attribute_value = 'call'
)
AND attribute_name = 'person'
GROUP BY person;
:)

DROP TABLE IF EXISTS eav_hell;
CREATE TABLE eav_hell
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,entity INT NOT NULL
,attribute VARCHAR(20) NOT NULL
,value VARCHAR(20) NOT NULL
);
INSERT INTO eav_hell
VALUES
( 1 ,2 ,'action','call'),
( 2 ,2 ,'person','Joseph'),
( 3 ,2 ,'action','fault'),
( 4 ,2 ,'otherattr','otherval'),
( 5 ,5 ,'action','call'),
( 6 ,5 ,'person','Mike'),
( 7 ,5 ,'action','sprint'),
( 8 ,8 ,'action','call'),
( 9 ,8 ,'person','Joseph'),
(10 ,8 ,'action','block'),
(11 ,8 ,'action','call');
SELECT e1.entity
, e1.value person
, e2.value action
, COUNT(*)
FROM eav_hell e1
LEFT
JOIN eav_hell e2
ON e2.entity = e1.entity
AND e2.attribute = 'action'
AND e2.value = 'call'
WHERE e1.attribute = 'person'
GROUP
BY entity
, person
, action;
+--------+--------+--------+----------+
| entity | person | action | COUNT(*) |
+--------+--------+--------+----------+
| 2 | Joseph | call | 1 |
| 5 | Mike | call | 1 |
| 8 | Joseph | call | 2 |
+--------+--------+--------+----------+
Edit:
SELECT e1.value person
, e2.value action
, COUNT(DISTINCT e1.entity)
FROM eav_hell e1
LEFT
JOIN eav_hell e2
ON e2.entity = e1.entity
AND e2.attribute = 'action'
AND e2.value = 'call'
WHERE e1.attribute = 'person'
GROUP
BY person
, action;
+--------+--------+---------------------------+
| person | action | COUNT(DISTINCT e1.entity) |
+--------+--------+---------------------------+
| Joseph | call | 2 |
| Mike | call | 1 |
+--------+--------+---------------------------+

Related

Remove duplication of rows while joining multiple tables - Mysql

media
id | title | ...
1 | a song |
2 | a video |
media setting
media_id | setting_id | chosen_option
1 | 1 | 2
1 | 2 | 3
2 | 1 | 1
2 | 2 | 4
So I have media table with various infromation about user uploaded media files and they have two settings 1.privacy( option-1 for public and option-2 for private) and 2.age-safty( option-3 is for all and option-4 is for adult only). Now when a user(adult) searching for a media, suppose with a title starts with a.....
Here is my query:
SELECT
m.id AS media_id, m.title AS media_title,
ms.setting_id AS setting, ms.chosen_option as opt
FROM media m
LEFT JOIN media_setting ms ON m.id = ms.media_id
WHERE m.title LIKE 'a%'
AND It will give me an output with duplicate rows one row with each setting which I don't want.
So what i want is :
media_id | media_title | setting_1 | option_for_1 | setting_2 | option_for_2
1 | a song | 1 | 2 | 2 | 3
2 | a video | 1 | 1 | 2 | 4
How can i achieve this? Thanks.
As per comments, I'd stick with the query you've got, and resolve the display issues in application code.
But anyway, here's a standard (and non-dynamic) approach in sql...
CREATE TABLE media
(id SERIAL PRIMARY KEY
,title VARCHAR(20) NOT NULL
);
INSERT INTO media VALUES
(1,'a song'),
(2,'a video');
DROP TABLE IF EXISTS media_setting;
CREATE TABLE media_setting
(media_id INT NOT NULL
,setting_id INT NOT NULL
,chosen_option INT NOT NULL
,PRIMARY KEY(media_id,setting_id)
);
INSERT INTO media_setting VALUES
(1,1,2),
(1,2,3),
(2,1,1),
(2,2,4);
SELECT m.*
, MAX(CASE WHEN s.setting_id = 1 THEN chosen_option END) option_for_1
, MAX(CASE WHEN s.setting_id = 2 THEN chosen_option END) option_for_2
FROM media m
LEFT
JOIN media_setting s
ON s.media_id = m.id
GROUP
BY m.id;
+----+---------+--------------+--------------+
| id | title | option_for_1 | option_for_2 |
+----+---------+--------------+--------------+
| 1 | a song | 2 | 3 |
| 2 | a video | 1 | 4 |
+----+---------+--------------+--------------+

MySQL Performance - LEFT JOIN / HAVING vs Sub Query

Which of the following queries style is better for performance?
Basically, I'm returning many related records into one row with GROUP_CONCAT and I need to filter by another join on the GROUP_CONCAT value, and I will need to add many more either joins/group_concats/havings or sub queries in order to filter by more related values. I saw that, officially, LEFT JOIN was faster, but I wonder if the GROUP_CONCAT and HAVING through that off.
(This is a very simplified example, the actual data has many more attributes and it's reading from a Drupal MySQL architecture)
Thanks!
Main Records
+----+-----------------+----------------+-----------+-----------+
| id | other_record_id | value | type | attribute |
+----+-----------------+----------------+-----------+-----------+
| 1 | 0 | Red Building | building | |
| 2 | 1 | ACME Plumbing | attribute | company |
| 3 | 1 | east_side | attribute | location |
| 4 | 0 | Green Building | building | |
| 5 | 4 | AJAX Heating | attribute | company |
| 6 | 4 | west_side | attribute | location |
| 7 | 0 | Blue Building | building | |
| 8 | 7 | ZZZ Mattresses | attribute | company |
| 9 | 7 | south_side | attribute | location |
+----+-----------------+----------------+-----------+-----------+
location_transaltions
+-------------+------------+
| location_id | value |
+-------------+------------+
| 1 | east_side |
| 2 | west_side |
| 3 | south_side |
+-------------+------------+
locations
+----+--------------------+
| id | name |
+----+--------------------+
| 1 | Arts District |
| 2 | Warehouse District |
| 3 | Suburb |
+----+--------------------+
Query #1
SELECT
a.id,
GROUP_CONCAT(
IF(b.attribute = 'company', b.value, NULL)
) AS company_value,
GROUP_CONCAT(
IF(b.attribute = 'location', b.value, NULL)
) AS location_value,
GROUP_CONCAT(
IF(b.attribute = 'location', lt.location_id, NULL)
) AS location_id
FROM
records a
LEFT JOIN records b ON b.other_record_id = a.id AND b.type = 'attribute'
LEFT JOIN location_translations lt ON lt.value = b.value
WHERE a.type = 'building'
GROUP BY a.id
HAVING location_id = 2
Query #2
SELECT temp.* FROM (
SELECT
a.id,
GROUP_CONCAT(
IF(b.attribute = 'company', b.value, NULL)
) AS company_value,
GROUP_CONCAT(
IF(b.attribute = 'location', b.value, NULL)
) AS location_value
FROM
records a
LEFT JOIN records b ON b.other_record_id = a.id AND b.type = 'attribute'
WHERE a.type = 'building'
GROUP BY a.id
) as temp
LEFT JOIN location_translations lt ON lt.value = temp.location_value
WHERE location_id = 2
Using JOIN is preferable in most cases, because it helps optimizer to understand which indexes he can to use. In your case, query #1 looks good enough.
Of course, it works only if tables has indexes. Check table records has indexes on id, other_record_id, value and type columns, table location_translations on value

MySQL - How get this result?

I have a two tables.
work:
+----+----------+
| id | position |
+----+----------+
| 1 | 1 |
| 2 | 2 |
+----+----------+
content:
+----+---------+------+-------------+
| id | work_id | name | translation |
+----+---------+------+-------------+
| 1 | 1 | Kot | 1 |
| 2 | 1 | Cat | 2 |
| 3 | 2 | Ptak | 1 |
| 4 | 2 | Bird | 2 |
| 5 | 2 | Ssss | 3 |
+----+---------+------+-------------+
I want to get result like this:
+----+------+----------+
| id | name | sortName |
+----+------+----------+
| 1 | Kot | NULL |
| 1 | Cat | NULL |
| 2 | Ptak | Ssss |
| 2 | Bird | Ssss |
+----+------+----------+
My not working query is here:
select
w.id,
c.name,
cSort.name as sortName
from
work w
LEFT JOIN
content c
ON
(w.id=c.work_id)
LEFT JOIN
content cSort
ON
(w.id=cSort.work_id)
WHERE
c.translation IN(1,2) AND
cSort.translation=3
ORDER BY
sortName
I want to get for each work at least one translation and secound if exist (translation=1 always exist). And for every row I want special column with translation used to sort. But Not always this translation exist for work.id. In this example I want to sort work by translation=3.
Sorry for my not fluent english. Any ideas?
Best regards
/*
create table work ( id int, position int);
insert into work values
( 1 , 1 ),
( 2 , 2 );
create table content(id int, work_id int, name varchar(4), translation int);
insert into content values
( 1 , 1 , 'Kot' , 1),
( 2 , 1 , 'Cat' , 2),
( 3 , 2 , 'Ptak' , 1),
( 4 , 2 , 'Bird' , 2),
( 5 , 2 , 'Ssss' , 3);
*/
select w.id,c.name,(select c.name from content c where c.work_id = w.id and c.translation = 3) sortname
from work w
join content c on w.id = c.work_id
where c.translation <> 3;
result
+------+------+----------+
| id | name | sortname |
+------+------+----------+
| 1 | Kot | NULL |
| 1 | Cat | NULL |
| 2 | Ptak | Ssss |
| 2 | Bird | Ssss |
+------+------+----------+
So translation is also a work_id and you consider translation = 3 a translation in your example and translation <> 3 an original. You want to join each original record with every translation record where the latter's work_id matches the former's translation.
I think you are simply confusing IDs here. It should be ON (w.translation = cSort.work_id).
Another way to write the query:
select o.work_id as id, o.name, t.name as sortname
from (select * from content where translation <> 3) o
left join (select * from content where translation = 3) t
on t.work_id = o.translation
order by t.name;
There seems to be no need to join table work.
I'd like to add that the table design is a bit confusing. Somehow it is not clear from it what is a translation for what. In your example you interpret translation 3 as a translation for the non-three records, but this is just an example as you say. I don't find this readable.
UPDATE: In order to sort your results by work.position, you can join that table or use a subquery instead. Here is the order by clause for the latter:
order by (select position from work w where w.id = o.work_id);

Correctly join 1:n:1:1 relation in mysql database

I'm developing a system to manage rental processes right now and I'm wondering how to efficiently query all rentable objects with the person name, who is currently renting it, if the object is rented at the moment. Otherwise there should be NULL in that column.
My tables look like:
object
| object_id | object_name |
---------------------------
| 1 | Object A |
| 2 | Object B |
| 3 | Object C |
| 4 | Object D |
| 5 | Object E |
---------------------------
person
| person_id | person_name |
---------------------------
| 1 | John Doe |
| 2 | Jane Doe |
| 3 | Max Muster |
| 4 | Foobar |
---------------------------
rental
| rental_id | rental_state| person_person_id |
----------------------------------------------
| 1 | open | 1 |
| 2 | returned | 1 |
| 3 | returned | 2 |
| 4 | open | 3 |
| 5 | returned | 4 |
----------------------------------------------
rental2object
| rental_rental_id | object_object_id |
---------------------------------------
| 1 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 3 |
| 4 | 2 |
| 4 | 5 |
| 5 | 2 |
---------------------------------------
The result I want should look like this:
| object_id | object_name | rented_to |
-------------------------------------------
| 1 | Object A | John Doe |
| 2 | Object B | Max Muster |
| 3 | Object C | NULL |
| 4 | Object D | NULL |
| 5 | Object E | Max Muster |
-------------------------------------------
What I've got so far is:
SELECT `object_id`, `object_name`, `person_name` FROM `object`
LEFT JOIN `rental2object` ON `object_id` = `object_object_id`
LEFT JOIN `rental` ON `rental_id` = `rental_rental_id` AND `rental_state` = 'open'
LEFT JOIN `person` ON `person_id` = `person_person_id`
GROUP BY `object_id`
The obvious problem is that I don't know how to aggregate the right way while grouping.
What would be the most efficient way to achieve my goal? Appreciate your help.
EDIT
Corrected the expected result, so that Object B is also rented to Max Muster.
About your question
Objects #2 and #5 are both in rental #4. But, on your expected results, you are handling both in different way. Object E and Object B both should be the same behaviour because they are in the same rental. If not, you should to explain witch is the criteria to know if a product has or not a related person.
Group by
To be SQL92 compliant you should to include in select clause all nonaggregated columns:
SELECT `object_id`, `object_name`, `person_name` as rented_to
FROM `object`
...
GROUP BY `object_id`, `object_name`, `person_name`
To be SQL99 compliant you should to include in select clause all nonaggregated columns non functionlly dependent, in your case, they are a dependent between object_id and object_name: object_id -> object_name (the field rental_state breaks dependent functionality to person), then you can just to write:
SELECT `object_id`, `object_name`, `person_name` as rented_to
FROM `object`
...
GROUP BY `object_id`, `person_name`
MySQL 5.7.5 and up implements detection of functional dependence, then this last select is valid but I suggest to you that, for readability, use the first one.
Read MySQL Handling of GROUP BY for more info and ONLY_FULL_GROUP_BY parameter details.
Performance
Be sure you have indexes for:
object: Object_id ( is primary key, then index is implicit )
rental2object: object_object_id ( may be a composite index with the other field, but be sure object_object_id is the first field on index )
rental : rental_id & rental_state ( a composite index with both fields )
person: person_id ( is primary key, then index is implicit )
Try this
SELECT
o.object_id,
o.object_name,
p.person_name AS rent_to
FROM
rental2object ro
RIGHT JOIN object o ON ro.object_object_id = o.object_id
LEFT JOIN rental r ON ro.rental_rental_id = r.rental_id AND r.rental_status = 'open'
JOIN person p ON r.person_person_id = p.person_id
SELECT `object_id`, `object_name`,
case
when rental_state = 'Open' then `person_name`
when r1.rental_rental_id is null then null
else `rental_state`
end as RentedTo
FROM `object`
LEFT JOIN `rental2object` r1 ON `object_id` = r1.`object_object_id`
LEFT JOIN `rental` ON `rental_id` = r1.`rental_rental_id`
LEFT JOIN `person` ON `person_id` = `person_person_id`
where r1.rental_rental_id =
(select max(r2.`rental_rental_id`)
from `rental2object` r2
where r2.`object_object_id` = r1.`object_object_id`
group by r2.`object_object_id`)
or r1.rental_rental_id is null
GROUP BY `object_id`;

Select maximum value of each member from table [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Closed 8 years ago.
I want to select the best result of each member from the mysql table, for a given discipline.
(if there are entries with the same value, the entries with the lowest event start date should be taken)
DDLs:
CREATE TABLE `results` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`discipline` int(11) NOT NULL,
`member` int(11) DEFAULT '0',
`event` int(11) DEFAULT '0',
`value` int(11) DEFAULT '0',
PRIMARY KEY (`id`),
UNIQUE KEY `member_2` (`member`,`discipline`,`event`)
);
INSERT INTO results VALUES
(1,1,2,4,10),
(2,1,1,4, 8),
(3,1,2,5, 9),
(4,2,3,5, 9),
(5,1,2,6,11),
(6,1,2,7,11),
(7,1,2,1,11),
(8,1,2,3, 7);
CREATE TABLE `events` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) DEFAULT NULL,
`startDate` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
);
INSERT INTO events VALUES
(1 ,'Not in scope','2012-05-23'),
(3 ,'Test 0', '2014-05-09'),
(4 ,'Test 1', '2014-05-10'),
(5 ,'Test 2', '2014-05-11'),
(6 ,'Test 3', '2014-05-12'),
(7 ,'Test 4', '2014-05-13');
SELECT * FROM results;
+----+------------+--------+-------+-------+
| id | discipline | member | event | value |
+----+------------+--------+-------+-------+
| 1 | 1 | 2 | 4 | 10 |
| 2 | 1 | 1 | 4 | 8 |
| 3 | 1 | 2 | 5 | 9 |
| 4 | 2 | 3 | 5 | 9 |
| 5 | 1 | 2 | 6 | 11 |
| 6 | 1 | 2 | 7 | 11 |
| 7 | 1 | 2 | 1 | 11 |
| 8 | 1 | 2 | 3 | 7 |
+----+------------+--------+-------+-------+
SELECT * FROM events;
+----+--------------+---------------------+
| id | name | startDate |
+----+--------------+---------------------+
| 1 | Not in scope | 2012-05-23 00:00:00 |
| 3 | Test 0 | 2014-05-09 00:00:00 |
| 4 | Test 1 | 2014-05-10 00:00:00 |
| 5 | Test 2 | 2014-05-11 00:00:00 |
| 6 | Test 3 | 2014-05-12 00:00:00 |
| 7 | Test 4 | 2014-05-13 00:00:00 |
+----+--------------+---------------------+
Result should be:
+---------+------------+--------+-------+-------+
| id | discipline | member | event | value |
+---------+------------+--------+-------+-------+
| 3 | 1 | 1 | 4 | 8 |
| 5 | 1 | 2 | 6 | 11 |
+---------+------------+--------+-------+-------+
My first approach was to group by member id, but it's not that easy. So I tried a lot of different approaches from the web and from my colleages.
The last one was:
select res.*
from `results` as res
join (select id, max(value)
from results
join events on results.event = events.id
where discipline = 1
events.name like 'Test%'
Group by id
Order by events.startDate ASC) as tmpRes
on res.id = tmpRes.id
group by member
order by value DESC
But the result in this example would be a random result id for member 2.
Should be correct now, but let me know if there's a mistake...
SELECT r.*
FROM events e
JOIN results r
ON r.event = e.id
JOIN
( SELECT r.member
, MIN(e.startdate) min_startdate
FROM events e
JOIN results r
ON r.event = e.id
JOIN
( SELECT member
, MAX(value) max_value
, discipline
FROM events e
JOIN results r
ON r.event = e.id
WHERE discipline = 1
AND name LIKE 'Test%'
GROUP
BY member
) x
ON x.member = r.member
AND x.max_value = r.value
AND x.discipline = r.discipline
AND e.name LIKE 'Test%'
GROUP
BY member
) y
ON y.member = r.member
AND y.min_startdate = e.startdate;
Although fast, because these queries can get rather complex and cumbersome, there's an undocumented hack that achieves the same result. It goes something like this...
SELECT *
FROM
( SELECT r.*
FROM events e
JOIN results r
ON r.event = e.id
WHERE discipline = 1
AND name LIKE 'Test%'
ORDER
BY member
, value DESC
, startdate
) x
GROUP
BY member;
If I understand your question correctly, you need to group on member in the sub-query. Try the following:
select res.*
from `results` as res
join (select member, min(event) AS minEvent, max(value) AS maxValue
from results
where discipline = 1
Group by member) as tmpRes
on res.member = tmpRes.member AND res.event=tmpRes.minEvent AND res.value=tmpRes.maxValue
order by res.value
EDIT (bast on most recent comment): If that's the case, you'll need to join on the Events table. Unless the startDate field is actually a temporal field, it's going to be a big mess.
It would have made things easier with all the requirements included in the original question.