Left Join Subselect with LIMIT in MySQL - mysql

I have 3 tables:
actor
| FIELD | TYPE | NULL | KEY | DEFAULT | EXTRA |
|----------|------------------|------|-----|---------|----------------|
| actor_id | int(10) unsigned | NO | PRI | (null) | auto_increment |
| username | varchar(30) | NO | | (null) | |
tag
| FIELD | TYPE | NULL | KEY | DEFAULT | EXTRA |
|--------|------------------|------|-----|---------|----------------|
| tag_id | int(10) unsigned | NO | PRI | (null) | auto_increment |
| title | varchar(40) | NO | | (null) | |
actor_tag_count
| FIELD | TYPE | NULL | KEY | DEFAULT | EXTRA |
|------------------|------------------|------|-----|-------------------|-----------------------------|
| actor_id | int(10) unsigned | NO | PRI | (null) | |
| tag_id | int(10) unsigned | NO | PRI | (null) | |
| clip_count | int(10) unsigned | NO | | (null) | |
| update_timestamp | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
SQLFiddle
I want to get the 5 most frequent (highest clip_count) and most recently updated (latest update_timestamp) tags for each actor.
My attempted query is:
SELECT
`a`.`actor_id`,
`a`.`username`,
GROUP_CONCAT(atc.clip_count) AS `tag_clip_counts`,
GROUP_CONCAT(t.tag_id) AS `tag_ids`,
GROUP_CONCAT(t.title) AS `tag_titles`
FROM
`actor` AS `a`
LEFT JOIN (
SELECT
`atc`.`actor_id`,
`atc`.`tag_id`,
`atc`.`clip_count`
FROM
`actor_tag_count` AS `atc`
INNER JOIN `actor` AS `a` USING (actor_id)
ORDER BY
atc.clip_count DESC,
atc.update_timestamp DESC
LIMIT 5
) AS `atc` USING (actor_id)
LEFT JOIN `tag` AS `t` ON atc.tag_id = t.tag_id
GROUP BY
`a`.`actor_id`
The problem is that the left join subselect is only calculated once and the tags for every result in the set are only fetched from a pool of 5 tags.
Expected GROUP_CONCAT'd tag title results for Keanu Reeves:
comedy, scifi, action, suspense, western
(Both western and documentary have a clip_count of 2, but western should come first because it has a later update_timestamp)
I'm not sure this is a point of any relevance, but I am executing other joins on the actors table but had them removed for this question.
It would be highly preferable to make this all 1 query, but I'm stumped on how to do this even with 2 queries. 1-or-2-query solutions appreciated.

SQLFiddle, with the help of a very nice answer about using a GROUP_CONCAT limit workaround:
SELECT
`a`.`actor_id`,
`a`.`username`,
SUBSTRING_INDEX(GROUP_CONCAT(atc.clip_count ORDER BY atc.clip_count DESC, atc.update_timestamp DESC), ',', 5) AS `tag_clip_counts`,
SUBSTRING_INDEX(GROUP_CONCAT(t.tag_id ORDER BY atc.clip_count DESC, atc.update_timestamp DESC), ',', 5) AS `tag_ids`,
SUBSTRING_INDEX(GROUP_CONCAT(t.title ORDER BY atc.clip_count DESC, atc.update_timestamp DESC), ',', 5) AS `tag_titles`
FROM
`actor` AS `a`
LEFT JOIN actor_tag_count AS `atc` USING (actor_id)
LEFT JOIN `tag` AS `t` ON atc.tag_id = t.tag_id
GROUP BY
`a`.`actor_id`

It is possible by adding a sequence number, but might not perform well on large tables.
Something like this (not tested):-
SELECT actor_id,
username,
GROUP_CONCAT(clip_count) AS tag_clip_counts,
GROUP_CONCAT(tag_id) AS tag_ids,
GROUP_CONCAT(title) AS tag_titles
FROM
(
SELECT actor.actor_id,
actor.username,
atc.clip_count,
tag.tag_id,
tag.title,
#aSeq := IF(#aActorId = actor.actor_id, #aSeq, 0) + a AS aSequence,
#aActorId := actor.actor_id
FROM
(
SELECT actor.actor_id,
actor.username,
atc.clip_count,
tag.tag_id,
tag.title
FROM actor
LEFT JOIN actor_tag_count AS atc ON actor.actor_id = atc.actor_id
LEFT JOIN tag ON atc.tag_id = tag.tag_id
ORDER BY actor.actor_id, atc.clip_count DESC, atc.update_timestamp DESC
)
CROSS JOIN (SELECT #aSeq:=0, #aActorId:=0)
)
WHERE aSequence <= 5
GROUP BY actor_id, username
A alternative would be to have a subselect that has a correlated sub query in the select statement (with a limit of 5), and then have an outer query that does the group concats. Something like this (again not tested)
SELECT
actor_id,
username,
GROUP_CONCAT(clip_count) AS tag_clip_counts,
GROUP_CONCAT(tag_id) AS tag_ids,
GROUP_CONCAT(title) AS tag_titles
FROM
(
SELECT
a.actor_id,
a.username,
(
SELECT
atc.clip_count,
t.tag_id,
t.title
FROM actor_tag_count AS atc ON a.actor_id = atc.actor_id
LEFT JOIN tag t ON atc.tag_id = t.tag_id
ORDER BY atc.clip_count DESC, atc.update_timestamp DESC
LIMIT 5
)
FROM actor a
)
GROUP BY actor_id, username

Related

SQL improvement in MySQL

I have these tables in MySQL.
CREATE TABLE `tableA` (
`id_a` int(11) NOT NULL,
`itemCode` varchar(50) NOT NULL,
`qtyOrdered` decimal(15,4) DEFAULT NULL,
:
PRIMARY KEY (`id_a`),
KEY `INDEX_A1` (`itemCode`)
) ENGINE=InnoDB
CREATE TABLE `tableB` (
`id_b` int(11) NOT NULL AUTO_INCREMENT,
`qtyDelivered` decimal(15,4) NOT NULL,
`id_a` int(11) DEFAULT NULL,
`opType` int(11) NOT NULL, -- '0' delivered to customer, '1' returned from customer
:
PRIMARY KEY (`id_b`),
KEY `INDEX_B1` (`id_a`)
KEY `INDEX_B2` (`opType`)
) ENGINE=InnoDB
tableA shows how many quantity we received order from customer, tableB shows how many quantity we delivered to customer for each order.
I want to make a SQL which counts how many quantity remaining for delivery on each itemCode.
The SQL is as below. This SQL works, but slow.
SELECT T1.itemCode,
SUM(IFNULL(T1.qtyOrdered,'0')-IFNULL(T2.qtyDelivered,'0')+IFNULL(T3.qtyReturned,'0')) as qty
FROM tableA AS T1
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyDelivered FROM tableB WHERE opType = '0' GROUP BY id_a)
AS T2 on T1.id_a = T2.id_a
LEFT JOIN (SELECT id_a,SUM(qtyDelivered) as qtyReturned FROM tableB WHERE opType = '1' GROUP BY id_a)
AS T3 on T1.id_a = T3.id_a
WHERE T1.itemCode = '?'
GROUP BY T1.itemCode
I tried explain on this SQL, and the result is as below.
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+----------------+----------+---------+-------+-------+----------------------------------------------+
| 1 | PRIMARY | T1 | ref | INDEX_A1 | INDEX_A1 | 152 | const | 1 | Using where |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 21211 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 10 | |
| 3 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 96 | Using where; Using temporary; Using filesort |
| 2 | DERIVED | tableB | ref | INDEX_B2 | INDEX_B2 | 4 | | 55614 | Using where; Using temporary; Using filesort |
+----+-------------+-------------------+----------------+----------+---------+-------+-------+----------------------------------------------+
I want to improve my query. How can I do that?
First, your table B has int for opType, but you are comparing to string via '0' and '1'. Leave as numeric 0 and 1. To optimize your pre-aggregates, you should not have individual column indexes, but a composite, and in this case a covering index. INDEX table B ON (OpType, ID_A, QtyDelivered) as a single index. The OpType to optimize the WHERE, ID_A to optimize the group by, and QtyDelivered for the aggregate in the index without going to the raw data pages.
Since you are looking for the two types, you can roll them up into a single subquery testing for either in a single pass result. THEN, Join to your tableA results.
SELECT
T1.itemCode,
SUM( IFNULL(T1.qtyOrdered, 0 )
- IFNULL(T2.qtyDelivered, 0)
+ IFNULL(T2.qtyReturned, 0)) as qty
FROM
tableA AS T1
LEFT JOIN ( SELECT
id_a,
SUM( IF( opType=0,qtyDelivered, 0)) as qtyDelivered,
SUM( IF( opType=1,qtyDelivered, 0)) as qtyReturned
FROM
tableB
WHERE
opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
on T1.id_a = T2.id_a
WHERE
T1.itemCode = '?'
GROUP BY
T1.itemCode
Now, depending on the size of your tables, you might be better doing a JOIN on your inner table to table A so you only get those of the item code you are expectin. If you have 50k items and you are only looking for items that qualify = 120 items, then your inner query is STILL qualifying based on the 50k. In that case would be overkill. In this case, I would suggest an index on table A by ( ItemCode, ID_A ) and adjust the inner query to
LEFT JOIN ( SELECT
b.id_a,
SUM( IF( b.opType = 0, b.qtyDelivered, 0)) as qtyDelivered,
SUM( IF( b.opType = 1, b.qtyDelivered, 0)) as qtyReturned
FROM
( select distinct id_a
from tableA
where itemCode = '?' ) pqA
JOIN tableB b
on PQA.id_A = b.id_a
AND b.opType IN ( 0, 1 )
GROUP BY
id_a) AS T2
My Query against your SQLFiddle

How to rewrite a NOT IN subquery as join

Let's assume that the following tables in MySQL describe documents contained in folders.
mysql> select * from folder;
+----+----------------+
| ID | PATH |
+----+----------------+
| 1 | matches/1 |
| 2 | matches/2 |
| 3 | shared/3 |
| 4 | no/match/4 |
| 5 | unreferenced/5 |
+----+----------------+
mysql> select * from DOC;
+----+------+------------+
| ID | F_ID | DATE |
+----+------+------------+
| 1 | 1 | 2000-01-01 |
| 2 | 2 | 2000-01-02 |
| 3 | 2 | 2000-01-03 |
| 4 | 3 | 2000-01-04 |
| 5 | 3 | 2000-01-05 |
| 6 | 3 | 2000-01-06 |
| 7 | 4 | 2000-01-07 |
| 8 | 4 | 2000-01-08 |
| 9 | 4 | 2000-01-09 |
| 10 | 4 | 2000-01-10 |
+----+------+------------+
The columns ID are the primary keys and the column F_ID of table DOC is a not-null foreign key that references the primary key of table FOLDER. By using the 'DATE' of documents in the where clause, I would like to find which folders contain only the selected documents. For documents earlier than 2000-01-05, this could be written as:
SELECT DISTINCT d1.F_ID
FROM DOC d1
WHERE d1.DATE < '2000-01-05'
AND d1.F_ID NOT IN (
SELECT d2.F_ID
FROM DOC d2 WHERE NOT (d2.DATE < '2000-01-05')
);
and it correctly returns '1' and '2'. By reading
http://dev.mysql.com/doc/refman/5.5/en/rewriting-subqueries.html
the performance for big tables could be improved if the subquery is replaced with a join. I already found questions related to NOT IN and JOINS but not exactly what I was looking for. So, any ideas of how this could be written with joins ?
The general answer is:
select t.*
from t
where t.id not in (select id from s)
Can be rewritten as:
select t.*
from t left outer join
(select distinct id from s) s
on t.id = s.id
where s.id is null
I think you can apply this to your situation.
select distinct d1.F_ID
from DOC d1
left outer join (
select F_ID
from DOC
where date >= '2000-01-05'
) d2 on d1.F_ID = d2.F_ID
where d1.date < '2000-01-05'
and d2.F_ID is null
If I understand your question correctly, that you want to find the F_IDs representing folders which only contains documents from before '2000-01-05', then simply
SELECT F_ID
FROM DOC
GROUP BY F_ID
HAVING MAX(DATE) < '2000-01-05'
Sample Table and Insert Statements
CREATE TABLE `tleft` (
`id` int(2) NOT NULL,
`name` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
CREATE TABLE `tright` (
`id` int(2) NOT NULL,
`t_left_id` int(2) DEFAULT NULL,
`description` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
INSERT INTO `tleft` (`id`, `name`)
VALUES
(1, 'henry'),
(2, 'steve'),
(3, 'jeff'),
(4, 'richards'),
(5, 'elon');
INSERT INTO `tright` (`id`, `t_left_id`, `description`)
VALUES
(1, 1, 'sample'),
(2, 2, 'sample');
Left Join : SELECT l.id,l.name FROM tleft l LEFT JOIN tright r ON l.id = r.t_left_id ;
Returns Id : 1, 2, 3, 4, 5
Right Join : SELECT l.id,l.name FROM tleft l RIGHT JOIN tright r ON l.id = r.t_left_id ;
Returns Id : 1,2
Subquery Not in tright : select id from tleft where id not in ( select t_left_id from tright);
Returns Id : 3,4,5
Equivalent Join For above subquery :
SELECT l.id,l.name FROM tleft l LEFT JOIN tright r ON l.id = r.t_left_id WHERE r.t_left_id IS NULL;
AND clause will be applied during the JOIN and WHERE clause will be applied after the JOIN .
Example : SELECT l.id,l.name FROM tleft l LEFT JOIN tright r ON l.id = r.t_left_id AND r.description ='hello' WHERE r.t_left_id IS NULL ;
Hope this helps

MySQL Query Optimization with MAX()

I have 3 tables with the following schema:
CREATE TABLE `devices` (
`device_id` int(11) NOT NULL auto_increment,
`name` varchar(20) default NULL,
`appliance_id` int(11) default '0',
`sensor_type` int(11) default '0',
`display_name` VARCHAR(100),
PRIMARY KEY USING BTREE (`device_id`)
)
CREATE TABLE `channels` (
`channel_id` int(11) NOT NULL AUTO_INCREMENT,
`device_id` int(11) NOT NULL,
`channel` varchar(10) NOT NULL,
PRIMARY KEY (`channel_id`),
KEY `device_id_idx` (`device_id`)
)
CREATE TABLE `historical_data` (
`date_time` datetime NOT NULL,
`channel_id` int(11) NOT NULL,
`data` float DEFAULT NULL,
`unit` varchar(10) DEFAULT NULL,
KEY `devices_datetime_idx` (`date_time`) USING BTREE,
KEY `channel_id_idx` (`channel_id`)
)
The setup is that a device can have one or more channels and each channel has many (historical) data.
I use the following query to get the last historical data for one device and all it's related channels:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1')
GROUP BY c.channel
ORDER BY h.date_time, channel
The query plan looks as follows:
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
| 1 | SIMPLE | c | ALL | PRIMARY,device_id_idx | NULL | NULL | NULL | 34 | Using where |
| 1 | SIMPLE | d | eq_ref | PRIMARY | PRIMARY | 4 | c.device_id | 1 | Using where |
| 1 | SIMPLE | h | ref | channel_id_idx | channel_id_idx | 4 | c.channel_id | 322019 | |
+----+-------------+-------+--------+-----------------------+----------------+---------+---------------------------+--------+-------------+
3 rows in set (0.00 sec)
The above query is currently taking approximately 15 secs and I wanted to know if there are any tips or way to improve the query?
Edit:
Example data from historical_data
+---------------------+------------+------+------+
| date_time | channel_id | data | unit |
+---------------------+------------+------+------+
| 2011-11-20 21:30:57 | 34 | 23.5 | C |
| 2011-11-20 21:30:57 | 9 | 68 | W |
| 2011-11-20 21:30:54 | 34 | 23.5 | C |
| 2011-11-20 21:30:54 | 5 | 316 | W |
| 2011-11-20 21:30:53 | 34 | 23.5 | C |
| 2011-11-20 21:30:53 | 2 | 34 | W |
| 2011-11-20 21:30:51 | 34 | 23.4 | C |
| 2011-11-20 21:30:51 | 9 | 68 | W |
| 2011-11-20 21:30:49 | 34 | 23.4 | C |
| 2011-11-20 21:30:49 | 4 | 193 | W |
+---------------------+------------+------+------+
10 rows in set (0.00 sec)
Edit 2:
Mutliple channel SELECT example:
SELECT c.channel_id, c.channel, max(h.date_time), h.data
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
INNER JOIN historical_data h ON h.channel_id = c.channel_id
WHERE d.name = 'livingroom' AND d.appliance_id = '0'
AND d.sensor_type = 1 AND ( c.channel = 'ch1' OR c.channel = 'ch2' OR c.channel = 'ch2')
GROUP BY c.channel
ORDER BY h.date_time, channel
I've used OR in the c.channel where clause because it was easier to generated pro grammatically but it can be changed to use IN if necessary.
Edit 3:
Example result of what I'm trying to achieve:
+-----------+------------+---------+---------------------+-------+
| device_id | channel_id | channel | max(h.date_time) | data |
+-----------+------------+---------+---------------------+-------+
| 28 | 9 | ch1 | 2011-11-21 20:39:36 | 0 |
| 28 | 35 | ch2 | 2011-11-21 20:30:55 | 32767 |
+-----------+------------+---------+---------------------+-------+
I have added the device_id to the example but my select will only need to return channel_id, channel, last date_time i.e max and the data. The results should be the last record from the historical_data table for each channel for one device.
It seems that removing an re-creating the index on date_time by deleting and creating it again sped up my original SQL up to around 2secs
I haven't been able to test this, so I'd like to ask you to run it and let us know what happens.. if it gives you the desired result and if it runs faster than your current:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND c.channel = param_channel
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE ('livingroom', 0, 1, 'ch1');
I tried working it into a stored procedure so that even if you get the desired results using this for one device, you can try it with another device and see the results... Thanks!
[edit] : : In response to Danny's comment here's an updated test version:
CREATE DEFINER=`root`#`localhost` PROCEDURE `GetLatestHistoricalData_EXAMPLE_3Channel`
(
IN param_device_name VARCHAR(20)
, IN param_appliance_id INT
, IN param_sensor_type INT
, IN param_channel_1 VARCHAR(10)
, IN param_channel_2 VARCHAR(10)
, IN param_channel_3 VARCHAR(10)
)
BEGIN
SELECT
h.date_time, h.data
FROM
historical_data h
INNER JOIN
(
SELECT c.channel_id
FROM devices d
INNER JOIN channels c ON c.device_id = d.device_id
WHERE
d.name = param_device_name
AND d.appliance_id = param_appliance_id
AND d.sensor_type = param_sensor_type
AND (
c.channel IN (param_channel_1
,param_channel_2
,param_channel_3
)
c ON h.channel_id = c.channel_id
ORDER BY h.date_time DESC
LIMIT 1;
END
Then to run a test:
CALL GetLatestHistoricalData_EXAMPLE_3Channel ('livingroom', 0, 1, 'ch1', 'ch2' , 'ch3');
Again, this is just for testing, so you'll be able to see if it meets your needs..
I would first add an index on the devices table ( appliance_id, sensor_type, name ) to match your query. I don't know how many entries are in this table, but if large, and many elements per device, get right to it.
Second, on your channels table, index on ( device_id, channel )
Third, on your history data, index on ( channel_id, date_time )
then,
SELECT STRAIGHT_JOIN
PreQuery.MostRecent,
PreQuery.Channel_ID,
PreQuery.Channel,
H2.Data,
H2.Unit
from
( select
c.channel_id,
c.channel,
max( h.date_time ) as MostRecent
from
devices d
join channels c
on d.device_id = c.device_id
and c.channel in ( 'ch1', 'ch2', 'ch3' )
join historical_data h
on c.channel_id = c.Channel_id
where
d.appliance_id = 0
and d.sensor_type = 1
and d.name = 'livingroom'
group by
c.channel_id ) PreQuery
JOIN Historical_Data H2
on PreQuery.Channel_ID = H2.Channel_ID
AND PreQuery.MostRecent = H2.Date_Time
order by
PreQuery.MostRecent,
PreQuery.Channel

Select distinct records on a join

I have two mysql tables - a sales table:
+----------------+------------------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------------------------+------+-----+---------+-------+
| StoreId | bigint(20) unsigned | NO | PRI | NULL | |
| ItemId | bigint(20) unsigned | NO | | NULL | |
| SaleWeek | int(10) unsigned | NO | PRI | NULL | |
+----------------+------------------------------+------+-----+---------+-------+
and an items table:
+--------------------+------------------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+------------------------------+------+-----+---------+-------+
| ItemId | bigint(20) unsigned | NO | PRI | NULL | |
| ItemName | varchar(100) | NO | | NULL | |
+--------------------+------------------------------+------+-----+---------+-------+
The sales table contains multiple records for each ItemID - one for each SaleWeek. I want to select all items sold by joining the two tables like so:
SELECT items.ItemName, items.ItemId FROM items
JOIN sales ON items.ItemId = sales.ItemId
WHERE sales.StoreID = ? ORDER BY sales.SaleWeek DESC;
However, this is returning multiple ItemId values based on the multiple entries for each SaleWeek. Can I do a distinct select to only return one ItemID - I don't want to have to query for the latest SaleWeek because some items may not have an entry for the latest SaleWeek so I need to get the last sale. Do I need to specify DISTINCT or use a LEFT OUTER JOIN or something?
A DISTINCT should do what you're looking for:
SELECT DISTINCT items.ItemName, items.ItemId FROM items
JOIN sales ON items.ItemId = sales.ItemId
WHERE sales.StoreID = ? ORDER BY sales.SaleWeek DESC;
That would return only distinct items.ItemName, items.ItemId tuples.
You had comment about the sales week too. And wanting the most recent week, you may want to try using a GROUP BY
SELECT
items.ItemName,
items.ItemId,
max( Sales.SaleWeek ) MostRecentSaleWeek
FROM
items JOIN sales ON items.ItemId = sales.ItemId
WHERE
sales.StoreID = ?
GROUP BY
items.ItemID,
items.ItemName
ORDER BY
MostRecentSaleWeek, -- ordinal column number 3 via the MAX() call
items.ItemName
You may have to change the ORDER BY to the ordinal 3rd column reference if you so want based on that column.. This query will give you each distinct item AND the most recent week it was sold.
SELECT u.user_name,u.user_id, u.user_country,u.user_phone_no,ind.Industry_name,inv.id,u.user_email
FROM invitations inv
LEFT JOIN users u
ON inv.sender_id = u.user_id
LEFT JOIN employee_info ei
ON inv.sender_id=ei.employee_fb_id
LEFT JOIN industries ind
ON ei.industry_id=ind.id
WHERE inv.receiver_id='XXX'
AND inv.invitation_status='0'
AND inv.invitati
on_status_desc='PENDING'
GROUP BY (user_id)
We can use this:
INSERT INTO `test_table` (`id`, `name`) SELECT DISTINCT
a.`employee_id`,b.`first_name` FROM `employee_leave_details`as a INNER JOIN
`employee_register` as b ON a.`employee_id` = b.`employee_id`

How to get smallest column value without triggering "Mixing of GROUP columns [...] with no GROUP columns is illegal if there is no GROUP BY clause"?

I have a table 'foo' with a timestamp field 'bar'. How do I get only the oldest timestamp for a query like: SELECT foo.bar from foo? I tried doing something like: SELECT MIN(foo.bar) from foo but it failed with this error
ERROR 1140 (42000) at line 1: Mixing of GROUP columns (MIN(),MAX(),COUNT(),...) with no GROUP columns is illegal if there is no GROUP BY clause
OK, so my query is much more complicated than that and that's why I am having a hard time with it. This is the query with the MIN(a.timestamp):
select distinct a.user_id as 'User ID',
a.project_id as 'Remix Project Id',
prjs.based_on_pid as 'Original Project ID',
(case when f.reasons is NULL then 'N' else 'Y' end)
as 'Flagged Y or N',
f.reasons, f.timestamp, MIN(a.timestamp)
from view_stats a
join (select id, based_on_pid, user_id
from projects p) prjs on
(a.project_id = prjs.id)
left outer join flaggers f on
( f.project_id = a.project_id
and f.user_id = a.user_id)
where a.project_id in
(select distinct b.id
from projects b
where b.based_on_pid in
( select distinct c.id
from projects c
where c.user_id = a.user_id
)
)
order by f.reasons desc, a.user_id, a.project_id;
Any help would be greatly appreciated.
The view_stats table:
+------------+------------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+------------------+------+-----+-------------------+----------------+
| id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| user_id | int(10) unsigned | NO | MUL | 0 | |
| project_id | int(10) unsigned | NO | MUL | 0 | |
| ipaddress | bigint(20) | YES | MUL | NULL | |
| timestamp | timestamp | NO | | CURRENT_TIMESTAMP | |
+------------+------------------+------+-----+-------------------+----------------+
If you are going to use aggregate functions (like min(), max(), avg(), etc.) you need to tell the database what exactly it needs to take the min() of.
transaction date
one 8/4/09
one 8/5/09
one 8/6/09
two 8/1/09
two 8/3/09
three 8/4/09
I assume you want the following.
transaction date
one 8/4/09
two 8/1/09
three 8/4/09
Then to get that you can use the following query...note the group by clause which tells the database how to group the data and get the min() of something.
select
transaction,
min(date)
from
table
group by
transaction