mysql query, selecting from 3 tables - mysql

I am wrangling with a query pulling unique values from 3 tables. Is this better done in 2 separate queries?
the query is to:
count as returned (
all leadID from lds where status = "ok"
AND leadID is also in rlds with recID="999"
AND rdate > (03-20-2012)
+
(all distinct leadID from plds where recID="999"
AND change != NULL
AND pdate > (03-20-2012))
the result of the working query should be "2": leadID 1 and leadID 4
table lds:
leadID | status
1 | ok
2 | bad
3 | ok
table plds:
leadID | recID | change | pdate
4 | 999 | ch1 | 03-27-2012
4 | 999 | ch2 | 03-27-2012
5 | 888 | NULL | 03-27-2012
table rlds:
leadID | recID2 | rdate
1 | 999 | 03-27-2012
6 | 999 | 03-27-2012
Thanks!

SELECT Ids.leadId
FROM
Ids JOIN
rlds ON rlds.leadId = Ids.LeadId AND recID = 999 AND rdate > '03-20-2012'
WHERE Ids.Status = 'ok'
UNION
SELECT leadId
FROM pIds
WHERE recID = 999 AND change IS NOT NULL AND pdate > '03-20-2012'

You can use the UNION key word to retreive a array a 2 row, the first one containing the count of your first match and the seconde the count of your second match.
You can then use the SUM keyword to sum the two row altogether in an enclosing select. The result will be a single row contening your count.
SELECT sum(1)
FROM
(
SELECT count(1)
FROM lds
INNER JOIN rlds ON lds.leadid = plds.leadid
WHERE lds.status='ok' AND rlds.recid2=999 AND rlds.pdate > '03-20-2012'
UNION
SELECT count(1)
FROM plds
WHERE plds.recid=999
AND change != NULL
AND pdate > '03-20-2012'
) AS tmpcount

Related

How can I optimize a query involving a subselect in MySQL?

Suppose I have a table like this:
id | price | group1
1 | 6 | some_group
2 | 7 | some_group
3 | 8 | some_group
4 | 9 | some_other_group
If I want to select the lowest price grouped by group1
I can just do this:
SELECT id, min(price), group1 FROM some_table GROUP BY group1;
The problem is when I have a table which is not sorted by price like this:
id | price | group1
1 | 8 | some_group
2 | 7 | some_group
3 | 6 | some_group
4 | 9 | some_other_group
Then my query returns this result set:
id | price | group1
1 | 6 | some_group
4 | 9 | some_other_group
The problem is that I get 1 in the id column but the id of the row with the price of 6 is not 1 but 3.
My question is that how can I get the values from the row which contains the minimum price when I use GROUP BY?
I tried this:
SELECT f.id, min(f.price), f.group1 FROM (SELECT * FROM some_table ORDER BY price) f
GROUP BY f.group1;
but this is really slow and if I have multiple columns and aggregations it may fail.
Please note that the names above are just for demonstration purposes. My real query looks like this:
SELECT depdate, retdate, min(totalprice_eur) price FROM
(SELECT * FROM flight_results
WHERE (
fromcity = 30001350
AND tocity = 30001249
AND website = 80102118
AND roundtrip = 1
AND serviceclass = 1
AND depdate > date(now()))
ORDER BY totalprice_eur) F
WHERE (
fromcity = 30001350
AND tocity = 30001249
AND website = 80102118
AND roundtrip = 1
AND serviceclass = 1
AND depdate > date(now()))
GROUP BY depdate,retdate
and there is a concatenated primary key including website, fromcity, tocity, roundtrip, depdate, and retdate. There are no other indexes.
Explain says:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 2837 Using where; Using temporary; Using filesort
2 DERIVED flight_results ALL PRIMARY NULL NULL NULL 998378 Using filesort
You can do this instead:
SELECT t1.id, t1.price, t1.group1
FROM some_table AS t1
INNER JOIN
(
SELECT min(price) minprice, group1
FROM some_table
GROUP BY group1
) AS t2 ON t1.price = t2.minprice AND t1.group1 = t2.group1;

Condition row total as a column in another table using MySQL

Firstly, I apologize for the terrible wording, but I'm not sure how to describe what I'm doing...
I have a table of computer types (id, type, name), called com_types
id | type | name
1 | 1 | Dell
2 | 4 | HP
In a second table, I have each individual computer, with a column 'type_id' to denote what type of computer it is, called com_assets
id | type_id | is_assigned
1 | 4 | 0
2 | 1 | 1
I'd like to create a view that shows each computer type, and how many we have on hand and in use, and a total, so the outcome would be
id | type | name | on_hand | in_use | total |
1 | 1 | Dell | 0 | 1 | 1 |
2 | 4 | HP | 1 | 0 | 1 |
As you can see, the on_hand, in_use, and total columns are dependent on the type_id and is_assigned column in the second table.
So far I have tried this...
CREATE VIEW test AS
SELECT id, type, name,
( SELECT COUNT(*) FROM com_assets WHERE type_id = id AND is_assigned = '0' ) as on_hand,
( SELECT COUNT(*) FROM com_assets WHERE type_id = id AND is_assigned = '1' ) as in_use,
SUM( on_hand + in_use ) AS total
FROM com_types
But all this returns is one column with all correct values, except the total equals ALL of the computers in the other table. Will I need a trigger to do this instead?
on_hand is the count of assigned = 0, and in_use is the count of assigned = 1. You can count them together, without the correlated subqueries, like this:
SELECT
com_types.id,
com_types.type,
com_types.name,
COUNT(CASE WHEN com_assets.is_assigned = 0 THEN 1 END) AS on_hand,
COUNT(CASE WHEN com_assets.is_assigned = 1 THEN 1 END) AS in_use,
COUNT(*) AS total
FROM com_types
JOIN com_assets ON com_types.id = com_assets.id
GROUP BY
com_types.id,
com_types.type,
com_types.name

Mysql: event table :chronological consecutive join

I have the following table:
user_id | Membership_type | start_Date
1 | 1 | 1
1 | 1 | 2
1 | 2 | 3
1 | 3 | 4
with several users, and i need to find out for each user when the membership type changes and what the change is, in the following format (start date is datetime, put it here in int for ease of understanding)
user_id |Membership_change| change_Date
1 | 1 to 2 | 3
1 | 2 to 3 | 4
I have tried
select m1.user_id, concat(m1.Membership_type, ' to ',m2.Membership_type), m2.start_date
from table_membership m1
join table_membership m2
on m1.user_id=m2.user_id and m1.start_date<m2.start_date and m1.membership_type<>m2.membership_type
but this will return
user_id |Membership_change| change_Date
1 | 1 to 2 | 3
1 | 1 to 2 | 3
1 | 1 to 3 | 4
1 | 2 to 3 | 4
The duplicate 1 to 2 is not a problem to remove through a grouping, but I cannot seem to be able to think of a way to avoid having the 1 to 3 result. I basically just need to join chronologically from one membership to the next
Any ideas would be appreciated!
Edit: Had an idea to add the column m1.start_date and group by account_id and m1.start_date, so I would only get the first row where each entry is joined. Also a pre-sort by date before the joins, to make sure they are all in order. Will test.
You are missing GROUP BY
select
m1.user_id,
concat(m1.Membership_type, ' to ',m2.Membership_type),
m2.start_date
from table_membership m1
join table_membership m2
on m1.user_id = m2.user_id
and m1.start_date < m2.start_date
and m1.membership_type <> m2.membership_type
GROUP BY user_id, Membership_change, change_Date
Had an idea to add the column m1.start_date and group by account_id and m1.start_date, so I would only get the first row where each entry is joined. Also a pre-sort by date before the joins, to make sure they are all in order.
select m.user_id, m.membership_change, m.change_date from
(
select
m1.user_id,
concat(m1.Membership_type, ' to ',m2.Membership_type) as membership_change,
m2.start_date as change_date,
m1.start_date
from (select * from table_membership order by start_date asc)m1
join (select * from table_membership order by start_date asc)m2
on m1.user_id = m2.user_id
and m1.start_date < m2.start_date
and m1.membership_type <> m2.membership_type
GROUP BY m1.user_id, m1.start_Date
)m group by 1,2,3

How to include dates with zero messages into the resultset anyway?

I have the following table with messages:
+---------+---------+------------+----------+
| msg_id | user_id | m_date | m_time |
+-------------------+------------+----------+
| 1 | 1 | 2011-01-22 | 06:23:11 |
| 2 | 1 | 2011-01-23 | 16:17:03 |
| 3 | 1 | 2011-01-23 | 17:05:45 |
| 4 | 2 | 2011-01-22 | 23:58:13 |
| 5 | 2 | 2011-01-23 | 23:59:32 |
| 6 | 2 | 2011-01-24 | 21:02:41 |
| 7 | 3 | 2011-01-22 | 13:45:00 |
| 8 | 3 | 2011-01-23 | 13:22:34 |
| 9 | 3 | 2011-01-23 | 18:22:34 |
| 10 | 3 | 2011-01-24 | 02:22:22 |
| 11 | 3 | 2011-01-24 | 13:12:00 |
+---------+---------+------------+----------+
What I want is for each day, to see how many messages each user has sent BEFORE and AFTER 16:00:
SELECT
user_id,
m_date,
SUM(m_time <= '16:00') AS before16,
SUM(m_time > '16:00') AS after16
FROM messages
GROUP BY user_id, m_date
ORDER BY user_id, m_date ASC
This produces:
user_id m_date before16 after16
-------------------------------------
1 2011-01-22 1 0
1 2011-01-23 0 2
2 2011-01-22 0 1
2 2011-01-23 0 1
2 2011-01-24 0 1
3 2011-01-22 1 0
3 2011-01-23 1 1
3 2011-01-24 2 0
Because user 1 has written no messages on 2011-01-24, this date is not in the resultset. However, this is undesirable. I have a second table in my database, called "date_range":
+---------+------------+
| date_id | d_date |
+---------+------------+
| 1 | 2011-01-21 |
| 1 | 2011-01-22 |
| 1 | 2011-01-23 |
| 1 | 2011-01-24 |
+---------+------------+
I want to check the "messages" against this table. For each user, all these dates have to be in the resultset. As you can see, none of the users have written messages on 2011-01-21, and as said, user 1 has no messages on 2011-01-24. The desired output of the query would be:
user_id d_date before16 after16
-------------------------------------
1 2011-01-21 0 0
1 2011-01-22 1 0
1 2011-01-23 0 2
1 2011-01-24 0 0
2 2011-01-21 0 0
2 2011-01-22 0 1
2 2011-01-23 0 1
2 2011-01-24 0 1
3 2011-01-21 0 0
3 2011-01-22 1 0
3 2011-01-23 1 1
3 2011-01-24 2 0
How can I link the two tables so that the query result also holds rows with zero values for before16 and after16?
Edit: yes, I have a "users" table:
+---------+------------+
| user_id | user_date |
+---------+------------+
| 1 | foo |
| 2 | bar |
| 3 | foobar |
+---------+------------+
Test bed:
create table messages (msg_id integer, user_id integer, _date date, _time time);
create table date_range (date_id integer, _date date);
insert into messages values
(1,1,'2011-01-22','06:23:11'),
(2,1,'2011-01-23','16:17:03'),
(3,1,'2011-01-23','17:05:05');
insert into date_range values
(1, '2011-01-21'),
(1, '2011-01-22'),
(1, '2011-01-23'),
(1, '2011-01-24');
Query:
SELECT p._date, p.user_id,
coalesce(m.before16, 0) b16, coalesce(m.after16, 0) a16
FROM
(SELECT DISTINCT user_id, dr._date FROM messages m, date_range dr) p
LEFT JOIN
(SELECT user_id, _date,
SUM(_time <= '16:00') AS before16,
SUM(_time > '16:00') AS after16
FROM messages
GROUP BY user_id, _date
ORDER BY user_id, _date ASC) m
ON p.user_id = m.user_id AND p._date = m._date;
EDIT:
Your initial query is left as is, I hope it doesn't requires any explanations;
SELECT DISTINCT user_id, dr._date FROM messages m, date_range dr will return a cartesian or CROSS JOIN of two tables, which will give me all required date range for each user in subject. As I'm interested in each pair only once, I use DISTINCT clause. Try this query with and without it;
Then I use LEFT JOIN on two sub-selects.
This join means: first, INNER join is performed, i.e. all rows with matching fields in the ON condition are returned. Then, for each row in the left-side relation of the join that has no matches on the right side, return NULLs (thus the name, LEFT JOIN, i.e. left relation is always there and right is expected to have NULLs). This join will do what you expect — return user_id + date combinations even if there were no messages in the given date for a given user. Note that I use user_id + date sub-select first (on the left) and messages query second (on the right);
coalesce() is used to replace NULL with zero.
I hope this clarifies how this query works.
Give this a shot:
select u.user_id, u._date,
sum(_time <= '16:00') as before16,
sum(_time > '16:00') as after16
from (
select m.user_id, d._date
from messages m
cross join date_range d
group by m.user_id, d._date
) u
left join messages m on u.user_id=m.user_id
and u._date=m._date
group by u.user_id, u._date
The inner query is just building a set of all possible/desired user-date pairs. It would be more efficient to use a users table, but you didn't mention that you had one, so I won't assume. otherwise, you just need the left join to not remove the non-joined records.
EDIT
--More detailed explanation: taking the query apart.
Start with the innermost query; the goal is to get a list of all desired dates for every user. Since there's a table of users and a table of dates it can look like this:
select distinct u.user_id, d.d_date
from users u
cross join date_range d
The key here is the cross join, taking every row in the users table and associating it with every row in the date_range table. The distinct keyword is really just a shorthand for a group by on all columns, and is here just in case there's duplicated data.
Note that there are several other methods of getting this same result set (like in my original query), but this is probably the simplest from both a logical and computational standpoint.
Really, the only other steps are to add the left join (associating all of the rows we got above to all available data, and not removing anything that doesn't have any data) and the group by and select components which are basically the same as you had before. So, putting everything together it looks like this:
select t.user_id, t.d_date,
sum(m.m_time <= '16:00') as before16,
sum(m.m_time > '16:00') as after16
from (
select distinct u.user_id, d.d_date
from users u
cross join date_range d
) t
left join messages m on t.user_id = m.user_id
and t.d_date = m.m_date
group by t.user_id, t.d_date
Based on some other comments/questions, note the explicit use of prefixes for all uses of all tables and sub-queries (which is pretty straight forward since we're not using any table more than once anymore): u for the users table, d for the date_range table, t for the sub-query containing the dates to use for each user, and m for the message table. This is probably where my first explanation fell a little short, since I used the message table twice, both times with the same prefix. It works there because of the context of both uses (one was in a sub-query), but it probably isn't the best practice.
It is not neat. But if you have a user table. Then maybe something like this:
SELECT
user_id,
_date,
SUM(_time <= '16:00') AS before16,
SUM(_time > '16:00') AS after16
FROM messages
GROUP BY user_id, _date
UNION
SELECT
user_id,
date_range,
0 AS before16,
0 AS after16
FROM
users,
date_range
ORDER BY user_id, _date ASC
chezy525's solution works great, I ported it to postgresql and removed/renamed some aliases:
select users_and_dates.user_id, users_and_dates._date,
SUM(case when _time <= '16:00' then 1 else 0 end) as before16,
SUM(case when _time > '16:00' then 1 else 0 end) as after16
from (
select messages.user_id, date_range._date
from messages
cross join date_range
group by messages.user_id, date_range._date
) users_and_dates
left join messages on users_and_dates.user_id=messages.user_id
and users_and_dates._date=messages._date
group by users_and_dates.user_id, users_and_dates._date;
and ran on my machine, worked perfectly

MySQL JOIN based on highest date and non-unique columns

I need some help with a MySQL query I'm working on. I have data as follows.
Table 1
id date1 text number
---|------------|--------|-------
1 | 2012-12-12 | hi | 399
2 | 2011-11-11 | so | 399
5 | 2010-10-10 | what | 555
3 | 2009-09-09 | bye | 300
4 | 2008-08-08 | you | 300
Table 2
id number date2 ref
---|--------|------------|----
1 | 399 | 2012-06-06 | 40
2 | 399 | 2011-06-06 | 50
5 | 555 | 2011-03-03 | 60
For each row in Table 1, I want to get zero or one ref values from Table 2. There should be a row in the result for each row in Table 1. The number column isn't unique to either table, so the join must be made using the date1 & date2 columns, where date2 is the highest value for the number without exceeding date1 for that number.
The desired result from the above example would be like so.
date1 text number ref
------------|--------|--------|-----
2012-12-12 | hi | 399 | 40
2011-11-11 | so | 399 | 50
2010-10-10 | what | 555 | null
2009-09-09 | bye | 300 | null
2008-08-08 | you | 300 | null
You can see in the result's first row, ref is 40 was chosen because in table2 the record with ref=40 had a date2 that that was less than date1, and the highest date that met that condition.
In the result's second row, ref is 50 was chosen because in table2 the record with ref=50 had a date2 that that was less than date1, and the highest date that met that condition.
The rest of the results have null refs because date1 is always less or a corresponding number doesn't exist in table2.
I've got to a certain point but I'm stuck. The query I have so far is like this.
SELECT date1, text, number, ref
FROM table1
LEFT JOIN (
SELECT *
FROM (
SELECT *
FROM table2
WHERE date2 <= '2012-12-12'
ORDER BY date2 DESC
) tmp
GROUP BY msisdn
) tmp ON table1.number = table2.number;
The problem is that the hard coded date won't do, it should be based on date1, but I can't use date1 because it's in the outer query. Is there a way I can make this work?
I tried similar example with different tables just now and was able to get what you wanted. Below is a similar query modified to fit your needs. You might want to change < with <= if that is what you are looking for.
SELECT a.date1, a.text, b.ref
FROM table1 a LEFT JOIN table2 b ON
( a.number = b.number
AND a.date1 > b.date2
AND b.date2 = ( SELECT MAX(x.date2)
FROM table2 x
WHERE x.number = b.number
AND x.date2 < a.date1)
)
Untested:
SELECT t1.date1,
t1.text,
t1.number,
(SELECT a.ref
FROM TABLE_2 a
JOIN (SELECT t.number,
MAX(t.date2) AS max_date
FROM TABLE_2 t
WHERE t.number = t1.number
AND t.date2 <= t1.date1
GROUP BY t.number) b ON b.number = a.number
AND b.max_date = a.date2)
FROM TABLE_1 t1
The issue is the use of t1 in the derived table of the subselect...