Mysql join query - mysql

I'm using two tables in the database. These tables look like this:
Table A:
id | date
----------------------
12001 | 2011-01-01
13567 | 2011-01-04
13567 | 2011-01-04
11546 | 2011-01-07
13567 | 2011-01-07
18000 | 2011-01-08
Table B:
user | date | amount
----------------------------------
15467 | 2011-01-04 | 140
14568 | 2011-01-04 | 120
14563 | 2011-01-05 | 140
12341 | 2011-01-07 | 140
18000 | 2011-01-08 | 120
I need a query that will join these the two tables.
The first query should result in a total number of users from table A group by date and the number of unique users from table A grouped by date. That query looks like:
SELECT COUNT(DISTINCT id) AS uniq, COUNT(*) AS total, format_date(date, '%Y-%m-%d') as date FROM A GROUP BY date
From the second table I need the sum of the amounts grouped by dates.
That query looks like:
SELECT SUM(amount) AS total_amount FROM B GROUP BY DATE_FORMAT( date, '%Y-%m-%d' )
What I want to do is to merge these two queries into one on column "date", and that as a result I get the following list:
date | unique | total | amount
-----------------------------------------------
2011-01-01 | 1 | 1 | 0
2011-01-04 | 1 | 2 | 260
2011-01-05 | 0 | 0 | 140
2011-01-07 | 2 | 2 | 140
2011-01-08 | 1 | 1 | 120
How can I do that using one query?
Thanks for all suggestions.

select date_format(a.date, '%Y-%m-%d') as date, a.uniq, a.total, ifnull(b.amount, 0) as amount
from (
select count(distinct id) as uniq, count(*) as total, date
from tablea
group by date
) a
left join (
select sum(amount) as amount, date
from tableb
group by date
) b on a.date = b.date
order by a.date
I assume that field date is a datetime type. It's better to format output fields in final result set (date field in this case).
Your queries are fine everything they need is a join.

Related

Sum datetime difference for values of same column and group by day

I have a table with 'ON' and 'OFF' values in column activity and another column datetime.
id(AUTOINCREMENT) id_device activity datetime
1 a ON 2017-05-26 22:00:00
2 b ON 2017-05-26 05:00:00
3 a OFF 2017-05-27 04:00:00
4 b OFF 2017-05-26 08:00:00
5 a ON 2017-05-28 12:00:00
6 a OFF 2017-05-28 15:00:00
I need to get total ON time by day
day id_device total_minutes_on
2017-05-26 a 120
2017-05-26 b 180
2017-05-27 a 240
2017-05-27 b 0
2017-05-28 a 180
2017-05-28 b 0
i have searched and tried answers for another posts, i tried TimeDifference and i get correct total time.
I don't find the way to get total time grouped by date
i appreciate your help
I'm not posting this as a definite answer rather it's an experiment for me and hopefully you'll find is useful in your case. Also I would like to mention that the MySQL database version I'm working with is quite old so the method I'm using is also very manual to say the least.
First of all lets extract your expected output:
The date value in day need to be repeated twice fro each of id_device a and b.
Minutes are calculated based on the activity; if activity is 'ON' until tomorrow, it needs to be calculated until the day end at 24:00:00 while the next day will calculate minutes until the activity is OFF.
What I come up with is this:
Creating condition (1):
SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY dtt,id_device;
The query above will return the following result:
+------------+-----------+
| dtt | id_device |
+------------+-----------+
| 2017-05-26 | a |
| 2017-05-26 | b |
| 2017-05-27 | a |
| 2017-05-27 | b |
| 2017-05-28 | a |
| 2017-05-28 | b |
+------------+-----------+
*Above will only work with all the dates you have in the table. If you want all date regardless if there's activity or not, I suggest you create a calendar table (refer: Generating a series of dates).
So this become the base query. Then I've added an outer query to left join the query above with the original data table:
SELECT v.*,
GROUP_CONCAT(w.activity ORDER BY w.datetime SEPARATOR ' ') activity,
GROUP_CONCAT(TIME_TO_SEC(TIME(w.datetime)) ORDER BY w.datetime SEPARATOR ' ') tr
FROM
-- this was the first query
(SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY a.dtt,b.id_device) v
--
LEFT JOIN
mytable w
ON v.dtt=DATE(w.datetime) AND v.id_device=w.id_device
GROUP BY DATE(v.dtt),v.id_device
What's new in the query is the addition of GROUP_CONCAT operation on both activity and time value extracted from datetime column which is converted into seconds value. You notice that in both of the GROUP_CONCAT there's a similar ORDER BY condition which is important in order to get the exact corresponding value.
The query above will return the following result:
+------------+-----------+----------+-------------+
| dtt | id_device | activity | tr |
+------------+-----------+----------+-------------+
| 2017-05-26 | a | ON | 79200 |
| 2017-05-26 | b | ON OFF | 18000 28800 |
| 2017-05-27 | a | OFF | 14400 |
| 2017-05-27 | b | (NULL) | (NULL) |
| 2017-05-28 | a | ON OFF | 43200 54000 |
| 2017-05-28 | b | (NULL) | (NULL) |
+------------+-----------+----------+-------------+
From here, I've added another query outside to calculate how many minutes and attempt to get the expected result:
SELECT dtt,id_device,
CASE
WHEN SUBSTRING_INDEX(activity,' ',1)='ON' AND SUBSTRING_INDEX(activity,' ',-1)='OFF'
THEN (SUBSTRING_INDEX(tr,' ',-1)-SUBSTRING_INDEX(tr,' ',1))/60
WHEN activity='ON' THEN 1440-(tr/60)
WHEN activity='OFF' THEN tr/60
WHEN activity IS NULL AND tr IS NULL THEN 0
END AS 'total_minutes_on'
FROM
-- from the last query
(SELECT v.*,
GROUP_CONCAT(w.activity ORDER BY w.datetime SEPARATOR ' ') activity,
GROUP_CONCAT(TIME_TO_SEC(TIME(w.datetime)) ORDER BY w.datetime SEPARATOR ' ') tr
FROM
-- this was the first query
(SELECT * FROM
(SELECT DATE(datetime) dtt FROM mytable GROUP BY DATE(datetime)) a,
(SELECT id_device FROM mytable GROUP BY id_device) b
ORDER BY a.dtt,b.id_device) v
--
LEFT JOIN
mytable w
ON v.dtt=DATE(w.datetime) AND v.id_device=w.id_device
GROUP BY DATE(v.dtt),v.id_device
--
) z
The last part I do is if the activity value have both ON and OFF on the same day then (OFF-ON)/60secs=total minutes. If activity value is only ON then minutes value for '24:00:00' > 24 hr*60 min= 1440-(ON/60secs)= total minutes, and if activity only OFF, I just convert seconds to minutes because the day starts at 00:00:00 anyhow.
+------------+-----------+------------------+
| dtt | id_device | total_minutes_on |
+------------+-----------+------------------+
| 2017-05-26 | a | 120 |
| 2017-05-26 | b | 180 |
| 2017-05-27 | a | 240 |
| 2017-05-27 | b | 0 |
| 2017-05-28 | a | 180 |
| 2017-05-28 | b | 0 |
+------------+-----------+------------------+
Hopefully this will give you some ideas. ;)

mysql need complete count of a column and group by some columns

I need a complete count of each person_id from the database according to the date wise report
SELECT date, person_id, count(person_id)
FROM visits
group by date, person_id
I tried this one but this couldn't give the result what I expected.
Date | person_id| count(person_id)
2018-01-01 | 33000 | 10 |
2018-01-01 | 712000 | 111 |
2018-01-01 | 730000 | 30 |
2018-01-01 | 743000 | 5 |
2018-01-01 | 755000 | 123 |
you need total append to your query result? For example:
Date | person_id| count(person_id) | total
2018-01-01 | 33000 | 10 | 1000
2018-01-01 | 712000 | 111 | 1000
right? if so, I don't think it's a good idea only using sql query. On my case, I will query twice asynchronously,and then merge the result.
like this:
query1:
SELECT date, person_id, count(person_id)
FROM visits
group by date, person_id
query2:
SELECT count(person_id) as total
FROM visits
and then merge the results by program.

MySQL report -- fill in empty dates

I am building a query to return daily sales data. My current query returns a table similar to this:
----------------------------------
| DATE | SKU | TOTAL |
----------------------------------
| 2014-11-01 | AV155_A | 209.00 |
| 2014-11-02 | AV155_B | 627.00 |
| 2014-11-04 | AV155_C | 279.00 |
| 2014-11-05 | AV155 | 279.00 |
| 2014-11-08 | AV1556_A | 209.00 |
| 2014-11-09 | AV1556_B | 627.00 |
| 2014-11-10 | AV1556_C | 279.00 |
| 2014-11-12 | AV1556 | 279.00 |
What I would like is a results table that displays every day, even if there are no data points for that particular day. Something like this:
----------------------------------
| DATE | SKU | TOTAL |
----------------------------------
| 2014-11-01 | AV155_A | 209.00 |
| 2014-11-02 | AV155_B | 627.00 |
| 2014-11-03 | | 0 |
| 2014-11-04 | AV155_C | 279.00 |
| 2014-11-05 | AV155 | 279.00 |
| 2014-11-06 | | 0 |
| 2014-11-07 | | 0 |
| 2014-11-08 | AV1556_A | 209.00 |
| 2014-11-09 | AV1556_B | 627.00 |
| 2014-11-10 | AV1556_C | 279.00 |
| 2014-11-11 | | 0 |
| 2014-11-12 | AV1556 | 279.00 |
The query I currently have looks like this:
select
DATE_FORMAT(created_on, '%m-%d-%Y') as date,
sku,
SUM(price) as total
FROM order_items
WHERE created_on between FROM_UNIXTIME(1415577600) AND NOW()
GROUP BY MONTH(created_on), DAY(v.created_on), order_item_sku;
You need to use an outer join. The easiest way is if you have a calendar table, but you can make one on the fly:
select c.thedate, oi.sku, sum(price) as total
from (select date('2014-11-01') as thedate union all
date('2014-11-02') as thedate union all
date('2014-11-03') as thedate union all
date('2014-11-04') as thedate union all
date('2014-11-05') as thedate union all
date('2014-11-06') as thedate union all
date('2014-11-07') as thedate union all
date('2014-11-08') as thedate union all
date('2014-11-09') as thedate union all
date('2014-11-10') as thedate union all
date('2014-11-11') as thedate union all
date('2014-11-12') as thedate
) c left join
order_items oi
on c.thedate = date(oi.created_on)
where oi.created_on between FROM_UNIXTIME(1415577600) AND NOW()
group by ci.thedate, oi.sku
Here's an answer that addresses the need for a flexible list of dates. You need to figure out a way to get a virtual table containing all the dates in the appropriate range, and then join them to the summary. Here’s a query that will get the dates in the range.
SELECT mintime + INTERVAL seq.seq DAY AS reportdate
FROM (
SELECT MIN(DATE(created_on)) AS mintime,
MAX(DATE(created_on)) AS maxtime
FROM order_items
WHERE created_on >= starting_time
AND created_on <= NOW()
) AS order_items
JOIN seq_0_to_999 AS seq
ON seq.seq < TIMESTAMPDIFF(DAY,mintime,maxtime)
What’s going on here? Three things.
We have a subquery which determines the first and last day (min and max created_on) we care about reporting.
We apply a time range to that query. I like to avoid using BETWEEN for timestamp ranges because it often gets the ending time wrong in an off-by-one-second error.
We have a table called seq_0_to_999. It contains a sequence of a thousand cardinal numbers: the integers starting at zero. More about this in a moment.
Then, you can join that as a subquery to your aggregate query to get all the dates in the range listed, like so.
select DATE_FORMAT(d.reportdate, '%m-%d-%Y') as date,
sku,
SUM(price) as total
FROM (
SELECT mintime + INTERVAL seq.seq DAY AS reportdate
FROM (
SELECT MIN(DATE(created_on)) AS mintime,
MAX(DATE(created_on)) AS maxtime
FROM order_items
WHERE created_on >= starting_time
AND created_on <= NOW()
) AS order_items
JOIN seq_0_to_999 AS seq
ON seq.seq < TIMESTAMPDIFF(DAY,mintime,maxtime)
) AS d
LEFT JOIN order_items ON d.reportdate = DATE(order_items.created_on)
WHERE created_on >= starting_time
AND created_on <= NOW()
GROUP BY d.reportdate, sku
ORDER BY d.reportdate, sku
It looks like a big nasty hairball of a query. But if you think of it as a sandwich made of various layers of queries, it really isn't that complicated.
It uses LEFT JOIN so it makes sure all the dates in the range are preserved even if there's no corresponding data in your order_items table.
Finally, what about this seq_0_to_999 table? Where do we get those integers starting with zero? The answer is this: we have to arrange to do that; those numbers aren’t built in to MySQL. (They are built into the MySQL fork called MariaDB.) Create a short table with the integers from 0-9 in it, like so:
DROP TABLE IF EXISTS seq_0_to_9;
CREATE TABLE seq_0_to_9 AS
SELECT 0 AS seq UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4
UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9;
Then create a view that joins that table with itself to generate 1000 combinations like this:
DROP VIEW IF EXISTS seq_0_to_999;
CREATE VIEW seq_0_to_999 AS (
SELECT (a.seq + 10 * (b.seq + 10 * c.seq)) AS seq
FROM seq_0_to_9 a
JOIN seq_0_to_9 b
JOIN seq_0_to_9 c
);
I wrote this up in some detail at http://www.plumislandmedia.net/mysql/filling-missing-data-sequences-cardinal-integers/

Sort sql results by Month with unix

here's what im trying to achive:
i have 2 sql tables:
transactions and payplans
bellow is the structures of 2 tables:
transactions
uid | plan | date | payid | status
------------------------------------
12 | 3 | 1388534400 | 334 | 1
699 | 4 | 1388214400 | 335 | 1
payplans:
plan | plan_price
-------------------
3 | 9.99
4 | 19.99
with this query:
SELECT SUM(plan_price)
FROM transations AS t
INNER JOIN payplans AS p
ON t.plan = p.plan
WHERE t.status = '1'
i was able to calculate total sum of all "plan_price" rows,
but i would like to have the price sum for every month starting jan 2013
for example:
jan-13 | 9.99
feb-13 | 29.99
etc.
For MySQL
SELECT date_format(FROM_UNIXTIME(t.date), '%b-%y') as mnth,
SUM(plan_price)
FROM transations AS t
INNER JOIN payplans AS p
ON t.plan = p.plan
WHERE t.status = '1'
GROUP BY mnth;
SQLFiddle
You converting unix_timestamp to date using FROM_UNIXTIME
formatting it into 'MON-YY' format with DATE_FORMAT
then grouping by month.

MySQL query based on time range, group users, and sum values over a sliding window

I want to create a new Table B based on the information from another existing Table A. I'm wondering if MySQL has the functionality to take into account a range of time and group column A values then only sum up the values in a column B based on those groups in column A.
Table A stores logs of events like a journal for users. There can be multiple events from a single user in a single day. Say hypothetically I'm keeping track of when my users eat fruit and I want to know how many fruit they eat in a week (7days) and also how many apples they eat.
So in Table B I want to count for each entry in Table A, the previous 7 day total # of fruit and apples.
EDIT:
I'm sorry I over simplified my given information and didn't thoroughly think my example.
I'm initially have only Table A. I'm trying to create Table B from a query.
Assume:
User/id can log an entry multiple times in a day.
sum counts should be for id between date and date - 7 days
fruit column stands for the total # of fruit during the 7 day interval ( apples and bananas are both fruit)
The data doesn't only start at 2013-9-5. It can date back 2000 and I want to use the 7 day sliding window over all the dates between 2000 to 2013.
The sum count is over a sliding window of 7 days
Here's an example:
Table A:
| id | date-time | apples | banana |
---------------------------------------------
| 1 | 2013-9-5 08:00:00 | 1 | 1 |
| 2 | 2013-9-5 09:00:00 | 1 | 0 |
| 1 | 2013-9-5 16:00:00 | 1 | 0 |
| 1 | 2013-9-6 08:00:00 | 0 | 1 |
| 2 | 2013-9-9 08:00:00 | 1 | 1 |
| 1 | 2013-9-11 08:00:00 | 0 | 1 |
| 1 | 2013-9-12 08:00:00 | 0 | 1 |
| 2 | 2013-9-13 08:00:00 | 1 | 1 |
note: user 1 logged 2 entries on 2013-9-5
The result after the query should be Table B.
Table B
| id | date-time | apples | fruit |
--------------------------------------------
| 1 | 2013-9-5 08:00:00 | 1 | 2 |
| 2 | 2013-9-5 09:00:00 | 1 | 1 |
| 1 | 2013-9-5 16:00:00 | 2 | 3 |
| 1 | 2013-9-6 08:00:00 | 2 | 4 |
| 2 | 2013-9-9 08:00:00 | 2 | 3 |
| 1 | 2013-9-11 08:00:00 | 2 | 5 |
| 1 | 2013-9-12 08:00:00 | 0 | 3 |
| 2 | 2013-9-13 08:00:00 | 2 | 4 |
At 2013-9-12 the sliding window moves and only includes 9-6 to 9-12. That's why id 1 goes from a sum of 2 apples to 0 apples.
You need years in your data to be able to use date arithmetic correctly. I added them.
There's an odd thing in your data. You seem to have multiple log entries for each person for each day. You're assuming an implicit order setting the later log entries somehow "after" the earlier ones. If SQL and MySQL do that, it's only by accident: there's no implicit ordering of rows in a table. Plus if we duplicate date/id combinations, the self join (read on) has lots of duplicate rows and ruins the sums.
So we need to start by creating a daily summary table of your data, like so:
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
This summary will contain at most one row per id per day.
Next we need to do a limited cross product self-join, so we get seven days' worth of fruit eating.
select --whatever--
from (
-- summary query --
) as a
join (
-- same summary query once again
) as b
on ( a.id = b.id
and b.`date` between a.`date` - interval 6 day AND a.`date` )
The between clause in the on gives us the seven days (today, and the six days prior). Notice that the table in the join with the alias b is the seven day stuff, and the a table is the today stuff.
Finally, we have to summarize that result according to your specification. The resulting query is this.
select a.id, a.`date`,
sum(b.apples) + sum(b.banana) as fruit_last_week,
a.apples as apple_today
from (
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
) as a
join (
select id, `date`, sum(apples) as apples, sum(banana) as banana
from fruit
group by id, `date`
) as b on (a.id = b.id and
b.`date` between a.`date` - interval 6 day AND a.`date` )
group by a.id, a.`date`, a.apples
order by a.`date`, a.id
Here's a fiddle: http://sqlfiddle.com/#!2/670b2/15/0
Assumptions:
one row per id/date
the counts should be for id between date and date - 7 days
"fruit" = "banana"
the "date" column is actually a date (including year) and not just month/day
then this SQL should do the trick:
INSERT INTO B
SELECT a1.id, a1.date, SUM( a2.banana ), SUM( a2.apples )
FROM (SELECT DISTINCT id, date
FROM A
WHERE date > NOW() - INTERVAL 7 DAY
) a1
JOIN A a2
ON a2.id = a1.id
AND a2.date <= a1.date
AND a2.date >= a1.date - INTERVAL 7 DAY
GROUP BY a1.id, a1.date
Some questions:
Are the above assumptions correct?
Does table A contain more fruits than just Bananas and Apples? If so, what does the real structure look like?