Mysql...Impossible query? - mysql

Difficult query. Not sure if this is possible.
I am trying to figure out how to format a query that needs to accomplish several things.
(1) I need to parse the DATE field that is a VARCHAR field and not a sql date field to get the month of each date.
(2) I then need to AVG all the PTS fields by NAME and Month. So with my example data below I would have a row that has John and John would have 2 in the JAN column and 3 in the the APR column and the AVG column would be an average of all the months. So the months are an average of all the entries in that month and the AVG column is an average of all the columns in the row.
Table:
Name (VARCAHR) PTS (INT) DATE (VARCHAR)
---------------------------------------------
John 3 Tue Apr 14 17:56:02 2020
Chris 2 Tue Apr 14 19:44:03 2020
John 2 Mon Jan 30 15:23:03 2020
Chris 4 Fri Feb 28 16:15:15 2020
John 3 Tue Apr 14 17:56:02 2020
Table Layout on web page:
Name Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Average

Not impossible, just convoluted. You can use STR_TO_DATE to convert your strings into DATETIME objects, from which you can then use MONTH to get the month number. Note though (as #DRapp commented) that you should be storing DATETIME values in their native form, not as VARCHAR, then you wouldn't have to deal with STR_TO_DATE. Having got the month number, you can then use conditional aggregation to get the results you want:
SELECT name,
COALESCE(AVG(CASE WHEN mth = 1 THEN PTS END), 0) AS Jan,
COALESCE(AVG(CASE WHEN mth = 2 THEN PTS END), 0) AS Feb,
COALESCE(AVG(CASE WHEN mth = 3 THEN PTS END), 0) AS Mar,
COALESCE(AVG(CASE WHEN mth = 4 THEN PTS END), 0) AS Apr,
-- repeat for May to November
COALESCE(AVG(CASE WHEN mth = 12 THEN PTS END), 0) AS `Dec`,
AVG(PTS) AS AVG
FROM (
SELECT name, PTS AS PTS, MONTH(STR_TO_DATE(DATE, '%a %b %e %H:%i:%s %Y')) AS mth
FROM data
) d
GROUP BY name
Output (for your sample data):
name Jan Feb Mar Apr Dec AVG
Chris 0 4 0 2 0 3
John 0 0 0 2.6667 0 2.6667
Demo on SQLFiddle

Related

ORDER BY MONTH(1) after GROUP_CONCAT as names?

If I don't use GROUP_CONCAT() then there is no difficulty to the group and order the rows according to date-month-year
Following code:
SELECT FROM_UNIXTIME(orders.date_time,'%d %m %Y') AS date,
SUM(orders.net_amount) AS total_sales,
COUNT(FROM_UNIXTIME(orders.date_time,'%D %b %Y')) AS total_orders
FROM orders
JOIN users ON orders.user_id = users.id
WHERE FROM_UNIXTIME(orders.date_time,'%d %m %Y') != DATE_FORMAT(users.reg_date_time, '%d %m %Y')
GROUP BY date
ORDER BY Month(1)
O/P:
21 12 2019 1092 1 pinky
04 01 2020 1050 1 harshit
30 12 2019 21 1 robin
05 01 2020 987 2 chetan
31 12 2019 1239 2 rahul
30 11 2019 157.5 1 rahul
01 01 2020 651 1 rahul
15 12 2019 1575 1 isha
03 01 2020 598.5 1 manvi
SEE the names are not concating
But as soon as I add this line:
GROUP_CONCAT(users.firstname SEPARATOR '-')) AS names
like this:
SELECT FROM_UNIXTIME(orders.date_time,'%d %m %Y') AS date,
SUM(orders.net_amount) AS total_sales,
GROUP_CONCAT(users.firstname SEPARATOR '-') AS names,
COUNT(FROM_UNIXTIME(orders.date_time,'%D %b %Y')) AS total_orders
FROM orders
JOIN users ON orders.user_id = users.id
WHERE FROM_UNIXTIME(orders.date_time,'%d %m %Y') != DATE_FORMAT(users.reg_date_time, '%d %m %Y')
GROUP BY date
ORDER BY Month(1)
O/P:
01 01 2020 651 1 rahul
03 01 2020 598.5 1 manvi
04 01 2020 1050 1 harshit
05 01 2020 987 2 chetan-saurabh
15 12 2019 1575 1 isha
21 12 2019 1092 1 pinky
30 11 2019 157.5 1 rahul
30 12 2019 21 1 robin
31 12 2019 1239 2 rahul-manvi
then the order changed by day-order(without proper month and year order) but the grouping is correct.
Am I doing something wrong?
Use ORDER BY MONTH(orders.date_time). The problem is that your date column is not formatted as a valid MySQL date, so it's not extracting the month correctly.

How to add rows for missing combination of data and impute corresponding fields with 0

I have combination of domain and month with their total orders in corresponding month. I would like to impute missing combination with 0 values. What's the least expensive aggregation commands that can be used in Pyspark to achieve this ?
I have following input table:
domain month year total_orders
google.com 01 2017 20
yahoo.com 02 2017 30
google.com 03 2017 30
yahoo.com 03 2017 40
a.com 04 2017 50
a.com 05 2017 50
a.com 06 2017 50
Expected Output:
domain month year total_orders
google.com 01 2017 20
yahoo.com 02 2017 30
google.com 03 2017 30
yahoo.com 03 2017 40
a.com 04 2017 50
a.com 05 2017 50
a.com 06 2017 50
google.com 02 2017 0
google.com 04 2017 0
yahoo.com 04 2017 0
google.com 05 2017 0
yahoo.com 05 2017 0
google.com 06 2017 0
yahoo.com 06 2017 0
Here Expected order of output does not really matter.
The simplest method is to combine all months and years for each domain:
select my.year, my.month, d.domain, coalesce(t.total_orders, 0) as total_orders
from (select distinct month, year from input) my cross join
(select distinct domain from input) d left join
t
on t.month = my.month and t.year = my.year and t.domain = d.domain;
Note: This assumes that each year/month combination occurs at least once, somewhere in the data.
Getting values within a range is a pain because you have split the date into multiple columns. Let me assume the years are all the same, as in your example:
select my.year, my.month, d.domain, coalesce(t.total_orders, 0) as total_orders
from (select distinct month, year from input) my join
(select domain, min(month) as min_month, max(month) as max_month
from input
) d
on my.month >= d.min_month and my.month <= d.max_month left join
t
on t.month = my.month and t.year = my.year and t.domain = d.domain

MYSQL employee working hours for each day in date range

Hi I have a MySQL query that looks into a table that holds hours for each day the employee work. My query looks at the day of the date and the result is what hours we worked. so let say 2015-11-24 is Tuesday then 8.
How do I run the query without PHP to look at every day in a date range?
Thanks
Eg. $holidaystart = '2015-11-24';
$holidayend = '2015-11-30';
#Table employees hours id Mon Tue Wed Thu Fri Sat Sun
1 8 8 0 8 8 8 0
$sql= "select empId,
'".$holidayStart."' as date,
case dayname('".$holidayStart."')
when 'Sunday' then Sun
when 'Monday' then Mon
when 'Tuesday' then Tue
when 'Wednesday' then Wed
when 'Thursday' then Thu
when 'Friday' then Fri
when 'Saturday' then Sat
else 0 end as hours
from employees" ;
Edit
I have one table that holds employee id number (empId) Then each day of the week Mon, Tue, Wed, Thu, Fri, Sat, Sun
--------------------------------------------------------------
| empid | Mon | Tue | Wed | Thu | Fri | Sat | Sun |
--------------------------------------------------------------
| 1 | 8 | 8 | 0 | 8 | 8 | 8 | 0 |
| 2 0 8 | 8 0 | 8 | 8 | 8 |
--------------------------------------------------------------
so employee 1 as a working shift of 8 hour on Mon, 8 on Tue, 0 Wed, 8 Thu, 8 Fri, 8 Sat 0 Sunday . So what I am trying to achieve is to look at lets say 2015-11-24 to 2015-11- 26 and see how many hours are employee 1 will work between the 2 date based on how many hours they are suppose to work in each day Mon = 8 hour shift Tue = 8 hour shift Wed = 0 (day off) Thu = 8 Fri = 8 Sat = 8 Sun day off
FIDDLE DEMO
As you can see by the demo 2015-10-10 is saturday so empId 1 = 8 hours empId 2 = 6 hours and empId 3 = 8 hours so how do i do this using a date range?
Assuming your table has the following columns:
empId - Employee id
workDate - the date they worked
hoursWorked - the number of hours they worked that day
Then something like this
SELECT empId
SUM(IF(dayname(workDate) = 'Sunday', hoursWorked, 0)) As Sun,
SUM(IF(dayname(workDate) = 'Monday', hoursWorked, 0)) As Mon,
SUM(IF(dayname(workDate) = 'Tuesday', hoursWorked, 0)) As Tues,
SUM(IF(dayname(workDate) = 'Wednesday', hoursWorked, 0)) As Wed,
SUM(IF(dayname(workDate) = 'Thursday', hoursWorked, 0)) As Thurs,
SUM(IF(dayname(workDate) = 'Friday', hoursWorked, 0)) As Fri,
SUM(IF(dayname(workDate) = 'Saturday', hoursWorked, 0)) As Sat,
FROM employees
GROUP BY empId
WHERE workDate BETWEEN :startDate AND :endDate
Don't forget to use prepared statements to protect yourself from SQL injection!
Ok, so this is a blind shot, because you haven't provided us with the structure of table employees. Try to following:
$sql = mysql_query('SELECT * FROM employees WHERE `date` BETWEEN ".$startDate.'" AND "'.$endDate.'"');
while($row=mysql_fetch_assoc($sql)){
$arr[$row[date]]=$row;
}
$totalHours=0;
for($i=strtotime($startDate);$i<=strtotime($endDate);$i+=86400){
$day=$arr[date('Y-m-d H:i:s',$i];
if($day){
$id=$day[id];
}
$hours=($day)? $day[hours] : 0;
$totalHours+=$hours;
echo "Employee $id Day ".date('D',$i). " Hours $hours";
}
echo "Total hours $totalHours";
You shouldn't use the mysql library and instead use mysqli or PDO.
Try this dynamic Pivot query:
SET #sql = NULL;
SELECT empId,
GROUP_CONCAT(DISTINCT
CONCAT(case dayname(table_date_field)
when 'Sunday' then Sun
when 'Monday' then Mon
when 'Tuesday' then Tue
when 'Wednesday' then Wed
when 'Thursday' then Thu
when 'Friday' then Fri
when 'Saturday' then Sat
else 0 end as hours
)
) INTO #sql
FROM employees;
SET #sql = CONCAT('SELECT empId,', #sql, ' FROM employees GROUP BY empId');

To display missing values [date] as range from Date Column

I have output as below
ID Date
Null 2012-10-01
1 2012-10-02
2 2012-10-03
NULL 2012-10-04
3 2012-10-05
NULL 2012-10-06
4 2012-10-07
NULL 2012-10-08
5 2012-10-10
NULL 2012-10-11
NULL 2012-10-12
6 2012-10-13
NULL 2012-10-16
As it has missing dates with value as NULL. I need to show final output as
2012-10-01 - 2012-10-01 (1 day )
2012-10-04 - 2012-10-04(1 day )
2012-10-06 - 2012-10-06(1 day )
2012-10-08 - 2012-10-08(1 day )
2012-10-11 - 2012-10-12(2 day )
2012-10-14 - 2012-10-14(1 day )
You can generate the date ranges using the following query:
select
min(date) as start,
max(date) as end,
datediff(max(date), min(date)) + 1 as numDays
from
(select #curRow := #curRow + 1 AS row_number, id, date
from Table1 join (SELECT #curRow := 0) r where ID is null) T
group by
datediff(date, '2012-10-01 00:00:00') - row_number;
The logic is based on a clever trick for grouping consecutive ranges. First, we filter and number the rows in the subquery. Then, the rows that are grouped together are found by comparing the number of days after 2012-10-01 to the row number. If any rows share this value, then they must be consecutive, otherwise there would be a "jump" between two rows and the expression datediff(date, '2012-10-01 00:00:00') - row_number would no longer match.
Sample output (DEMO):
START END NUMDAYS
October, 01 2012 00:00:00+0000 October, 01 2012 00:00:00+0000 1
October, 04 2012 00:00:00+0000 October, 04 2012 00:00:00+0000 1
October, 06 2012 00:00:00+0000 October, 06 2012 00:00:00+0000 1
October, 08 2012 00:00:00+0000 October, 08 2012 00:00:00+0000 1
October, 11 2012 00:00:00+0000 October, 12 2012 00:00:00+0000 2
October, 16 2012 00:00:00+0000 October, 16 2012 00:00:00+0000 1
From there I think it should be pretty trivial for you to get the exact output you are looking for.

Get data from table where date between given date and 1 week back

I have database dbadmin, table - tbl_empreimburse with fields-emp_id,rem_amount,rem_date.
I want to retrieve data which comes from given date to a week back.
I tried this query,
SELECT SUM(rem_amount),DATEADD(dd, -7, "2012-01-10")
FROM tbl_empreimburse
GROUP BY emp_id
HAVING emp_id='5' AND rem_date BETWEEN DATEADD(dd, -7, "2012-01-10") AND "2012-01-10"
It gives me error "FUNCTION dbadmin.DATEADD does not exist". Do I need to convert "2012-01-10" to date format? Any Help, Please?
Try this:
This query gives result as you have specified for employee id 5 and date period of 7 days.
SELECT emp_id, SUM(rem_amount)
FROM tbl_empreimburse
WHERE emp_id='5' AND DATEDIFF('2012-12-31', rem_date) BETWEEN 0 AND 7;
OR
Below query gives you all employee data.
SELECT emp_id, SUM(rem_amount)
FROM tbl_empreimburse
GROUP BY emp_id
HAVING DATEDIFF('2012-12-31', rem_date) BETWEEN 0 AND 7;
Check this *SQLFIDDLE reference out. :)
I am not sure why you are using group by clause here...
Sample date:
ID AMOUNT RDATE
1 3400 January, 01 2012 00:00:00+0000
2 5000 January, 10 2012 00:00:00+0000
3 3000 January, 02 2012 00:00:00+0000
5 1000 January, 05 2012 00:00:00+0000
5 2000 January, 04 2012 00:00:00+0000
2 2000 February, 10 2012 00:00:00+0000
Query:
select * from emp
where id = 5;
here is the query to get sum
select id, sum(amount)
from emp
where rdate between '2012-01-10' - interval 7 day
and '2012-01-10'
and id = 5
;
Results:
all records by employee id = 5
ID AMOUNT RDATE
5 1000 January, 05 2012 00:00:00+0000
5 2000 January, 04 2012 00:00:00+0000
sum of amount by employee id = 5
ID SUM(AMOUNT)
5 3000