Sorting some rows by average with SQL - mysql

All right, so here's a challenge for all you SQL pros:
I have a table with two columns of interest, group and birthdate. Only some rows have a group assigned to them.
I now want to print all rows sorted by birthdate, but I also want all rows with the same group to end up next to each other. The only semi-sensible way of doing this would be to use the groups' average birthdates for all the rows in the group when sorting. The question is, can this be done with pure SQL (MySQL in this instance), or will some scripting logic be required?
To illustrate, with the given table:
id | group | birthdate
---+-------+-----------
1 | 1 | 1989-12-07
2 | NULL | 1990-03-14
3 | 1 | 1987-05-25
4 | NULL | 1985-09-29
5 | NULL | 1988-11-11
and let's say that the "average" of 1987-05-25 and 1989-12-07 is 1988-08-30 (this can be found by averaging the UNIX timestamp equivalents of the dates and then converting back to a date. This average doesn't have to be completely correct!).
The output should then be:
id | group | birthdate | [sort_by_birthdate]
---+-------+------------+--------------------
4 | NULL | 1985-09-29 | 1985-09-29
3 | 1 | 1987-05-25 | 1988-08-30
1 | 1 | 1989-12-07 | 1988-08-30
5 | NULL | 1988-11-11 | 1988-11-11
2 | NULL | 1990-03-14 | 1990-03-14
Any ideas?
Cheers,
Jon

I normally program in T-SQL, so please forgive me if I don't translate the date functions perfectly to MySQL:
SELECT
T.id,
T.group
FROM
Some_Table T
LEFT OUTER JOIN (
SELECT
group,
'1970-01-01' +
INTERVAL AVG(DATEDIFF('1970-01-01', birthdate)) DAY AS avg_birthdate
FROM
Some_Table T2
GROUP BY
group
) SQ ON SQ.group = T.group
ORDER BY
COALESCE(SQ.avg_birthdate, T.birthdate),
T.group

Related

How to get multiple records one record using query in mysql

How to be able to query from this data:
parking_place | number_of_month | from_date | end_date | monthly_unit_price
A | 3 | 2018-01 | 2018-03 | 3000000
Desire to show results:
parking_place | month | monthly_unit_price
A | 2018-01 | 3000000
A | 2018-02 | 3000000
A | 2018-03 | 3000000
please suggest me how to query?
You may join using a calendar table:
SELECT
t.parking_place,
t.month,
t.monthly_unit_price
FROM
(
SELECT '2018-01' AS month UNION ALL
SELECT '2018-02' UNION ALL
SELECT '2018-03'
) months
INNER JOIN yourTable t
ON months.month BETWEEN t.from_date AND t.end_date
ORDER BY
months.month;
Note that it would be better to store actual valid date literals to represent each month. For example, instead of storing the text '2018-01', you could store 2018-01-01 as a date literal.

Mysql IN function

class_table
+----+-------+--------------+
| id |teac_id| student_id |
+----+-------+--------------+
| 1 | 1 | 1,2,3,4 |
+----+-------+--------------+
student_mark
+----+----------+--------+
| id |student_id| marks |
+----+----------+--------+
| 1 | 1 | 12 |
+----+----------+--------+
| 2 | 2 | 80 |
+----+----------+--------+
| 3 | 3 | 20 |
+----+----------+--------+
I have these two tables and i want to calculate the total marks of student and my sql is:
SELECT SUM(`marks`)
FROM `student_mark`
WHERE `student_id` IN
(SELECT `student_id` FROM `class_table` WHERE `teac_id` = '1')
But this will return null, please help!!
DB fiddle
Firstly, you should never store comma separated data in your column. You should really normalize your data. So basically, you could have a many-to-many table mapping teacher_to_student, which will have teac_id and student_id columns.
In this particular case, you can utilize Find_in_set() function.
From your current query, it seems that you are trying to getting total marks for a teacher (summing up marks of all his/her students).
Try:
SELECT SUM(sm.`marks`)
FROM `student_mark` AS sm
JOIN `class_table` AS ct
ON FIND_IN_SET(sm.`student_id`, ct.`student_id`) > 0
WHERE ct.`teac_id` = '1'
In case, you want to get total marks per student, you would need to add a Group By. The query would look like:
SELECT sm.`student_id`,
SUM(sm.`marks`)
FROM `student_mark` AS sm
JOIN `class_table` AS ct
ON FIND_IN_SET(sm.`student_id`, ct.`student_id`) > 0
WHERE ct.`teac_id` = '1'
GROUP BY sm.`student_id`
Just in case you want to know why, The reason it returned null is because the subquery returned as '1,2,3,4' as a whole. What you need is to make it returned 1,2,3,4 separately.
What your query returned
SELECT SUM(`marks`)
FROM `student_mark`
WHERE `student_id` IN ('1,2,3,4')
What you expect is
SELECT SUM(`marks`)
FROM `student_mark`
WHERE `student_id` IN (1,2,3,4)
The best way is it normalize as #madhur said. In your case you need to make the teacher and student as one to many link
+----+-------+--------------+
| id |teac_id| student_id |
+----+-------+--------------+
| 1 | 1 | 1 |
+----+-------+--------------+
| 2 | 1 | 2 |
+----+-------+--------------+
| 3 | 1 | 3 |
+----+-------+--------------+
| 4 | 1 | 4 |
+----+-------+--------------+
If you want to filter your table based on a comma separated list with ID, my approach is to
append extra commas at the beginning and at the end of a list as well as at the beginning and at the end of an ID, eg.
1 becomes ,1, and list would become ,1,2,3,4,. The reason for that is to avoid ambigious matches like 1 matches 21 or 12 in a list.
Also, EXISTS is well-suited in that situation, which together with INSTR function should work:
SELECT SUM(`marks`)
FROM `student_mark` sm
WHERE EXISTS(SELECT 1 FROM `class_table`
WHERE `teac_id` = '1' AND
INSTR(CONCAT(',', student_id, ','), CONCAT(',', sm.student_id, ',')) > 0)
Demo
BUT you shouldn't store related IDs in one cell as comma separated list - it should be foreign key column to form proper relation. Joins would become trivial then.

Mysql Getting zero values when counting

I'm trying to count the number of sales orders has been canceled in a time period. But I run into the problem that it doesn't return results that are zero
My table
+---------------+------------+------------------+
| metrausername | signupdate | cancellationdate |
+---------------+------------+------------------+
| GLO00026 | 2017-06-22 | 2017-03-20 |
| GLO00055 | 2017-06-22 | 2017-04-18 |
| GLO00022 | 2017-06-27 | NULL |
| GLO00044 | 2017-06-24 | NULL |
| GLO00005 | 2017-06-26 | NULL |
+---------------+------------+------------------+
The statment i'm trying to count with
SELECT metrausername, COUNT(*) AS count FROM salesdata2
WHERE cancellationdate IS NOT NULL
AND signupDate >= '2017-6-21' AND signupDate <= '2017-7-20'
GROUP BY metrausername;
Let me know if any additional information would help
If the metrausername is filtered out by the where, it won't appear. Left join to the aggregation to get round this:
select distinct a1.metrausername, coalesce(a2.counted,0) as counted -- coalesce replaces null with a value
from salesdata2 a1
left join
(
SELECT metrausername, COUNT(*) AS counted
FROM salesdata2
WHERE cancellationdate IS NOT NULL
AND signupDate >= '2017-6-21' AND signupDate <= '2017-7-20'
GROUP BY metrausername
) a2
on a1.metrausername = a2.metrausername
I would just do this by moving the filtering clause to the select. Assuming you really do want the date range (as opposed to having users outside the range), then:
SELECT metrausername, COUNT(cancellationdate ) AS count
FROM salesdata2
WHERE signupDate >= '2017-06-21' AND signupDate <= '2017-07-20'
GROUP BY metrausername;
COUNT(<colname>) counts the non-NULL values, so this seems like the simplest approach.

Get specific values from same column within grouped rows

This is a problem for which I have a working query, but it feels horribly inefficient to me and I'd like some help constructing a better one. This is going into a live production environment, and the number of queries the db handles each day is incredibly high, so the more efficient this can be, the better. I have a table structured something like this (stripped to just the relevant parts):
id | type | datecolumn
1 | A | 2014-01-01
1 | B | 0000-00-00
2 | A | 2014-01-02
2 | B | 2014-01-10
3 | A | 2014-01-01
3 | B | 0000-00-00
There will always be two rows for each id, one of type A and one of type B. A will always have a valid date, and B will either have a date >= that of A, or all 0s. What I want is a query that will produce output similar to this:
id | date for A | date for B
1 | 2014-01-01 | None
2 | 2014-01-02 | 2014-01-10
3 | 2014-01-01 | None
The way I'm doing this now is as follows:
SELECT
id,
IF(MIN(datecolumn) > 0, MIN(datecolumn), MAX(datecolumn)) AS 'date for A',
IF(MIN(datecolumn) > 0, MAX(datecolumn), 'None') AS 'date for B'
GROUP BY id
But it really feels like I should be able to pluck the datecolumn value on a by-type basis somehow. I know the simplest solution should be to change the table structure so that each id only uses one row, but I'm afraid that is not possible in this case; there has to be two rows. Is there a way to leverage the type column properly in this query?
Edit: Also, this is on a table that will have upwards of 10,000,000 rows. So again, efficiency is key.
I'd stick with what you've go, but maybe write it this way...
CREATE TABLE my_table
(id INT NOT NULL
,type CHAR(1) NOT NULL
,datecolumn DATE NOT NULL DEFAULT '0000-00-00'
,PRIMARY KEY(id,type)
);
INSERT INTO my_table VALUES
(1 ,'A','2014-01-01'),
(1 ,'B','0000-00-00'),
(2 ,'A','2014-01-02'),
(2 ,'B','2014-01-10'),
(3 ,'A','2014-01-01'),
(3 ,'B','0000-00-00');
SELECT id
, MAX(CASE WHEN type = 'A' THEN datecolumn END) a
, MAX(REPLACE(CASE WHEN type='B' THEN datecolumn END,'0000-00-00','none')) b
FROM my_table
GROUP
BY id;
+----+------------+------------+
| id | a | b |
+----+------------+------------+
| 1 | 2014-01-01 | none |
| 2 | 2014-01-02 | 2014-01-10 |
| 3 | 2014-01-01 | none |
+----+------------+------------+
Make sure you have an index that covers both the id and type columns (e.g ALTER TABLE tbl ADD INDEX (type,id)), then do:
SELECT
table_a.id,
table_a.datecolumn AS 'date for A',
IF(table_b.datecolumn > 0, table_b.datecolumn, 'None') AS 'date for B'
FROM tbl AS table_a
JOIN tbl AS table_b ON table_a.id = table_b.id AND table_b.type = 'B'
WHERE table_a.type = 'A';

Show all grouped results and sort

I have a table, like that one:
| B | 1 |
| C | 2 |
| B | 2 |
| A | 2 |
| C | 3 |
| A | 2 |
I would like to fetch it, but sorted and grouped. That is, I would like it grouped by the letter, but sorted by the highest sum of the group. Also, I want to show all entries within the group:
| C | 3 |
| C | 2 |
| A | 2 |
| A | 2 |
| B | 2 |
| B | 1 |
The order is that way because C has 3 and 2. 3+2=5, which is higher than 2+2=4 for A which in turn is higher than 2+1=3 for B.
I need to show all "grouped" letters because there are other columns that are distinct all of which I need shown.
EDIT:
Thanks for the quick reply. I have the audacity, however, to inquire further.
I have this query:
SELECT * FROM `ip_log` WHERE `IP` IN
(SELECT `IP` FROM `ip_log` GROUP BY `IP` HAVING COUNT(DISTINCT `uid`) > 1)
GROUP BY `uid` ORDER BY `IP`
The letters in the upper description are ip (I need it grouped by the IP addresses) and the numbers are timestamp (I need it sorted by the sum (or just used as the sorting parameter)). Should I create a temporary table and then use the solution below?
select t.Letter, t.Value
from MyTable t
inner join (
select Letter, sum(Value) as ValueSum
from MyTable
group by Letter
) ts on t.Letter = ts.Letter
order by ts.ValueSum desc, t.Letter, t.Value desc
SQL Fiddle Example
If your table's columns are letter and number, the way I would go around to doing this would be the following:
SELECT
letter,
GROUP_CONCAT(number ORDER BY number DESC),
SUM(number) AS total
FROM table
GROUP BY letter
ORDER BY total desc
What you will get, based on your example is the following:
| C | 3,2 | 5
| A | 2,2 | 4
| B | 2,1 | 3
You can then process that data to get the actual information you want/need.
If you still want the data in the format you requested originally, it is not possible with a single query. The reason for that is that you can't sort based on an aggregated data that you are not calculating in the same query (the SUM of the number column). So you will need to make a sub-query to calculate that and feed it back into the original query (disclaimer: untested query):
SELECT
letter,
number
FROM table
JOIN (SELECT ltr, SUM(number) AS total FROM table GROUP BY letter) AS totals
ON table.letter = totals.ltr
ORDER BY totals.total desc, letter desc, number desc