I have a database table which has a person's id, name, and time (in milliseconds, stored as an int). For example:
| id | name | totalTime |
| --- | ------ | --------- |
| 1 | Bob | 16280 |
| 2 | Andy | 17210 |
| 3 | Bill | 15320 |
| 4 | Matt | 14440 |
| 5 | Steven | 17570 |
| 6 | Tom | NULL |
| 7 | Angus | 17210 |
| 8 | Will | NULL |
| 9 | Jack | 17410 |
| 10 | Alex | 16830 |
Not necessarily all people have a time (thus the nulls).
I would like to have another two columns - one which shows the rank/position of each person, and another which shows the difference in time (milliseconds) between the best (i.e. minimum) time and each row's time.
I have managed to write a MySQL 8.x query which does the ranks:
SELECT id, name, totalTime,
(CASE WHEN totalTime IS NOT NULL THEN RANK() OVER ( PARTITION BY (CASE WHEN totalTime IS NOT NULL THEN 1 ELSE 0 END) ORDER BY totalTime ) END) totalRank
FROM results
ORDER BY -totalRank DESC;
...and outputs this:
| id | name | totalTime | totalRank |
| --- | ------ | --------- | --------- |
| 4 | Matt | 14440 | 1 |
| 3 | Bill | 15320 | 2 |
| 1 | Bob | 16280 | 3 |
| 10 | Alex | 16830 | 4 |
| 2 | Andy | 17210 | 5 |
| 7 | Angus | 17210 | 5 |
| 9 | Jack | 17410 | 7 |
| 5 | Steven | 17570 | 8 |
| 6 | Tom | NULL | NULL |
| 8 | Will | NULL | NULL |
...but have not been able to figure out the SQL to add another column with the time difference.
Below is an example of what I would like, but can't figure out how to do:
| id | name | totalTime | totalRank | difference |
| --- | ------ | --------- | --------- | ---------- |
| 4 | Matt | 14440 | 1 | 0 |
| 3 | Bill | 15320 | 2 | 880 |
| 1 | Bob | 16280 | 3 | 1840 |
| 10 | Alex | 16830 | 4 | 2390 |
| 2 | Andy | 17210 | 5 | 2770 |
| 7 | Angus | 17210 | 5 | 2770 |
| 9 | Jack | 17410 | 7 | 2970 |
| 5 | Steven | 17570 | 8 | 3130 |
| 6 | Tom | NULL | NULL | NULL |
| 8 | Will | NULL | NULL | NULL |
I have this available as a DB Fiddle: https://www.db-fiddle.com/f/gQvSeij2EKSufYp9VjbDav/0
Thanks in advance for any help!
You can use a CTE to get the min totalTime and use it to calculate the difference:
WITH cte as (SELECT MIN(totalTime) minTotalTime FROM results)
SELECT id, name, totalTime,
CASE WHEN totalTime IS NOT NULL
THEN RANK() OVER (PARTITION BY (
CASE
WHEN totalTime IS NOT NULL THEN 1
ELSE 0
END
) ORDER BY totalTime)
END totalRank,
totalTime - (SELECT minTotalTime from cte) difference
FROM results
ORDER BY -totalRank DESC;
See the demo.
Results:
| id | name | totalTime | totalRank | difference |
| --- | ------ | --------- | --------- | ---------- |
| 4 | Matt | 14440 | 1 | 0 |
| 3 | Bill | 15320 | 2 | 880 |
| 1 | Bob | 16280 | 3 | 1840 |
| 10 | Alex | 16830 | 4 | 2390 |
| 2 | Andy | 17210 | 5 | 2770 |
| 7 | Angus | 17210 | 5 | 2770 |
| 9 | Jack | 17410 | 7 | 2970 |
| 5 | Steven | 17570 | 8 | 3130 |
| 6 | Tom | | | |
| 8 | Will | | | |
Add min() window function
SELECT id, name, totalTime,
(CASE WHEN totalTime IS NOT NULL THEN RANK() OVER ( PARTITION BY (CASE WHEN totalTime IS NOT NULL THEN 1 ELSE 0 END) ORDER BY totalTime ) END) totalRank
,totaltime - min(totaltime) over() diff
FROM results
ORDER BY -totalRank DESC;
SELECT subtable.id,
subtable.NAME,
subtable.totalTime,
subtable.diff,
IIF(subtable.totalTime IS NULL,NULL,subtable.rowno) as bisi
FROM (
select *,
ROW_NUMBER() OVER (ORDER BY totalTime desc) as rowno,
totalTime -
(
select min(rst.totalTime)
from results rst) as diff
from results) subtable;
I would do this way in MS-SQL or alternatively in MYSQL
SELECT subtable.id,
subtable.NAME,
subtable.totalTime,
subtable.diff,
IF (subtable.totalTime IS NULL, NULL, subtable.rowno) as bisi
FROM (
select *,
ROW_NUMBER() OVER (ORDER BY totalTime desc) as rowno,
totalTime -
(
select min(rst.totalTime)
from results rst) as diff
from results) subtable;
Serg's answer is correct. I would write it as:
SELECT id, name, totalTime,
(CASE WHEN totalTime IS NOT NULL
THEN RANK() OVER (PARTITION BY (totalTime IS NULL) ORDER BY totalTime)
END) as totalRank,
totaltime - MIN(totaltime) OVER() as diff
FROM results
ORDER BY (totalTime IS NOT NULL) DESC, totalRank;
The differences are:
Simplifying the PARTITION BY. You use CASE, but MySQL conveniently treats booleans as "real" values.
Expressing the ORDER BY in a more intuitive fashion.
Related
i have table with data like this below
| id | wallet_id | wallet_name | deposit | |
|----|-----------|-------------|---------|---|
| 1 | 12 | a_wallet | 10 | |
| 2 | 14 | c_wallet | 12 | |
| 3 | 12 | a_wallet | 24 | |
| 4 | 15 | e_wallet | 50 | |
| 5 | 14 | c_wallet | 10 | |
| 6 | 15 | e_wallet | 22 | |
i want to select and group with same wallet_id, probably something like this
| wallet_id | id | wallet_name |
|-----------|----|-------------|
| 12 | 1 | a_wallet |
| | 3 | a_wallet |
| 14 | 2 | c_wallet |
| | 5 | c_wallet |
| 15 | 4 | e_wallet |
| | 6 | e_wallet |
i already try
select wallet_id, id, wallet_name from wallet group by wallet_id
but it shows like usual select query with no grouping.
Kindly need your help, thanks
We would generally handle your requirement from the presentation layer (e.g. PHP), but if you happen to be using MySQL 8+, here is a way to do this directly from MySQL:
SELECT
CASE WHEN ROW_NUMBER() OVER (PARTITION BY wallet_id ORDER BY id) = 1
THEN wallet_id END AS wallet_id,
id,
wallet_name
FROM wallet w
ORDER BY w.wallet_id, id;
I have an SQL table with roughly the following structure:
Employee| date | department | Country | Designation
What I would like is to get results with the following structure:
count_emp_per_department | count_emp_per_country | count_emp_per_designation |
Currently I am using UNION ALL, that is constructing a query similar to that one:
SELECT emp_ID, NULL, count(1)
FROM employee
GROUP BY country
UNION ALL
SELECT NULL, emp_ID, count(1)
FROM film
GROUP BY designation
Is this the most effective way to perform multiple aggregations and return all of them in a single result set in Hive?
Kindly share if you new approach which can optimize/enhance performance.
Not sure whether its a real requirement.. as the output isnt that useful.. anyway
Here is the structure and query.
+-----------+------------+----------+
| col_name | data_type | comment |
+-----------+------------+----------+
| emp | int | |
| dt | date | |
| dept | string | |
| country | string | |
| desig | string | |
+-----------+------------+----------+
+--------+-------------+---------+------------+----------+
| t.emp | t.dt | t.dept | t.country | t.desig |
+--------+-------------+---------+------------+----------+
| 1 | 2020-02-02 | human | usa | hr |
| 2 | 2020-02-02 | dir | usa | hr |
| 3 | 2020-02-02 | dir | canada | it |
+--------+-------------+---------+------------+----------+
with q1 as (select dept,count(*) as deptcount from t group by dept),
q2 as (select country,count(*) as countrycount from t group by country),
q3 as (select desig,count(*) as desigcount from t group by desig)
select * from q1, q2, q3;
output will be like this..
+----------+---------------+-------------+------------------+-----------+----------------+
| q1.dept | q1.deptcount | q2.country | q2.countrycount | q3.desig | q3.desigcount |
+----------+---------------+-------------+------------------+-----------+----------------+
| dir | 2 | canada | 1 | hr | 2 |
| dir | 2 | usa | 2 | hr | 2 |
| dir | 2 | canada | 1 | it | 1 |
| dir | 2 | usa | 2 | it | 1 |
| human | 1 | canada | 1 | hr | 2 |
| human | 1 | usa | 2 | hr | 2 |
| human | 1 | canada | 1 | it | 1 |
| human | 1 | usa | 2 | it | 1 |
+----------+---------------+-------------+------------------+-----------+----------------+
I'm stuck with window function.
I have this table called task:
user_id VARCHAR
date DATE
balance INTEGER
+---------+------------+---------+
| user_id | date | balance |
+---------+------------+---------+
| 1 | 03.04.2020 | 0 |
| 1 | 04.04.2020 | 265 |
| 1 | 05.04.2020 | 140 |
| 1 | 06.04.2020 | 70 |
| 1 | 07.04.2020 | 0 |
| 2 | 03.04.2020 | 535 |
| 2 | 04.04.2020 | 115 |
| 2 | 05.04.2020 | 0 |
| 2 | 06.04.2020 | 0 |
| 2 | 07.04.2020 | 694 |
+---------+------------+---------+
I'm trying to calculate all the periods where balance was constantly positive.
So the output table should look like this:
+---------+------------+------------+-------------+-------------+
| user_id | start_date | end_date | avg_balance | date_length |
+---------+------------+------------+-------------+-------------+
| 1 | 04.04.2020 | 06.04.2020 | 158.3 | 3 |
| 2 | 03.04.2020 | 04.04.2020 | 325 | 2 |
| 2 | 07.04.2020 | 07.04.2020 | 694 | 1 |
+---------+------------+------------+-------------+-------------+
I've tried to implement the window function but got stuck.
Assign periods by counting the number of zeros before. Then aggregate:
select user_id, min(date), max(date), avg(balance), count(*) as date_length
from (select t.*,
sum( balance = 0 ) over (partition by user_id order by date) as grp
from t
) t
where balance > 0
group by user_id, grp;
Here is a db<>fiddle.
I'm working on a query where I need to count distinct CarId row when the column LocationId is not null and get all CarId if its null or 0 but the query that I tried distincts all the CarId even if its null
#LocId int
Select Count(distinct a.CarId) from VehicleDetails a
inner join VehicleDocuments b on a.DocId=b.DocId
left join VehicleShipmentDetails dpg on dpg.VehicleShipmentId= b.VehicleShipmentId
where b.LogicalDelete=0 and a.LogicalDelete=0
and (dpg.LocationId= #LocId or dpg.LocationId= 0 or dpg.LocationId is null)
| ID | CarId | LocationId | DateCreated |
|------+----------------+-----------------+---------------|
| 1 | 1 | 5 | 02/03/2019 |
| 2 | 2 | null | 01/14/2019 |
| 3 | 2 | 0 | 02/03/2019 |
| 4 | 2 | 5 | 12/30/2018 |
| 5 | 4 | 3 | 01/10/2019 |
| 6 | 3 | 5 | 02/14/2019 |
| 7 | 2 | 5 | 03/13/2019 |
Desired output:
| ID | CarId | LocationId | DateCreated |
+------+----------------+-----------------+---------------+
| 1 | 1 | 5 | 02/03/2019 |
| 2 | 2 | null | 01/14/2019 |
| 3 | 2 | 0 | 02/03/2019 |
| 4 | 2 | 5 | 03/13/2019 |
| 5 | 4 | 3 | 01/10/2019 |
| 6 | 3 | 5 | 02/14/2019 |
Current Output
| ID | CarId | LocationId | DateCreated |
+------+----------------+-----------------+---------------+
| 1 | 1 | 5 | 02/03/2019 |
| 2 | 2 | 5 | 01/14/2019 |
| 3 | 4 | 3 | 01/10/2019 |
| 4 | 3 | 5 | 02/14/2019 |
Im getting a count of 4 but i needed to have 6 as the Count
EDIT: My goal is to remove the row to Distinct CarId if the value of the LocationId is Null or 0 but on my Current code, It distincts all CarId that is null,0 and equals to #LocId
You can query something like this, replace your_table by your actual set of data.
SELECT ID, CardId, LocationId, DateCreated
FROM your_table as T
WHERE NOT EXISTS (SELECT *
FROM your_table as T1
WHERE T.ID > T1.ID AND T.CarID = T1.CarID)
In SQL, you can use the statement CASE to manage conditions (just like the "if then else" in other programming languages). In your case this function could help because you have two differents cases to handle.
This is a continuation of my previous question.
Assume we have three tables. A main table and two id tables.
+-----+-----+--------------------------------------+
| cid | pid | date1 | date2 | date3 |
+-----+-----+--------------------------------------+
| 1 | 2 | NULL | 2014-03-24 | 2014-03-24 |
| 3 | 1 | 2014-06-13 | NULL | NULL |
| 4 | 3 | NULL | 2014-09-14 | NULL |
| 2 | 1 | NULL | NULL | 2014-08-15 |
| 4 | 3 | 2014-01-10 | NULL | NULL |
| 1 | 4 | 2014-02-15 | NULL | NULL |
| 4 | 2 | NULL | 2014-01-06 | 2014-01-12 |
+-----+-----+------------+------------+------------+
+----+----------+ +----+--------+
| id | city | | id | person |
+----+----------+ +----+--------+
| 1 | 'Dallas' | | 1 | 'John' |
| 2 | 'Berlin' | | 2 | 'Jack' |
| 3 | 'Topeka' | | 3 | 'Doug' |
| 4 | 'London' | | 4 | 'Pete' |
+----+----------+ +----+--------+
Ok, now i'd like to make a select to get one row per city in the result. The row has to contain the city, the max of each date (date1, date2, date3) of this city and the person that belongs to the max of the three max dates.
Result:
+--------+--------+--------------------------------------+
| city | person | date1 | date2 | date3 |
+--------+--------+--------------------------------------+
| Dallas | Jack | 2014-02-15 | 2014-03-24 | 2014-03-24 |
| Berlin | John | NULL | NULL | 2014-08-15 |
| Topeka | John | 2014-06-13 | NULL | NULL |
| London | Doug | 2014-01-10 | 2014-09-14 | 2014-01-12 |
+--------+--------+------------+------------+------------+
Mhh... I thought it would't be that difficult.
see the fiddle
I think this might work.
select c.city, p.person, y.date1, y.date2, y.date3
from (select x.cid, x.date1, x.date2, x.date3, greatest(ifnull(x.date1, '0000-01-01'), ifnull(x.date2, '0000-01-01'), ifnull(x.date3, '0000-01-01')) as maxdate
from (select cid, max(date1) as date1, max(date2) as date2, max(date3) as date3
from main
group by cid) as x)
as y
join main m
on m.cid = y.cid and
(m.date1 = y.maxdate or m.date2 = y.maxdate or m.date3 = y.maxdate)
join city c
on y.cid = c.id
join person p
on m.pid = p.id
It starts by creating 'x' which is a table with the max dates for each city. Then it creates 'y' where it adds on the highest of the 3 dates. Then it joins with the main table to find the row for the city with the highest date. And then it joins the city and person table to get the names rather than the ids.