Need to find Three most expensive path on average having code 'A' with delivery in May 2021 - mysql

Statement:
Need to find Three most expensive path(from_to_ in table) on average that code (with_ in table) 'A' with delivery in May 2021? If two ties then include both.
Schema:
'observed_on', date,
'from_', varchar(3),
'to_', varchar(3),
'from_to_', varchar(8),
'with_', varchar(3),
'cart_no', varchar(8),
'deliver_on', date,
'd_charge', double,
Sample data:
click to view
Solution I tried:
SELECT
from_to_
,avg_price
FROM
(
SELECT
from_to_
,ROUND(AVG(d_charge),2) AS avg_price
,DENSE_RANK() OVER(ORDER BY ROUND(AVG(d_charge),2) DESC) rank_by_avgp
FROM
(
SELECT
*
FROM DELIVERY
WHERE deliver_on BETWEEN '2021-05-01' AND '2021-05-30'
AND with_ = 'A'
) AS A
GROUP BY from_to_
) AS bb
WHERE bb.rank_by_avgp <=3;
I know it's a workaround so I am looking for a better solution

#nishant, you have a quite poorly asked question here.
The issues with your question
The sample data, you reference, is a picture? Why not give text values or even better the DML statements to set the example up.
In the referenced dataset(picture) all the DELIVERY.deliver_on date values are before 2010. So the conditions you have here deliver_on BETWEEN '2021-05-01' AND '2021-05-30' will just not return anything for the example data.
If you say
I know it's a workaround so I am looking for a better solution
then based on what? It is a workaround for what? It looks to be producing correct results, so what is the problem with it? Do you want it to perform better or what?
You are not specifing the DB version. The different MySQL versions can have different solutions.
One possible solution
The example dataset setup:
DDL
CREATE TABLE DELIVERY (
observed_on DATE
, from_ VARCHAR(3)
, to_ VARCHAR(3)
, from_to_ VARCHAR(8)
, with_ VARCHAR(3)
, cart_no VARCHAR(8)
, deliver_on DATE
, d_charge DOUBLE
);
DML
INSERT INTO DELIVERY VALUES ('2012-01-19','Aus','Nzl','AusNzl','A','2118','2021-04-19',82.3);
INSERT INTO DELIVERY VALUES ('2012-01-19','Aus','Nzl','AusNzl','A','2118','2021-05-19',82.3);
INSERT INTO DELIVERY VALUES ('2012-01-19','Aus','Nzl','AusNzl','A','2118','2021-05-19',82.3);
INSERT INTO DELIVERY VALUES ('2013-01-19','Ind','Sla','IndSla','B','2233','2021-05-19',70.32);
INSERT INTO DELIVERY VALUES ('2013-01-19','Ind','Sla','IndSla','A','2233','2021-05-19',70.32);
INSERT INTO DELIVERY VALUES ('2013-01-19','Eur','Usa','EurUsa','C','2434','2021-05-19',67.53);
INSERT INTO DELIVERY VALUES ('2013-01-19','Eur','Usa','EurUsa','A','2434','2021-05-19',67.53);
INSERT INTO DELIVERY VALUES ('2013-01-19','Xyz','Usa','XyzUsa','A','2434','2021-05-19',67.53);
INSERT INTO DELIVERY VALUES ('2013-01-19','Xyz','Sla','XyzSla','A','2434','2021-05-19',67.51);
INSERT INTO DELIVERY VALUES ('2012-01-19','Aus','Nzl','AusNzl','A','2323','2021-05-19',82.3);
INSERT INTO DELIVERY VALUES ('2012-01-19','Aus','Nzl','AusNzl','A','2118','2021-06-19',82.3);
QUERY
SELECT from_to_
, avg_d_charge
, denserank_avg_d_charge
FROM /*SUB_to_calculate_the_denserank*/
(
SELECT from_to_
, ROUND(avg_d_charge, 2) AS avg_d_charge
, DENSE_RANK() OVER (
ORDER BY ROUND(avg_d_charge, 2) DESC
) denserank_avg_d_charge /*dense ranking*/
FROM /*SUB_to_calculate_the_averages*/
(
SELECT from_to_
, ROW_NUMBER() OVER (PARTITION BY from_to_) AS rownumber /*To filter for only one row per from_to_.*/
, AVG(d_charge) OVER (PARTITION BY from_to_) AS avg_d_charge /*Average caldulation*/
FROM /*DELIVERY*/
DELIVERY
WHERE 1 = 1
AND with_ = 'A' /* The "code" filter*/
AND DATE_SUB(deliver_on, INTERVAL DAYOFMONTH(deliver_on) - 1 DAY) = '2021-05-01' /* The 2021-05 filter*/
) SUB_to_calculate_the_averages
WHERE 1 = 1
AND rownumber = 1
) SUB_to_calculate_the_denserank
WHERE 1 = 1
AND denserank_avg_d_charge < 4;
The only main difference here from your solution is then that I do not use the aggegate GROUP BY here, only analytical functions. I prefer this quite often as it allows to carry later other attributes through the query without the need to apply the aggregate functions on them etc. But in the end then this comes down to performance and the requirements what/how should be done.

Related

MySQL add balance from previous rows

I’ve tried a few things I’ve seen on here but it doesn’t work in my case, the balance on each row seems to duplicate.
Anyway I have a table that holds mortgage transactions, that table has a Column that stores an interest added value or a payment value.
So I might have:
Balance: 100,000
Interest added 100 - balance 100,100
Payment made -500 - balance 99,600
Interest added 100 - balance 99,700
Payment made -500 - balance 99,200
What I’m looking for is a query to pull all of these in date order newest first and summing the balance in a column depending on whether it has interest or payment (the one that doesn’t will be null) so at the end of the rows it will have the current liability
I can’t remember what the query I tried was but it ended up duplicating rows and the balance was weird
Sample structure and data:
CREATE TABLE account(
id int not null primary key auto_increment,
account_name varchar(50),
starting_balance float(10,6)
);
CREATE TABLE account_transaction(
id int not null primary key auto_increment,
account_id int NOT NULL,
date datetime,
interest_amount int DEFAULT null,
payment_amount float(10,6) DEFAULT NULL
);
INSERT INTO account (account_name,starting_balance) VALUES('Test Account','100000');
INSERT INTO account_transaction (account_id,date,interest_amount,payment_amount) VALUES(1,'2020-10-01 00:00:00',300,null);
INSERT INTO account_transaction (account_id,date,interest_amount,payment_amount) VALUES(1,'2020-10-01 00:00:00',null,-500);
INSERT INTO account_transaction (account_id,date,interest_amount,payment_amount) VALUES(1,'2020-11-01 00:00:00',300,null);
INSERT INTO account_transaction (account_id,date,interest_amount,payment_amount) VALUES(1,'2020-11-05 00:00:00',-500,null);
So interest will be added on to the rolling balance, and the starting balance is stored against the account - if we have to have a transaction added for this then ok. Then when a payment is added it can be either negative or positive to decrease the balance moving to each row.
So above example i'd expect to see something along the lines of:
I hope this makes it clearer
WITH
starting_dates AS ( SELECT id account_id, MIN(`date`) startdate
FROM account_transaction
GROUP BY id ),
combine AS ( SELECT 0 id,
starting_dates.account_id,
starting_dates.startdate `date`,
0 interest_amount,
account.starting_balance payment_amount
FROM account
JOIN starting_dates ON account.id = starting_dates.account_id
UNION ALL
SELECT id,
account_id,
`date`,
interest_amount,
payment_amount
FROM account_transaction )
SELECT DATE(`date`) `Date`,
CASE WHEN interest_amount = 0 THEN 'Balance Brought Forward'
WHEN payment_amount IS NULL THEN 'Interest Added'
WHEN interest_amount IS NULL THEN 'Payment Added'
ELSE 'Unknown transaction type'
END `Desc`,
CASE WHEN interest_amount = 0 THEN ''
ELSE COALESCE(interest_amount, 0)
END Interest,
COALESCE(payment_amount, 0) Payment,
SUM(COALESCE(payment_amount, 0) + COALESCE(interest_amount, 0))
OVER (PARTITION BY account_id ORDER BY id) Balance
FROM combine
ORDER BY id;
fiddle
PS. Source data provided (row with id=4) was altered according to desired output provided. Source structure was altered, FLOAT(10,6) which is not compatible with provided values was replaced with DECIMAL.
PPS. The presence of more than one account is allowed.

How do I SELECT a MySQL Table value that has not been updated on a given date?

I have a MySQL database named mydb in which I store daily share prices for
423 companies in a table named data. Table data has the following columns:
`epic`, `date`, `open`, `high`, `low`, `close`, `volume`
epic and date being primary key pairs.
I update the data table each day using a csv file which would normally have 423 rows
of data all having the same date. However, on some days prices may not available
for all 423 companies and data for a particular epic and date pair will
not be updated. In order to determine the missing pair I have resorted
to comparing a full list of epics against the incomplete list of epics using
two simple SELECT queries with different dates and then using a file comparator, thus
revealing the missing epic(s). This is not a very satisfactory solution and so far
I have not been able to construct a query that would identify any epics that
have not been updated for any particular day.
SELECT `epic`, `date` FROM `data`
WHERE `date` IN ('2019-05-07', '2019-05-08')
ORDER BY `epic`, `date`;
Produces pairs of values:
`epic` `date`
"3IN" "2019-05-07"
"3IN" "2019-05-08"
"888" "2019-05-07"
"888" "2019-05-08"
"AA." "2019-05-07"
"AAL" "2019-05-07"
"AAL" "2019-05-08"
Where in this case AA. has not been updated on 2019-05-08. The problem with this is that it is not easy to spot a value that is not a pair.
Any help with this problem would be greatly appreciated.
You could do a COUNT on epic, with a GROUP BY epic for items in that date range and see if you get any with a COUNT less than 2, then select from this result where UpdateCount is less than 2, forgive me if the syntax on the column names is not correct, I work in SQL Server, but the logic for the query should still work for you.
SELECT x.epic
FROM
(
SELECT COUNT(*) AS UpdateCount, epic
FROM data
WHERE date IN ('2019-05-07', '2019-05-08')
GROUP BY epic
) AS x
WHERE x.UpdateCount < 2
Assuming you only want to check the last date uploaded, the following will return every item not updated on 2019-05-08:
SELECT last_updated.epic, last_updated.date
FROM (
SELECT epic , max(`date`) AS date FROM `data`
GROUP BY 'epic'
) AS last_updated
WHERE 'date' <> '2019-05-08'
ORDER BY 'epic'
;
or for any upload date, the following will compare against the entire database, so you don't rely on '2019-08-07' having every epic row. I.e. if the epic has been in the database before then it will show if not updated:
SELECT d.epic, max(d.date)
FROM data as d
WHERE d.epic NOT IN (
SELECT d2.epic
FROM data as d2
WHERE d2.date = '2019-05-08'
)
GROUP BY d.epic
ORDER BY d.epic

Count Distinct Date MySQL returning one row

Ok so be ready I'm working on a weird base :
Every table has 3 column only : varchar('Object'),varchar('Property'),varchar('Value')
Here is a fiddle I've build with examples of my tries
http://sqlfiddle.com/#!9/de22eb/1
I need to extract the last time a server was update. But i'm not interested in the server itself it's more about the date. Once I know that there was an update for a date I'm looking to count every updates on that day.
To do so I'm using 2 tables : the server table
CREATE TABLE IF NOT EXISTS `server` (
`name` varchar(80) NOT NULL,
`field` varchar(80) NOT NULL,
`value` varchar(200) NOT NULL
);
And the event table :
CREATE TABLE IF NOT EXISTS `event` (
`name` varchar(80) NOT NULL,
`field` varchar(80) NOT NULL,
`value` varchar(80) NOT NULL
);
Please go watch the fiddle to have an idea of the content.
I want to have a result like this (based on my example) :
Date Number patched
2017-11-14 2
2017-11-04 1
The problem is that I don't know where I'm wrong on my query (I've separated the step for better understanding inside the fiddle) :
Select date_format(d.val, '%Y-%m-%d') as 'Date', COUNT(distinct
date_format(d.val, '%Y-%m-%d')) as 'Number'
FROM (
Select b.serv,b.val
FROM (
Select serv,val FROM (
Select name as serv, value as val FROM event
where field='server_vers' and
value!='None'
order by serv ASC,
val DESC LIMIT 18446744073709551615) a
group by a.serv) b,
server c
where b.serv = c.name and c.field = 'OS' and c.value = 'Fedora'
) d group by date_format(d.val, '%Y-%m-%d');
It's giving me only one row. Adding group by date_format(d.val, '%Y-%m-%d') at the end makes the Count useless. How can I fix that ?
I want to have for each server for a given OS type the last patch date and then sum the result by date.
Is that what you needed ?
SELECT dates.date, COUNT(dates.date) as patch_count
FROM (
SELECT MAX(date_format(event.value, '%Y-%m-%d')) as date
FROM event
JOIN server ON (event.name = server.name)
WHERE (server.field = 'OS' AND server.value = 'Fedora')
GROUP BY event.name ) as dates
GROUP BY 1
ORDER BY 2 DESC
Here's the fiddle :
http://sqlfiddle.com/#!9/de22eb/37/0
Explanation : We get the last date for every server name. That gives a list of last dates. Then we use this as a table, that we can group on to count each different value.
The datetimes are stored as strings. The first ten characters of that string represent the date. So you get the date with left(value, 10).
You get the last update per server by grouping by server and retrieving max(left(value, 10)), because alphabetic order works on 'yyyy-mm-dd'.
select name, max(left(value, 10))
from event
where field = 'server_vers'
and value <> 'None'
group by name
Build up on this to get the count of updates on those last-update dates:
select left(value, 10), count(*)
from event
where field = 'server_vers'
and left(value, 10) in
(
select max(left(value, 10))
from event
where field = 'server_vers'
and value <> 'None'
group by name
)
group by left(value, 10)
order by left(value, 10);

Mysql time spread

Sorry, I have difficulty explaining my question and search for a previous answer. This is my problem -- I have a MySQL table with events
CREATE TABLE events {
id INT,
event INT,
date DATETIME
}
Data is being added a few times a week or month. I would like to see the statistical spread of time between two adjacent events. Something like:
Time difference between two events
1 day appart - 4 occurances
2 days apart - 2 occurances
n days apart - x occurances
It should be something like this, I guess, but calculating the time difference between events.
SELECT COUNT('id') AS 'no', ??? AS 'delta' GROUP BY FLOOR( 'delta' )
This piece of SQL code did it:
SET #old = NOW();
SELECT COUNT(`id`) AS `no`, query1.`delta` FROM
( SELECT `id`, `date`, DATEDIFF( #old, `date` ) AS `delta`, #old := `date` AS `old`
FROM `life`
ORDER BY `date`DESC ) query1
GROUP BY `delta`
ORDER BY `delta`

How to sort a list of people by birth and death dates when data are incomplete

I have a list of people who may or may not have a birth date and/or a death date. I want to be able to sort them meaningfully - a subjective term - by birth date.
BUT - if they don't have a birth date but they to have a death date, I want to have them collated into the list proximal to other people who died then.
I recognize that this is not a discrete operation - there is ambiguity about where someone should go when their birth date is missing. But I'm looking for something that is a good approximation, most of the time.
Here's an example list of what I'd like:
Alice 1800 1830
Bob 1805 1845
Carol 1847
Don 1820 1846
Esther 1825 1860
In this example, I'd be happy with Carol appearing either before or after Don - that's the ambiguity I'm prepared to accept. The important outcome is that Carol is sorted in the list relative to her death date as a death date, not sorting the death dates in with the birth dates.
What doesn't work is if I coalesce or otherwise map birth and death dates together. For example, ORDER BY birth_date, death_date would put Carol after Esther, which is way out of place by my thinking.
I think you're going to have to calculate an average age people end up living (for those having both birth and death dates). And either subtract them from death date or add them to birth date for people who don't have the other one.
Doing this in one query may not be efficient, and perhaps ugly because mysql doesn't have windowing functions. You may be better of precalculating the average living age beforehand. But let's try to do it in one query anyway:
SELECT name, birth_date, death_date
FROM people
ORDER BY COALESCE(
birth_date,
DATE_SUB(death_date, INTERVAL (
SELECT AVG(DATEDIFF(death_date, birth_date))
FROM people
WHERE birth_date IS NOT NULL AND death_date IS NOT NULL
) DAY)
)
N.B.: I've tried with a larger dataset, and it is not working completely as I'd expect.
Try with this query (it needs an id primary key column):
SELECT * FROM people p
ORDER BY (
CASE WHEN birth IS NOT NULL THEN (
SELECT ord FROM (
SELECT id, #rnum := #rnum + 1 AS ord
FROM people, (SELECT #rnum := 0) r1
ORDER BY (CASE WHEN birth IS NOT NULL THEN 0 ELSE 1 END), birth, death
) o1
WHERE id = p.id
) ELSE (
SELECT ord FROM (
SELECT id, #rnum := #rnum + 1 AS ord
FROM people, (SELECT #rnum := 0) r2
ORDER BY (CASE WHEN death IS NOT NULL THEN 0 ELSE 1 END), death, birth
) o2
WHERE id = p.id
)
END)
;
What I've done is, basically, to sort the dataset two times, once by birth date and then by death date. Then I've used these two sorted lists to assign the final order to the original dataset, picking the place from the birth-sorted list at first, and using the place from the death-sorted list when a row has no birth date.
Here's a few problems with that query:
I didn't run it against lots of datasets, so I can't really guarantee it will work with any dataset;
I didn't check its performance, so it could be quite slow on large datasets.
This is the table I've used to write it, tested with MySQL 5.6.21 (I can't understand why, but SQL Fiddle is rejecting my scripts with a Create script error, so I can't provide you with a live example).
Table creation:
CREATE TABLE `people` (
`id` INT(11) NOT NULL AUTO_INCREMENT,
`name` VARCHAR(50) NOT NULL,
`birth` INT(11) NULL DEFAULT NULL,
`death` INT(11) NULL DEFAULT NULL,
PRIMARY KEY (`id`)
);
Data (I actually slightly changed yours):
INSERT INTO `people` (`name`, `birth`, `death`) VALUES ('Alice', 1800, NULL);
INSERT INTO `people` (`name`, `birth`, `death`) VALUES ('Bob', 1805, 1845);
INSERT INTO `people` (`name`, `birth`, `death`) VALUES ('Carol', NULL, 1847);
INSERT INTO `people` (`name`, `birth`, `death`) VALUES ('Don', 1820, 1846);
INSERT INTO `people` (`name`, `birth`, `death`) VALUES ('Esther', 1815, 1860);
you can use a subquery to pick a suitable birthdate for sorting purposes
and then a union to join with the records with a birthdate
for example:
select d1.name, null as birthdate, d1.deathdate, max(d2.birthdate) sort from
d as d1, d as d2
where d1.birthdate is null and d2.deathdate <=d1.deathdate
group by d1.name, d1.deathdate
union all
select name, birthdate, deathdate, birthdate from d
where birthdate is not null
order by 4
http://sqlfiddle.com/#!9/2d91c/1
Not sure if this will work, but worth a try (I can't test this on MySQL) so trying to guess:
order by case birth_date when null then death_date else birth_date end case