I have a query that SUMS different values and then orders the results using ORDER BY.
Whenever I format the result using FORMAT I get a different ordering than without format.
For example:
Ordering without format: 2827.0000, 1668.0000, 663.1000
Ordering with format: 663.10, 2,827.00, 1,668.00
What could be causing this behaviour?
This is the full query:
SELECT
FORMAT( ( (Sum(CASE WHEN YEAR(order_date) = 2015 THEN total END) / 100) - (SELECT COALESCE( ( SUM(total) / 100), 0)
FROM returns WHERE customer = orders.customer AND YEAR(return_dat) = 2015) ), 2) AS anual
FROM orders
WHERE 1 GROUP BY customer ORDER BY anual DESC
Ordering formatted strings is going to result in ASCII-abetical sorting. If you want them sorted numerically, you'll need to have two columns, formatted and unformatted. Keep in mind this is usually best done in your application layer.
Related
I have a table containing stock market data (open, hi, lo, close prices) but in a random order of date:
Date Open Hi Lo Close
12/10/2019 313.82 314.54 312.81 313.58
11/22/2019 311.09 311.24 309.85 310.96
11/25/2019 311.98 313.37 311.98 313.37
11/26/2019 313.41 314.28 313.06 314.08
11/27/2019 314.61 315.48 314.37 315.48
11/29/2019 314.86 315.13 314.06 314.31
12/2/2019 314.59 314.66 311.17 311.64
12/3/2019 308.65 309.64 307.13 309.55
I have another value in a PHP variable (say $BaseValue),and a start date and end date ($startdt and $enddt).
1) My requirement is to pick-up the value from the HI column, if it exceeds the $BaseValue on the very FIRST date in a chronological order between the given start and end dates.
For example, if the $BaseValue=314, startdt=11/22, enddt=12/2, then I want to retrieve the Date (11/26/19) as it is the earliest date on which the Hi value (314.28) exceeded the $Basevalue within the given date range. The select statement should return both the Hi value (314.28) and the Date (11/26/19).
2) Additionally, I also need to retrieve the HIGHEST value and date from the HI column during the given date duration. In the above scenario, it should return 315.48 and corresponding date 11/27.
The table is NOT in a chronological order - its randomly filled.
I am unable to get the first query at all with the use of MAX function and its various combinations. Makes me wonder if that is possible at all in SQL or not.
While the second is straightforward, I was wondering if it is more efficient and less complex to club the two queries and get the four values in one single shot.
Any ideas on how can I approach the need to fulfill this requirement please?
Thanks
You could use two subqueries for filtering, one per criteria, like:
select t.*
from mytable t
where
t.date = (
select min(t1.date)
from mytable t1
where t1.date between :datedt and :enddt and t1.hi >= :basevalue
)
or t.hi = (
select max(t1.hi)
from mytable t1
where t1.date between datedt and :enddt and t1.hi >= :basevalue
)
Another option is to union two queries with orer by and limit:
(
select t.*
from mytable
where t.date between :datedt and :enddt and t1.hi >= :basevalue
order by t.date
limit 1
)
union
(
select t.*
from mytable t
where t.date between :datedt and :enddt and t1.hi >= :basevalue
order by t.hi desc, t.date
limit 1
)
Please note that both queries do not do exactly the same thing. If there are ties for the highest hi in the period, the first query will return all ties, while the second will pick the earliest one. It's up to you to decide which solution better fits your use case.
I am trying to fetch the count of records entered in each month of the financial year
For example, I have declared a column called issue in varchar because the data what I am taking is issues of the particular machine. And for example, let's say one issue is raised in July month I enter the data as 'Jul 19-1' and the again issue is raised in the month of September again I go back to the issue happened in July and enter the data as 'sep19-2'.
So in the backend, it takes as jul19-1 sep19-2
What can be the query that I can write for counting the number of issues raised in each month
I tried the below query but
SELECT COUNT(month_nc)
FROM `ncr`
WHERE month_nc='Jul18-1'
In some months there will be only one issue so I can the count of the month given in the above query
What will be the query if I want to fetch the count of each month
id issue issue_month
1 bearing jul18-1
sep18-2
2 motor jul18-2
3 battery apr18-3
ps: issue_month is declared in varchar(10)
Here are two methods. One using strings:
select left(issue_month, 5), count(*)
from t
group by left(issue_month, 5), count(*)
This will not order the values correctly.
You can convert to a date to order properly:
order by str_to_date(concat('01', left(issue_month, 5)), '%d%b%y')
Or, represent the dates correctly:
select str_to_date(concat('01', left(issue_month, 5)), '%d%b%y') as yyyymm, count(*)
from t
group by yyyymm
order by yyyymm;
Here is what you can do to split your issue_month into "month_year" and "issue_count"
yourTable
select id,
issue,
issue_month,
REGEXP_SUBSTR(issue_month, '[^-]+', 1) as month_year,
REGEXP_SUBSTR(issue_month, '[^-]+', 1,2 ) as issue_count
from yourTable;
Now you can aggregate the issue_count across issues or year_months or any other field in your table.
For example, to get the sum of all the issues for any given month_year
select
month_year,
sum(issue_count) issue_count
from
(select
id, issue, issue_month,
REGEXP_SUBSTR(issue_month, '[^-]+', 1) as month_year,
REGEXP_SUBSTR(issue_month, '[^-]+', 1,2 ) as issue_count
from yourTable) foo
group by month_year;
I have users and orders tables with this structure (simplified for question):
USERS
userid
registered(date)
ORDERS
id
date (order placed date)
user_id
I need to get array of users (array of userid) who placed their 25th order during specified period (for example in May 2019), date of 25th order for each user, number of days to place 25th order (difference between registration date for user and date of 25th order placed).
For example if user registered in April 2018, then placed 20 orders in 2018, and then placed 21-30th orders in Jan-May 2019 - this user should be in this array, if he placed 25th (overall for his account) order in May 2019.
How I can do this with MySQL request?
Sample data and structure: http://www.sqlfiddle.com/#!9/998358 (for testing you can get 3rd order as ex., not 25th, to not add a lot of sample data records).
One request is not required - if this can't be done in one request, few is possible and allowed.
You can use a correlated subquery to get the count of orders placed before the current one by a user. If that's 24 the current order is the 25th. Then check if the date is in the desired range.
SELECT o1.user_id,
o1.date,
datediff(o1.date, u1.registered)
FROM orders o1
INNER JOIN users u1
ON u1.userid = o1.user_id
WHERE (SELECT count(*)
FROM orders o2
WHERE o2.user_id = o1.user_id
AND o2.date < o1.date
OR o2.date = o1.date
AND o2.id < o1.id) = 24
AND o1.date >= '2019-01-01'
AND o1.date < '2019-06-01';
The basic inefficient way of doing this would be to get the user_id for every row in ORDERS where the date is in your target range AND the count of rows in ORDERS with the same user_id and a lower date is exactly 24.
This can get very ugly, very quickly, though.
If you're calling this from code you control, can't you do it from the code?
If not, there should be a way to assign to each row an index describing its rank among orders for its specific user_id, and select from this all user_id from rows with an index of 25 and a correct date. This will give you a select from select from select, but it should be much faster. The difficulty here is to control the order of the rows, so here are the selects I envision:
Select all rows, order by user_id asc, date asc, union-ed to nothing from a table made of two vars you'll initialize at 0.
from this, select all while updating a var to know if a row's user_id is the same as the last, and adding a field that will report so (so for each user_id the first line in order will have a specific value like 0 while the other rows for the same user_id will have a 1)
from this, select all plus a field that equals itself plus one in case the first added field is 1, else 0
from this, select the user_id from the rows where the second added field is 25 and the date is in range.
The union thingy is only necessary if you need to do it all in one request (you have to initialize them in a lower select than the one they're used in).
Edit: Well if you need the date too you can just select it along with the user_id, but calculating the number of days in sql will be a pain. Just join the result table to the users table and get both the date of 25th order and their date of registration, you'll surely be able to do the difference in code.
I'll try building an actual request, however if you want to truly understand what you need to make this you gotta read up on mysql variables, unions, and conditional statements.
"Looks too complicated. I am sure that this can be done with current DB structure and 1-2 requests." Well, yeah. Use the COUNT request, it will be easy, and slow as hell.
For the complex answer, see http://www.sqlfiddle.com/#!9/998358/21
Since you can use multiple requests, you can just initialize the vars first.
It isn't actually THAT complicated, you just have to understand how to concretely express what you mean by "an user's 25th command" to a SQL engine.
See http://www.sqlfiddle.com/#!9/998358/24 for the difference in days, turns out there's a method for that.
Edit 5: seems you're going with the COUNT method. I'll pray your DB is small.
Edit 6: For posterity:
The count method will take years on very large databases. Since OP didn't come back, I'm assuming his is small enough to overlook query speed. If that's not your case and let's say it's 10 years from now and the sqlfiddle links are dead; here's the two-queries solution:
SET #PREV_USR:=0;
SELECT user_id, date_ FROM (
SELECT user_id, date_, SAME_USR AS IGNORE_SMUSR,
#RANK_USR:=(CASE SAME_USR WHEN 0 THEN 1 ELSE #RANK_USR+1 END) AS RANK FROM (
SELECT orders.*, CASE WHEN #PREV_USR = user_id THEN 1 ELSE 0 END AS SAME_USR,
#PREV_USR:=user_id AS IGNORE_USR FROM
orders
ORDER BY user_id ASC, date_ ASC, id ASC
) AS DERIVED_1
) AS DERIVED_2
WHERE RANK = 25 AND YEAR(date_) = 2019 AND MONTH(date_) = 4 ;
Just change RANK = ? and the conditions to fit your needs. If you want to fully understand it, start by the innermost SELECT then work your way high; this version fuses the points 1 & 2 of my explanation.
Now sometimes you will have to use an API or something and it wont let you keep variable values in memory unless you commit it or some other restriction, and you'll need to do it in one query. To do that, you put the initialization one step lower and make it so it does not affect the higher statements. IMO the best way to do this is in a UNION with a fake table where the only row is excluded. You'll avoid the hassle of a JOIN and it's just better overall.
SELECT user_id, date_ FROM (
SELECT user_id, date_, SAME_USR AS IGNORE_SMUSR,
#RANK_USR:=(CASE SAME_USR WHEN 0 THEN 1 ELSE #RANK_USR+1 END) AS RANK FROM (
SELECT DERIVED_4.*, CASE WHEN #PREV_USR = user_id THEN 1 ELSE 0 END AS SAME_USR,
#PREV_USR:=user_id AS IGNORE_USR FROM
(SELECT * FROM orders
UNION
SELECT * FROM (
SELECT (#PREV_USR:=0) AS INIT_PREV_USR, 0 AS COL_2, 0 AS COL_3
) AS DERIVED_3
WHERE INIT_PREV_USR <> 0
) AS DERIVED_4
ORDER BY user_id ASC, date_ ASC, id ASC
) AS DERIVED_1
) AS DERIVED_2
WHERE RANK = 25 AND YEAR(date_) = 2019 AND MONTH(date_) = 4 ;
With that method, the thing to watch for is the amount and the type of columns in your basic table. Here orders' first field is an int, so I put INIT_PREV_USR in first then there are two more fields so I just add two zeroes with names and call it a day. Most types work, since the union doesn't actually do anything, but I wouldn't try this when your first field is a blob (worst comes to worst you can use a JOIN).
You'll note this is derived from a method of pagination in mysql. If you want to apply this to other engines, just check out their best pagination calls and you should be able to work thinks out.
I have a "transaction" table that has the following columns
ID TIMESTAMP USER ID DESCRIPTION AMOUNT REF_ID TYPE
The description column contains the payment platform used for example "STRIPE-ch_1745". We currently have 4 platforms all described in the reference as in the example above. What I want is to get the payment platform, the total amount processed by the platform and the count of transactions. Like this
Platform Amount Count
Stripe 100,000 78
iOS 78,000 50
My current code only gives me these values for one platform, I've been unable to structure this properly to give me the desired result. I assumed I needed nested select statements, so I wrote the code in that manner
SELECT txn_count, sum
FROM
(SELECT count(*) AS txn_count, sum(`transaction`.`amount`) AS `sum`
FROM `transaction`
WHERE (`transaction`.`type` = 'credit'
AND (`transaction`.`description` like 'stripe%')
AND str_to_date(concat(date_format(`transaction`.`timestamp`, '%Y-%m'), '-01'), '%Y-%m-%d') = str_to_date(concat(date_format(now(), '%Y-%m'), '-01'), '%Y-%m-%d'))) t1
What this gives me right now is
Txn Count Sum
311 501,000
Would appreciate some help on how to get the expected table
Try this : ( edited to remove the reference part, assuming the reference is always separated by the platform by '-' )
SELECT
LEFT(t.description,LOCATE('-',t.description) - 1) as 'Platform',
SUM(t.amount) as 'Amount',
COUNT(*) as 'Count'
FROM transaction t
GROUP BY Platform
Say I have this .csv file which holds data that describes sales of a product. Now say I want a monthly breakdown of number of sales. I mean I wanna see how many orders were received in JAN2005, FEB2005...JAN2008, FEB2008...NOV2012, DEC2012.
Now one very simply way I can think of is count them one by one like this. (BTW I am using logparser to run my queries)
logparser -i:csv -o:csv "SELECT COUNT(*) AS NumberOfSales INTO 'C:\Users\blah.csv' FROM 'C:\User\whatever.csv' WHERE OrderReceiveddate LIKE '%JAN2005%'
My question is if there is a smarter way to do this. I mean, instead of changing the month again and again and running my query, can I write one query which can produce the result in one excel all at one.
Yes.
If you add a group by clause to the statement, then the sql will return a separate count for each unique value of the group by column.
So if you write:
SELECT OrderReceiveddate, COUNT(*) AS NumberOfSales INTO 'C:\Users\blah.csv'
FROM `'C:\User\whatever.csv' GROUP BY OrderReceiveddate`
you will get results like:
JAN2005 12
FEB2005 19
MAR2005 21
Assuming OrderReceiveDate is a date, you would format the date to have a year and month and then aggregate:
SELECT date_format(OrderReceiveddate, '%Y-%m') as YYYYMM, COUNT(*) AS NumberOfSales
INTO 'C:\Users\blah.csv'
FROM 'C:\User\whatever.csv'
WHERE OrderReceiveddate >= '2015-01-01'
GROUP BY date_format(OrderReceiveddate, '%Y-%m')
ORDER BY YYYYMM
You don't want to use like on a date column. like expects string arguments. Use date functions instead.