Count Values with same Month Timestamp - mysql

I am struggling to count all the values that have the same Timestamp. This is how my database looks like:
Let's say I would like to get the amount of orders in May 2013. What is the right Syntax to get this done?

To get a count for a timestamp range, we can compare the timestamp column to a lower and upper bounds, for example:
SELECT COUNT(*)
FROM mytable t
WHERE t.orderdate >= '2013-05-01 00:00:00'
AND t.orderdate < '2013-06-01 00:00:00'
(All orders on or after the first second of May 1st AND before the first second of June.)
We can also do a similar comparison in an expression in the SELECT list, a conditional aggregation pattern:
SELECT SUM(IF(t.orderdate >= '2013-05-01' AND t.orderdate < '2013-06-01',1,0)) AS cnt_may
FROM mytable t
equivalently
SELECT SUM(CASE WHEN DATE_FORMAT(t.orderdate,'%Y-%m') = '2013-05' THEN 1 ELSE 0 END) AS cnt_may
FROM mytable t
Note that the first query (with conditions in the WHERE clause on the bare orderdate column) can take advantage of an index that has orderdate as the leading column, to perform an efficient range scan operation.

Related

SQL - empty result set for checking if date is past the current date

I have a query in MariaDB 10.3 database where there is a field called "expiration_date" that stores a unix timestamp, but if there is no data in the field the default is set to "0".
I'm trying to use a WHERE clause to check the current date against the expiration_date to filter out any records that are past the expiration_date. Below is what I have.
SELECT entry_id, title, (CASE WHEN expiration_date = "0" THEN CURDATE() + INTERVAL 1 DAY ELSE FROM_UNIXTIME(expiration_date, "%Y-%m-%d") END) AS expiration_date
FROM channel_titles
WHERE CURDATE() < expiration_date
This returns and empty result set... what am I missing?
There's a very simple solution to this and it only requires you to change two things from your original query:
The first part is your column (CASE expression) alias - you should define your alias with something not similar to any of the column names present in the table. From your query, you have a column expiration_datein your table and you also set an alias for your CASE expression with expiration_date as well and since you're using WHERE, the query will definitely do the lookup based on your table expiration_date column instead of your CASE expression. Rename that alias to something like exp_date... but doing WHERE exp_date ... will return you an error. Refer to the second point below.
The second part is your WHERE - since you're doing lookup from a CASE expression (or perhaps custom generated value/column) with newly assigned alias of exp_date, you can't use it in WHERE.. well, unless you make the query as a subquery/derived table then do the WHERE outside.. but you don't need to. You only need to change WHERE to HAVING and you should be able to use the exp_date and get your result.
So, with those two changes, your query should be something like this:
SELECT entry_id, title,
(CASE WHEN expiration_date = "0" THEN CURDATE() + INTERVAL 1 DAY ELSE
FROM_UNIXTIME(expiration_date, "%Y-%m-%d") END) AS exp_date
FROM channel_titles
HAVING CURDATE() < exp_date;
demo fiddle
You're trying to use an alias of expiration_date from your CASE statement in your WHERE clause.
Two problems with this:
You cannot use column aliases in the WHERE clause. Refer to this post here.
WHERE happens before SELECT in the execution chain.
Your alias matches an actual column name in your table, so your
WHERE clause is not throwing an error regarding your alias, its
comparing the current date to the expiration_date column in the table,
thus, throwing off your expected result.
Solutions:
If you want to use the alias in your WHERE clause, there are a few options for you to force SQL to handle the SELECT before the WHERE clause.
You can use a subquery (or subselect) to force logical order of
operation by using parentheses:
SELECT
a.entry_id,
a.title,
a.expiration_date
FROM
(SELECT
entry_id,
title,
(CASE WHEN expiration_date = 0 THEN CURDATE() + INTERVAL 1 DAY ELSE FROM_UNIXTIME(expiration_date, '%Y-%m-%d') END) AS expiration_date
FROM channel_titles
) a
WHERE CURDATE() < a.expiration_date
You can declare your alias in a Common Table Expression (CTE), then SELECT it FROM the CTE:
WITH cte AS (SELECT
entry_id,
title,
(CASE WHEN expiration_date = 0 THEN CURDATE() + INTERVAL 1 DAY ELSE FROM_UNIXTIME(expiration_date, '%Y-%m-%d') END) AS expiration_date
FROM channel_titles)
SELECT
entry_id,
title,
expiration_date
FROM cte
WHERE CURDATE() < expiration_date
You can disregard using your alias entirely in your WHERE clause and plug in the logic from your SELECT statement directly into your WHERE clause. However, this may appear redundant from a readability perspective; also, extra processing should be considered when using this approach as well, but if you have a small data set this method will work just fine:
SELECT
entry_id,
title,
(CASE WHEN expiration_date = 0 THEN CURDATE() + INTERVAL 1 DAY ELSE FROM_UNIXTIME(expiration_date, '%Y-%m-%d') END) AS expiration_date
FROM channel_titles
WHERE CURDATE() < (CASE WHEN expiration_date = 0 THEN CURDATE() + INTERVAL 1 DAY ELSE FROM_UNIXTIME(expiration_date, '%Y-%m-%d') END)
Input:
entry_id
title
expiration_date
expiration_date_date
1
test1
1695513600
2023-09-24
2
test2
0
2022-09-15
3
test3
1662768000
2022-09-10
Output:
entry_id
title
expiration_date
1
test1
2023-09-24
2
test2
2022-09-15
db<>fiddle here.

Avg function not returning proper value

I expect this query to give me the avg value from daily active users up to date and grouped by month (from Oct to December). But the result is 164K aprox when it should be 128K. Why avg is not working? Avg should be SUM of values / number of current month days up to today.
SELECT sq.month_year AS 'month_year', AVG(number)
FROM
(
SELECT CONCAT(MONTHNAME(date), "-", YEAR(DATE)) AS 'month_year', count(distinct id_user) AS number
FROM table1
WHERE date between '2020-10-01' and '2020-12-31 23:59:59'
GROUP BY EXTRACT(year_month FROM date)
) sq
GROUP BY 1
Ok guys thanks for your help. The problem was that on the subquery I was pulling the info by month and not by day. So I should pull the info by day there and group by month in the outer query. This finally worked:
SELECT sq.day_month, AVG(number)
FROM (SELECT date(date) AS day_month,
count(distinct id_user) AS number
FROM table_1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY 1
) sq
GROUP BY EXTRACT(year_month FROM day_month)
Do not use single quotes for column aliases!
SELECT sq.month_year, AVG(number)
FROM (SELECT CONCAT(MONTHNAME(date), '-', YEAR(DATE)) AS month_year,
count(distinct id_user) AS number
FROM table1
WHERE date >= '2020-10-01' AND
date < '2021-01-01'
GROUP BY month_year
) sq
GROUP BY 1;
Note the fixes to the query:
The GROUP BY uses the same columns as the SELECT. Your query should return an error (although it works in older versions of MySQL).
The date comparisons have been simplified.
No single quotes on column aliases.
Note that the outer query is not needed. I assume it is there just to illustrate the issue you are having.

Multiple COUNT() conditions for values on either side of a range

I currently have a query that finds all rows (with status=0) that have occurred before now:
SELECT id, COUNT(1) FROM tbl WHERE status = 0 AND date < UNIX_TIMESTAMP() GROUP BY id;
However, now I'd also like to be able to retrieve the values on the other side of this--i.e., I want to get all dates available after and before now, as two distinct values.
Is there any way to optimize this besides simply running two separate queries?
SELECT id
, SUM(date < UNIX_TIMESTAMP()) AS BeforeNow
, SUM(date > UNIX_TIMESTAMP()) AS AfterNow
FROM tbl
WHERE status = 0
GROUP BY id;
date < UNIX_TIMESTAMP() is a boolean expression, which equates to 1 or 0. The SUM of the expression is equal to the amount of times it was true, or its count.
You can do a conditional count.
SELECT id,
COUNT(CASE WHEN date < UNIX_TIMESTAMP() THEN 1 ELSE null END ) ,
COUNT(CASE WHEN date > UNIX_TIMESTAMP() THEN 1 ELSE null END )
FROM tbl GROUP BY id

mySQL query with HAVING gives me an error. How to fix it?

When I run this query I have this error message on phpmydamin: Unknown column 'timestamp' in 'having clause'
My column name is timestamp
SELECT DISTINCT (
hash
) AS total
FROM behaviour
HAVING total =1 and date(timestamp) = curdate()
How to get the number of hash for today?
Use where. And parentheses are not appropriate for select distinct (distinct is not a function). I suspect that you intend:
SELECT COUNT(DISTINCT hash) AS total
FROM behaviour
WHERE date(timestamp) = curdate();
It is better to write the WHERE clause without using a function on the column:
SELECT COUNT(DISTINCT hash) AS total
FROM behaviour
WHERE timestamp >= curdate() AND timestamp < date_add(curdate, interval 1 day);
Although more complicated, it allows the database engine to use an index on behaviour(timestamp) (or better yet, on behaviour(timestamp, hash).
EDIT:
If you want the hash that only appear once, one method is a subquery:
select count(*)
from (select hash
from behaviour
where timestamp >= curdate() AND timestamp < date_add(curdate, interval 1 day)
group by hash
having count(*) = 1
);
To count the hash values only existing once:
select count(*)
from
(
select hash
from behavior
where date(timestamp) = curdate()
group by hash
having count(*) = 1
) dt
The inner select (derived table) will return the hash values only existing once. The outer select will count those rows.

MySQL Query not selecting correct date range

Im currently trying to run a SQL query to export data between a certain date, but it runs the query fine, just not the date selection and i can't figure out what's wrong.
SELECT
title AS Order_No,
FROM_UNIXTIME(entry_date, '%d-%m-%Y') AS Date,
status AS Status,
field_id_59 AS Transaction_ID,
field_id_32 AS Customer_Name,
field_id_26 AS Sub_Total,
field_id_28 AS VAT,
field_id_31 AS Discount,
field_id_27 AS Shipping_Cost,
(field_id_26+field_id_28+field_id_27-field_id_31) AS Total
FROM
exp_channel_data AS d NATURAL JOIN
exp_channel_titles AS t
WHERE
t.channel_id = 5 AND FROM_UNIXTIME(entry_date, '%d-%m-%Y') BETWEEN '01-05-2012' AND '31-05-2012' AND status = 'Shipped'
ORDER BY
entry_date DESC
As explained in the manual, date literals should be in YYYY-MM-DD format. Also, bearing in mind the point made by #ypercube in his answer, you want:
WHERE t.channel_id = 5
AND entry_date >= UNIX_TIMESTAMP('2012-05-01')
AND entry_date < UNIX_TIMESTAMP('2012-06-01')
AND status = 'Shipped'
Besides the date format there is another issue. To effectively use any index on entry_date, you should not apply functions to that column when you use it conditions in WHERE, GROUP BY or HAVING clauses (you can use the formatting in SELECT list, if you need a different than the default format to be shown). An effective way to write that part of the query would be:
( entry_date >= '2012-05-01'
AND entry_date < '2012-06-01'
)
It works with DATE, DATETIME and TIMESTAMP columns.