How to select the highest value for a given month? - mysql

+-------------------------------------------------+-----------------+---------------------+
| landing_page | all_impressions | dates |
+-------------------------------------------------+-----------------+---------------------+
| https://www.example.co.uk/url-1 | 53977 | 2018-08-19 13:59:40 |
| https://www.example.co.uk/url-1 | 610 | 2018-09-19 13:59:40 |
| https://www.example.co.uk/url-1 | 555 | 2018-10-19 13:59:40 |
| https://www.example.co.uk/url-1 | 23 | 2018-11-19 13:59:40 |
| https://www.example.co.uk/ | 1000 | 2018-06-19 13:59:40 |
| https://www.example.co.uk/ | 2 | 2018-07-19 13:59:40 |
| https://www.example.co.uk/ | 4 | 2018-08-19 13:59:40 |
| https://www.example.co.uk/ | 1563 | 2018-09-19 13:59:40 |
| https://www.example.co.uk/ | 1 | 2018-10-19 13:59:40 |
| https://www.example.co.uk/ | 9812 | 2018-11-19 13:59:40 |
+-------------------------------------------------+-----------------+---------------------+
With the above database table, I only want to select the landing_page if the impression count is the max for the current date - For example, from this, the select would return https://www.example.co.uk/ only as the current month it's all_impressions value is it's highest for November (https://www.example.co.uk/url-1  would not be selected as it's highest value was in August)
How might I do this with SQL?
index info:
mysql> show indexes from landing_pages_client_v3;
+-------------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| landing_pages_client_v3 | 0 | PRIMARY | 1 | id | A | 24279939 | NULL | NULL | | BTREE | | |
| landing_pages_client_v3 | 1 | profile_id | 1 | profile_id | A | 17 | NULL | NULL | YES | BTREE | | |
| landing_pages_client_v3 | 1 | profile_id | 2 | dates | A | 17 | NULL | NULL | | BTREE | | |
| landing_pages_client_v3 | 1 | profile_id_2 | 1 | profile_id | A | 17 | NULL | NULL | YES | BTREE | | |
| landing_pages_client_v3 | 1 | profile_id_2 | 2 | lp_id | A | 6069984 | NULL | NULL | YES | BTREE | | |
+-------------------------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

In a Derived Table, get the maximum value of all_impressions for every landing_page. Join back to the main table to get the row corresponding to maximum all_impressions value.
We will eventually consider that row only if it belongs to Current Month. For sargability, we will not use functions on the dates column. Instead, we will determine the first day of the current month and next month. We will consider those dates which fall within this range. You can check details of the datetime functions here: https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html
For performance, you may also need to add the following composite index: (landing_page, all_impressions, dates). (I am not sure about which order these columns should be in. Maybe some benchmarking/trial is needed.
SELECT
t.*
FROM
your_table AS t
JOIN
(
SELECT
landing_page,
MAX(all_impressions) AS max_all_impressions
FROM your_table
GROUP BY landing_page
) AS dt
ON dt.landing_page = t.landing_page AND
dt.max_all_impressions = t.all_impressions
WHERE
t.dates >= ((LAST_DAY(CURDATE()) + INTERVAL 1 DAY) - INTERVAL 1 MONTH) AND
t.dates < (LAST_DAY(CURDATE()) + INTERVAL 1 DAY)

You can try like this way to select the landing_page url and maximum value of the all_impressions column. To do that you've to use WHERE clause to check that your dates column value is the same month and year as the CURRENT_DATE number. SEE Date and Time Functions
SELECT landing_page,MAX(all_impressions)
FROM your_table_name_goes_here
WHERE MONTH(dates) = MONTH(CURRENT_DATE())
AND YEAR(dates) = YEAR(CURRENT_DATE())
OR
SELECT landing_page
FROM your_table_name_goes_here
WHERE MONTH(dates) = MONTH(CURRENT_DATE())
AND YEAR(dates) = YEAR(CURRENT_DATE())
ORDER BY all_impressions DESC LIMIT 1

In mysql. you can do like this.
SELECT landing_page,MAX(all_impressions) AS max_count
FROM your_table_name_goes_here
WHERE MONTH(dates) = MONTH(NOW()) AND YEAR(dates) = YEAR(NOW())
GROUP BY landing_page ORDER BY max_count DESC LIMIT 1

Related

Count null from joined table in MySQL

I need a count of NULL from 2 tables that are joined in MySQL. Sample data like this:
datefield FROM TABLE calendar (contain dates from start to end of this year)
-----------
TABLE value (data stored)
+------------+-------+
| date | keter |
+------------+-------+
| 2021-08-01 | 11 |
| 2021-08-04 | 0 |
| 2021-08-07 | 20 |
| 2021-08-08 | 15 |
| 2021-08-11 | 0 |
+------------+-------+
I am using the following query to combine and display data from calendar and value tables.
SELECT datefield,keter FROM calendar
LEFT JOIN kehadiran ON datefield=tgl AND id_kar IN ('110101')
WHERE datefield BETWEEN '2021-08-01' AND '2021-08-15' GROUP BY datefield;
result :
+------------+-------+
| datefield | keter |
+------------+-------+
| 2021-08-01 | 11 |
| 2021-08-02 | NULL |
| 2021-08-03 | NULL |
| 2021-08-04 | 0 |
| 2021-08-05 | NULL |
| 2021-08-06 | NULL |
| 2021-08-07 | 20 |
| 2021-08-08 | 15 |
| 2021-08-09 | NULL |
| 2021-08-10 | NULL |
| 2021-08-11 | 0 |
| 2021-08-12 | NULL |
| 2021-08-13 | NULL |
| 2021-08-14 | NULL |
| 2021-08-15 | NULL |
+------------+-------+
I use query based on this question (3 table join counting nulls), I didn't get the result I wanted. The query is this :
SELECT SUM(k.keter) FROM kehadiran k
LEFT OUTER JOIN calendar c ON c.datefield = k.keter AND id_kar IN ('110101')
WHERE datefield BETWEEN '2021-08-01' AND '2021-08-12' AND k.keter is NULL;
result:
+--------------+
| SUM(k.keter) |
+--------------+
| NULL |
+--------------+
the result i wanted :
+--------------+
| SUM(k.keter) |
+--------------+
| 10 |
+--------------+
How should I count NULL from the joined table as mentioned above?
You swapped the tables in your last query which is incorrect. Use the query that worked and use COUNT(*) with WHERE right_table.any_notnull_column IS NULL:
SELECT COUNT(*)
FROM calendar
LEFT JOIN kehadiran k ON datefield=tgl AND id_kar IN ('110101')
WHERE datefield BETWEEN '2021-08-01' AND '2021-08-15'
AND k.keter is NULL
You sum up NULL in your query. And a Sum of NULL is NULL. You should just replace SUM(k.keter) with COUNT(k.keter)
See for a small example
To count NULLs, you can use:
SUM(k.keter IS NULL)
Or:
COUNT(*) - COUNT(k.keter)

Comparing Row Values Based On User MySQL

I thought every MySQL query question would be answered by now, but I have a query that I can't find a good answer to.
I would like to run a query analyzing the differences between rows in one table, based on each SecState column and Timestamp. So loop through all rows, find the next (or previous) reading based on the Timestamp of that SecState and get the difference between the Timestamp, Location, and Days readings.
Input Table Example:
+----------+---------------+-------------+-------------+
| SecState | Timestamp | Location | Days |
+----------+---------------+-------------+-------------+
| 1 | 1574614810000 | 0.030520002 | 0.068209626 |
| 2 | 1574614810000 | 0.000491507 | 0.000124721 |
| 1 | 1574614780000 | 0.030519481 | 0.068209626 |
| 2 | 1574614780000 | 0.000491507 | 0.000124721 |
| 3 | 1574614752000 | 1 | 1 |
| 3 | 1574614731000 | 1 | 1 |
| 1 | 1574614750000 | 0.03051896 | 0.068209626 |
| 2 | 1574614750000 | 0.000491493 | 0.000124721 |
| 1 | 1574614720000 | 0.030518439 | 0.068206906 |
| 2 | 1574614720000 | 0.00049148 | 0.000124721 |
| 1 | 1574614690000 | 0.030517918 | 0.068206906 |
| 2 | 1574614690000 | 0.00049148 | 0.000124721 |
| 3 | 1574614671000 | 1 | 1 |
| 3 | 1574614631000 | 1 | 1 |
| 3 | 1574614571000 | 1 | 1 |
| 1 | 1574614660000 | 0.030517397 | 0.068206906 |
| 2 | 1574614660000 | 0.000491467 | 0.000124721 |
| 1 | 1574614630000 | 0.030516876 | 0.068206906 |
+----------+---------------+-------------+-------------+
Thanks!
If you are running MySQL 8.0, you can use window function lead() to access columns of the next record.
Something like this should be what you want:
select
t.*,
lead(Timestamp) over(partition by SecState order by Timestamp)
- Timestamp TimestampDiff,
lead(Location) over(partition by SecState order by Timestamp)
- Location LocationDiff,
lead(Days) over(partition by SecState order by Timestamp)
- Days DaysDiff
from mytable t
In earlier versions, you can self-join the table and use a correlated subquery with a not exists condition to locate the next record, like so:
select
t.*,
t1.Timestamp - t.Timestamp TimestampDiff,
t1.Location - t.Location LocationDiff,
t1.Days - t.Days DaysDiff
from mytable t
left join mytable t1
on t1.SecState = t.SecState
and t1.Timestamp > t.Timestamp
and not exists (
select 1
from mytable t2
where
t2.SecState = t.SecState
and t2.Timestamp > t.Timestamp
and t2.Timestamp < t1.Timestamp
)

MySQLl key-value store ordering with specific condition

I have the following structure:
+----------+--------+---------------------+
| id| gr_id| name | value |
+----------+--------+---------------------+
| 1 | 11 | name | Burro |
| 2 | 11 | submit | 2019/05/10 |
| 3 | 11 | date | 2019/05/17 |
| 4 | 12 | name | Ajax |
| 5 | 12 | submit | 2019/05/10 |
| 6 | 12 | date | 2019/05/18 |
+----------+--------+---------------------+
I have to order it by the date(if the name is date), from highest to lowest date, also it has to keep the groups (gr_id) without mixing the elments.
The desired result would look like this:
+----------+--------+---------------------+
| id| gr_id| name | value |
+----------+--------+---------------------+
| 4 | 12 | name | Ajax |
| 5 | 12 | submit | 2019/05/10 |
| 6 | 12 | date | 2019/05/18 |
| 1 | 11 | name | Burro |
| 2 | 11 | submit | 2019/05/10 |
| 3 | 11 | date | 2019/05/17 |
+----------+--------+---------------------+
How can i implement this?
You'll have to associate the group ordering criteria with all the elements of the group. You can do it through a subquery, or a join.
Subquery version:
SELECT t.*
FROM (SELECT gr_id, value as `date` FROM t WHERE `name` = 'date') AS grpOrder
INNER JOIN t ON grpOrder.gr_id = t.gr_id
ORDER BY grpOrder.`date`
, CASE `name`
WHEN 'name' THEN 1
WHEN 'submit' THEN 2
WHEN 'date' THEN 3
ELSE 4
END
Join version:
SELECT t1.*
FROM t AS t1
INNER JOIN AS t2 ON t1.gr_id = t2.gr_id AND t2.`name` = 'date'
ORDER BY t2.value
, CASE t1.`name`
WHEN 'name' THEN 1
WHEN 'submit' THEN 2
WHEN 'date' THEN 3
ELSE 4
END

Select IDs (same table) that don't have any records/rows after a specific date

I have a table, let's say, like this one:
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
| user_id | mortgage_id | value | classification | created_at |
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
| 6c1e1f12-2e5d-488d-b02d-29fcffe783f2 | 1e76bcbb-70ee-4966-87fd-1d6024a04513 | 0 | initial | 2014-08-23 14:25:42 |
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
| 49dc3dab-d2d0-400b-b964-71e03339d475 | 59366911-f1a8-4a8c-b7ea-c3257d04478e | 1 | created | 2015-08-23 14:26:11 |
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
| 76ce889b-2f2c-435f-8754-7c5ec15cbfcb | b962e26b-1ba6-4547-8eb8-167989a0705e | 5 | created | 2016-08-23 14:26:11 |
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
| 5d9f1892-05c0-4b0a-b5d9-a501595fa351 | fb4be36e-e156-4c1b-bd40-422d30646f8e | 8 | created | 2016-08-23 14:26:11 |
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
| 49dc3dab-d2d0-400b-b964-71e03339d475 | 2cee0bc7-744f-4f51-a094-f5eb66ac482e | 2 | created | 2017-08-23 14:26:11 |
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
| 76ce889b-2f2c-435f-8754-7c5ec15cbfcb | b0d27c9e-907c-43df-abd2-5772785cb91c | 0 | created | 2017-08-23 14:26:11 |
|--------------------------------------+--------------------------------------+--------+----------------+---------------------|
I'm trying to fetch/get all the distinct/unique user_ids that don't have any records from a given moment in time and onwards.
For instance, if I choose that "time frame" to be: After 2017-01-01 00:00:00, the return would be:
|--------------------------------------+
| user_id |
|--------------------------------------+
| 6c1e1f12-2e5d-488d-b02d-29fcffe783f2 |
|--------------------------------------+
| 5d9f1892-05c0-4b0a-b5d9-a501595fa351 |
|--------------------------------------+
I have this query, but I think there should be a better way to do this:
SET #timestamp = '2017-01-01 00:00:00';
SELECT DISTINCT user_id
FROM mortgages
WHERE user_id NOT IN (SELECT DISTINCT user_id FROM mortgages WHERE created_at > #timestamp);
I would use group by:
select user_id
from mortgages
group by user_id
having max(created_at) <= #timestamp;

How can I treat with NULL as minimum value?

I have a table like this:
// notifications
+----+-----------+-------+---------+---------+------+
| id | score | type | post_id | user_id | seen |
+----+-----------+-------+---------+---------+------+
| 1 | 15 | 1 | 2342 | 342 | 1 |
| 2 | 5 | 1 | 2342 | 342 | 1 |
| 3 | NULL | 2 | 5342 | 342 | 1 |
| 4 | -10 | 1 | 2342 | 342 | NULL |
| 5 | 5 | 1 | 2342 | 342 | NULL |
| 6 | NULL | 2 | 8342 | 342 | NULL |
| 7 | -2 | 1 | 2342 | 342 | NULL |
+----+-----------+-------+---------+---------+------+
-- type: 1 means "it is a vote", 2 means "it is a comment (without score)"
Here is my query:
SELECT SUM(score), type, post_id, seen
FROM notifications
WHERE user_id = 342
GROUP BY type, post_id
ORDER BY (seen IS NULL) desc
As you see, there is SUM() function, Also both type and post_id columns are in the GROUP BY statement. Well now I'm talking about seen column. I don't want to put it into GROUP BY statement. So I have to use either MAX() or MIN() for it. Right?
Actually I need to select NULL as seen column (in query above) if there is even one row which has seen = NULL. My current query selects 1 as seen's value, even when I use MIN(seen). So why 1 is minimum when there is NULL?
Also I want to order rows so that all SEEN = NULL be in the top of list. How can I do that?
Expected result:
// notifications
+-----------+-------+---------+------+
| score | type | post_id | seen |
+-----------+-------+---------+------+
| 13 | 1 | 2342 | NULL |
| NULL | 2 | 8342 | NULL |
| NULL | 2 | 5342 | 1 |
+-----------+-------+---------+------+
You could do this
case when sum(seen is null) > 0
then null
else min(seen)
end
You could use the following query:
SELECT SUM(score), type, post_id, min(IFNULL(seen, 0)) as seen
FROM notifications
WHERE user_id = 342
GROUP BY type, post_id
ORDER BY seen desc