How can I select rows newer than a week? - mysql

Using MariaDB 10, I'd like to query article table for the past week articles:
Here is my query:
SELECT * FROM article WHERE category="News" AND created_at < NOW() - INTERVAL 1 WEEK ORDER BY created_at DESC;
But it returns all articles instead.
explain article ;
+-------------+-----------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+-----------------+------+-----+-------------------+----------------+
| id | int(6) unsigned | NO | PRI | NULL | auto_increment |
| title | varchar(150) | NO | | NULL | |
| content | mediumtext | NO | | NULL | |
| created_at | timestamp | NO | | CURRENT_TIMESTAMP | |
| category | varchar(64) | NO | | test | |
How can I achieve this?

The logic is backwards. You want > not <:
SELECT a.*
FROM article a
WHERE category = 'News' AND
created_at > NOW() - INTERVAL 1 WEEK
ORDER BY created_at DESC;
For performance, you would want an index on article(category, created_at).

Related

SELECT Rows Older Than Date Only If Row Does Not Have Row Newer Than Date

I have the below table where I would like to 'prune' out nicks who have not gained points in 1 week. I'm new to MySQL and I'm not sure how to best SELECT these rows. Your help is greatly appreciated!
Here is what I have so far that is not yielding correct results. The results this yields are nicks who have earned points at any time it seems.
SELECT * FROM points_log p1
INNER JOIN points_log p2 ON p1.nick = p2.nick
AND p1.dt < NOW() - INTERVAL 1 WEEK
WHERE p2.dt > NOW() - INTERVAL 1 WEEK LIMIT 10;
Here is the table:
mysql> describe points_log;
+-------------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| nick | char(25) | NO | PRI | NULL | |
| amount | decimal(10,4) | YES | MUL | NULL | |
| stream_online | tinyint(1) | NO | MUL | NULL | |
| modification_type | tinyint(3) unsigned | NO | MUL | NULL | |
| dt | datetime | NO | PRI | NULL | |
+-------------------+-----------------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)
You can get the nicks who have scored in the past week using an aggregation:
SELECT pl.nick
FROM points_log pl
GROUP BY pl.nick
HAVING MAX(pl.dt) < NOW() - INTERVAL 1 WEEK;
I'm not sure what you want as final output, but this will return the nicks that have scored in the past week.

How can I optimize this mysql query to find maximum simultaneous calls?

I'm trying to calculate maximum simultaneous calls. My query, which I believe to be accurate, takes way too long given ~250,000 rows. The cdrs table looks like this:
+---------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-----------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| CallType | varchar(32) | NO | | NULL | |
| StartTime | datetime | NO | MUL | NULL | |
| StopTime | datetime | NO | | NULL | |
| CallDuration | float(10,5) | NO | | NULL | |
| BillDuration | mediumint(8) unsigned | NO | | NULL | |
| CallMinimum | tinyint(3) unsigned | NO | | NULL | |
| CallIncrement | tinyint(3) unsigned | NO | | NULL | |
| BasePrice | float(12,9) | NO | | NULL | |
| CallPrice | float(12,9) | NO | | NULL | |
| TransactionId | varchar(20) | NO | | NULL | |
| CustomerIP | varchar(15) | NO | | NULL | |
| ANI | varchar(20) | NO | | NULL | |
| ANIState | varchar(10) | NO | | NULL | |
| DNIS | varchar(20) | NO | | NULL | |
| LRN | varchar(20) | NO | | NULL | |
| DNISState | varchar(10) | NO | | NULL | |
| DNISLATA | varchar(10) | NO | | NULL | |
| DNISOCN | varchar(10) | NO | | NULL | |
| OrigTier | varchar(10) | NO | | NULL | |
| TermRateDeck | varchar(20) | NO | | NULL | |
+---------------+-----------------------+------+-----+---------+----------------+
I have the following indexes:
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| cdrs | 0 | PRIMARY | 1 | id | A | 269622 | NULL | NULL | | BTREE | | |
| cdrs | 1 | id | 1 | id | A | 269622 | NULL | NULL | | BTREE | | |
| cdrs | 1 | call_time_index | 1 | StartTime | A | 269622 | NULL | NULL | | BTREE | | |
| cdrs | 1 | call_time_index | 2 | StopTime | A | 269622 | NULL | NULL | | BTREE | | |
+-------+------------+-----------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
The query I am running is this:
SELECT MAX(cnt) AS max_channels FROM
(SELECT cl1.StartTime, COUNT(*) AS cnt
FROM cdrs cl1
INNER JOIN cdrs cl2
ON cl1.StartTime
BETWEEN cl2.StartTime AND cl2.StopTime
GROUP BY cl1.id)
AS counts;
It seems like I might have to chunk this data for each day and store the results in a separate table like simultaneous_calls.
I'm sure you want to know not only the maximum simultaneous calls, but when that happened.
I would create a table containing the timestamp of every individual minute
CREATE TABLE times (ts DATETIME UNSIGNED AUTO_INCREMENT PRIMARY KEY);
INSERT INTO times (ts) VALUES ('2014-05-14 00:00:00');
. . . until 1440 rows, one for each minute . . .
Then join that to the calls.
SELECT ts, COUNT(*) AS count FROM times
JOIN cdrs ON times.ts BETWEEN cdrs.starttime AND cdrs.stoptime
GROUP BY ts ORDER BY count DESC LIMIT 1;
Here's the result in my test (MySQL 5.6.17 on a Linux VM running on a Macbook Pro):
+---------------------+----------+
| ts | count(*) |
+---------------------+----------+
| 2014-05-14 10:59:00 | 1001 |
+---------------------+----------+
1 row in set (1 min 3.90 sec)
This achieves several goals:
Reduces the number of rows examined by two orders of magnitude.
Reduces the execution time from 3 hours+ to about 1 minute.
Also returns the actual timestamp when the highest count was found.
Here's the EXPLAIN for my query:
explain select ts, count(*) from times join cdrs on times.ts between cdrs.starttime and cdrs.stoptime group by ts order by count(*) desc limit 1;
+----+-------------+-------+-------+---------------+---------+---------+------+--------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+------------------------------------------------+
| 1 | SIMPLE | times | index | PRIMARY | PRIMARY | 5 | NULL | 1440 | Using index; Using temporary; Using filesort |
| 1 | SIMPLE | cdrs | ALL | starttime | NULL | NULL | NULL | 260727 | Range checked for each record (index map: 0x4) |
+----+-------------+-------+-------+---------------+---------+---------+------+--------+------------------------------------------------+
Notice the figures in the rows column, and compare to the EXPLAIN of your original query. You can estimate the total number of rows examined by multiplying these together (but that gets more complicated if your query is anything other than SIMPLE).
The inline view isn't strictly necessary. (You're right about a lot of time to run the EXPLAIN on the query with the inline view, the EXPLAIN will materialize the inline view (i.e. run the inline view query and populate the derived table), and then give an EXPLAIN on the outer query.
Note that this query will return an equivalent result:
SELECT COUNT(*) AS max_channels
FROM cdrs cl1
JOIN cdrs cl2
ON cl1.StartTime BETWEEN cl2.StartTime AND cl2.StopTime
GROUP BY cl1.id
ORDER BY max_channels DESC
LIMIT 1
Though it still has to do all the work, and probably doesn't perform any better; the EXPLAIN should run a lot faster. (We expect to see "Using temporary; Using filesort" in the Extra column.)
The number of rows in the resultset is going to be the number of rows in the table (~250,000 rows), and those are going to need to be sorted, so that's going to be some time there. The bigger issue (my gut is telling me) is that join operation.
I'm wondering if the EXPLAIN (or performance) would be any different if you swapped the cl1 and cl2 in the predicate, i.e.
ON cl2.StartTime BETWEEN cl1.StartTime AND cl1.StopTime
I'm thinking that, just because I'd be tempted to try a correlated subquery. That's ~250,000 executions, and that's not likely going to be any faster...
SELECT ( SELECT COUNT(*)
FROM cdrs cl2
WHERE cl2.StartTime BETWEEN cl1.StartTime AND cl1.StopTime
) AS max_channels
, cl1.StartTime
FROM cdrs cl1
ORDER BY max_channels DESC
LIMIT 11
You could run an EXPLAIN on that, we're still going to see a "Using temporary; Using filesort", and it will also show the "dependent subquery"...
Obviously, adding a predicate on the cl1 table to cut down the number of rows to be returned (for example, checking only the past 15 days); that should speed things up, but it doesn't get you the answer you want.
WHERE cl1.StartTime > NOW() - INTERVAL 15 DAY
(None of my musings here are sure-fire answers to your question, or solutions to the performance issue; they're just musings.)

MySQL - Select last data inserted in last 5 days skip missing records for days

I want to select the data that was inserted in the last 5 days, and if the rows are missing for that day then move on to the previous day, but it always have to return rows from the last 5 days.
The column which i'm trying to match is a DATETIME column
I've tried using this query
select * from `thum_{ROH}` where date >= NOW() - INTERVAL 5 DAY;
Now this return data from 2013-12-24 to 2013-12-22 because data on 2013-12-25 and 2013-12-26
is not available.
How can i modify the query to make it return the last 5 days data irrespective of missing rows. So in this case it will return data inserted on
2013-12-24
2013-12-23
2013-12-22
2013-12-19
2013-12-12
The days which are missing in between the dates above simply have no rows associated with them so they won't be returned.
I have also tried using
select distinct(date(date)), power from `thum_{ROH}` limit 5;
But this only selects some values in a specific date while skips on the rest. What i mean is that there are around 30 or more rows in each day which are present so the above query only returns around 2 or 3 rows per day.
I hope my question makes sense. I've been trying to find a solution without any success. Please provide any sort of advice on how can i achieve this. I would appreciate any help.
Thanks in advance,
Maxx
EDIT
Here is the table structure in question.
+-----------------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+-------+
| thumType | int(11) | NO | PRI | 0 | |
| timestamp | int(10) unsigned | NO | PRI | 0 | |
| rune | char(15) | YES | | NULL | |
| date | datetime | YES | | NULL | |
| destruction | decimal(15,2) | YES | | NULL | |
| restoration | decimal(15,2) | YES | | NULL | |
| conjuration | decimal(15,2) | YES | | NULL | |
| alteration | decimal(15,2) | YES | | NULL | |
| illusion | int(10) unsigned | YES | | NULL | |
| power | decimal(15,2) | YES | | NULL | |
| magicka | decimal(15,2) | YES | | NULL | |
| health | decimal(15,2) | YES | | NULL | |
+-----------------+------------------+------+-----+---------+-------+
You can do this with a join:
select t.*
from `thum_{ROH}` t join
(select distinct date
from `thum_{ROH}`
order by date desc
limit 5
) as date5
on t.date = date5.date;
EIDT:
The above works if we assume that there is no time component.
We can fix that problem by doing:
select t.*
from `thum_{ROH}` t join
(select distinct date(date) as thedate
from `thum_{ROH}`
order by date desc
limit 5
) as date5
on date(t.date) = date5.thedate;
Assuming that the 'date' column is of type date, then we can solve this with a subquery to get the 5 most recent non-blank dates and then only select rows from those days.
SELECT *
FROM tablename
WHERE date IN (
SELECT distinct date FROM tablename ORDER BY date DESC LIMIT 5
)

WeatherStation : Mysql query join & average on tables

I have two tables like that :
temperature :
+---------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| date | datetime | YES | UNI | NULL | |
| capteur | int(11) | YES | | NULL | |
| valeur | float(3,1) | YES | | NULL | |
+---------+---------------------+------+-----+---------+----------------+
humidite :
+---------+---------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+---------------------+------+-----+---------+----------------+
| id | bigint(20) unsigned | NO | PRI | NULL | auto_increment |
| date | datetime | YES | UNI | NULL | |
| capteur | int(11) | YES | | NULL | |
| valeur | int(11) | YES | | NULL | |
+---------+---------------------+------+-----+---------+----------------+
I have values recorded on those two tables, not at the same time (Around 1 record each minute).
If I enter this command, I get average values for each hour for the last 24h (so, 24 rows) :
$sql->query('SELECT hour(date) AS humhour,ROUND(AVG(valeur),1) AS avghum FROM humidite WHERE date >= (now() - INTERVAL 1 DAY) GROUP BY HOUR(date) ORDER BY DATE;');
Now, I try to get the same thing, but with both tables. Ie, for all value between 0h00 and 0h59, I want average of all temperature and average of all humidity values.
I try this command :
$result = $sql->query('
SELECT hour(temperature.date) AS hourtemp,
hour(humidite.date) AS hourhum,
ROUND(AVG(temperature.valeur),1) AS avgtemp,
ROUND(AVG(humidite.valeur),1) AS avghum
FROM temperature
INNER JOIN humidite on hour(temperature.date) = hour(humidite.date)
WHERE temperature.date >= (now() - INTERVAL 1 DAY)
GROUP BY HOUR(date)
ORDER BY DATE;');
An idea ?
Thank you !

Order by number of views in last hour [MySQL]

I have a table which holds all views for the last 24 hours. I want to pull all pages ordered by a rank. The rank should be calculated something like this:
rank = (0.3 * viewsInCurrentHour) * (0.7 * viewsInPreviousHour)
I want the prefferably in one single query. Is this possible, or do I need to make 2 queries (one for the current hour and one for the last hour and then just aggregate them)?
Here is the DESCRIBE of the table accesslog:
+-----------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+------------------+------+-----+---------+----------------+
| aid | int(11) | NO | PRI | NULL | auto_increment |
| sid | varchar(128) | NO | | | |
| title | varchar(255) | YES | | NULL | |
| path | varchar(255) | YES | | NULL | |
| url | text | YES | | NULL | |
| hostname | varchar(128) | YES | | NULL | |
| uid | int(10) unsigned | YES | MUL | 0 | |
| timer | int(10) unsigned | NO | | 0 | |
| timestamp | int(10) unsigned | NO | MUL | 0 | |
+-----------+------------------+------+-----+---------+----------------+
select
url,
sum(timestamp between subdate(now(), interval 2 hour) and subdate(now(), interval 1 hour)) * .3 +
sum(timestamp between subdate(now(), interval 1 hour) and now()) * .7 as rank
from whatever_your_table_name_is_which_you_have_kept_secret
where timestamp > subdate(now(), interval 2 hour)
group by url
order by rank desc;
The sum(condition) works because in mysql trye is 1 and false is 0, so summing a condition is the same as what some noobs write as sum(case when condition then 1 else 0 end)
Edit:
Note the addition of where timestamp > subdate(now(), interval 2 hour) to improve performance, because only these records contribute to the result.