MYSQL view not filling up empty spots to fill 1000 - mysql

The following query gets the popular questions from the questions asked in the last 2 days. It looks at a feed table to see whats talked about latest, then it searches a tag table to find which one of those is popular.
I only get about 60 results which is great, but I need 1000 results. This means I need to fill up the rest with random questions.
My sql query attempts to do this but does not fill in the rest of the view with more questions not in the feed table.
CREATE
ALGORITHM = UNDEFINED
DEFINER = `root`#`%`
SQL SECURITY DEFINER
VIEW `popular` AS
select
`q`.`name` AS `name`,
`q`.`questionUrl` AS `questionUrl`,
`q`.`miRating` AS `miRating`,
`q`.`imageUrl` AS `imageUrl`,
`q`.`foundOn` AS `foundOn`,
`q`.`myId` AS `myId`
from
(`question` `q`
join `feed` `f` ON ((`q`.`myId` = `f`.`question_id`))
join `tag` `t` ON ((`q`.`myId` = `t`.`question_id`)))
where
(`t`.`name` like '%popular%')
group by `q`.`name`
order by (max(`f`.`timeStamp`) >= (now() - interval 1 day)) desc , (`q`.`myId` is not null) desc
limit 0 , 1000comment

If you need random questions, remove the where clause and move the logic to the order by:
select
`q`.`name` AS `name`,
`q`.`questionUrl` AS `questionUrl`,
`q`.`miRating` AS `miRating`,
`q`.`imageUrl` AS `imageUrl`,
`q`.`foundOn` AS `foundOn`,
`q`.`myId` AS `myId`
from
(`question` `q`
join `feed` `f` ON ((`q`.`myId` = `f`.`question_id`))
join `tag` `t` ON ((`q`.`myId` = `t`.`question_id`)))
group by `q`.`name`
order by (max(`f`.`timeStamp`) >= (now() - interval 1 day)) desc ,
max(`t`.`name` like '%popular%') desc,
rand()
limit 0 , 1000;

Related

SQL query with a major NOT IN not working

Does anyone know what's wrong with this query?
This works perfectly on its own:
SELECT * FROM
(SELECT * FROM data WHERE site = '".$id."'
AND disabled = '0'
AND carvotes NOT LIKE '0'
AND (time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car ORDER BY carvotes DESC LIMIT 0 , 10)
X order by time DESC
So does this:
SELECT * FROM data WHERE site = '".$id."' AND disabled = '0' GROUP BY car DESC ORDER BY time desc LIMIT 0 , 30
But combining them like this:
SELECT * FROM data WHERE site = '".$id."' AND disabled = '0' AND car NOT IN (SELECT * FROM
(SELECT * FROM data WHERE site = '".$id."'
AND disabled = '0'
AND carvotes NOT LIKE '0'
AND (time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car ORDER BY carvotes DESC LIMIT 0 , 10)
X order by time DESC) GROUP BY car DESC ORDER BY time desc LIMIT 0 , 30
Gives errors. Any ideas?
Please try the following...
$result = mysqli_query( $con,
"SELECT *
FROM data
WHERE site = '" . $id .
"' AND disabled = '0'
AND car NOT IN ( SELECT car
FROM ( SELECT car,
carvotes
FROM data
WHERE site = '" . $id .
"' AND disabled = '0'
AND carvotes NOT LIKE '0'
AND ( time > ( NOW( ) - INTERVAL 14 DAY ) )
GROUP BY car
ORDER BY carvotes DESC
LIMIT 10 ) X
)
GROUP BY car
ORDER BY time DESC
LIMIT 30" );
The main cause of your problem is that with car NOT IN ( SELECT * FROM ( SELECT *... you are trying to compare each record's value of car with each row returned by your subquery. IN requires you to have the same number of fields on both sides of the comparison. By using SELECT * at both levels of the subquery you were ensuring that the right side of the comparison had however many fields are in data versus your single field on the left, which confused MySQL.
Since you are aiming to compare to a single field, namely car, our subquery has to select just the car field from its dataset. Since the sort order of the subquery's results has no effect upon the IN comparison, and since our innermost query will be returning just car, I have removed the outer level of the subquery.
Beyond changing the first part of the subquery to SELECT car, the only other change that I have made to the subquery is to change LIMIT 0, 10 to LIMIT 10. The former means limit to the the 10 records that are offset by 0 from the first record. This is useful if you want records 6 to 15, but redundant for 1 to 10 as LIMIT 10 has the same affect and is slightly simpler. Ditto for LIMIT 0, 30 at the end of your overall statement.
As for the main body of the statement, I have not made any attempt to specify what fields (or aggregate functions of those fields) should be returned since you have made no statement indicating what your requirements / preferences are. If you are satisfied that GROUP BY has left you with a still valid set of values, then all the good, but if not then I recommend that you rewrite your Question to be specific about that detail.
By default, MySQL sorts the data subjected to a GROUP BY into ascending order, but if an ORDER BY clause is also present then it overrides the GROUP BY's sort pattern. As such, there is no benefit to specifying DESC after either of your GROUP BY car clauses, so I have removed it where it occurs.
Interesting Sidenote : You can override a GROUP BY's sort by specifying ORDER BY NULL.
If you have any questions or comments, then please feel free to post a Comment accordingly.
Further Reading
https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html - on optimising your ORDER BY sorting
https://dev.mysql.com/doc/refman/5.7/en/select.html - on the SELECT statement's syntax - specifically the parts to do with LIMIT.
https://www.w3schools.com/php/php_mysql_select_limit.asp - a simpler explanation of LIMIT
This is your query:
SELECT *
FROM data
WHERE site = '".$id."' AND disabled = '0' AND
car NOT IN (SELECT *
FROM (SELECT *
FROM data
WHERE site = '".$id."' AND
disabled = '0' AND
carvotes NOT LIKE '0' AND
(time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car
ORDER BY carvotes DESC
LIMIT 0 , 10
) x
ORDER BY time DESC
)
GROUP BY car DESC
ORDER BY time desc
LIMIT 0 , 30 ;
Several comments:
Do not wrap integer constants in single quotes. This can mislead people. This can mislead optimizers.
Do not use string functions on integers (such as like). Same reason.
NOT IN with subqueries is dangerous. The construct does not handle NULL values the way you expect. Use NOT EXISTS or LEFT JOIN instead.
When using subqueries, ORDER BY is almost never appropriate.
Never use SELECT * with GROUP BY. It is just wrong. Happily, MySQL 5.7 has changed its defaults to reject this anti-pattern
So, a better way to write this query is something like this:
SELECT d.car, MAX(time) as time
FROM data d LEFT JOIN
(SELECT d2.*
FROM data d2
WHERE d2.site = '".$id."' AND
d2.disabled = 0 AND
d2.carvotes NOT LIKE 0 AND
(d2.time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY d2.car
ORDER BY carvotes DESC
LIMIT 0 , 10
) car10
ON d.car = car10.car
WHERE d.site = '".$id."' AND d.disabled = 0' AND
car10.car IS NOT NULL
GROUP BY car DESC
ORDER BY MAX(time) desc
LIMIT 0 , 30 ;
Alternatively, use SELECT * and remove the GROUP BY in the outer query.

MySQL nested select: am I able to use any data from outer select?

Let's imagine I have the following table users:
id name
1 John
2 Mike
3 Max
And table posts
id author_id date title
1 1 2014-12-12 Post 2
2 1 2014-12-10 Post 1
3 2 2014-10-01 Lorem ipsum
...and so on
And I'd like to have a query containing the following data:
user name
number of user's posts within last week
number of user's posts within last month
I can do it just for each individual user (with id 1 in the following example):
SELECT
`name`,
(SELECT COUNT(*) FROM `posts`
WHERE
`author` = 1 AND
UNIX_TIMESTAMP()-UNIX_TIMESTAMP(`date`) < 7*24*3600) AS `posts7`,
(SELECT COUNT(*) FROM `posts`
WHERE
`author` = 1 AND
UNIX_TIMESTAMP()-UNIX_TIMESTAMP(`date`) < 30*24*3600) AS `posts30`
FROM `users`
WHERE `id` = 1
I suspect that MySQL will allow do this for all users within one query if I could exchange the data between inner and outer SELECT's. I probably using wrong words, but I really hope that people here will understand my needs and I will have some help.
This is NOT the final SQL but gives you the jist...
Select name, sum(case when datewithin7 then 1 else 0 end) as posts7,
sum(case when datewithin30 then 1 else 0 end) as posts30
from name
left join posts on name.id = posts.nameid
GROUP BY name.
Note you need the group by. but I don't have the time to put the case statement together...
Try something like this
SELECT
`name`,
sum(IF(`date` between DATE(NOW()-INTERVAL 7 DAY) and now() , 1, 0) as posts7,
sum(IF(`date` between DATE(NOW()-INTERVAL 30 DAY) and now() , 1, 0) as posts30
FROM
`users` as u, posts as p
WHERE
u.id = p.author_id
GROUP BY
1
Certainly aggregating a non-nested query is the way to solve the problem although both Benni and xQbert have written unbounded queries - which, while satisfying the objective, are very innefficient. Consider (adapted from Benni's answer):
SELECT `name`
, SUM(IF(
`date` between DATE(NOW()-INTERVAL 7 DAY) and now()
, 1
, 0) as posts7
, SUM(IF(
p.author_id IS NULL
, 0
, 1) as posts30
FROM `users` as u
LEFT JOIN posts as p
ON u.id = p.author_id
AND p.date > NOW()-INTERVAL 30 DAY
GROUP BY name
Note that NOT using the conversion to a UNIX timestamp allows the database to use an index (if available) to resolve the query.
However there are scenarios where it is more effective / appropriate to use a nested query. So although it's not the best solution to this problem:
SELECT
`name`
, (SELECT COUNT(*)
FROM `posts` AS p7
WHERE p7.author = users.id
AND p7.`date` > NOW() - INTERVAL 7 DAY) AS `posts7`
, (SELECT COUNT(*)
FROM `posts` AS p30
WHERE p30.author = users.id
AND p30.`date` > NOW() - INTERVAL 30 DAY) AS `posts30`
FROM `users`
WHERE `id` = 1

MYSQL How to show most popular items in the last two days

I tried to show the most popular items in the last two days, but this view is letting in items that happened over two days ago.
It is made to find the most popular in the last two days (maybe 20-30 items) and fill the remaining ones with random items (need 1000 items on the view at all times)
How can I fix this?
Thank you
CREATE
ALGORITHM = UNDEFINED
DEFINER = `XX`#`XX`
SQL SECURITY DEFINER
VIEW `trending` AS
select
`question`.`name` AS `name`,
`question`.`questionUrl` AS `questionUrl`,
`question`.`miRating` AS `miRating`,
`question`.`imageUrl` AS `imageUrl`,
`question`.`miThumbnail` AS `miThumbnail`,
`question`.`foundOn` AS `foundOn`,
`question`.`myId` AS `myId`
from
(`question`
join `feed` ON ((`question`.`myId` = `feed`.`question_id`)))
group by `question`.`name`
order by (`feed`.`timeStamp` >= (now() - interval 1 day)) desc ,
(`feed`.`question_id` is not null) desc ,
(((`question`.`likesCount` * 0.8) + (`question`.`commentsCount` * 0.6)) + ((`question`.`sharesCount` * 1) / 2.4)) desc
limit 0 , 1000
The problem is that you are grouping by question_name, but you have a lot of other columns in the query, in both the select and the order by. MySQL chooses arbitrary values for these. One way to fix this is by using only the maximum of the time condition in the order by clause:
select q.`name` AS `name`, q.`questionUrl` AS `questionUrl`, q.`miRating` AS `miRating`,
q.`imageUrl` AS `imageUrl`, q.`miThumbnail` AS `miThumbnail`,
q.`foundOn` AS `foundOn`, q.`myId` AS `myId`
from `question` q join
`feed` f
ON q.`myId` = f.`question_id`
group by q.`name`
order by (max(f.`timeStamp`) >= (now() - interval 1 day)) desc ,
(f.`question_id` is not null) desc ,
(((q.`likesCount` * 0.8) + (q.`commentsCount` * 0.6)) + ((q.`sharesCount` * 1) / 2.4)) desc
limit 0 , 1000

MYSQL Query : How to get values per category?

I have huge table with millions of records that store stock values by timestamp. Structure is as below:
Stock, timestamp, value
goog,1112345,200.4
goog,112346,220.4
Apple,112343,505
Apple,112346,550
I would like to query this table by timestamp. If the timestamp matches,all corresponding stock records should be returned, if there is no record for a stock for that timestamp, the immediate previous one should be returned. In the above ex, if I query by timestamp=1112345 then the query should return 2 records:
goog,1112345,200.4
Apple,112343,505 (immediate previous record)
I have tried several different ways to write this query but no success & Im sure I'm missing something. Can someone help please.
SELECT `Stock`, `timestamp`, `value`
FROM `myTable`
WHERE `timestamp` = 1112345
UNION ALL
SELECT `Stock`, `timestamp`, `value`
FROM `myTable`
WHERE `timestamp` < 1112345
ORDER BY `timestamp` DESC
LIMIT 1
select Stock, timestamp, value from thisTbl where timestamp = ? and fill in timestamp to whatever it should be? Your demo query is available on this fiddle
I don't think there is an easy way to do this query. Here is one approach:
select tprev.*
from (select t.stock,
(select timestamp from t.stock = s.stock and timestamp <= <whatever> order by timestamp limit 1
) as prevtimestamp
from (select distinct stock
from t
) s
) s join
t tprev
on s.prevtimestamp = tprev.prevtimestamp and s.stock = t.stock
This is getting the previous or equal timestamp for the record and then joining it back in. If you have indexes on (stock, timestamp) then this may be rather fast.
Another phrasing of it uses group by:
select tprev.*
from (select t.stock,
max(timestamp) as prevtimestamp
from t
where timestamp <= YOURTIMESTAMP
group by t.stock
) s join
t tprev
on s.prevtimestamp = tprev.prevtimestamp and s.stock = t.stock

Sub query or Join which is the optimal solution?

i've 2 tables, 1 user table and a table where the
outgoing emails are queued. I want to select the users
that are not online for a certain amount of time
and send them an email. I also want that, if they
already received such an email in the last 7 days
or have an scheduled email for the next 7 days, that
they are not selected.
I have 2 queries, which i think would be great if
they are working with subqueries.
As an area of which i'm not an expert in, i would
like to kindly invite you to either,
Build a subquery of the second query
Make a JOIN and exclude the second query results.
I would be far more then happy :)
Thank you for reading
SELECT
`user_id`
FROM
`user`
WHERE
DATEDIFF( CURRENT_DATE(), date_seen ) >= 7
The results of the second query should be excluded
from the query above.
SELECT
`mail_queue_id`,
`mail_id`,
`user_id`,
`status`,
`date_scheduled`,
`date_processed`
FROM
`mail_queue`
WHERE
(
DATEDIFF( CURRENT_DATE(), date_scheduled ) >= 7
OR
DATEDIFF( date_scheduled, CURRENT_DATE() ) <= 7
)
AND
(
`mail_id` = 'inactive_week'
AND
(
`status` = 'AWAITING'
OR
`status` = 'DELIVERED'
)
)
SOLUTION
SELECT
`user_id`
FROM
`user` as T1
WHERE
DATEDIFF( CURRENT_DATE(), date_seen ) >= 7
AND NOT EXISTS
(
SELECT
`user_id`
FROM
`mail_queue` as T2
WHERE
T2.`user_id` = T1.`user_id`
AND
(
DATEDIFF( CURRENT_DATE(), date_scheduled ) >= 7
OR
DATEDIFF( date_scheduled, CURRENT_DATE() ) <= 7
AND
(
`mail_id` = 'inactive_week'
AND
(
`status` = 'AWAITING'
OR
`status` = 'DELIVERED'
)
)
)
)
YOu can select the users who match the first criterion (not having logged on in the past seven days) and then "AND" that criterion to another clause using "NOT EXISTS", aliasing the same table:
select * from T where {first criterion}
and not exists
(
select * from T as T2 where T2.userid = T.userid
and ABS( DATEDIFF(datescheduled, CURRENT_DATE()) ) <=7
)
I'm not familiar with the nuances of the mysql DATEDIFF, i.e. whether it matters which date value appears in which position, but the absolute value would make it so that if the user had been sent a notice in the past 7 days or is scheduled to receive a notice in the next seven days, they would satisfy the condition, and thereby fail the NOT EXISTS condition, excluding that user from your final set.