Count number of retweeted tweets and total number tweets have been retweeted - mysql

Consider the following tables
users tweets
--------------------------------- ---------------------------
user_id num_retweets sum_retweets tweet_id user_id retweeted
--------------------------------- ---------------------------
1 1 1 3
2 2 1 0
3 3 1 4
4 2 0
5 2 0
6 3 1
7 3 2
8 3 0
I want to count num_retweets: the number of times a users has written a retweet that has been retweeted and sum_retweets: the number of times all of a users tweets have been retweeted. The expected users table after the UPDATE query is:
users
---------------------------------
user_id num_retweets sum_retweets
---------------------------------
1 2 7 <-- 3 + 4
2 0 0
3 2 3 <-- 1 + 2
Any help on building these two queries would be greatly appreaciated :-) I keep having trouble performing UPDATEs across tables.

UPDATE
USERS u
JOIN (
SELECT
tweets.user_id,
COUNT(IF(tweets.retweeted > 0, 1, null)) as num_retweets,
SUM(tweets.retweeted) as sum_retweets
FROM tweets
GROUP BY tweets.user_id
) as t ON t.user_id = u.user_id
SET u.num_retweets = t.num_retweets, u.sum_retweets = t.sum_retweets

Related

MySQL Query 2 records at top followed by the rest

I'm stuck with the following problem.
I have a website with for example supermarket shopping items. People can search the website for items.
Now I want on the search result page at the top 2 items to be displayed that I have selected to be on offer. There can be lots more items on offer.
So for example, someone would search for shampoo, the query would display all the shampoo items in the database table but I want just 2 shampoo offer items at the top of the query. There could be 2 or more shampoo offers in the database table, then the other would just not be shown.
Example with names :
Table:
id name C D
----------------------------------
1 Jack 1 1
2 Joe 1 1
3 Dave 3 0
4 Sue 1 0
5 Mike 1 1
6 Steve 4 0
7 David 1 0
8 Susan 4 1
9 Marc 1 1
10 Ronald 4 1
11 Michael 4 1
EXAMPLE 1
Query :
WHERE C = 1 AND D = 1 (But only maximum of 2 'D' records, these 2 'D' records show at the top of the result)
Desired Query Result :
id name C D
----------------------------------
1 Jack 1 1
2 Joe 1 1
4 Sue 1 0
7 David 1 0
EXAMPLE 2
Query :
WHERE C = 4 AND D = 1 (But only maximum of 2 'D' records, these 2 'D' records show at the top of the result)
Desired Query Result :
id name C D
----------------------------------
8 Susan 4 1
10 Ronald 4 1
6 Steve 4 0
I hope this explains my goal what I'm trying to achieve.
Many thanks for any help or suggestions!
That's two queries that you can combine with UNION ALL:
select * from mytable where c = 1 and d = 1 limit 2
union all
select * from mytable where c = 1 and d = 0
order by d desc;
UPDATE: If you want to have the two rows chosen randomly, then order by RAND(). (Without an ORDER BY the rows are chosen arbitrarily, which means it's not guaranteed to get the same two rows picked again when re-running the query &dash; but it's quite likely.) As we need an ORDER BY for a partial query (the first query in the complete union-alled query), we must use parentheses, because otherwise only one ORDER BY would be allowed, namely for the complete query at the query's end.
(select * from mytable where c = 1 and d = 1 order by rand() limit 2)
union all
(select * from mytable where c = 1 and d = 0)
order by d desc;
SQL fiddle: http://sqlfiddle.com/#!9/ed53f6/3.

MySQL: select when count = sum & group by

I need to select user_id & quiz_id, for users which their count of questions in their quiz = sum of correct, this mean they answer 100% correct
answers table:
quiz_id question_id user_id answer_id correct
1 1 1 1 1
1 2 1 6 0
1 3 1 9 1
2 1 2 1 1
2 2 2 5 1
3 4 1 17 1
3 5 1 21 1
3 6 1 25 1
4 1 3 1 1
5 4 4 18 0
6 1 5 1 1
6 2 5 5 1
7 1 3 2 0
7 2 3 7 0
ex 1:
user 1 took "quiz_id" = 1
count of questions in "quiz_id = 1" = 3
sum of correct = 2
so it's not 100%
user_id = 1 in quiz_id = 1 => will not selected
but user_id = 1 will be selected with quiz_id = 3 cause he got 100%
expected results:
quiz_id user_id
2 2
3 1
4 3
6 5
notes:
quiz could be taken with different users with different number of
questions
quiz_id, user_id unique together (user can not take same quiz twice)
thanks,
You should use an aggregate query with HAVING clause:
SELECT quiz_id, user_id
FROM quiz_answer -- or whatever the name is
GROUP BY quiz_id, user_id
HAVING COUNT(question_id) = SUM(correct)
here you must use HAVING instead of WHERE because
The HAVING clause can refer to aggregate functions, which the WHERE
clause cannot
as specified in the docs.

Most recent distinct record from a joined MySQL table

I have two tables, one of a list of competition results, and one with ELO ratings (based on previous competitions).
Fetching the list of competitors for an arbitrary competition is trivial enough, but I also need to get the most recent rating value for them.
score:
id | eventid | competitorid | position
1 1 1 1
2 1 2 2
3 1 3 3
4 2 2 1
5 2 3 2
6 3 1 1
7 3 3 2
8 3 2 3
rating:
id | competitorid | rating
1 1 1600
2 2 1500
3 3 1500
4 2 1600
5 3 1590
Expected output for a query against score.eventid = 3 would be
id | competitorid | position | rating
6 1 1 1600
7 3 2 1590
8 2 3 1600
At the moment my code looks like:
SELECT score.scoreID, score.competitorID, score.position,
rating.id, rating.rating
FROM score, rating
WHERE score.competitorid = rating.competitorid
AND score.eventid = 3
ORDER BY score.position
which gives an output of
id | competitorid | position | rating.id | rating
6 1 1 1 1600
7 3 2 2 1500
7 3 2 4 1590
8 2 3 3 1500
8 2 3 5 1600
basically it's showing the data from the score table for that correct event, but giving me a row for every rating available against that competitorID unfortunately I have no idea where to build in the DISTINCT statement or how to limit it to the most recent result.
MySQL noob, and managed DISTINCT statements, but not with joins. Unfortunately most previous questions seemed to deal with getting distinct results from a single table, which is not quite what I'm after. Thanks!
One way to get the rating is with a correlated subquery:
SELECT s.scoreID, s.eventID, s.competitorID, s.position,
(select r.rating
from rating r
where s.competitorID = r.competitorID
order by r.id desc
limit 1
) as rating
FROM score s
WHERE s.eventID = 3
ORDER BY s.position;
I'm not sure what ratingprid is, so this only includes the rating.

My sql query sum of highest points for each category

upload table
id category_id
1 1
2 2
3 3
4 1
In Ratings Table
id upload_id points
1 1 5
2 2 3
3 3 2
4 4 2
5 1 5
I want to display all the records from ratings table except id 4 because id 4 is the same category id 1
I excepted result
Uploaded_id 1 has sum of 10 points
Uploaded_id 2 has sum of 3 points
Uploaded_id 3 has sum of 2 points
Please help me.
Thanks In advance
Sasikumar
select upload_id, sum(points)
from rating r,
(select min(id) id from upload group by category_id) a
where a.id = r.upload_id group by upload_id

Joining data from 4 tables to calculate several weighted scores

Please consider the following tables.
users holds a couple of thousands of Twitter-users; their tweets are indexed with sp100_id, which is the id of the company (see sp100) the tweet was talking about. tweets.class holds the assigned sentiment class (1 = neutral, 2 = positive, 3 = negative) for each tweet. tweets.rt holds the amount of times the tweet has been retweeted. Finally, each user has been given a quality score and a follow score, as follows:
users tweets
------------------------- -----------------------------------------------
user_id quality follow tweet_id sp100_id nyse_date user_id class rt
------------------------- -----------------------------------------------
1 2.50 5.00 1 1 2011-03-12 1 1 0
2 0.75 1.00 2 1 2011-03-13 1 2 2
3 1 2011-03-13 1 2 1
daterange 4 1 2011-03-13 2 2 0
---------------- 5 1 2011-03-13 2 3 3
_date 6 2 2011-03-12 2 2 3
---------------- 7 2 2011-03-12 2 2 0
2011-03-11 8 2 2011-03-12 1 3 5
2011-03-12 9 2 2011-03-13 2 2 0
2011-03-13
sp100
----------------
sp100_id _name
----------------
1 Alcoa
2 Apple
The desired output is a list per sp100_id per _date the amount of positive (class=2) and negative (class=3) tweets weighted per rt, 'quality' and follow:
sp100_id nyse_date pos-rt pos-quality pos-follow neg-rt neg-quality neg-follow
--------------------------------------------------------------------------------
1 2011-03-11 0 0 0 0 0 0
1 2011-03-12 0 0 0 0 0 0
1 2011-03-13 5 (1) 5.75 (2) 11.00 (3) 3 (4) 0.75 (5) 1.00 (6)
2 2011-03-11 0 0 0 0 0 0
2 2011-03-12 3 (7) 5.00 (8) 10.00 (9) 5.00 2.50 2.50
2 2011-03-13 0 0.75 1.00 0 0 0
--------------------------------------------------------------------------------
(1) On 2011-03-13, 3 positive tweets for sp100_id 1. 1 tweet retweeted 2 times,
1 tweets retweeted 1 time and 1 tweet retweeted 0 times = 2x2+1x1+1x0 = 5
(2) On 2011-03-13, 2 positive tweets made by user 1, who has quality 2.50 and
1 positive tweet made by user 2, who has quality 0.75 = 2x2.50+1x0.75 = 5.75
(3) On 2011-03-13, 2 positive tweets made by user 1, who has follow 5.00 and
1 positive tweet made by user 2, who has follow 1 = 2x5.00+1x1.00 = 11.00
(4) On 2011-03-13, 1 negative tweet made by user 2, retweeted 3 times = 1x3 = 3
(5) On 2011-03-13, 1 negative tweet made by user 2, who has quality 0.75, thus
1x0.75 = 0.75
(6) On 2011-03-13, 1 negative tweets made by user 2, who has follow 1.00 so
1x1.00 = 1.00
(7) 1 positive tweet which has been retweeted 3 times, 1 positive tweet without
any retweets = 1x3+1x0 = 3
(8) 2 positive tweets from user 2 x quality 2.50 = 5.00
(9) 2 positive tweets x follow 5 = 10.00
I've tried to explain myself as good as possible. Who can help me build the correct query? As you can see, also dates for which there are no tweets (all values zero), need to be included in the resultset. I now have this, but am having trouble finishing the rest:
SELECT
s.sp100_id,
d._date,
COALESCE(c.pos-rt,0) AS pos-rt,
COALESCE(c.pos-quality,0) AS pos-quality,
COALESCE(c.pos-follow,0) AS pos-follow,
COALESCE(c.neg-rt,0) AS neg-rt,
COALESCE(c.neg-quality,0) AS neg-quality,
COALESCE(c.neg-follow,0) AS neg-follow
FROM sp100 s
CROSS JOIN daterange d
LEFT JOIN (
SELECT
sp100_id,
nyse_date,
COUNT(CASE class WHEN 2 THEN 1 END) * [rt] AS pos-rt,
COUNT(CASE class WHEN 2 THEN 1 END) * [quality] AS pos-quality,
COUNT(CASE class WHEN 2 THEN 1 END) * [follow] AS pos-follow,
COUNT(CASE class WHEN 3 THEN 1 END) * [rt] AS neg-rt,
COUNT(CASE class WHEN 3 THEN 1 END) * [quality] AS neg-quality,
COUNT(CASE class WHEN 3 THEN 1 END) * [follow] AS neg-follow
FROM tweets
GROUP BY sp100_id, nyse_date
) c ON s.sp100_id = c.sp100_id AND d._date = c.nyse_date
ORDER BY s.sp100_id, d._date ASC
Obviously, [rt], [quality] and [follow] need to be replaced by correct syntax and I'm not sure about the COUNT(...) either, because it now first counts the number of tweets, but it should take every tweet apart and multiply it by its own number of retweets ('rt').
Can anybody help me out?
Assuming that I've understood the problem correctly (see my comments above), then you merely need group the joined tables and SUM() the relevant fields where the tweets are of the desired class which can be determined using IF():
SELECT sp100.sp100_id AS `sp100_id`,
daterange._date AS `nyse_date`,
SUM(IF(tweets.class=2, tweets.rt, 0)) AS `pos-rt`,
SUM(IF(tweets.class=2, users.quality, 0)) AS `pos-quality`,
SUM(IF(tweets.class=2, users.follow, 0)) AS `pos-follow`,
SUM(IF(tweets.class=3, tweets.rt, 0)) AS `neg-rt`,
SUM(IF(tweets.class=3, users.quality, 0)) AS `neg-quality`,
SUM(IF(tweets.class=3, users.follow, 0)) AS `neg-follow`
FROM sp100
JOIN daterange
LEFT JOIN tweets ON tweets.nyse_date = daterange._date
AND tweets.sp100_id = sp100.sp100_id
LEFT JOIN users ON tweets.user_id = users.user_id
GROUP BY sp100.sp100_id, daterange._date
See it on sqlfiddle.
[EDIT] Here's the EXPLAIN:
id select_type table type possible_keys key key_len ref rows extra
-----------------------------------------------------------------------------------------------------------------------------------------------------------
1 SIMPLE sp100 index NULL PRIMARY 4 NULL 101 Using index; Using temporary; Using filesort
1 SIMPLE daterange index NULL _date 3 NULL 147 Using index; Using join buffer
1 SIMPLE tweets ref query,nyse_date,sp100_id nyse_date 3 sentimeter.daterange._date 3815
1 SIMPLE users eq_ref PRIMARY PRIMARY 4 sentimeter.tweets.user_id 1