Retrieving sets with highest number of combined votes among objects

Retrieving sets with highest number of combined votes among objects - mysql

I have three tables relevant to this problem - rankset, item, and vote. Rankset is essentially the category the item is placed in, such as "Favorite sport". Item is what's actually being voted on, such as "Baseball". Vote is the log of the vote itself. What I want to do is display the 25 most active ranksets on a page. Here's what the tables themselves look like:
CREATE TABLE IF NOT EXISTS `rankset` (
`id` INT NOT NULL AUTO_INCREMENT ,
`name` TEXT NOT NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB
CREATE TABLE IF NOT EXISTS `item` (
`id` BIGINT NOT NULL AUTO_INCREMENT ,
`name` VARCHAR(128) NOT NULL ,
`rankset` BIGINT NOT NULL ,
`image` VARCHAR(45) NULL ,
`description` VARCHAR(140) NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB
CREATE TABLE IF NOT EXISTS `mydb`.`vote` (
`id` BIGINT NOT NULL AUTO_INCREMENT ,
`value` TINYINT NOT NULL ,
`item` BIGINT NOT NULL ,
`user` BIGINT NOT NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB
This is what I've tried so far:
SELECT rankset.*, COALESCE(COUNT(vote.id), 0) AS votes
FROM rankset, item, vote
WHERE rankset.id = item.rankset
AND vote.item = item.id ORDER BY votes DESC LIMIT 25
For whatever reason, I seem to only be able to get the single most popular rankset with that. I've also tried this:
SELECT rankset.*, COALESCE(COUNT(vote.id), 0) AS votes
FROM rankset, vote, item
WHERE item.rankset = rankset.id
GROUP BY rankset ORDER BY votes DESC LIMIT 25
But that seems to ignore the "ORDER BY" part completely. What would be the correct way to go about this?
EDIT: Here's the fiddle: http://sqlfiddle.com/#!2/b57ac

Your queries are nearly right. In first one you should just add 'GROUP BY rankset'. Try that:
SELECT rankset.*, COALESCE(COUNT(vote.id), 0) AS votes
FROM rankset, item, vote
WHERE rankset.id = item.rankset
AND vote.item = item.id
GROUP by rankset.id
ORDER BY votes DESC
LIMIT 25;
Here's the fiddle: http://sqlfiddle.com/#!2/fe315/9.
UPDATE:
The case if some rankset doesn't have any votes:
SELECT rankset.*, COALESCE(COUNT(vote.id), 0) AS votes
FROM rankset, item
LEFT JOIN vote ON (vote.item = item.id)
WHERE rankset.id = item.rankset
GROUP by rankset.id
ORDER BY votes DESC
LIMIT 25;

Related

How to get NULL at the top of the sort?

Two tables are defined:
CREATE TABLE `users` (
`user_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
`score` tinyint(1) unsigned DEFAULT NULL,
PRIMARY KEY (`user_id`)
);
CREATE TABLE `online` (
`user_id` mediumint(6) unsigned NOT NULL AUTO_INCREMENT,
`url` varchar(255) NOT NULL,
PRIMARY KEY (`user_id`)
);
How to combine the tables so that the result would be sorted by the score field from the largest to the smallest but at the top there were records with the value NULL?
This query does not sort the second sample:
(SELECT * FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NULL)
UNION
(SELECT * FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NOT NULL ORDER BY `score` DESC)

Use two keys in the sort:
SELECT *
FROM `online` o JOIN
`users`
USING (user_id)
ORDER BY (`score` IS NULL) DESC, Score DESC;
MySQL treats booleans as numbers in a numeric context, with "1" for true and "0" for false. So, DESC puts the true values first.
Incidentally, your version would look like it works if you used UNION ALL rather than UNION. However, it is not guaranteed that the results are in any particular order unless you explicitly have an ORDER BY.
The UNION incurs overhead for removing duplicates and in doing so rearranges the data.

Try:
select * from online join users using (user_id) order by ifnull(score, 10) desc;

You can use order by Nulls Last in the end of your sql to show nulls on the first.

You can try below -
select * from
(
SELECT *,1 as ord FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NULL
UNION
SELECT *,2 FROM `online` JOIN `users` USING(`user_id`) WHERE `score` IS NOT NULL
)A ORDER BY ord asc,`score` DESC

Improve speed of MySQL query with 5 left joins

Working on a support ticketing system with not a lot of tickets (~3,000). To get a summary grid of ticket information, there are five LEFT JOIN statements on custom field table (j25_field_value) containing about 10,000 records. The query runs too long (~10 seconds) and in cases with a WHERE clause, it runs even longer (up to ~30 seconds or more).
Any suggestions for improving the query to reduce the time to run?
Four tables:
j25_support_tickets
CREATE TABLE `j25_support_tickets` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`category_id` int(11) NOT NULL DEFAULT '0',
`user_id` int(11) DEFAULT NULL,
`email` varchar(50) DEFAULT NULL,
`subject` varchar(255) DEFAULT NULL,
`message` text,
`modified_date` datetime DEFAULT NULL,
`priority_id` tinyint(3) unsigned DEFAULT NULL,
`status_id` tinyint(3) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3868 DEFAULT CHARSET=utf8
j25_support_priorities
CREATE TABLE `j25_support_priorities` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(100) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=14 DEFAULT CHARSET=utf8
j25_support_statuses
CREATE TABLE `j25_support_statuses` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id` (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=7 DEFAULT CHARSET=utf8
j25_field_value (id, ticket_id, field_id, field_value)
CREATE TABLE `j25_support_field_value` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`ticket_id` int(11) DEFAULT NULL,
`field_id` int(11) DEFAULT NULL,
`field_value` tinytext,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=10889 DEFAULT CHARSET=utf8
Also, ran this:
SELECT LENGTH(field_value) len FROM j25_support_field_value ORDER BY len DESC LIMIT 1
note: the result = 38
The query:
SELECT DISTINCT t.id as ID
, (select p.title from j25_support_priorities p where p.id = t.priority_id) as Priority
, (select s.title from j25_support_statuses s where s.id = t.status_id) as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, type.field_value AS IssueType
, ver.field_value AS Version
, utype.field_value AS UserType
, cust.field_value AS Company
, refno.field_value AS RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value AS type ON t.id = type.ticket_id AND type.field_id =1
LEFT JOIN j25_support_field_value AS ver ON t.id = ver.ticket_id AND ver.field_id =2
LEFT JOIN j25_support_field_value AS utype ON t.id = utype.ticket_id AND utype.field_id =3
LEFT JOIN j25_support_field_value AS cust ON t.id = cust.ticket_id AND cust.field_id =4
LEFT JOIN j25_support_field_value AS refno ON t.id = refno.ticket_id AND refno.field_id =5

ALTER TABLE j25_support_field_value
ADD INDEX (`ticket_id`,`field_id`,`field_value`(50))
This index will work as a covering index for your query. It will allow the joins to use only this index to look up the values. It should perform massively faster than without this index, since currently your query would have to read every row in the table to find what matches each combination of ticket_id and field_id.
I would also suggest converting your tables to InnoDB engine, unless you have a very explicit reason for using MyISAM.
ALTER TABLE tablename ENGINE=InnoDB

As above - a better index would help. You could probably then simplify your query into something like this too (join to the table only once):
SELECT t.id as ID
, p.title as Priority
, s.title as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, case when v.field_id=1 then v.field_value else null end as IssueType
, case when v.field_id=2 then v.field_value else null end as Version
, case when v.field_id=3 then v.field_value else null end as UserType
, case when v.field_id=4 then v.field_value else null end as Company
, case when v.field_id=5 then v.field_value else null end as RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value v ON t.id = v.ticket_id
LEFT JOIN j25_support_priorities p ON p.id = t.priority_id
LEFT JOIN j25_support_statuses s ON s.id = t.status_id;

You can do away with the subqueries for starters and just get them from another join. You can add an index to j25_support_field_value
alter table j25_support_field_value add key(id, field_type);
I assume there is an index on id in j25_support_tickets - if not and if they are unique, add a unique index alter table j25_support_tickets add unique key(id); If they're not unique, remove the word unique from that statement.
In MySQL, a join usually requires an index on the field(s) that you are using to join on. This will hold up and produce very reasonable results with huge tables (100m+), if you follow that rule, you will not go wrong.
are the ids in j25_support_tickets unique? If they are you can do away with the distinct - if not, or if you are getting exact dupicates in each row, still do away with the distinct and add a group by t.id to the end of this:
SELECT t.id as ID
, p.title as Priority
, s.title as Status
, t.subject as Subject
, t.email as SubmittedByEmail
, type.field_value AS IssueType
, ver.field_value AS Version
, utype.field_value AS UserType
, cust.field_value AS Company
, refno.field_value AS RefNo
, t.modified_date as Modified
FROM j25_support_tickets AS t
LEFT JOIN j25_support_field_value AS type ON t.id = type.ticket_id AND type.field_id =1
LEFT JOIN j25_support_field_value AS ver ON t.id = ver.ticket_id AND ver.field_id =2
LEFT JOIN j25_support_field_value AS utype ON t.id = utype.ticket_id AND utype.field_id =3
LEFT JOIN j25_support_field_value AS cust ON t.id = cust.ticket_id AND cust.field_id =4
LEFT JOIN j25_support_field_value AS refno ON t.id = refno.ticket_id AND refno.field_id =5
LEFT JOIN j25_support_priorities p ON p.id = t.priority_id
LEFT JOIN j25_support_statuses s ON s.id = t.status_id;

Switch to InnoDB.
After switching to InnoDB, make the PRIMARY KEY for j25_support_field_value be (ticket_id, field_id) (and get rid if id). (Tacking on field_value(50) will hurt, not help.)
A PRIMARY KEY is a UNIQUE KEY, so don't have both.
Use VARCHAR(255) instead of the nearly-equivalent TINYTEXT.
EAV schema sucks. My ran on EAV.

Selecting sets with the top 10% of combined votes among objects

For some background, I previously asked about retrieving sets with highest number of combined votes among objects. That works great for getting the top 25, but now I would like to get the top 10%, ordered by rankset's timestamp. Here are the tables in question:
CREATE TABLE IF NOT EXISTS `rankset` (
`id` INT NOT NULL AUTO_INCREMENT ,
`name` TEXT NOT NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB
CREATE TABLE IF NOT EXISTS `item` (
`id` BIGINT NOT NULL AUTO_INCREMENT ,
`name` VARCHAR(128) NOT NULL ,
`rankset` BIGINT NOT NULL ,
`image` VARCHAR(45) NULL ,
`description` VARCHAR(140) NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB
CREATE TABLE IF NOT EXISTS `mydb`.`vote` (
`id` BIGINT NOT NULL AUTO_INCREMENT ,
`value` TINYINT NOT NULL ,
`item` BIGINT NOT NULL ,
`user` BIGINT NOT NULL ,
PRIMARY KEY (`id`) )
ENGINE = InnoDB
I'd list what I've tried so far, but I honestly don't even know where to begin with this one. Here's the SQL fiddle:
http://sqlfiddle.com/#!2/fe315/9

From the comments:
Look at this answer: https://stackoverflow.com/a/4474389/97513, He generates a column for the ranking of his items, use a WHERE clause that uses rank <= (SELECT COUNT(1) / 10 FROM rankset) to identify the top 10%.
I put together an SQL fiddle to demonstrate: http://sqlfiddle.com/#!2/fe315/21 - Here's another with more results so you can see it scales up when you add more rows: http://sqlfiddle.com/#!2/a02ea/1
SQL Used:
SET #rn := 0;
SELECT (#rn:=#rn+1) AS rank, q.*
FROM (
SELECT rankset.*, COALESCE(COUNT(vote.id), 0) AS votes
FROM rankset, vote, item
WHERE item.rankset = rankset.id
AND vote.item = item.id
GROUP BY rankset.id
ORDER BY votes DESC
) q
WHERE
#rn <= (SELECT COUNT(1)/10 FROM rankset);

Need help creating SQL query

I have thus two tables:
CREATE TABLE `workers` (
`id` int(7) NOT NULL AUTO_INCREMENT,
`number` int(7) NOT NULL,
`percent` int(3) NOT NULL,
`order` int(7) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE `data` (
`id` bigint(15) NOT NULL AUTO_INCREMENT,
`workerId` int(7) NOT NULL,
PRIMARY KEY (`id`)
);
I want to return the first worker (order by order ASC) that his number of rows in the table data times percent(from table workers) /100 is smaller than number(from table workers.
I have tried this query:
SELECT workers.id, COUNT(data.id) AS `countOfData`
FROM `workers` as workers, `data` as data
WHERE data.workerId = workers.id
AND workers.percent * `countOfData` < workers.number
LIMIT 1
But I get the error:
#1054 - Unknown column 'countOfData' in 'where clause'

This should work:
SELECT A.id
FROM workers A
LEFT JOIN (SELECT workerId, COUNT(*) AS Quant
FROM data
GROUP BY workerId) B
ON A.id = B.workerId
WHERE (COALESCE(Quant,0) * `percent`)/100 < `number`
ORDER BY `order`
LIMIT 1

You could calculate the number of rows per worker in a subquery. The subquery can be joined to the worker table. If you use a left join, a worker with no data rows will be considered:
select *
from workers w
left join
(
select workerId
, count(*) as cnt
from data
group by
workerId
) d
on w.id = d.workerId
where coalesce(d.cnt, 0) * w.percent / 100 < w.number
order by
w.order
limit 1

mysql query with multiple groups?

I have a query that gets product IDs based on keywords.
SELECT indx_search.pid
FROM indx_search
LEFT JOIN word_index_mem ON (word_index_mem.word = indx_search.word)
WHERE indx_search.word = "phone"
GROUP BY indx_search.pid
ORDER BY indx_search.pid ASC
LIMIT 0,20
This works well but now I'm trying to go a step further and implement "price range" into this query.
CREATE TABLE `price_range` (
`pid` int(11) NOT NULL,
`range_id` tinyint(3) NOT NULL,
PRIMARY KEY (`pid`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1
This table simply contains product IDs and a range_id. The price range values are stored here:
CREATE TABLE `price_range_values` (
`ID` tinyint(3) NOT NULL AUTO_INCREMENT,
`rangeFrom` float(10,2) NOT NULL,
`rangeTo` float(10,2) NOT NULL,
PRIMARY KEY (`ID`)
) ENGINE=MyISAM AUTO_INCREMENT=32 DEFAULT CHARSET=latin1
I want to GROUP price_range.range_id with COUNT() of how many products match the certain price range within the current query. I still want to receive my 20 results of product IDs.
So something along the lines of:
SELECT indx_search.pid, price_range.range_id as PriceRangeID, COUNT(price_range.range_id) as PriceGroupTotal
FROM indx_search
LEFT JOIN windex_mem ON ( windex_mem.word = indx_search.word )
LEFT JOIN price_range ON ( price_range.pid = indx_search.pid )
WHERE indx_search.word = "memory"
GROUP BY indx_search.pid, PriceRangeID
ORDER BY indx_search.pid ASC
LIMIT 0 , 20
Is this possible to accomplish without busting an additional query?

Try to use a subquery with a LIMIT clause, e.g. -
SELECT
i_s.pid, p_r.range_id as PriceRangeID, COUNT(p_r.range_id) as PriceGroupTotal
FROM
(SELECT * FROM indx_search WHERE i_s.word = 'memory' ORDER BY i_s.pid ASC LIMIT 0 , 20) i_s
LEFT JOIN
windex_mem w_m ON w_m.word = i_s.word
LEFT JOIN
price_range p_r ON p_r.pid = i_s.pid
GROUP BY
i_s.pid, PriceRangeID

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Retrieving sets with highest number of combined votes among objects - mysql

Related

How to get NULL at the top of the sort?

Improve speed of MySQL query with 5 left joins

Selecting sets with the top 10% of combined votes among objects

Need help creating SQL query

mysql query with multiple groups?

Categories

Resources