SQL Weighted averages of multiple rows -

SQL Weighted averages of multiple rows - - mysql

How would you go about to find the real weighted average of multiple rows:
By real weighted, I mean like this calculator: https://www.rapidtables.com/calc/math/weighted-average-calculator.html (and not by multiplying value with weight).
The weight for each answer is set in answers and values for each question in child-table answer_items. We want the weighted average for each question (a,b,c,d). We know what questions to look for in advance.
The query will include between 2 and 500k answers (so preferably a speedy solution :) )
CREATE TABLE `answers` (
`id` int(10) NOT NULL,
`weight` varchar(255) NOT NULL
);
INSERT INTO `answers` (`id`, `weight`) VALUES
(1, '0.7'),
(2, '1'),
(3, '0.7'),
(4, '0.9');
CREATE TABLE `answer_items` (
`id` int(11) NOT NULL,
`answer_id` int(11) NOT NULL,
`question` varchar(5) NOT NULL,
`value` int(11) NOT NULL
);
INSERT INTO `answer_items` (`id`, `answer_id`, `question`, `value`) VALUES
(1, 1, 'a', 2),
(2, 1, 'b', 4),
(3, 1, 'c', 2),
(4, 1, 'd', 3),
(5, 2, 'a', 4),
(6, 2, 'b', 2),
(7, 2, 'c', 4),
(8, 2, 'd', 1),
(9, 3, 'a', 3),
(10, 3, 'b', 4),
(11, 3, 'c', 1),
(12, 3, 'd', 5),
(13, 4, 'a', 5),
(14, 4, 'b', 2),
(15, 4, 'c', 3),
(16, 4, 'd', 3);
ALTER TABLE `answers`
ADD PRIMARY KEY (`id`);
ALTER TABLE `answer_items`
ADD PRIMARY KEY (`id`);
ALTER TABLE `answers`
MODIFY `id` int(10) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=5;
ALTER TABLE `answer_items`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=17;

You can multiply the value times the weight and then divide by the sum of the weights. For the weighted average by question:
select question, sum(ai.value * a.weight) / sum(a.weight)
from answer_items ai join
answers a
on ai.answer_id = a.id
group by question;
Here is a db<>fiddle.

Somethin like:
SELECT
SUM(value * weight) / SUM (weight)
FROM
answers
JOIN
answer_items
ON
answer_items.answer_id = answers.id
(edit: and obviously include the question and GROUP BY question, as in the previous answer for the SUM aggregation, as per the previous answer. D'oh)

Related

get two totals in one sql query

How can I get the total of hectar and total amount in one row rather than looping it in many rows with php and mysql?
I am using the following sql table:
-- Database: officedb
CREATE TABLE IF NOT EXISTS `agricollectcropedata` (
`kebele` varchar(50) NOT NULL,
`croptype` varchar(40) NOT NULL,
`hektar` int(40) NOT NULL,
`amount` int(20) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `agricollectcropedata` (`kebele`, `croptype`, `hektar`, `amount`) VALUES
('b', 'wheet', 2, 12),
('a', 'wheet', 1, 5),
('a', 'wheet', 2, 6),
('a', 'wheet', 3, 12),
('a', 'wheet', 0, 0),
('a', 'wheet', 0, 0);

SELECT
sum(hektar) as hektar,
sum(amount) as amount
FROM
agricollectcropedata

Condition based selecting from group_concat in Mysql query

Here is my table and sample data.
CREATE TABLE `articles`
(
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `tags`
(
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `article_tags`
(
`article_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL
);
INSERT INTO `tags` (`id`, `name`) VALUES
(1, 'Wap Stories'),
(2, 'App Stories');
INSERT INTO `articles` (`id`, `title`) VALUES
(1, 'USA'),
(2, 'England'),
(3, 'Germany'),
(4, 'India'),
(5, 'France'),
(6, 'Dubai'),
(7, 'Poland'),
(8, 'Japan'),
(9, 'China'),
(10, 'Australia');
INSERT INTO `article_tags` (`article_id`, `tag_id`) VALUES
(1, 1),
(1, 2),
(4, 1),
(5, 1),
(2, 2),
(2, 1),
(6, 2),
(7, 2),
(8, 1),
(9, 1),
(3, 2),
(9, 2),
(10, 2);
How can I get the below output I have tried using group_concat function. It gives all the results. But my requirement is I need to groupconcat values as
a. Combination of 1,2 can be there, only 1 can be there but 2 alone cannot be there.
b. Combination of 2,1 can be there, only 2 can be there but 1 alone cannot be there
Below is the output I need
id, title, groupconcat
--------------------------
1, USA, 1,2
2, England, 1,2
4, India, 1
5, France, 1
8, Japan, 1
9, China, 1,2
SqlFiddle Link
The query which I am using is
select id, title, group_concat(tag_id order by tag_id) as 'groupconcat' from articles a
left join article_tags att on a.id = att.article_id
where att.tag_id in (1,2)
group by article_id order by id

You can try like this
SELECT id, title, GROUP_CONCAT(tag_id ORDER BY tag_id) AS 'groupconcat'
FROM articles a
LEFT join article_tags att on a.id = att.article_id
WHERE att.tag_id in (1,2)
GROUP BY article_id
HAVING SUBSTRING_INDEX(groupconcat,',',1) !='2'
ORDER BY id

mySQL Divide two values from Same Column

I wish to divide values from the same column using mySQL.
is there a better way of doing this??? Do I really need 3 select statements?
SELECT (SELECT FixAM From Fixes WHERE Id = 1) / (SELECT FixAM From Fixes WHERE Id = 2)
My table structure is as follows:
CREATE TABLE IF NOT EXISTS `Fixes` (
`Id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'PK',
`CurrencyId` int(11) NOT NULL COMMENT 'FK',
`MetalId` int(11) NOT NULL COMMENT 'FK',
`FixAM` decimal(10,5) NOT NULL,
`FixPM` decimal(10,5) DEFAULT NULL,
`TimeStamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`Id`),
KEY `CurrencyId` (`CurrencyId`),
KEY `MetalId` (`MetalId`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COLLATE=latin1_general_ci AUTO_INCREMENT=13 ;
--
-- Dumping data for table `Fixes`
--
INSERT INTO `Fixes` (`Id`, `CurrencyId`, `MetalId`, `FixAM`, `FixPM`, `TimeStamp`) VALUES
(1, 1, 1, '1592.50000', '1586.25000', '2013-02-25 15:10:21'),
(2, 2, 1, '1051.84900', '1049.59300', '2013-02-25 15:10:21'),
(3, 3, 1, '1201.88700', '1194.10600', '2013-02-25 15:10:21'),
(4, 1, 2, '29.17000', NULL, '2013-02-25 13:54:02'),
(5, 2, 2, '19.27580', NULL, '2013-02-25 13:54:02'),
(6, 3, 2, '21.98190', NULL, '2013-02-25 13:54:02'),
(7, 1, 3, '1627.00000', '1620.00000', '2013-02-25 14:28:59'),
(8, 2, 3, '1074.65000', '1072.50000', '2013-02-25 14:28:59'),
(9, 3, 3, '1229.30000', '1218.95000', '2013-02-25 14:28:59'),
(10, 1, 4, '747.00000', '748.00000', '2013-02-25 14:28:59'),
(11, 2, 4, '493.40000', '495.20000', '2013-02-25 14:28:59'),
(12, 3, 4, '564.40000', '562.85000', '2013-02-25 14:28:59');

I think this is what you're looking for:
SELECT MetalId,
MAX(CASE WHEN CurrencyId = 1 THEN FixAM END) /
MAX(CASE WHEN CurrencyId = 2 THEN FixAM ELSE 1 END) Output
FROM Fixes
GROUP BY MetalId
This produces 1592.50000 / 1051.849000. If you want the opposite, swap the currency ids.
SQL Fiddle Demo
In case you don't have a CurrencyId = 2, I defaulted the dividing value to 1 so you wouldn't receive an error.

search a set of values in other set of values for a row

Hello I am having issues with execution time on a query that searches for users ( from users table ) that are having at least one interest from one specified interests set and a location from a specified locations set. So I have this test DB:
CREATE TABLE IF NOT EXISTS `interests` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;
--
-- Dumping data for table `interests`
--
INSERT INTO `interests` (`id`, `name`) VALUES
(1, 'auto'),
(2, 'moto'),
(3, 'health'),
(4, 'garden'),
(5, 'house'),
(6, 'music'),
(7, 'video'),
(8, 'games'),
(9, 'it');
-- --------------------------------------------------------
--
-- Table structure for table `locations`
--
CREATE TABLE IF NOT EXISTS `locations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `locations`
--
INSERT INTO `locations` (`id`, `name`) VALUES
(1, 'engalnd'),
(2, 'austia'),
(3, 'germany'),
(4, 'france'),
(5, 'belgium'),
(6, 'italy'),
(7, 'russia'),
(8, 'poland'),
(9, 'norway'),
(10, 'romania');
-- --------------------------------------------------------
--
-- Table structure for table `users`
--
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `users`
--
INSERT INTO `users` (`id`, `email`) VALUES
(1, 'email1#test.com'),
(2, 'email2#test.com'),
(3, 'email3#test.com'),
(4, 'email4#test.com'),
(5, 'email5#test.com'),
(6, 'email6#test.com'),
(7, 'email7#test.com'),
(8, 'email8#test.com'),
(9, 'email9#test.com'),
(10, 'email10#test.com');
-- --------------------------------------------------------
--
-- Table structure for table `users_interests`
--
CREATE TABLE IF NOT EXISTS `users_interests` (
`user_id` int(11) NOT NULL,
`interest_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`interest_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_interests`
--
INSERT INTO `users_interests` (`user_id`, `interest_id`) VALUES
(1, 1),
(1, 2),
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
-- --------------------------------------------------------
--
-- Table structure for table `users_locations`
--
CREATE TABLE IF NOT EXISTS `users_locations` (
`user_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`location_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_locations`
--
INSERT INTO `users_locations` (`user_id`, `location_id`) VALUES
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
Is there a better way to query it than this:
SELECT email,
GROUP_CONCAT( DISTINCT ui.interest_id ) AS interests,
GROUP_CONCAT( DISTINCT ul.location_id ) AS locations
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
GROUP BY u.id
HAVING IF( interests IS NOT NULL , FIND_IN_SET( 2, interests )
OR FIND_IN_SET( 3, interests ) , 1 )
AND IF( locations IS NOT NULL , FIND_IN_SET( 2, locations )
OR FIND_IN_SET( 3, locations ) , 1 )
This is the best solution I found but it still slow on a 500k and 1mil rows in the relational tables ( locations and interests ). Especially when you are matching against a large set of values ( let's say above 50 locations and interests ).
So I am trying to achieve the result this query produces, but a bit faster:
email interests locations
email1#test.com 1,2 [BLOB - 0B]
email5#test.com 1,2,8 1,2,8
email6#test.com 3,7,8 3,7,8
email9#test.com [BLOB - 0B] [BLOB - 0B]
email10#test.com [BLOB - 0B] [BLOB - 0B]
I also tried to join against an SELECT UNION table - for the matching set - but it was even slower. Like this:
SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is` ON ui.interest_id = is.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls` ON ul.location_id = ls.location
WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1) AND
IF(ul.user_id IS NOT NULL, ls.location IS NOT NULL,1)
GROUP BY u.id
I am using this for a basic targeting system.
I would appreciate very much, any suggestion! Thank you!

you have IS is reserved word for mysql
and also your group by can slow your query but i dont see any meaning to use group by u.id here since the u.id is already unique id.
look demo
try use backticks around it.
SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is`
ON ui.interest_id = `is`.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls`
ON ul.location_id = `ls`.location
WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1)
AND
IF(ul.user_id IS NOT NULL, `ls`.location IS NOT NULL,1)

Remove duplicate rows, except one, where rows contain NULL values

This question is similar to but different from other SO questions involving removal of duplicate rows in MySQL.
Chosen solutions in the referenced questions(below) fail where one or more of the columns contain NULL values. I'm including schema as well as two chosen answers in a sqlfiddle below, to illustrate where the previous solutions are not working.
Source schema:
CREATE TABLE IF NOT EXISTS `deldup` (
`or_id` int(11) NOT NULL AUTO_INCREMENT,
`order_id` varchar(5) NOT NULL,
`txt_value` varchar(20) NOT NULL,
`date_of_revision` date NOT NULL,
`status` int(3) DEFAULT NULL,
PRIMARY KEY (`or_id`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=16 ;
INSERT INTO `deldup` (`or_id`, `order_id`, `txt_value`, `date_of_revision`, `status`) VALUES
(1, '10001', 'ABC', '2003-03-06', NULL),
(2, '10001', 'RFE', '2003-03-11', NULL),
(3, '10002', 'ASE', '2009-08-05', NULL),
(4, '10003', 'PEF', '2001-11-03', NULL),
(5, '10004', 'OIU', '1999-10-29', NULL),
(6, '10005', 'FOO', '2002-03-01', NULL),
(7, '10006', 'RTY', '2005-08-19', NULL),
(8, '10001', 'NND', '2003-03-20', NULL),
(9, '10005', 'VBN', '2002-02-19', NULL),
(10, '10002', 'AAQ', '2009-08-13', NULL),
(11, '10002', 'EEW', '2009-08-07', NULL),
(12, '10001', 'ABC', '2003-03-06', 3),
(13, '10001', 'ABC', '2003-03-06', 3),
(14, '10001', 'ABC', '2003-03-06', NULL),
(15, '10001', 'ABC', '2003-03-06', NULL);
Solution Example 1:
http://sqlfiddle.com/#!2/983f3/1
create temporary table tmpTable (or_id int);
insert tmpTable
(or_id)
select or_id
from deldup yt
where exists
(
select *
from deldup yt2
where yt2.txt_value = yt.txt_value
and yt2.order_id = yt.order_id
and yt2.date_of_revision = yt.date_of_revision
and yt2.status = yt.status
and yt2.or_id > yt.or_id
);
delete
from deldup
where or_id in (select or_id from tmpTable);
Note that the rows containing non-null row values are successfully deleted, however rows containing null values are not removed (See row 14 and 15 in the resulting SELECT query).
Solution Example 2:
http://sqlfiddle.com/#!2/8a4f8/1
DELETE
n1
FROM
deldup n1,
deldup n2
WHERE
n1.or_id < n2.or_id AND
n1.order_id = n2.order_id AND
n1.txt_value = n2.txt_value AND
n1.date_of_revision = n2.date_of_revision AND
n1.status = n2.status
This solution involves less code and works just the same as Example 1, including the exclusion of rows containing null values.
How can I remove the duplicate rows where one of the column values contains NULL values?
References:
Delete all Duplicate Rows except for One in MySQL?
Remove duplicate rows in MySQL
How to remove duplicate entries from a mysql db?

How about simply adjusting the second solution with something like this:
WHERE
...
(n1.status IS NULL AND n2.status IS NULL
OR n1.status = n2.status)
I assume you consider the record a duplicate only if all values are repeated.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL Weighted averages of multiple rows - - mysql

You can multiply the value times the weight and then divide by the sum of the weights. For the weighted average by question: select question, sum(ai.value * a.weight) / sum(a.weight) from answer_items ai join answers a on ai.answer_id = a.id group by question; Here is a db<>fiddle.

Somethin like: SELECT SUM(value * weight) / SUM (weight) FROM answers JOIN answer_items ON answer_items.answer_id = answers.id (edit: and obviously include the question and GROUP BY question, as in the previous answer for the SUM aggregation, as per the previous answer. D'oh)

Related

get two totals in one sql query

Condition based selecting from group_concat in Mysql query

mySQL Divide two values from Same Column

search a set of values in other set of values for a row

Remove duplicate rows, except one, where rows contain NULL values

Categories

Resources

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

SQL Weighted averages of multiple rows - - mysql

You can multiply the value times the weight and then divide by the sum of the weights. For the weighted average by question: select question, sum(ai.value * a.weight) / sum(a.weight) from answer_items ai join answers a on ai.answer_id = a.id group by question; Here is a db<>fiddle.

Somethin like: SELECT SUM(value * weight) / SUM (weight) FROM answers JOIN answer_items ON answer_items.answer_id = answers.id (edit: and obviously include the question and GROUP BY question, as in the previous answer for the SUM aggregation, as per the previous answer. D'oh)

Related

get two totals in one sql query

Condition based selecting from group_concat in Mysql query

mySQL Divide two values from Same Column

search a set of values in other set of values for a row

Remove duplicate rows, except one, ** where rows contain NULL values **

Categories

Resources

Remove duplicate rows, except one, where rows contain NULL values