MySQL compatibility or similarity ranking queries - mysql

G'day, I'm trying to develop a way to query compatibility or similarity between values and failing. It's not a highest or lowest AVG rating but rather smallest difference between values over a number or rows. So if structure is something like the following where RANK is the "rating" by the USER.
USER ITEM RANK
A x 5
B x 6
C x 2
A y 2
B y 3
C y 8
A z 7
B z 4
C z 4
At the end I'd like to be able to sort across the data like:
User A vs User B have avg rating difference of 3
User A vs User C have avg rating difference of 4
User B vs User C have avg rating difference of 5
My only thought so far is to build a temp table (huge) with every permutation:
col1 col2 dif item
A B 1 x
A C 3 x
etc...
And then SUM with a GROUP. But that still doesn't deal properly with occasions where User A and C match closer on some items and have greater diff on other items to outweigh the initial closeness. Any direction anyone can give?
Thanks!
This is a mysql 5.5 db so I'm missing any CTE or the like on query structure.

could be using a self join
select a.user, b.user, abs(a.rank - b.rank) diff_rank, a.item
from my_table a
inner join my_table b on a.item = b.item and a.user <> b.user
order by item, diff_rank asc
for avoid duplicated value you can use distinct
select distinct a.user, b.user, abs(a.rank - b.rank) diff_rank, a.item
from my_table a
inner join my_table b on a.item = b.item and a.user <> b.user
order by item, diff_rank asc
and for obtain the users with the lowest diff firts you can change the order by
select distinct a.user, b.user, abs(a.rank - b.rank) diff_rank, a.item
from my_table a
inner join my_table b on a.item = b.item and a.user <> b.user
order by diff_rank asc

Related

Connecting two mySQL tables and and summarising with unique values

I've got two mySQL tables, Table A and B. I need to get an output like in Table 3.
Below mentioned is the code I tried with Full Join and does not give me the intended result. Much appreciate your help..
SELECT DISTINCT(Table_A.Code) as 'Code', SUM(Table_A.Qty_On_Hand) as 'On Hand Qty', SUM(Table_B.Counted_Qty) as 'Counted Qty'
FULL JOIN Table_B ON Table_A.Code = Table_B.Code
FROM Table_A
Table A
Code
On Hand Qty
A
20
B
10
B
20
B
50
C
60
Table B
Code
Counted Qty
A
10
B
0
C
30
B
0
C
10
Out put required:
Code
On Hand Qty
Counted Qty
A
20
10
B
80
0
C
60
40
You need to use GROUP BY Table_A.Code, not DISTINCT.
SELECT a.Code, SUM(a.Qty_On_Hand) AS `On Hand Qty`, b.`Counted Qty`
FROM Table_A as a
JOIN (
SELECT Code, SUM(Counted_Qty) AS `Counted Qty`
FROM Table_B
GROUP BY Code
) AS b ON a.Code = b.Code
GROUP BY a.Code
You need to do one of the SUMs in a subquery, otherwise you'll multiply its sums by the number of rows in the other table. See Join tables with SUM issue in MYSQL.

How to sum up two different table values

I have two tables:
data
id[int] balance[float] category[id]
1 10.2 1
2 0.12 2
3 112.42 1
4 2.3 3
categories
id[int] name[varchar] start_at[float]
1 high 10.5
2 low 105.2
3 mid 0.7
I want to query the categories and join the data. For each categorie I want the sum of all data balances added to the start_at value of categories:
This is where I started with:
select sum(d.balance) as balancesum, c.name
from data d
left join categories c on c.id = d.category
group by d.category
What I want to know is, how can I add the start_at value of categories to the balancesum value?
SELECT c.name, c.start_at + SUM(d.balance) as balancesum
FROM categories c
JOIN data d ON c.id = d.category
GROUP BY c.name, c.start_at
You can use next approach:
select
c.name, balancesum, ifnull(balancesum, 0) + start_at
from categories c
left join (
-- calculate sum of balances per category
-- and join sums to data table
select category, sum(d.balance) as balancesum
from data d
group by d.category
) b on b.category = c.id;
Here you can play with live query

How to correctly separate this query

I'm trying to sort through my table to find the frequent categories in my orders. After conducting this query
SELECT
ccd.cart_id,
mp.category_name,
ccd.quantity
FROM
customer_orders co
JOIN customer_cart_dtls ccd
ON co.order_cart = ccd.cart_id
JOIN merchant_products mp
ON ccd.product_id = mp.product_id
which yields this result
So from that query Cart #2006........63 has 9 items. 1 from eatables, 3 From fruits, 2 From cleaning, and 3 from Snacks. All of them quantity 1 except for the second entry of cleaning which has two. How can I alter my query so that I get 10 items all with quantity 1?
Which would look like this
You want to split the individual rows into multiple rows. One method uses recursive CTEs:
WITH RECURSIVE t as (
SELECT ccd.cart_id, mp.category_name, ccd.quantity
FROM customer_orders co JOIN
customer_cart_dtls ccd
ON co.order_cart = ccd.cart_id JOIN
merchant_products mp
ON ccd.product_id = mp.product_id
),
cte as (
SELECT cart_id, category_name, quantity, 1 as n
FROM t
UNION ALL
SELECT cart_id, category_name, quantity, n + 1
FROM cte
WHERE n < quantity
)
SELECT cart_id, category_name, 1 as quantity
FROM cte;
Here is a db<>fiddle.
EDIT:
You can join in a list of quantities -- easier if you have a tally table of some sort:
SELECT ccd.cart_id, mp.category_name, 1 as quantity
FROM customer_orders co JOIN
customer_cart_dtls ccd
ON co.order_cart = ccd.cart_id JOIN
merchant_products mp
ON ccd.product_id = mp.product_id JOIN
(SELECT 1 as n UNION ALL
SELECT 2 as n UNION ALL
SELECT 3 as n UNION ALL
SELECT 4 as n UNION ALL
SELECT 5 as n
) n
ON n.n <= ccd.quantity;
You can also construct the table using variables from an existing table (if it is big enough):
(select (#rn := #rn + 1) as n
from customer_orders cross join
(select #rn := 0) params
limit 100 -- say that 100 is big enough
) n
Are you trying to count how many items come from each category by splitting every item into an individual row and then using COUNT? If so, I don't think you necessarily need to go down that route. It will likely be a lot easier to simply use the SUM aggregate function after grouping by category_name. It might look something like this:
SELECT mp.category_name, SUM(ccd.quantity)
FROM customer_orders AS co
JOIN customer_card_dtls AS ccd ON co.order_cart = ccd.cart_id
JOIN merchant_products AS mp ON ccd.product_id = mp.product_id
GROUP BY mp.category_name
If you want to also see cart IDs then just add the appropriate columns to your SELECT and GROUP BY statements

mysql retrieving unique id from table with division

In a table like this
ID | Category | Value
1 Device Computer
1 Location 1st Floor
2 Device Phone
2 Type Voip
2 Location 1st Floor
3 Device Computer
3 Location 2nd Floor
How do I get the ID of the where device='computer' and location='1st Floor'? The query is created programmatically and there might be many of these criteria that specifies a single ID in a statement.
You can use the join query like this for your problem.
select a.ID from MYTABLE a, MYTABLE b where a.ID=b.ID and a.Category='Device'
and a.VALUE='Computer' and b.Category='Location' and b.VALUE='1st Floor';
If there is may catogories like this then you must split the table like below.
TABLES :
Category with columns (CATOGORY_ID, CATOGORY)
Value with columns (VALUE_ID, VALUE)
MYTABLE with columns (ID, CATOGORY_ID, VALUE_ID)
then you should use join query.
select Distinct a.id
from myTable a inner join myTable b
on a.Id = b.Id
Where a.value = 'Computer' And b.value = '1st Floor' and a.Category = 'Device' and b.Category = 'Location'
Demo here

SQL subquery to return MIN of a column and corresponding values from another column

I'm trying to query
number of courses passed,
the earliest course passed
time taken to pass first course, for each student who is not currently expelled.
The tricky part here is 2). I constructed a sub-query by mapping the course table onto itself but restricting matches only to datepassed=min(datepassed). The query appears to work for a very sample, but when I try to apply it to my full data set (which would return ~1 million records) the query takes impossibly long to execute (left it for >2 hours and still wouldn't complete).
Is there a more efficient way to do this? Appreciate all your help!
Query:
SELECT
S.id,
COUNT(C.course) as course_count,
C2.course as first_course,
DATEDIFF(MIN(C.datepassed),S.dateenrolled) as days_to_first
FROM student S
LEFT JOIN course C
ON C.studentid = S.id
LEFT JOIN (SELECT * FROM course GROUP BY studentid HAVING datepassed IN (MIN(datepassed))) C2
ON C2.studentid = C.studentid
WHERE YEAR(S.dateenrolled)=2013
AND U.id NOT IN (SELECT id FROM expelled)
GROUP BY S.id
ORDER BY S.id
Student table
id status dateenrolled
1 graduated 1/1/2013
3 graduated 1/1/2013
Expelled table
id dateexpelled
2 5/1/2013
Course table
studentid course datepassed
1 courseA 5/1/2014
1 courseB 1/1/2014
1 courseC 2/1/2014
1 courseD 3/1/2014
3 courseA 1/1/2014
3 couseB 2/1/2014
3 courseC 3/1/2014
3 courseD 4/1/2014
3 courseE 5/1/2014
SELECT id, course_count, days_to_first, C2.course first_course
FROM (
SELECT S.id, COUNT(C.course) course_count,
DATEDIFF(MIN(datepassed),S.dateenrolled) as days_to_first,
MIN(datepassed) min_datepassed
FROM student S
LEFT JOIN course C ON C.studentid = S.id
WHERE S.dateenrolled BETWEEN '2013-01-01' AND '2013-12-31'
AND S.id NOT IN (SELECT id FROM expelled)
GROUP BY S.id
) t1 LEFT JOIN course C2
ON C2.studentid = t1.id
AND C2.datepassed = t1.min_datepassed
ORDER BY id
I would try something like:
SELECT s.id, f.course,
COALESCE( DATEDIFF( c.first_pass,s.dateenrolled), 0 ) AS days_to_pass,
COALESCE( c.num_courses, 0 ) AS courses
FROM student s
LEFT JOIN
( SELECT studentid, MIN(datepassed) AS first_pass, COUNT(*) AS num_courses
FROM course
GROUP BY studentid ) c
ON s.id = c.studentid
JOIN course f
ON c.studentid = f.studentid AND c.first_pass = f.datepassed
LEFT JOIN expelled e
ON s.id = e.id
WHERE s.dateenrolled BETWEEN '2013-01-01' AND '2013-12-31'
AND e.id IS NULL
This query assumes a student can pass only one course on a given day, otherwise you can get more than one row for a student as its possible to have many first courses.
For performance it would help to have an index on dateenrolled in student table and a composite index on (studentid,datepassed) in courses table.