Deleting cross referenced data - mysql

I have the following MySQL table:
id rid
----- ------
1 2
2 1
2 3
3 2
1 3
3 1
I want to change this so only one row per relation exists.
e.g:
id rid
----- ------
1 2
2 3
1 3

If you always have pairs (as in your example):
delete from table
where id > rid;
This keeps the record where id is smaller.
If there is the possibility that no all pairs exist, then:
delete t
from table t left outer join
(select least(id, rid) as lid, greatest(id, rid) as gid, count(*) as cnt
from table t2
group by least(id, rid), greatest(id, rid)
) t2
on least(t.id, t.rid) = t2.lid and greatest(t.id, t.rid) = gid
where id < rid or t2.cnt = 1;
EDIT (explanation):
How does the second query work? Let me be honest, what I want to write is this:
delete t from table t
where id < rid or
(id > rid and
not exists (select 1 from table t2 where t2.id = t.rid and t2.rid = t.id
);
That is, I want to keep all records where id < rid. But then, I also want to keep all singleton records where rid > id. I don't think MySQL allows the syntax with the where clause.
Instead, the query in the answer counts the number of times that a pair exists, by looking at the smallest value and the largest value. For the data in the question, the result of the subquery is:
id rid cnt
1 2 2
2 3 2
1 3 2
So, all of these would use the id < rid to select the row. If you had one more row, say 4, 1. It would look like:
lid gid cnt
1 2 2
2 3 2
1 3 2
1 4 1
In this case, the first three would take the row with id < rid. But the new row would also be selected because the cnt is 1.
If you had duplicates in the table and a primary key, there would be a slight variation on the query that would do the same thing.

Related

Retrieving data across two tables

I'm pretty new to MySQL and need to get data from a column where the id in another column of the same table matches the id in a second table and I'm not sure how to go about it.
I haven't tried anything yet as I'm too new to go about answering my own question, sorry.
So my first table looks like this
userid questionid score
-----------------------------
1 1 5
1 2 4
1 3 7
1 4 10
1 4 6
And my 2nd table looks like this
otherfields userid
---------------------
blah 1
blah 2 2
etc 3
you 4
get 5
the 6
idea 7
So what I need to do is select all the scores from table 1 where userid of table 1 matches the user id of table 2.
So you want to sum up the score for each user?
Then something like this could help:
SELECT t2.userid, t2.otherfields, SUM(t1.score) AS sum_score
FROM first_table t1
LEFT JOIN second_table t2 ON t1.userid = t2.userid
GROUP BY t2.userid
First, you join the two tables together based on userid which is the same over both tables. Then you GROUP all rows which belong to the same user (e.g. question 1-5 for user 1) together and finally you SUM up the scores of each row of a group.

MySQL Putting in duplicate id's

I'm trying to run an UPDATE query that uses the same table and I'm getting an error saying "1093 - Table 'queues_monitor_times' is specified twice, both as a target for 'UPDATE' and as a separate source for data".
UPDATE queues_monitor_times
SET queue_id = IF((
SELECT id
FROM queues_monitor_times
INNER JOIN(
SELECT pcc_group, pcc, gds, queue, category, `name`
FROM queues_monitor_times
GROUP BY pcc_group, pcc, gds, queue, category, `name`
HAVING COUNT(id) > 1
)temp ON queues_monitor_times.pcc_group = temp.pcc_group AND
queues_monitor_times.pcc = temp.pcc AND
queues_monitor_times.gds = temp.gds AND
queues_monitor_times.queue = temp.queue AND
queues_monitor_times.category = temp.category AND
queues_monitor_times.`name` = temp.`name`), 1, id)
WHERE
id NOT IN (SELECT MIN(id) FROM queues_old GROUP BY pcc_group, pcc, gds, queue, category, `name`);
I ran the select query by itself and it showed all the rows that were duplicates, which is what I wanted. I want queue_id to be set with the lowest duplicate row's id if the row is a duplicate or the row id if it is not.
Example of what the query should do:
id dup_id name value
1 1 John 13
2 2 John 13
3 3 Sally 6
4 4 Frank 4
5 5 Sally 6
And after running the query it will turn into
id dup_id name value
1 1 John 13
2 1 John 13
3 3 Sally 6
4 4 Frank 4
5 3 Sally 6
Please advise and thank you for your help.
I was able to solve my problem. Thanks for all your help!
UPDATE queues_monitor_times
SET queue_id = (
SELECT
id
FROM
queues_old
WHERE
queues_old.pcc_group = queues_monitor_times.pcc_group
AND queues_old.pcc = queues_monitor_times.pcc
AND queues_old.gds = queues_monitor_times.gds
AND queues_old.queue = queues_monitor_times.queue
AND queues_old.category = queues_monitor_times.category
AND queues_old.`name` = queues_monitor_times.`name`
GROUP BY pcc_group, pcc, gds, queue, category, `name`
HAVING COUNT(id) > 1)
WHERE
id NOT IN (SELECT MIN(id) FROM queues_old GROUP BY pcc_group, pcc, gds, queue, category, `name`);
For those that will want to use this in the future, queues_monitor_times table and queues_old table have the exact same data.

Explaning results out of JOIN clause

I have mycash table in MySQL with the following data :
Test Case 1 :
When I run the following query,
SELECT t.id, SUM(prev.cash) AS cash_sum FROM mycash t JOIN mycash prev ON (t.id > prev.id)
I get :
id | cash_sum
2 | 1303.00
Summing up the cash values from all the rows equals to 1302 and NOT 1303.
I change the comparison operator in the ON condition and get the following results:
Test Case 2 :
For ON (t.id < prev.id) , result is:
id | cash_sum
1 | 2603.00
Test Case 3 :
For ON (t.id >= prev.id) , result is:
id | cash_sum
1 | 2605.00
Test Case 4 :
For ON (t.id <= prev.id) , result is:
id | cash_sum
1 | 3905.00
What is the calculation behind each of the results? Step by step explanation would clarify them most.
The results are correct.
200 + 200 + 200 + 301 + 301 + 101 = 1303
Yes, I got the answer my self. It is quite simple.
For any comparison operator in the ON condition, we have to compare each row in table t with all other rows in table prev. The query in any case will output the sum total of all the cash values from all the rows (total number of rows returned may NOT be 4 at most).
Let me explain the test case 1:
SELECT t.id, SUM(prev.cash) AS cash_sum FROM mycash t JOIN mycash prev ON (t.id > prev.id)
We have to select those rows from table prev whose id numbers are less then a particular id in table t.
From the first table, i.e. table t, when the id number is 1, we get no corresponding rows from second table, i.e. table prev. When the id is 2 from 1st table, we have the id 1 (cash =200) from 2nd table. If the id is 3 from 1st table, we have 1 and 2 (cash =200 and 301 respectively) from 2nd table. When the id is 4 from first table, we have ids 1,2,3 (cash = 200,301,101 respectively). So after adding all the cash values, cash_sum becomes 200+200+301+200+301+101 = 1303.
Other test cases can be explained in a similar way.

common values b/w fields

This table lists user and item id's
user_id item_id
1 1
1 2
1 3
2 1
2 3
3 1
3 4
3 3
How can I run a query on this table to list all the items that are common between given users.
My guess is, this will need a self join, but I'm not sure.
i am trying this quering but it's returning an error
SELECT *
FROM recs 1
JOIN recs 2 ON 2.user_id='2' AND 2.item_id=1.item_id
WHERE 1.user_id='1'
Try using alias names that start in a letter:
SELECT *
FROM recs r1
JOIN recs r2 ON r2.user_id='2' AND r2.item_id=r1.item_id
WHERE r1.user_id='1'
This returns
user_id item_id
------- -------
1 1
1 3
for your data. Demo on sqlfiddle.
Note: I kept single quotes in the query, because I assume that both IDs in your table are of character type. If that is not the case, remove single quotes around user ID values '1' and '2'.
I want it for n number of users ... a I want the query to return all item_id's that are common among the users
SELECT DISTINCT(r1.item_id)
FROM recs r1
WHERE EXISTS (
SELECT *
FROM recs r2
WHERE r2.item_id=r1.item_id
AND r1.user_id <> r2.user_id
)
Demo #2.

Add a column to result, which is another SELECT with dependency

This one is a little bit tricky, at least to me to explain, so please, don't get mad if you don't get the point - it's likely caused by my poor explanation.
I want to get one more column from my main SELECT, which will represent number of rows from another table, suiting id of main record.
So, imagine main table, which I am selecting from. I'll call it simply main.
What I want to select from main, basically is:
SELECT * FROM main ORDER BY c1 ASC LIMIT 5
Plus I need one extra column for each row returned, which says number of rows from side table, matching the id:
SELECT COUNT(*) FROM side WHERE m_id = main_id
Maybe an example will tell you a little bit more
id data1 data2 id m_id ...
main ----|-------|------- side -----|------|-----
1 aa ab 1 1
2 xx yy 2 2
3 az bz 3 1
4 1
5 3
6 2
7 1
8 1
9 2
expected result:
id data1 data2 num
----|-------|-------|------
1 aa ab 5
2 xx yy 3
3 az bz 1
A simple way to add the count is with a correlated subquery:
SELECT m.*,
(select count(*) from side s where s.m_id = m.main_id) as side_cnt
FROM main m
ORDER BY c1 ASC
LIMIT 5;
You can also do this by changing the from clause. However, this method only affects the select part of the query.
You should be able to do it as a subquery:
SELECT m.*, ( SELECT COUNT(*) FROM side WHERE m_id = m.main_id ) as num FROM main m ORDER BY c1 ASC LIMIT 5
This basically runs a special query for each result that counts the number of matching results and displays it in the "num" column.