How can I join these MYSQL tables? - mysql

I'm having 2 tables. Table A contains a list of people who booked for an event, table B has a list of people the booker from table A brings with him/her. Both tables have many colums with unique data that I need to do certain calculations on in PHP , and as of now I do so by doing queries on the tables with a recursive PHP function to resolve it. I want to simplify the PHP and reduce the amount of queries that come from this recursive function by doing better MYSQL queries but I'm kind of stuck.
Because the table has way to many columns I will give an Excerpt of table A instead:
booking_id | A_customer | A_insurance
1 | 134 | 4
Excerpt of table B:
id | booking_id | B_insurance
1 | 1 | 0
2 | 1 | 1
3 | 1 | 1
4 | 1 | 3
The booking_id in table A is unique and set to auto increment, the booking_id in table b can occur many times (depending on how many guests the client from table A brings with him). Lets say I want to know every selected insurance from customer 134 and his guests, then I want the output like this:
booking_id | insurance
1 | 4
1 | 0
1 | 1
1 | 1
1 | 3
I have tried a couple of joins and this is the closest I've came yet, unfortunately this fails to show the row from A and only shows the matching rows in B.
SELECT a.booking_id,a.A_customer,a.A_insurance,b.booking_id,b.insurance FROM b INNER JOIN a ON (b.booking_id = a.booking_id) WHERE a.booking_id = 134
Can someone point me into the right direction ?
Please note: I have altered the table and column names for stackoverflow so it's easy for you guys to read, so it's possible that there is a typo that would break the query in it right now.

I think you need a union all for this:
select a.booking_id, a.insurance
from a
where a.a_customer = 134
union all
select b.booking_id, b.insurance
from a join
b
on a.booking_id = b.booking_id
where a.a_customer = 134;

The simplest way I can think of to achieve this is to use a UNION:
SELECT booking_id, A_insurance insurance
FROM A
WHERE booking_id = 134
UNION
SELECT booking_id, B_insurance insurance
FROM B
WHERE booking_id = 134

As my understanging of your isso is right, that should give you the result you need:
SELECT a.booking_id,a.insurance FROM a WHERE a.booking_id = 134
union
SELECT a.booking_id,b.insurance FROM b INNER JOIN a ON (b.booking_id = a.booking_id) WHERE a.booking_id = 134

Related

Combine MySQL SELECT COUNTs

I have 3 MySQL queries I'd like to combine as 1
SELECT pic FROM active,
SELECT pic FROM deleted,
SELECT alt_pic FROM active where alt_pic!=''
I've managed to get the first 2 as one
SELECT pic FROM active UNION SELECT pic FROM deleted
I think I've partially gotten through combining all 3 except I don't know where exactly to insert the 3rd query
SELECT COUNT(*) FROM (SELECT pic FROM active UNION SELECT pic FROM deleted)t
I am just studying for fun. If I am somehow breaking convention or introduce some security risks, please don't get mad :)
Edit 1: Newbie doesn't know, thanks Mureinik and Strawberry for pointing it out :)
alt_pic is just a very optional field, my table active has about 300+ rows but only 8 alt_pic fields filled
active
ID | name | pic | alt_pic
1 | Peter | pic5.jpg | alt1.jpg
2 | Mark | pic4.jpg | NULL
3 | John | pic3.jpg | alt2.jpg
deleted
ID | pic
1 | pic2.jpg
2 | pic1.jpg
The result I'd like to have is
pic_count | alt_pic_count
5 | 2
I'd perform an aggregate query on each table, and then, since they both return just one row, cross join them. Note you don't need a third query with a condition on alt_pic - since count ignores nulls, you could apply it directly to that column.
SELECT pcount + dcount AS pic_count, acount AS alt_pic_count
FROM (SELECT COUNT(pic) AS pcount, COUNT(alt_pic) AS acount
FROM active) a
CROSS JOIN (SELECT COUNT(*) AS dcount
FROM deleted) d

How can I count rows in a 1:N:N relation in a faster way?

This question is a bit complicated to me, and I can't explain it in one sentence so the title may seem quite ambiguous.
I have 3 tables in my MySQL database, their structure is shown below:
word_list (5 million rows)
+-----+--------+
| wid | word |
+-----+--------+
| 1 | foo |
| 2 | bar |
| 3 | hello |
+-----+--------+
paper_word_relation (10 million rows)
+-----+-------+
| pid | word |
+-----+-------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 2 | 1 |
| 2 | 3 |
+-----+-------+
paper_citation_relation (80K rows)
+----------+--------+
| pid_from | pid_to |
+----------+--------+
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 2 | 1 |
| 2 | 3 |
+----------+--------+
I want to find out how many papers contain word W, and cite the papers also contain word W.(for each word in the list)
I use two inner join to do this job but it seems extremely slow when the word is popular - above 50s (quite fast if the word is rarely used - below 0.1s), here is my code
SELECT COUNT(*) FROM (
SELECT a.pid_from, a.pid_to, b.word FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b ON a.pid_from = b.pid
INNER JOIN paper_word_relation AS c ON a.pid_to = c.pid
WHERE b.word = 2 AND c.word = 2) AS d
How can I do this faster? Is my query not efficient enough or it's the problem about the amount of data?
I can only come up with one solution that I delete the words which occur less than 2 in the paper_word_relation table. (About 4 million words only occur once)
Thanks!
If you are only concerned with getting the Count, you should not be first getting the results into a Derived Table, and then Count the rows out. This may create unnecessary temporary tables storing lots of data in-memory. You can directly count the number of rows.
I also think that you need to count unique number of papers. Because of Many-to-Many relationships in paper_citation_relation table, duplicate rows may be coming for a single paper.
SELECT COUNT(DISTINCT a.pid_from)
FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b ON a.pid_from = b.pid
INNER JOIN paper_word_relation AS c ON a.pid_to = c.pid
WHERE b.word = 2 AND c.word = 2
For performance, you will need following indexing:
Composite Index on (pid_from, pid_to) in the paper_citation_relation table.
Composite Index on (pid, word) in the paper_word_relation table.
We may also possibly optimize the query further by reducing one join and use conditional AND/OR based filtering in HAVING. You will need to benchmark it though.
SELECT COUNT(*)
FROM (
SELECT a.pid_from
FROM paper_citation_relation AS a
INNER JOIN paper_word_relation AS b
ON (a.pid_from = b.pid OR
a.pid_to = b.pid)
GROUP BY a.pid_from
HAVING SUM(a.pid_from = b.pid AND b.word = 2) AND
SUM(a.pid_to = b.pid AND b.word = 2)
)
After the first 1:n join you get the same pid_to multiple times and your next join is no longer 1:n but n:m, creating a possibly huge intermediate result before the final DISTINCT. It's similar to a CROSS JOIN and it's getting worse for popular words, e.g. 10*10 vs. 1000*1000 rows.
You must remove the duplicates before the join, this should return the same number as #MadhurBhaiya's answer
SELECT Count(*) -- no more DISTINCT needed
FROM
(
SELECT DISTINCT cr.pid_to -- reducing m to 1
FROM paper_citation_relation AS cr
JOIN paper_word_relation AS wr
ON cr.pid_from = wr.pid
WHERE wr.word = 2
) AS dt
JOIN paper_word_relation AS wr
ON dt.pid_to = wr.pid -- 1:n join again
WHERE wr.word = 2
If you want to count the number of papers which have been cited you need to get a distinct list of pid (either pid_from or pid_to) from paper_citation_relation first and then join to the specific word.
SELECT Count(*)
FROM
( -- get a unique list of cited or citing papers
SELECT pid_from AS pid -- citing
FROM paper_citation_relation
UNION -- DISTINCT by default
SELECT pid_to -- cited
FROM paper_citation_relation
) AS dt
JOIN paper_word_relation AS wr
ON wr.pid = dt.pid
WHERE wr.word = 2 -- now check for the searched word
The number returned by this might be slightly higher (it counts a paper regardless if cited or citing).

MySQL intersection of two tables

I need to implement a function which returns all the networks the installation is not part of.
Following is my table and for example if my installation id is 1 and I need all the network ids where the installation is not part of then the result will be only [9].
network_id | installation_id
-------------------------------
1 | 1
3 | 1
2 | 1
2 | 2
9 | 2
2 | 3
I know this could be solved with a join query but I'm not sure how to implement it for the same table. This is what I've tried so far.
select * from network_installations where installation_id = 1;
network_id | installation_id
-------------------------------
1 | 1
2 | 1
3 | 1
select * from network_installations where installation_id != 1;
network_id | installation_id
-------------------------------
9 | 2
2 | 2
2 | 3
The intersection of the two tables will result the expected answer, i.e. [9]. But though we have union, intersect is not present in mysql. A solution to find the intersection of the above two queries or a tip to implement it with a single query using join will be much appreciated.
The best way to do this is to use a network table (which I presume exists):
select n.*
from network n
where not exists (select 1
from network_installation ni
where ni.network_id = n.network_id and
ni.installation_id = 1
);
If, somehow, you don't have a network table, you can replace the from clause with:
from (select distinct network_id from network_installation) n
EDIT:
You can do this in a single query with no subqueries, but a join is superfluous. Just use group by:
select ni.network_id
from network_installation ni
group by ni.network_id
having sum(ni.installation_id = 1) = 0;
The having clause counts the number of matches for the given installation for each network id. The = 0 is saying that there are none.
Another solution using OUTER JOIN:
SELECT t1.network_id, t1.installation_id, t2.network_id, t2.installation_id
FROM tab t1 LEFT JOIN tab t2
ON t1.network_id = t2.network_id AND t2.installation_id = 1
WHERE t2.network_id IS NULL
You can check at http://www.sqlfiddle.com/#!9/4798d/2
select *
from network_installations
where network_id in
(select network_id
from network_installations
where installation_id = 1
group by network_id )

MYSQL - How to increment fields in one row with values from another row

I have a table that we'll call 'Sales' with 4 rows: uid, date, count and amount. I want to increment the count and amount values for one row with the count/amount values from a different row in that table. Example:
UID | Date | Count | Amount|
1 | 2013-06-20 | 1 | 500 |
2 | 2013-06-24 | 2 | 1000 |
Ideal results would be uid 2's count/amount values being incremented by uid 1's values:
UID | Date | Count | Amount|
1 | 2013-06-20 | 1 | 500 |
2 | 2013-06-24 | 3 | 1500 |
Please note that my company's database is an older version of MYSQL (3.something) so subqueries are not possible. I am curious to know if this is possible outside of doing an "update sales set count = count + 1" and likewise for the amount columns. I have a lot of rows to update and incrementing the values individually is quite time consuming if you can imagine. Thanks for any help or suggestions!
Without using a subselect you may be able to do a JOIN. Not sure from your description on what columns you are linking the rows to each other to decide which to update, but the following might give you the idea
UPDATE Sales a
INNER JOIN Sales b
ON ..........
SET a.Count = a.Count + b.Count,
a.Amount = a.Amount + b.Amount
However not sure if this works on archaic versions of MySQL
If you are just updating row 2 based on the values in row 1 then the following should do it
UPDATE Sales a
INNER JOIN Sales b
ON a.uid = 2 AND b.uid = 1
SET a.Count = a.Count + b.Count,
a.Amount = a.Amount + b.Amount
Most of the time, subqueries could be rewritten as join ... and even MySQL 3.23 has multiple table UPDATE
Something like that would probably do the trick ... but I am unable to test it (since your the only one still using such an old version of MySQL ;)
UPDATE Sales AS S1, Sales AS S2
SET S1.`count` = S1.`count̀€ +S2.`count`, S1.Amount = S1.Amount + S2.Amount
WHERE S1.uid = 2 AND S2.uid = 1
For simplicity here I explicitly set S1.uid to "2" and S2.uid to "1" -- if that works for this line, you should be able to use the WHERE clause that correspond to your specific needs.

How to filter duplicates within row using Distinct/group by with JOINS

For simplicity, I will give a quick example of what i am trying to achieve:
Table 1 - Members
ID | Name
--------------------
1 | John
2 | Mike
3 | Sam
Table 1 - Member_Selections
ID | planID
--------------------
1 | 1
1 | 2
1 | 1
2 | 2
2 | 3
3 | 2
3 | 1
Table 3 - Selection_Details
planID | Cost
--------------------
1 | 5
2 | 10
3 | 12
When i run my query, I want to return the sum of the all member selections grouped by member. The issue I face however (e.g. table 2 data) is that some members may have duplicate information within the system by mistake. While we do our best to filter this data up front, sometimes it slips through the cracks so when I make the necessary calls to the system to pull information, I also want to filter this data.
the results SHOULD show:
Results Table
ID | Name | Total_Cost
-----------------------------
1 | John | 15
2 | Mike | 22
3 | Sam | 15
but instead have John as $20 because he has plan ID #1 inserted twice by mistake.
My query is currently:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
Adding DISTINCT s.planID filters the results incorrectly as it will only show a single PlanID 1 sold (even though members 1 and 3 bought it).
Any help is appreciated.
EDIT
There is also another table I forgot to mention which is the agent table (the agent who sold the plans to members).
the final group by statement groups ALL items sold by the agent ID (which turns the final results into a single row).
Perhaps the simplest solution is to put a unique composite key on the member_selections table:
alter table member_selections add unique key ms_key (ID, planID);
which would prevent any records from being added where the unique combo of ID/planID already exist elsewhere in the table. That'd allow only a single (1,1)
comment followup:
just saw your comment about the 'alter ignore...'. That's work fine, but you'd still be left with the bad duplicates in the table. I'd suggest doing the unique key, then manually cleaning up the table. The query I put in the comments should find all the duplicates for you, which you can then weed out by hand. once the table's clean, there'll be no need for the duplicate-handling version of the query.
Use UNIQUE keys to prevent accidental duplicate entries. This will eliminate the problem at the source, instead of when it starts to show symptoms. It also makes later queries easier, because you can count on having a consistent database.
What about:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN
(select distinct ID, PlanID from member_selections) s
USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
By the way, is there a reason you don't have a primary key on member_selections that will prevent these duplicates from happening in the first place?
You can add a group by clause into the inner query, which groups by all three columns, basically returning only unique rows. (I also changed 'premium' to 'cost' to match your example tables, and dropped the agent part)
SELECT
sq.ID,
sq.name,
SUM(sq.Cost) AS total_cost
FROM
(
SELECT
m.id,
m.name,
g.Cost
FROM
members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
GROUP BY
m.ID,
m.NAME,
g.Cost
) sq
group by
sq.ID,
sq.NAME