MySQL JOIN performance issue - mysql

I am trying to build a query for later inserting it in a table I'm building. I want the query to return 4 columns, which are 4 ids from different entities which relate with each other. I've got one table with the relations, but with varchars, and four tables where I already put the different occurences with an ID each:
TABLE RELATIONS:
A B C D
Sony Bravia 32" 1200€
JVC Whatever 15cm 200€
Samsung Galaxy 13" 500€
TABLE A:
id name
1 Sony
2 JVC
3 Samsung
TABLE B:
id name
1 Whatever
2 Galaxy
3 Bravia
TABLE C:
id name
1 13"
2 15cm
3 32"
TABLE D:
id name
1 200€
2 1200€
3 500€
Now, what I want to get with my query is:
QUERY RESULT:
A B C D
1 3 3 2
2 1 2 1
3 2 1 3
Firstly, I constructed this one:
SELECT DISTINCT A.id as A, B.id as B, C.id as C, D.id as D
FROM relations
INNER JOIN A ON A.name = relations.A
INNER JOIN B ON B.name = relations.B
INNER JOIN C ON C.name = relations.C
INNER JOIN D ON D.name = relations.D
It seems correct, but it takes sooooo long (maybe hours) to complete. The table sizes are (80,65000,1900,15) for the 4 entities, and 65000 for the relations table.
If I perform just one of the joins it takes 15ms, if I perform two of them 6-7 seconds, and with 3 o 4 the time increases exponentially. I think maybe the JOIN solution might be overkill for my situation, as I only need to "translate" the strings...
I already created an index for every "name" field in the four entity tables, as well as an index for relations.a,.b,.c,.d.
Curiously, it takes almost nothing if what I do is duplicate the a,b,c,d columns in the relations table and perform 4 UPDATE queries inserting the id into the duplicate field, matching its "parent"...but I'm sure there's gotta be a better way to do that...anyone has an idea?
Many thanks!
EXPLAIN result http://www.redlanemedia.com/explain.png

Related

JOIN three tables not quite producing results needed

I have tried for a number of hours to get this. I am still quite new to mysql but have managed to achieve queries that I was impressed with after using the resources and examples I found. I am a bit stuck here. Apologies if I do not ask this very well.
Three tables that are used for managing categories and category membership within a project.
table a = project membership
id user_id project_id
== ======= ==========
1 1 10
2 1 12
3 3 45
4 5 12
table b = categories
id name project_id
== ==== ==========
1 cat1 10
2 cat4 12
3 cat8 45
tabke c = category members
id user_id_added category_id capability
== ============= =========== ==========
1 1 2 1
2 3 3 2
3 5 3 1
4 5 2 0
Required result
members of project 2
user_id category capability_in_category
======= ======== ======================
1 2 1
5 2 0
SELECT a.user_id
, c.capability
, b.id as category
FROM a
LEFT OUTER JOIN b
ON a.project_id = c.project_id
LEFT OUTER JOIN c
ON b.id = c.category_id
WHERE a.project_id = $project_id
AND c.category_id = $category_id;
It feels like I don't need to join the three tables, but I do not see a way of joining the project table with the category membership table without using the category table (b). The query I am running nearly works, but user capability is not returning correct. I am using left outer joins as a member may not always be part of a category, but they still need to be shown as a member in the project. I have been trying various joins and subqueries, without success. I basically need a list of the members in the project and if they are part of a category, to show the capability they have of the specific category. I feel there are a few ways of doing this potentially, but there is a gray area I am struggling to bridge.
The question is vague so I might help you to solve the wrong problem but if you want to have all members of a specific project listed (regardless of their capability) and to list the capabilities in a specified category listed as well, then:
SELECT project_memberships.user_id
, category_members.category_id AS category
, category_members.capability AS capability
FROM project_members
LEFT OUTER JOIN categories
ON project_members.project_id = categories.project_id
LEFT OUTER JOIN category_members
ON categories.id = category_members.category_id
AND category_members.user_id_added = project_membership.user_id
WHERE project_members.project_id = $project_id
AND (categories.id = $category_id OR categories.id IS NULL);
should get you that.
I altered tree things compared to your original query:
I used the table names as they are more speaking than "a, b, c"
I added the additional constraint category_members.user_id_added = project_membership.user_id to the second join so as to not join category_members of a different user to a project_members record.
I loosened the WHERE condition so that members not having the desired capability are also displayed. category and capability will be NULL for those records.
As to your question regarding having to join the three tables the answer is yes, you need to do that.

How to select records that match values in one column and don't in other column in MySql?

I'm trying to select MerchantIDs that are the same but have different Networks values, for example:
ID MerchantID Network
1 1 A
2 1 A
3 2 B
4 2 C
5 3 D
6 3 D
In that case I would like the query to return "2" (since it's the only MerchantID that have different Networks).
Until now I have the following query:
SELECT a.MerchantID
FROM table a
JOIN table b
ON a.ID = b.ID
AND a.Network <> b.Network
AND a.MerchantID = b.MerchantID
GROUP BY a.MerchantID
Thing is table have around ~43,000 records and that query takes a LOT of time (haven't been even able to get the results).
Is there any better way to do it?
Thanks.
Try this:
SELECT MerchantID
FROM yourtable
GROUP BY MerchantID
HAVING COUNT(Distinct Network)>1
this should be faster, joins that use <> conditions are (usually) slower.

Getting value of Mapping table IDs when value is from same column

Im really struggling to get my head around what should be simple,
I have two tables, one contains records the other is a mapping table.
records
ID Title Description
1 record 1 desc 1
2 record 2 desc 2
3 record 3 desc 3
4 record 4 desc 4
mapping table
ID1 ID2
1 3
2 4
What I want to do is get the two titles of each row in the mapping table. So the above would output
record 1 record 3
record 2 record 4
Im missing something really obvious, trying multiple joins results in errors trying to link the same table twice.
The following returns NUll
SELECT records.title FROM mapping
LEFT JOIN records
ON mapping.ID1 = records.id
AND mapping.ID2 = records.id
try this one: (UNTESTED)
SELECT b.Title as TitleA,
c.Title as TitleB
FROM mapping a
INNER JOIN records b
on a.ID1 = b.ID
INNER JOIN records c
on a.ID2 = c.ID

Mysql joining two tables

I need to join the following two tables
table_A
id userId name score game
1 2343 me 45 Palo Alto
2 6575 other 21 SF
3 6575 other 2 miami
table_B
id userId pen mango
1 2343 3 4
2 2343 5 7
3 6575 1 2
Here is the join:
SELECT COUNT(a.userId), SUM(b.pen), SUM(b.mango)
FROM table_A AS a
LEFT JOIN table_B b ON a.userId = b.userId
WHERE userId = 2343;
The problem is I am getting count(userId) equals to 2, but I need it to be 1. What am I doing wrong?
Change it to the following:
count(distinct a.userId)
Your query will gerneate two rows, each row corresponding to one row in table_B (one for id 1, one for id 2).
I am not sure why you think the resulting count(a.userId) should be 1, but you could enforce this by using a GROUP BY clause à la GROUP BY b.userId.
I am confused about what you are doing. You can still get SUM of Pen and Mango without joining the two tables. And another thing, why do you still use the COUNT function where, in fact, you know that you are querying to ONLY ONE ID? Right?
SELECT SUM(Pen) as TotalPen,
SUM(Mango) as TotalMango
FROM table_B
WHERE userId = 2343
But if you want a joined tables you could write something like this:
SELECT SUM(COALESCE(b.pen,0)) as TotalPen,
SUM(COALESCE(b.mango,0)) as TotalMango
FROM table_A AS a LEFT JOIN table_B b ON a.userId = b.userId
WHERE a.userId = 2343;
The problem is I am getting count(userId) equals to 2, but I need it to be 1. - The query is correct but your understanding is wrong. Obviously there are two record IDs of 2343 in Table_B

MySQL selecting rows with a max id and matching other conditions

Using the tables below as an example and the listed query as a base query, I want to add a way to select only rows with a max id! Without having to do a second query!
TABLE VEHICLES
id vehicleName
----- --------
1 cool car
2 cool car
3 cool bus
4 cool bus
5 cool bus
6 car
7 truck
8 motorcycle
9 scooter
10 scooter
11 bus
TABLE VEHICLE NAMES
nameId vehicleName
------ -------
1 cool car
2 cool bus
3 car
4 truck
5 motorcycle
6 scooter
7 bus
TABLE VEHICLE ATTRIBUTES
nameId attribute
------ ---------
1 FAST
1 SMALL
1 SHINY
2 BIG
2 SLOW
3 EXPENSIVE
4 SHINY
5 FAST
5 SMALL
6 SHINY
6 SMALL
7 SMALL
And the base query:
select a.*
from vehicle a
join vehicle_names b using(vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
group
by a.id
having count(distinct c.attribute) = 2;
So what I want to achieve is to select rows with certain attributes, that match a name but only one entry for each name that matches where the id is the highest!
So a working solution in this example would return the below rows:
id vehicleName
----- --------
2 cool car
10 scooter
if it was using some sort of max on the id
at the moment I get all the entries for cool car and scooter.
My real world database follows a similar structure and has 10's of thousands of entries in it so a query like above could easily return 3000+ results. I limit the results to 100 rows to keep execution time low as the results are used in a search on my site. The reason I have repeats of "vehicles" with the same name but only a different ID is that new models are constantly added but I keep the older one around for those that want to dig them up! But on a search by car name I don't want to return the older cards just the newest one which is the one with the highest ID!
The correct answer would adapt the query I provided above that I'm currently using and have it only return rows where the name matches but has the highest id!
If this isn't possible, suggestions on how I can achieve what I want without massively increasing the execution time of a search would be appreciated!
If you want to keep your logic, here what I would do:
select a.*
from vehicle a
left join vehicle a2 on (a.vehicleName = a2.vehicleName and a.id < a2.id)
join vehicle_names b on (a.vehicleName = b.vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct c.attribute) = 2;
Which yield:
+----+-------------+
| id | vehicleName |
+----+-------------+
| 2 | cool car |
| 10 | scooter |
+----+-------------+
2 rows in set (0.00 sec)
As other said, normalization could be done on few levels:
Keeping your current vehicle_names table as the primary lookup table, I would change:
update vehicle a
inner join vehicle_names b using (vehicleName)
set a.vehicleName = b.nameId;
alter table vehicle change column vehicleName nameId int;
create table attribs (
attribId int auto_increment primary key,
attribute varchar(20),
unique key attribute (attribute)
);
insert into attribs (attribute)
select distinct attribute from vehicle_attribs;
update vehicle_attribs a
inner join attribs b using (attribute)
set a.attribute=b.attribId;
alter table vehicle_attribs change column attribute attribId int;
Which led to the following query:
select a.id, b.vehicleName
from vehicle a
left join vehicle a2 on (a.nameId = a2.nameId and a.id < a2.id)
join vehicle_names b on (a.nameId = b.nameId)
join vehicle_attribs c on (a.nameId=c.nameId)
inner join attribs d using (attribId)
where d.attribute in ('SMALL', 'SHINY')
and b.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct d.attribute) = 2;
The table does not seems normalized, however this facilitate you to do this :
select max(id), vehicleName
from VEHICLES
group by vehicleName
having count(*)>=2;
I'm not sure I completely understand your model, but the following query satisfies your requirements as they stand. The first sub query finds the latest version of the vehicle. The second query satisfies your "and" condition. Then I just join the queries on vehiclename (which is the key?).
select a.id
,a.vehiclename
from (select a.vehicleName, max(id) as id
from vehicle a
where vehicleName like '%coo%'
group by vehicleName
) as a
join (select b.vehiclename
from vehicle_names b
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
group by b.vehiclename
having count(distinct c.attribute) = 2
) as b on (a.vehicleName = b.vehicleName);
If this "latest vehicle" logic is something you will need to do a lot, a small suggestion would be to create a view (see below) which returns the latest version of each vehicle. Then you could use the view instead of the find-max-query. Note that this is purely for ease-of-use, it offers no performance benefits.
select *
from vehicle a
where id = (select max(b.id)
from vehicle b
where a.vehiclename = b.vehiclename);
Without going into proper redesign of you model you could
1) Add a column IsLatest that your application could manage.
This is not perfect but will satisfy you question (until next problem, see not at the end)
All you need is when you add a new entry to issue queries such as
UPDATE a
SET IsLatest = 0
WHERE IsLatest = 1
INSERT new a
UPDATE a
SET IsLatest = 1
WHERE nameId = #last_inserted_id
in a transaction or a trigger
2) Alternatively you can find out the max_id before you issue your query
SELECT MAX(nameId)
FROM a
WHERE vehicleName = #name
3) You can do it in single SQL, and providing indexes on (vehicleName, nameId) it should actually have decent speed with
select a.*
from vehicle a
join vehicle_names b ON a.vehicleName = b.vehicleName
join vehicle_attribs c ON b.nameId = c.nameId AND c.attribute = 'SMALL'
join vehicle_attribs d ON b.nameId = c.nameId AND d.attribute = 'SHINY'
join vehicle notmax ON a.vehicleName = b.vehicleName AND a.nameid < notmax.nameid
where a.vehicleName like '%coo%'
AND notmax.id IS NULL
I have removed your GROUP BY and HAVING and replaced it with another join (assuming that only single attribute per nameId is possible).
I have also used one of the ways to find max per group and that is to join a table on itself and filter out a row for which there are no records that have a bigger id for a same name.
There are other ways, search so for 'max per group sql'. Also see here, though not complete.