mysql simple subquery problem - mysql

table user
____________________________________________
id name nickname info_id
1 john apple 11
2 paul banana 12
3 pauline melon 13
table info
_____________________________________________
id job location
11 model usa
12 engineer russia
13 seller brazil
result I want
______________________________________________
1 john apple model usa
my query
left join:
select * from user a left join info b on b.id = a.info_id where a.id=1
subquery:
select a.*, b.* from (user a, info b) where b.id = a.info_id
which is better?

SELECT a.`name`, a.`nickname`, b.`job`, b.`location`
FROM `user` AS a
LEFT JOIN `info` AS b
ON ( a.`info_id` = b.`id` )
That should be pretty efficient. Try using MySQL EXPLAIN if you are concerned (also make sure there are indexes on the ID fields):
http://dev.mysql.com/doc/refman/5.1/en/using-explain.html
UPDATE
After seeing that you are not having performance problems just yet, I would not worry about it. "Don't fix what ain't broken". If you find that it is slowing down in the future, or it is bottle-necking on that function, then worry about it.
The query I gave should be pretty efficient.

Related

Filtering data in tables

I have two tables account_agent and bulk_index. The data on each table is as follows
account_agent
id name
1 Tom
2 Brad
3 John
4 Jan
5 Bartosz
account_agent to filter
id name
1 Tom
3 John
5 Bartosz
using these tables I want to make a normalized view that contains all agents from first table, but without agents placed in second table.
account_agent_filtered
name code
2 Brad
4 Jan
I have a lot of data and with cross join I could just do an opposite thing, but now I do not want to match. I want to filter and I do not have any idea how to do this.
Unfortunately, MySQL doesn't have an except set operator, but you could emulate this behavior using exists:
SELECT *
FROM account_agent aa
WHERE NOT EXISTS (SELECT *
FROM account_agent_to_filter aatf
WHERE aa.id = aatf.id AND aa.name = aatf.name)
In addition to Mureinik's answer:
SELECT t1.*
FROM account_agent t1
LEFT JOIN account_agent_to_filter t2 USING (id, name)
WHERE t2.id IS NULL

mysql join query where multiple rows have same key value

I have a table, in a third party database, that has two tables like these:
HISTORY
========
ID | ORDERED
1 PEAS
1 CARROTS
1 SPINACH
2 CARROTS
3 PEAS
3 CARROTS
PEOPLE
=====
ID | NAME
1 Jamal
2 Sharon
3 Mark
I am trying to create a MYSQL query that will return all the PEOPLE who ORDERED both PEAS and CARROTS. The results would be:
Jamal, Mark
When I try this with the OR operator, I get all three people:
SELECT a.ID from people a
INNER JOIN history b on a.ID=b.ID
WHERE b.ordered='PEAS' OR b.ordered='CARROTS'
When I try this with the AND operator, I get no people.
SELECT a.ID from people a
INNER JOIN history b on a.ID=b.ID
WHERE b.ordered='PEAS' AND b.ordered='CARROTS'
How can I write a query to get the names of the people who ordered peas and carrots given the table structure I have to work with?
JOIN twice, once for each condition:
SELECT a.ID
FROM people a
JOIN history b on a.ID=b.ID AND b.ordered='PEAS'
JOIN history c on a.ID=c.ID AND c.ordered='CARROTS'
If history can contain duplicates, or to be defensive, add DISTINCT:
SELECT DISTINCT a.ID
FROM ...
Select all people from people and for each of them the history details, but only the people who ordered PEAS or CARROTS:
When you say "all" people, then you always begin your sql with the people table. When you say "who have", then you add the LEFT JOIN, e.g to the LEFT table (people) you JOIN the wanted details from the right table (history). And when you say "but just the ones having the following details", then you apply a filter, e.g. you use WHERE clause.
SELECT
peo.ID, -- OPTIONAL
peo.NAME,
his.ID, -- OPTIONAL
his.ORDERED
FROM people AS peo
LEFT JOIN history AS his ON his.ID = peo.ID
WHERE
his.ORDERED = "PEAS" OR
his.ORDERED = "CARROTS"
;
Note: "--" means commentar.
EDIT 1:
Important, a principle which you always should apply: Rename the history.ID column to history.PEOPLE_ID and add a new column history.ID as PRIMARY KEY column, in order to uniquely identify each historyrecord.
HISTORY
-------
ID PEOPLE_ID ORDERED
1 1 PEAS
2 1 CARROTS
3 1 SPINACH
4 2 CARROTS
5 3 PEAS
6 3 CARROTS
PEOPLE table remains the same.
EDIT 2:
NO no, it should be an AND in the WHERE clause. My fault. I corrected it.
EDIT 3:
NO no, it should be an OR in the WHERE clause. My fault. I corrected it. AGAIN :-)))

Mysql joining two tables

I need to join the following two tables
table_A
id userId name score game
1 2343 me 45 Palo Alto
2 6575 other 21 SF
3 6575 other 2 miami
table_B
id userId pen mango
1 2343 3 4
2 2343 5 7
3 6575 1 2
Here is the join:
SELECT COUNT(a.userId), SUM(b.pen), SUM(b.mango)
FROM table_A AS a
LEFT JOIN table_B b ON a.userId = b.userId
WHERE userId = 2343;
The problem is I am getting count(userId) equals to 2, but I need it to be 1. What am I doing wrong?
Change it to the following:
count(distinct a.userId)
Your query will gerneate two rows, each row corresponding to one row in table_B (one for id 1, one for id 2).
I am not sure why you think the resulting count(a.userId) should be 1, but you could enforce this by using a GROUP BY clause à la GROUP BY b.userId.
I am confused about what you are doing. You can still get SUM of Pen and Mango without joining the two tables. And another thing, why do you still use the COUNT function where, in fact, you know that you are querying to ONLY ONE ID? Right?
SELECT SUM(Pen) as TotalPen,
SUM(Mango) as TotalMango
FROM table_B
WHERE userId = 2343
But if you want a joined tables you could write something like this:
SELECT SUM(COALESCE(b.pen,0)) as TotalPen,
SUM(COALESCE(b.mango,0)) as TotalMango
FROM table_A AS a LEFT JOIN table_B b ON a.userId = b.userId
WHERE a.userId = 2343;
The problem is I am getting count(userId) equals to 2, but I need it to be 1. - The query is correct but your understanding is wrong. Obviously there are two record IDs of 2343 in Table_B

MySQL selecting rows with a max id and matching other conditions

Using the tables below as an example and the listed query as a base query, I want to add a way to select only rows with a max id! Without having to do a second query!
TABLE VEHICLES
id vehicleName
----- --------
1 cool car
2 cool car
3 cool bus
4 cool bus
5 cool bus
6 car
7 truck
8 motorcycle
9 scooter
10 scooter
11 bus
TABLE VEHICLE NAMES
nameId vehicleName
------ -------
1 cool car
2 cool bus
3 car
4 truck
5 motorcycle
6 scooter
7 bus
TABLE VEHICLE ATTRIBUTES
nameId attribute
------ ---------
1 FAST
1 SMALL
1 SHINY
2 BIG
2 SLOW
3 EXPENSIVE
4 SHINY
5 FAST
5 SMALL
6 SHINY
6 SMALL
7 SMALL
And the base query:
select a.*
from vehicle a
join vehicle_names b using(vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
group
by a.id
having count(distinct c.attribute) = 2;
So what I want to achieve is to select rows with certain attributes, that match a name but only one entry for each name that matches where the id is the highest!
So a working solution in this example would return the below rows:
id vehicleName
----- --------
2 cool car
10 scooter
if it was using some sort of max on the id
at the moment I get all the entries for cool car and scooter.
My real world database follows a similar structure and has 10's of thousands of entries in it so a query like above could easily return 3000+ results. I limit the results to 100 rows to keep execution time low as the results are used in a search on my site. The reason I have repeats of "vehicles" with the same name but only a different ID is that new models are constantly added but I keep the older one around for those that want to dig them up! But on a search by car name I don't want to return the older cards just the newest one which is the one with the highest ID!
The correct answer would adapt the query I provided above that I'm currently using and have it only return rows where the name matches but has the highest id!
If this isn't possible, suggestions on how I can achieve what I want without massively increasing the execution time of a search would be appreciated!
If you want to keep your logic, here what I would do:
select a.*
from vehicle a
left join vehicle a2 on (a.vehicleName = a2.vehicleName and a.id < a2.id)
join vehicle_names b on (a.vehicleName = b.vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct c.attribute) = 2;
Which yield:
+----+-------------+
| id | vehicleName |
+----+-------------+
| 2 | cool car |
| 10 | scooter |
+----+-------------+
2 rows in set (0.00 sec)
As other said, normalization could be done on few levels:
Keeping your current vehicle_names table as the primary lookup table, I would change:
update vehicle a
inner join vehicle_names b using (vehicleName)
set a.vehicleName = b.nameId;
alter table vehicle change column vehicleName nameId int;
create table attribs (
attribId int auto_increment primary key,
attribute varchar(20),
unique key attribute (attribute)
);
insert into attribs (attribute)
select distinct attribute from vehicle_attribs;
update vehicle_attribs a
inner join attribs b using (attribute)
set a.attribute=b.attribId;
alter table vehicle_attribs change column attribute attribId int;
Which led to the following query:
select a.id, b.vehicleName
from vehicle a
left join vehicle a2 on (a.nameId = a2.nameId and a.id < a2.id)
join vehicle_names b on (a.nameId = b.nameId)
join vehicle_attribs c on (a.nameId=c.nameId)
inner join attribs d using (attribId)
where d.attribute in ('SMALL', 'SHINY')
and b.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct d.attribute) = 2;
The table does not seems normalized, however this facilitate you to do this :
select max(id), vehicleName
from VEHICLES
group by vehicleName
having count(*)>=2;
I'm not sure I completely understand your model, but the following query satisfies your requirements as they stand. The first sub query finds the latest version of the vehicle. The second query satisfies your "and" condition. Then I just join the queries on vehiclename (which is the key?).
select a.id
,a.vehiclename
from (select a.vehicleName, max(id) as id
from vehicle a
where vehicleName like '%coo%'
group by vehicleName
) as a
join (select b.vehiclename
from vehicle_names b
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
group by b.vehiclename
having count(distinct c.attribute) = 2
) as b on (a.vehicleName = b.vehicleName);
If this "latest vehicle" logic is something you will need to do a lot, a small suggestion would be to create a view (see below) which returns the latest version of each vehicle. Then you could use the view instead of the find-max-query. Note that this is purely for ease-of-use, it offers no performance benefits.
select *
from vehicle a
where id = (select max(b.id)
from vehicle b
where a.vehiclename = b.vehiclename);
Without going into proper redesign of you model you could
1) Add a column IsLatest that your application could manage.
This is not perfect but will satisfy you question (until next problem, see not at the end)
All you need is when you add a new entry to issue queries such as
UPDATE a
SET IsLatest = 0
WHERE IsLatest = 1
INSERT new a
UPDATE a
SET IsLatest = 1
WHERE nameId = #last_inserted_id
in a transaction or a trigger
2) Alternatively you can find out the max_id before you issue your query
SELECT MAX(nameId)
FROM a
WHERE vehicleName = #name
3) You can do it in single SQL, and providing indexes on (vehicleName, nameId) it should actually have decent speed with
select a.*
from vehicle a
join vehicle_names b ON a.vehicleName = b.vehicleName
join vehicle_attribs c ON b.nameId = c.nameId AND c.attribute = 'SMALL'
join vehicle_attribs d ON b.nameId = c.nameId AND d.attribute = 'SHINY'
join vehicle notmax ON a.vehicleName = b.vehicleName AND a.nameid < notmax.nameid
where a.vehicleName like '%coo%'
AND notmax.id IS NULL
I have removed your GROUP BY and HAVING and replaced it with another join (assuming that only single attribute per nameId is possible).
I have also used one of the ways to find max per group and that is to join a table on itself and filter out a row for which there are no records that have a bigger id for a same name.
There are other ways, search so for 'max per group sql'. Also see here, though not complete.

Mysql subselect alternative

I have a query that I know can be done using a subselect, but due to large table sizes (100k+ rows per table) I would like to find an alternative using a join. This is not a homework question, but it's easier to share an example in such terms.
Suppose there are two tables:
Students
:id :name
1 Tom
2 Sally
3 Ben
Books
:id :student_id :book
1 1 Math 101
2 1 History
3 2 NULL
4 3 Math 101
I want to find all students who don't have a history book. Working subselect is:
select name from students where id not in (select student_id from books where book = 'History');
This returns Sally and Ben.
Thanks for your replies!
Is performance the problem? Or is this just some theoretical (homework?) question to avoid a subquery? If it's performance then this:
SELECT *
FROM studnets s
WHERE NOT EXISTS
(SELECT id FROM books WHERE student_id = s.id AND book = 'History')
will perform a lot better than the IN you're doing on MySQL (on some other databases, they will perform equivalently). This can also be rephrased as a join:
SELECT s.*
FROM studnets s
LEFT JOIN books b ON s.id = b.student_id AND b.book = 'History'
WHERE b.id IS NULL