Tricky MySQL JOIN query - mysql

I have a text input which upon keyup I want to update the options in an adjacent select field.
I have these 2 tables in my database:
Models:
ID modelName brandID
1 NA140 3
1 SRL 1
1 SRS 1
1 SRF 1
1 SMS 2
1 SMU 2
Brands:
ID brandName
1 Samsung
2 Bosch
3 Panasonic
In the select field I want to list all the brandNames from the brands table but list them in relevance to the text input.
So if 'SR' is typed the order of modelNames would be SRF, SRL, SRS, SMU, SMS, NA140 and then the corresponding brandName grabbed as the result but only list each brand once.
How would I write this query?
I have this basic idea which I think is what I need...
JOIN models & brands ON m.brandID = b.ID
MATCH modelName to string%
SELECT UNIQUE brandName

The only way I can think is to first do a select on where it matches the user input, get those results, then do it again where it does, and combine the results with a union, like so:
(SELECT `t1`.`modelName`, `t1`.`brandID`,`t2`.`brandName` FROM `Models` as `t1` INNER JOIN `Brands` AS `t2` ON `t1`.`brandID`=`t2`.`ID` WHERE `t1`.`modelName` LIKE 'SR%' ORDER BY `t1`.`modelName`,`t2`.`brandName`)
UNION
(SELECT `t1`.`modelName`, `t1`.`brandID`,`t2`.`brandName` FROM `Models` as `t1` INNER JOIN `Brands` AS `t2` ON `t1`.`brandID`=`t2`.`ID` WHERE `t1`.`modelName` NOT LIKE 'SR%' ORDER BY `t1`.`modelName`,`t2`.`brandName`)
There may be a more efficient way to do this, this is what I could think of before I head out to the store :)

select distinct brandName
from Models m inner join Brands b
on m.brandID = b.ID
where m.modelName like 's%'
Fiddle

I came up with a similar query (simplified):
select brandName from (
select b.brandName as brandName, m.modelName
from brands b join models m
on m.brandid = b.id
where m.modelName like '%sr%'
union
select b.brandName as brandName, m.modelName
from brands b join models m
on m.brandid = b.id
where m.modelName not like '%sr%') as curb
group by brandName
SQLfiddle: http://sqlfiddle.com/#!2/593b36/30
(Not exact naming of brands and models)

SELECT DISTINCT b.brandName
FROM brands b
JOIN models m
ON m.brandID = b.id
WHERE m.modelName LIKE '%$variable%'
UNION
SELECT b.brandName
FROM brands b
JOIN models m
ON m.brandID = b.id
WHERE m.modelName NOT LIKE '%$variable%'

What is needed is a function to calculate the distance between the value in the field, anchor, and the values in the table, this function can be the difference between the HEX of the anchor and the HEX of the table values.
SELECT m.id, modelName, brandName
FROM Model m
INNER JOIN Brand b ON m.brandID = b.ID
ORDER BY ABS(CONV(HEX(RPAD('SR', 5, ' ')), 16, 10)
- CONV(HEX(RPAD(modelName, 5, ' ')), 16, 10))
SQLFiddle Demo
Before passing the values to HEX, to have the correct result, the string need to be padded so that they all have the same length, that why there is a RPAD.
To calculate the difference the values are converted from base-16 to base-10, as the distance is only positive ABS is used.
If there are model with name longer than five char the parameter of the RPAD need to be modified to the length of the longest model.

Are you looking for something like this?
SELECT DISTINCT brandname
FROM models m JOIN brands b
ON m.brandid = b.id
ORDER BY modelname LIKE 'S%' DESC,
modelname LIKE 'SR%' DESC,
modelname
Output:
| BRANDNAME |
|-----------|
| Samsung |
| Bosch |
| Panasonic |
Here is SQLFiddle demo
UPDATE: For SMU% input string the query should look
SELECT DISTINCT brandname
FROM models m JOIN brands b
ON m.brandid = b.id
ORDER BY modelname LIKE 'S%' DESC,
modelname LIKE 'SM%' DESC,
modelname LIKE 'SMU%' DESC,
modelname
Output:
| BRANDNAME |
|-----------|
| Bosch |
| Samsung |
| Panasonic |
Here is SQLFiddle demo

I believe that the solution shouldn't be in the query part... at least for each letter typed. Maybe you could do one query for the first letter, retrieve the data for that letter into the client part, put the names in an array, and order in the client side.
Doing it in the server side is obviously slower, but making a query per each letter typed, taking into account how many times users type wrong letters, delete (would make a new query), and type again (new query), is terrible in terms of performance.
Putting the calculations in the client's side is much better. you could redo the query only if the first letter has been deleted and typed a new one.
May be you need some why to make new queries each time, but.. maybe a different approach cab help finding a better solution...

Related

Understanding use of multiple SUMs with LEFT JOINS in mysql

Using the GROUP BY command, it is possible to LEFT JOIN multiple tables and still get the desired number of rows from the first table.
For example,
SELECT b.title
FROM books `b`
LEFT JOIN orders `o`
ON o.bookid = b.id
LEFT JOIN authors `a`
ON b.authorid = a.id
GROUP BY b.id
However, since behind the scenes MYSQL is doing a cartesian product on the tables, if you include more than one SUM command you get incorrect values based on all the hidden rows. (The problem is explained fairly well here.)
SELECT b.title,SUM(o.id) as sales,SUM(a.id) as authors
FROM books `b`
LEFT JOIN orders `o`
ON o.bookid = b.id
LEFT JOIN authors `a`
ON b.authorid = a.id
GROUP BY b.id
There are a number of answers on SO about this, most using sub-queries in the JOINS but I am having trouble applying them to this fairly simple case.
How can you adjust the above so that you get the correct SUMs?
Edit
Example
books
id|title|authorid
1|Huck Finn|1
2|Tom Sawyer|1
3|Python Cookbook|2
orders
id|bookid
1|1
2|1
3|2
4|2
5|3
6|3
authors
id|author
1|Twain
2|Beazley
2|Jones
The "correct answer" for total # of authors of the Python Cookbook is 2. However, because there are two joins and the overall dataset is expanded by the join on number of orders, SUM(a.id) will be 4.
You are correct that by joining multiple tables you would not get the expected results.
But in this case you should use COUNT() instead of SUM() and count the distinct orders or authors.
Also by your design you should count the names of the authors and not the ids of the table authors:
SELECT b.title,
COUNT(DISTINCT o.id) as sales,
COUNT(DISTINCT a.author) as authors
FROM books `b`
LEFT JOIN orders `o` ON o.bookid = b.id
LEFT JOIN authors `a` ON b.authorid = a.id
GROUP BY b.id, b.title
See the demo.
Results:
| title | sales | authors |
| --------------- | ----- | ------- |
| Huck Finn | 2 | 1 |
| Tom Sawyer | 2 | 1 |
| Python Cookbook | 2 | 2 |
When dealing with separate aggregates, it is good style to aggregate before joining.
Your data model is horribly confusing, making it look like a book is written by one author only (referenced by books.authorid), while this "ID" is not an author's ID at all.
Your main problem is: You don't count! We count with COUNT. But you are mistakenly adding up ID values with SUM.
Here is a proper query, where I am aggregating before joining and using alias names to fight confusion and thus enhance the query's readability and maintainability.
SELECT
b.title,
COALESCE(o.order_count, 0) AS sales,
COALESCE(a.author_count, 0) AS authors
FROM (SELECT title, id AS book_id, authorid AS author_group_id FROM books) b
LEFT JOIN
(
SELECT id as author_group_id, COUNT(*) as author_count
FROM authors
GROUP BY id
) a ON a.author_group_id = b.author_group_id
LEFT JOIN
(
SELECT bookid AS book_id, COUNT(*) as order_count
FROM orders
GROUP BY bookid
) o ON o.book_id = b.book_id
ORDER BY b.title;
i don't think that your query would work like you eexspected.
Assume one book could have 3 authors.
For Authors:
So you would have three rows for that book in your books table,each one for every Author.
So a
SUM(b.authorid)
gives you the correct answer in your case.
For Orders:
you must use a subselect like
LEFT JOIN (SELECT SUM(id) o_sum,bookid FROM orders GROUP BY bookid) `o`
ON o.bookid = b.id
You should really reconsider your approach with books and authors.

Concat foreign key values from a self related table

I have a products database which has a multi-tier category structure. Products are assigned to a category. The category table looks like this:
id name parent_id
================================
1 Electronics NULL
2 AV 1
3 Speakers 2
4 Wireless 3
What I want to do is, as part of my SELECT statement for products, output a concatenated string of the category tree.
The product is always assigned to the last category, so for example, Product "500w Wireless Speakers" would be assigned to category_id 4 (based on the above).
The ouputted column should be Electronics-AV-Speakers-Wireless.
Is this possible to do? I have looked at GROUP_CONCAT() but I'm having trouble working out the correct syntax.
Join as many times as you need, and concat the names:
select concat(a.name, '-', b.name, '-', c.name, '-', d.name) name
from mytable a
join mytable b on a.id = b.parent_id
join mytable c on b.id = c.parent_id
join mytable d on c.id = d.parent_id;

Gather values from different tables based on a table field

Suppose I have the following table structure:
TABLE 1
main_id | type | information
first segway excellent
second car mercedes
third bike sliceofwind
TABLE segway
id | grade
1 excellent
2 bad
3 (...)
TABLE car
id | brand
1 mercedes
2 honda
3 (...)
TABLE bike
id | tires
1 sliceofwind
2 flatasfaque
3 (...)
What I'd like to do is dinamically obtain information from different tables based on type from table1.
Here's the generic example of a query that I've tried
SELECT (CASE
WHEN table1.type = 'segway' AND segway.grade = table1.information
THEN segway.id,
WHEN table1.type = 'car' AND car.brand = table1.information
THEN car.id,
WHEN table1.type = 'bike' AND bike.tires = table1.information
THEN bike.id
END) AS information
FROM table1,segway,bike,car WHERE table1.main_id IN ("ids")
The result of this query is a cartesian product because all the data from all tables will be retrieved despite the restrictions inside the case because not all tables have restrictions.
I'd like to know if there is a way to work around this without changing the table structure, and if not plea for some hints! (I'm up to some kinky sql stuff, what I'm asking here if it is indeed possible to do this, despite it being advised or not and why!).
This might be one way to do it.
SELECT t1.*
FROM table1 t1
LEFT JOIN segway s
on T1.main_id = s.id and T1.type= 'segway'
LEFT JOIN car c
on T1.main_id = c.id and T1.type= 'car'
LEFT JOIN bike b
on T1.main_id = b.id and T1.type= 'bike'
WHERE t1.main_ID in (SomeList)
segway, car, and bike table columns will be null when the Table1's type doesn't match.
However this seems like it would give you back more data/columns than you need. I think you'd be better off writing separate queries outside the database and call them depending on the value they select. OR using a procedure within the database and conditional logic to return the desired result set. (again 3 separate queries and conditional logic in the database) but without understanding use case, I can't really say which would be better.
We could further coalese the brand, tires and grade into a "Value" field as in
Select t1.*, coalese(s.grade,c.brand,b.tires) as value but I'm not sure this offers any help either.
if we needed to only return table 1 values and set values from the other tables... you said kinky, not me.
I can't see how the Cartesian would occur this way.
This will be your expected result, try this query..
SELECT (CASE WHEN s.`id` IS NOT NULL THEN s.`id`
WHEN c.`id` IS NOT NULL THEN c.`id`
WHEN b.`id` IS NOT NULL THEN b.`id`
END) AS information FROM table1 AS t1
LEFT JOIN segway s ON t1.type= 'segway'
LEFT JOIN car c ON t1.type= 'car'
LEFT JOIN bike b ON t1.type= 'bike'
WHERE t1.main_ID IN (1, 2, 3) AND (t1.information = s.`grade` OR
t1.`information`=c.brand OR
t1.`information`=b.tires);

How can I join these 3 tables?

MySQL "table_a":
+----+-------+
| id | title |
+----+-------+
Now I can do a search like so:
$term = mysql_real_escape_string($_GET['term']);
mysql_query("SELECT * FROM `table_a`
WHERE MATCH(`title`) AGAINST('$term' IN BOOLEAN MODE) LIMIT 0,5");
However I want to add another compontent.
MySQL "table_b":
+----+----------+
| id | category |
+----+----------+
MySQL "table_c":
+------------+------------+
| table_a_id | table_b_id |
+------------+------------+
So, I want to look for a specific category in table_b that is linked to table_a according to table_c.
BUT, it isn't always the case that a category is linked to table_a (so it could happen that there just aren't any entries in table_c that are connected to table_a).
AND, if there is a category linked to table_a I want to be able to either search for the title in table_a or the category in table_b (so both should be possible, so category shouldn't overrule title, or the other way around). But if that's not possible, then title should overrule category.
Here's what I came up with so far, but the problem is
a) it doesn't work
b) it doesn't include the title OR category as I had just explained
$term = mysql_real_escape_string($_GET['term']);
mysql_query("SELECT a.*, LEFT JOIN (SELECT c.table_a_id FROM table_b AS b
, table_c AS c WHERE b.category = '$term' AND b.id = c.table_b_id)
AS d ON d.table_a_id = a.id FROM table_a AS a
WHERE MATCH(a.title) AGAINST('$term' IN BOOLEAN MODE) LIMIT 0,5");
Any help would be greatly appreciated.
You really want to join the table_b and table_c data together and the perform your left join against that. Something like this should work for you. If there are multiple categories entries that match, you will have duplicate records, since you only want data from a.*. For debugging, I would add d.* as well so you can see why there are duplicates if there are any:
SELECT a.*
from table_a a
LEFT JOIN (SELECT c.table_a_id, b.category
from table_b b,
table_c c
where c.table_b_id = b.id) d
on a.id = d.table_a_id
WHERE d.category = '$_GET[term]'
or MATCH(a.title) AGAINST('$_GET[term]' IN BOOLEAN MODE)
LIMIT 0,5
But, if you are doing anything other than testing, you absolutely have to escape the $_GET terms or, even better used parameterized SQL. If you aren't familiar with creating prepared statements, see this StackOverflow answer.

MySQL selecting rows with a max id and matching other conditions

Using the tables below as an example and the listed query as a base query, I want to add a way to select only rows with a max id! Without having to do a second query!
TABLE VEHICLES
id vehicleName
----- --------
1 cool car
2 cool car
3 cool bus
4 cool bus
5 cool bus
6 car
7 truck
8 motorcycle
9 scooter
10 scooter
11 bus
TABLE VEHICLE NAMES
nameId vehicleName
------ -------
1 cool car
2 cool bus
3 car
4 truck
5 motorcycle
6 scooter
7 bus
TABLE VEHICLE ATTRIBUTES
nameId attribute
------ ---------
1 FAST
1 SMALL
1 SHINY
2 BIG
2 SLOW
3 EXPENSIVE
4 SHINY
5 FAST
5 SMALL
6 SHINY
6 SMALL
7 SMALL
And the base query:
select a.*
from vehicle a
join vehicle_names b using(vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
group
by a.id
having count(distinct c.attribute) = 2;
So what I want to achieve is to select rows with certain attributes, that match a name but only one entry for each name that matches where the id is the highest!
So a working solution in this example would return the below rows:
id vehicleName
----- --------
2 cool car
10 scooter
if it was using some sort of max on the id
at the moment I get all the entries for cool car and scooter.
My real world database follows a similar structure and has 10's of thousands of entries in it so a query like above could easily return 3000+ results. I limit the results to 100 rows to keep execution time low as the results are used in a search on my site. The reason I have repeats of "vehicles" with the same name but only a different ID is that new models are constantly added but I keep the older one around for those that want to dig them up! But on a search by car name I don't want to return the older cards just the newest one which is the one with the highest ID!
The correct answer would adapt the query I provided above that I'm currently using and have it only return rows where the name matches but has the highest id!
If this isn't possible, suggestions on how I can achieve what I want without massively increasing the execution time of a search would be appreciated!
If you want to keep your logic, here what I would do:
select a.*
from vehicle a
left join vehicle a2 on (a.vehicleName = a2.vehicleName and a.id < a2.id)
join vehicle_names b on (a.vehicleName = b.vehicleName)
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
and a.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct c.attribute) = 2;
Which yield:
+----+-------------+
| id | vehicleName |
+----+-------------+
| 2 | cool car |
| 10 | scooter |
+----+-------------+
2 rows in set (0.00 sec)
As other said, normalization could be done on few levels:
Keeping your current vehicle_names table as the primary lookup table, I would change:
update vehicle a
inner join vehicle_names b using (vehicleName)
set a.vehicleName = b.nameId;
alter table vehicle change column vehicleName nameId int;
create table attribs (
attribId int auto_increment primary key,
attribute varchar(20),
unique key attribute (attribute)
);
insert into attribs (attribute)
select distinct attribute from vehicle_attribs;
update vehicle_attribs a
inner join attribs b using (attribute)
set a.attribute=b.attribId;
alter table vehicle_attribs change column attribute attribId int;
Which led to the following query:
select a.id, b.vehicleName
from vehicle a
left join vehicle a2 on (a.nameId = a2.nameId and a.id < a2.id)
join vehicle_names b on (a.nameId = b.nameId)
join vehicle_attribs c on (a.nameId=c.nameId)
inner join attribs d using (attribId)
where d.attribute in ('SMALL', 'SHINY')
and b.vehicleName like '%coo%'
and a2.id is null
group by a.id
having count(distinct d.attribute) = 2;
The table does not seems normalized, however this facilitate you to do this :
select max(id), vehicleName
from VEHICLES
group by vehicleName
having count(*)>=2;
I'm not sure I completely understand your model, but the following query satisfies your requirements as they stand. The first sub query finds the latest version of the vehicle. The second query satisfies your "and" condition. Then I just join the queries on vehiclename (which is the key?).
select a.id
,a.vehiclename
from (select a.vehicleName, max(id) as id
from vehicle a
where vehicleName like '%coo%'
group by vehicleName
) as a
join (select b.vehiclename
from vehicle_names b
join vehicle_attribs c using(nameId)
where c.attribute in('SMALL', 'SHINY')
group by b.vehiclename
having count(distinct c.attribute) = 2
) as b on (a.vehicleName = b.vehicleName);
If this "latest vehicle" logic is something you will need to do a lot, a small suggestion would be to create a view (see below) which returns the latest version of each vehicle. Then you could use the view instead of the find-max-query. Note that this is purely for ease-of-use, it offers no performance benefits.
select *
from vehicle a
where id = (select max(b.id)
from vehicle b
where a.vehiclename = b.vehiclename);
Without going into proper redesign of you model you could
1) Add a column IsLatest that your application could manage.
This is not perfect but will satisfy you question (until next problem, see not at the end)
All you need is when you add a new entry to issue queries such as
UPDATE a
SET IsLatest = 0
WHERE IsLatest = 1
INSERT new a
UPDATE a
SET IsLatest = 1
WHERE nameId = #last_inserted_id
in a transaction or a trigger
2) Alternatively you can find out the max_id before you issue your query
SELECT MAX(nameId)
FROM a
WHERE vehicleName = #name
3) You can do it in single SQL, and providing indexes on (vehicleName, nameId) it should actually have decent speed with
select a.*
from vehicle a
join vehicle_names b ON a.vehicleName = b.vehicleName
join vehicle_attribs c ON b.nameId = c.nameId AND c.attribute = 'SMALL'
join vehicle_attribs d ON b.nameId = c.nameId AND d.attribute = 'SHINY'
join vehicle notmax ON a.vehicleName = b.vehicleName AND a.nameid < notmax.nameid
where a.vehicleName like '%coo%'
AND notmax.id IS NULL
I have removed your GROUP BY and HAVING and replaced it with another join (assuming that only single attribute per nameId is possible).
I have also used one of the ways to find max per group and that is to join a table on itself and filter out a row for which there are no records that have a bigger id for a same name.
There are other ways, search so for 'max per group sql'. Also see here, though not complete.