Performance improvement for SQL count query - mysql

What type of sql query would I use to turn the following;
| ID | SERIAL | LCN | INITLCN |
|------|----------|-------|---------|
| 1 | A | A1 | |
| 2 | B | A2 | |
| 3 | C | A3 | A1 |
| 4 | D | A4 | A2 |
| 5 | E | A5 | A1 |
|------|----------|-------|---------|
into a result similar to this;
| ID | COUNT |
|------|---------|
| 1 | 2 |
| 2 | 1 |
|------|---------|
Using my low SQL skills, I have managed to write the below query however it is extremely slow;
select
a.id,
count (b.id) as parent
from assets a
left join assets b
ON (a.lcn = b.initlcn)
group by a.id
order by a.id;

select
t1.ID,
t1.LCN,
COUNT(*)
from
Table1 t1
INNER JOIN Table1 t2 ON t1.LCN = t2.INITLCN
GROUP BY t1.LCN
See it working live in an sqlfiddle.

maybe two table join is not nesecerry .
try this
select
a.id
,b.cnt
from assets a
join (
select
initlcn
count(1) cnt
from assets
group by initlcn
)
b on (a.lcn=b.initlcn)

you can test following query also -
I have checked the execution plan for your posted query and the following query and I am getting good difference between both -
SELECT t_1.Id, t_2.Cnt
FROM Assets t_1,
(SELECT Initlcn, COUNT(*) Cnt FROM Assets GROUP BY Initlcn) t_2
WHERE t_1.Lcn = t_2.Initlcn

Related

how to perform an outer join in mysql

I have a table A that contains tree columns, id, users ids and vehicle id. And a table B that contains vehicleid, and vehicle name.
Table A
---------------------------
| Id | User_id |Vehicle_id|
---------------------------
| 1 | 1 | 2 |
| 2 | 1 | 3 |
| 3 | 1 | 4 |
| 4 | 2 | 2 |
| 5 | 2 | 3 |
| 6 | 4 | 5 |
---------------------------
Table B
-------------------
| Id |Vehicle_name|
-------------------
| 1 | Car |
| 2 | Bike |
| 3 | Plane |
| 4 | Boat |
| 5 | Rocket |
-------------------
Given a user id, I need to get all vehicle names, that doesn't match with table A. I've tried Outer joins, but I can't manage to do get the info that i need.
For example: Given user id 1, the query should return Car and Rocket.
thanks in advance
This is simple enough using not in or not exists:
select b.*
from b
where not exists (select 1
from a
where a.vehicle_id = b.id and a.user_id = #a_user_id
);
I also thought of using a cross join and was able to get the output in case you are more comfortable with join logic.
SELECT CJOIN.USER_ID, CJOIN.VEHICLE_ID, CJOIN.VEHICLE_NAME
FROM
(SELECT DISTINCT A.USER_ID, B.ID AS VEHICLE_ID, B.VEHICLE_NAME FROM TABLE_A A CROSS JOIN TABLE_B B) CJOIN
LEFT JOIN
TABLE_A D
ON CJOIN.USER_ID = D.USER_ID AND CJOIN.VEHICLE_ID = D.VEHICLE_ID
WHERE D.USER_ID IS NULL AND D.VEHICLE_ID IS NULL;
First, I got all possible combinations of USER_ID x VEHICLE_ID by a cross join and used this table in a left join to pull records for which there is no match.

Identifying the pairs of ID's in a column with the highest number of matches in SQL

I am trying to find the pairs of businesses with the highest number of common customers using MySQL.
The table is like the following:
+------------+------------+
| BusinessID | CustomerID |
+------------+------------+
| A | 1 |
| A | 2 |
| A | 3 |
| B | 4 |
| B | 1 |
| B | 3 |
| B | 2 |
| C | 3 |
| C | 4 |
| C | 5 |
+------------+------------+
And I want the output to be the pairs of businesses and the number of common customers, like this:
+-------------+-------------+------------------------+
| BusinessID | BusinessID | Common Customers Count |
+-------------+-------------+------------------------+
| A | B | 3 |
| A | C | 1 |
| B | C | 2 |
+-------------+-------------+------------------------+
This is the query I wrote:
SELECT a.BusinessID,b.BusinessID,COUNT(*) AS ncom
FROM (SELECT BusinessID, CustomerID FROM MYTABLE) AS a JOIN
(SELECT BusinessID,CustomerID FROM MYTABLE) AS b
ON a.BusinessID < b.BusinessID AND a.CustomerID = b.CustomerID
GROUP BY a.BusinessID, b.BusinessID
ORDER BY ncom
The problem is that my dataset has about 5m rows, and this seems to be too inefficient on large datasets. I tested the query on smaller datasets by limiting the data -- it took 8 seconds to process 10k rows and 30 seconds for 20k rows, so this query wouldn't be feasible to run for 5m rows. How else can I write the query to make it faster?
Don't use subqueries to get the columns from the table, that's probably preventing it from using indexes.
SELECT a.BusinessID, b.BusinessID, COUNT(*) as ncom
FROM MYTABLE AS a
JOIN MYTABLE AS b ON a.BusinessID < b.BusinessID AND a.CustomerID = b.CustomerID
GROUP BY a.BusinessID, b.BusinessID
ORDER BY ncom
Also, give the table the following index:
CREATE INDEX ix_cust_bus ON MYTABLE (CustomerID, BusinessID);

conditional mysql query multiple tables

I have 2 tables in mysql database as shown below. I am looking for a query that will select * from books but if preview_image = 'none' then preview_image = the hash_id of the row with the largest size where books.id = images.parentid. Hope this makes sense.
table books
+----------------+---------------+
| id | title | preview_image |
+----------------+---------------+
| 1 | book1 | 55859076d906 |
| 2 | book2 | 20a14f9fd7cf |
| 3 | book3 | none |
| 4 | book4 | ce805ecff5c9 |
| 5 | book5 | e60a7217b3e2 |
+----------------+---------------+
table images
+-------------+------+---------------+
| parentid | size | hash_id |
+--------------------+---------------+
| 2 | 100 | 55859076d906 |
| 1 | 200 | 20a14f9fd7cf |
| 3 | 300 | 34805fr5c9e5 |
| 3 | 400 | ce805ecff5c9 |
| 3 | 500 | e60a7217b3e2 |
+--------------------+---------------+
Thanks
You can use SUBSTRING_INDEX() to obtain the first record from a sorted GROUP_CONCAT(), and switch using a CASE expression:
SELECT books.id, books.title, CASE books.preview_image
WHEN 'none' THEN SUBSTRING_INDEX(
GROUP_CONCAT(images.hash_id ORDER BY images.size DESC SEPARATOR ',')
, ',', 1)
ELSE books.preview_image
END AS preview_image
FROM books LEFT JOIN images ON images.parentid = books.id
GROUP BY books.id
Write a subquery that finds the desired hash ID for each parent ID, using one of the techniques in SQL Select only rows with Max Value on a Column. Then join this with the books table.
SELECT b.id, b.title, IF(b.preview_image = 'none', i.hash_id, b.preview_image) AS image
FROM books AS b
LEFT JOIN (SELECT i1.parentid, i1.hash_id
FROM images AS i1
JOIN (SELECT parentid, MAX(size) AS maxsize
FROM images
GROUP BY parentid) AS i2
ON i1.parentid = i2.parentid AND i1.size = i2.size) AS i
ON b.id = i.parentid

MySQL select unique rows in two columns with the highest value in one column

I have a basic table:
+-----+--------+------+------+
| id, | name, | cat, | time |
+-----+--------+------+------+
| 1 | jamie | 1 | 100 |
| 2 | jamie | 2 | 100 |
| 3 | jamie | 1 | 50 |
| 4 | jamie | 2 | 150 |
| 5 | bob | 1 | 100 |
| 6 | tim | 1 | 300 |
| 7 | alice | 4 | 100 |
+-----+--------+------+------+
I tried using the "Left Joining with self, tweaking join conditions and filters" part of this answer: SQL Select only rows with Max Value on a Column but some reason when there are records with a value of 0 it breaks, and it also doesn't return every unique answer for some reason.
When doing the query on this table I'd like to receive the following values:
+-----+--------+------+------+
| id, | name, | cat, | time |
+-----+--------+------+------+
| 1 | jamie | 1 | 100 |
| 4 | jamie | 2 | 150 |
| 5 | bob | 1 | 100 |
| 6 | tim | 1 | 300 |
| 7 | alice | 4 | 100 |
+-----+--------+------+------+
Because they are unique on name and cat and have the highest time value.
The query I adapted from the answer above is:
SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
SELECT name, cat, id, MAX(time) as time
FROM data
WHERE extra_column = 1
GROUP BY name, cat
) b ON a.id = b.id AND a.time = b.time
The issue here is that ID is unique per row you can't get the unique value when getting the max; you have to join on the grouped values instead.
SELECT a.name, a.cat, a.id, a.time
FROM data A
INNER JOIN (
SELECT name, cat, MAX(time) as time
FROM data
WHERE extra_column = 1
GROUP BY name, cat
) b ON A.Cat = B.cat and A.Name = B.Name AND a.time = b.time
Think about it... So what ID is mySQL returning form the Inline view? It could be 1 or 3 and 2 or 4 for jamie. Hows does the engine know to pick the one with the max ID? it is "free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. " it could pick the wrong one resulting in incorrect results. So you can't use it to join on.
https://dev.mysql.com/doc/refman/5.0/en/group-by-handling.html
If you want to use a self join, you could use this query:
SELECT
d1.*
FROM
date d1 LEFT JOIN date d2
ON d1.name=d2.name
AND d1.cat=d2.cat
AND d1.time<d2.time
WHERE
d2.time IS NULL
It is very simple
SELECT MAX(TIME),name,cat FROM table name group by cat

Mysql join on 3 tables output

I am learning joins and have the following tables.
Student
| ID | NAME |
-------------
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
Pass
| ID | MARKS |
--------------
| 2 | 80 |
| 3 | 75 |
Fail
| ID | MARKS |
--------------
| 1 | 25 |
| 4 | 20 |
The output I want is this:
| NAME | MARKS |
----------------
| B | 80 |
| C | 75 |
| A | 25 |
| D | 20 |
I wrote a query like this:
select s.id,s.name,p.marks from student s
left join pass p on s.id=p.id
left join (select f.marks,f.id from fail f ) as nn on s.id=nn.id
order by marks desc;
The output I got is this:
| id | name | Marks|
--------------------
| 1 | B | 80 |
| 2 | C | 75 |
| 3 | A | Null |
| 4 | D | NUll |
Cant figure out why Null is coming. Any pointers?
You can use CASE statement for that:
SELECT Name,
CASE WHEN P.Marks IS NULL THEN f.Marks ELSE P.Marks END AS Marks
FROM Student s
LEFT JOIN Pass p ON s.ID = p.ID
LEFT JOIN Fail f ON s.ID = f.ID
ORDER BY Marks DESC;
Or you can also use IF statement:
SELECT Name,
IF(P.Marks IS NULL, F.Marks, P.Marks) AS Marks
FROM Student s
LEFT JOIN Pass p ON s.ID = p.ID
LEFT JOIN Fail f ON s.ID = f.ID
ORDER BY Marks DESC;
Output
| NAME | MARKS |
----------------
| B | 80 |
| C | 75 |
| A | 25 |
| D | 20 |
See this SQLFiddle
To learn more about JOINs see: A Visual Explanation of SQL Joins
You select only the passed marks, this is the reason of null-s appears near falied results.
If you want to select the failed marks you can use IF condition
select s.id,s.name,IF(p.marks = null, nn.marks, p.marks) as markss
from student s
left join pass p on s.id=p.id
left join fail nn on s.id=nn.id
order by markss desc;
Or you can use union of the passed and failed results.
select s.id,s.name, u.marks
from student s
left join ( (SELECT * FROM pass) UNION (SELECT * FROM fail) ) as n ON n.id = s.id
order by marks desc;
You need to understand how the different joins work to understand why you receive NULL for the marks column.
Take a look here:A Visual Explanation of SQL Joins
The relevant example for you is:
LEFT OUTER JOIN:
SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name
id name id name
-- ---- -- ----
1 Pirate 2 Pirate
2 Monkey null null
3 Ninja 4 Ninja
4 Spaghetti null null
The Null values you received for the marks column are rows that have no match in the left joined tables.
(the left part of the Venn diagram) the values that does have a value are the cross section between the tow groups of the Venn Diagram.
specifics for your example:
select s.id,s.name,p.marks
from student s
left join pass p on s.id=p.id
left join (select f.marks,f.id from fail f ) as nn on s.id=nn.id
order by marks desc;
The output i got is this:
id | name | Marks
-------------------
1 | B | 80
2 | C | 75
3 | A | Null
4 | D | NUll
This will return all student rows when.
students that have a passing gtade will display the grade and thous who don't will display null.
Try the below Query, use COALESCE
select s.id,s.name,COALESCE(p.marks , nn.marks) as marks
from student s
left join pass p on s.id=p.id
left join fail nn on s.id=nn.id
order by marks desc;
SQL Fiddle