how to get the non duplicates? - mysql

I need to make a request in SQL.
I have a field that contains IDs.
These IDs are written in 2 ways and are prefixed either by'C0' or by'E0' for example: "C0121213" or "E0121213".
I would like to make a query allowing me to find the number of IDs starting with C0 but not duplicating starting with E0.
That is, I would like to find IDs that do not have C0 or E0 pairs.
Thank you in advance
I started with a request :
SELECT *
FROM SBYN
WHERE ID IN (
SELECT LID
FROM SBYN
WHERE LEFT(ID,2) = 'C0'
OR LEFT(ID, 2) = 'E0'
GROUP BY LID HAVING COUNT(*) > 1
)
ORDER BY ID

NOT EXISTS comes to mind:
SELECT COUNT(*)
FROM SBYN s
WHERE s.ID LIKE 'C0%' AND
NOT EXISTS (SELECT 1
FROM SBYN s2
WHERE s2.ID LIKE 'E0%' AND
SUBSTRING(s2.ID, 2) = SUBSTRING(s.ID, 2)
);
If you want the IDs, then use SELECT ID rather than SELECT COUNT(*).

Using EXISTS:
SELECT
ID
FROM SBYN s1
WHERE
ID LIKE 'C0%' AND
NOT EXISTS (SELECT 1 FROM SBYN s2
WHERE s2.ID LIKE 'E0' AND
SUBSTRING(s1.ID, 3) = SUBSTRING(s2.ID, 3));

With NOT EXISTS:
select * from sbyn s
where not exists (
select 1 from sbyn
where left(id, 2) <> left(s.id, 2) and
right(id, 3, length(id)) = right(s.id, 3, length(s.id))
)
This will return all the non duplicates.
If you care only about those starting with C0 add to the where clause:
and left(s.id) = 'C0'

Related

Is there any other way to use Alias in Sql?

I am working adjacent id exchange problem. I found one solution, it partly works.
SELECT tmp.id, tmp.student FROM
(
SELECT id-1 AS id, student FROM seat WHERE id%2 = 0 -- even id -1
UNION
SELECT id+1 AS id, student FROM seat WHERE id%2 = 1 -- odd id +1
) tmp
ORDER BY tmp.id
I know the basic idea, but still confuse to the syntax or expression. I am wondering where the tmp comes from? Is the other way to use Alias in SQL?
IN the sample the tmp alias is changed in my_table_alias
SELECT my_table_alias.id, my_table_alias.student FROM
(
SELECT id-1 AS id, student FROM seat WHERE id%2 = 0 -- even id -1
UNION
SELECT id+1 AS id, student FROM seat WHERE id%2 = 1 -- odd id +1
) my_table_alias
ORDER BY my_table_alias.id
This is introducued by the select ,,,, FROM ( subquery ) my_table_alias sintax
when you use a FROM( subquery ) you need a my_table_alias ..
The table alias for then FROM(subquery) in mandatory If you want refer to the column coming from the FROM(subquery) in outer part of query
You can just use a case expression:
SELECT (CASE WHEN s.id % 2 = 0 THEN s.id - 1 ELSE s.id + 1
END) AS id,
s.student
FROM seat s;
UNION is definitely not the right way to approach this problem.
Note that the s plays the same role as tmp in your question. It is a table alias that names a table or subquery in the from clause.

SQL - Split a column into two columns in Mysql

I have this table. Considering the id starts from 0.
Table 1
ID Letter
1 A
2 B
3 C
4 D
6 E
I need following output
Col1 Col2
NULL A
B C
D NULL
E NULL
I tried using union with id, id - 1 and id + 1, but I couldn't figure out how to get letter based on ids, also tried even odd logic but nothing worked.
Any help is appreciated.
Thank you
You didn't post the database engine, so I'll assume PostgreSQL where the modulus operand is %.
The query should be:
select o.letter, e.letter
from (
select id, letter, id as base from my_table where id % 2 = 0
) o full outer join (
select id, letter, (id - 1) as base from my_table where id % 2 <> 0
) e on e.base = o.base
order by coalesce(o.base, e.base)
Please take the following option with a grain of salt since I don't have a way of testing it in MySQL 5.6. In the absence of a full outer join, you can perform two outer joins, and then you can union them, as in:
select * from (
select o.base, o.letter, e.letter
from (
select id, letter, id as base from my_table where id % 2 = 0
) o left join (
select id, letter, (id - 1) as base from my_table where id % 2 <> 0
) e on e.base = o.base
union
select e.base, o.letter, e.letter
from (
select id, letter, id as base from my_table where id % 2 = 0
) o right join (
select id, letter, (id - 1) as base from my_table where id % 2 <> 0
) e on e.base = o.base
) x
order by base
Just use conditional aggregation:
select max(case when id % 2 = 0 then letter end) as col1,
max(case when id % 2 = 1 then letter end) as col2
from t
group by floor(id / 2);
If you prefer, you can use mod() instead of %. MySQL supports both.

Want to improve SQL query

I have a table called "world". It has some empty IDs:
id - data
1 - ...
2 - ...
(no 3,4 IDs after 2)
5 - ...
And I have a query to select the lowest unused ID in this table. It looks like:
SELECT MIN(t1.id)
FROM
(
SELECT 1 AS id
UNION ALL
SELECT id + 1
FROM world
) t1
LEFT OUTER JOIN world t2
ON t1.id = t2.id
WHERE t2.id IS NULL;
I want to find a way how to improve this query to make it execute faster.
You can do something like this:
select (w.id + 1)
from world w left join
world w2
on w.id = w2.id - 1
where w2.id is null
order by w.id
limit 1
This should have reasonable performance with an index on world(id).
SQLFiddle for the same SQL
This will give you the first unused ID after the first used one (i.e. it won't give you ID 1 if it's unused but will work for the rest).
SELECT id + 1 FROM world WHERE (id + 1) NOT IN (
SELECT id FROM world
) ORDER BY id ASC LIMIT 1
To include ID 1 you could do a specific check first to see if it exists, or do something like:
SELECT IF(
NOT EXISTS(SELECT id FROM world WHERE id = 1),
1,
(SELECT id + 1 FROM world WHERE (id + 1) NOT IN (
SELECT id FROM world
) ORDER BY id ASC LIMIT 1)
) AS min_unused

Select inside math function

This is most likely a beginner's question in SQL. Is it possible to use a select within a math expression?
For example, I have two tables:
- table A with a column named id (primary key) and another column named val_A
- table B with a column named id (primary key) and another column named val_B
I want to do something like:
select ((select val_A from A where id = 1) +
(select val_B from B where id = 1)) as final_sum;
I'm using MySQL and it is throwing errors. I'm assuming that this is because the result of a select is a set and I want the numeric value of val_A and val_B to be make the sum.
Is there any way of doing this?
Thanks!
The query that you have:
select ((select val_A from A where id = 1) +
(select val_B from B where id = 1)
) as final_sum
is correctly formed SQL in MySQL (assuming that the table and columns exist).
However, it assumes that each subquery only returns one row. If not, you can force it using limit or a function like min() or max():
select ((select val_A from A where id = 1 limit 1) +
(select max(val_B) from B where id = 1)
) as final_sum
Or, possibly, you are trying to get the sum of all the rows with id = 1 in both tables:
select ((select sum(val_A) from A where id = 1) +
(select sum(val_B) from B where id = 1)
) as final_sum
Yes you can do that, but a more proper query format would be:
SELECT (a.val_a + b.val_b) as final_sum
FROM a INNER JOIN b ON a.id = b.id
WHERE a.id = 1
I'm not sure why it's not working, but you could try something like:
select (val_A + val_B) as final_sum from A,B where A.id=1 and B.id=1;
Break down and test your query
select 1+1
so your statement is just without the select. This would run -
select ((select sum(val_A) from A where id = 1) +
(select sum(val_B) from B where id = 1)) as final_sum;

Fetch 2nd Higest value from MySql DB with GROUP BY

I have a table tbl_patient and I want to fetch last 2 visit of each patient in order to compare whether patient condition is improving or degrading.
tbl_patient
id | patient_ID | visit_ID | patient_result
1 | 1 | 1 | 5
2 | 2 | 1 | 6
3 | 2 | 3 | 7
4 | 1 | 2 | 3
5 | 2 | 3 | 2
6 | 1 | 3 | 9
I tried the query below to fetch the last visit of each patient as,
SELECT MAX(id), patient_result FROM `tbl_patient` GROUP BY `patient_ID`
Now i want to fetch the 2nd last visit of each patient with query but it give me error
(#1242 - Subquery returns more than 1 row)
SELECT id, patient_result FROM `tbl_patient` WHERE id <(SELECT MAX(id) FROM `tbl_patient` GROUP BY `patient_ID`) GROUP BY `patient_ID`
Where I'm wrong
select p1.patient_id, p2.maxid id1, max(p1.id) id2
from tbl_patient p1
join (select patient_id, max(id) maxid
from tbl_patient
group by patient_id) p2
on p1.patient_id = p2.patient_id and p1.id < p2.maxid
group by p1.patient_id
id11 is the ID of the last visit, id2 is the ID of the 2nd to last visit.
Your first query doesn't get the last visits, since it gives results 5 and 6 instead of 2 and 9.
You can try this query:
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
union
SELECT patient_ID,visit_ID,patient_result
FROM tbl_patient
where id in (
select max(id)
from tbl_patient
where id not in (
select max(id)
from tbl_patient
GROUP BY patient_ID)
GROUP BY patient_ID)
order by 1,2
SELECT id, patient_result FROM `tbl_patient` t1
JOIN (SELECT MAX(id) as max, patient_ID FROM `tbl_patient` GROUP BY `patient_ID`) t2
ON t1.patient_ID = t2.patient_ID
WHERE id <max GROUP BY t1.`patient_ID`
There are a couple of approaches to getting the specified resultset returned in a single SQL statement.
Unfortunately, most of those approaches yield rather unwieldy statements.
The more elegant looking statements tend to come with poor (or unbearable) performance when dealing with large sets. And the statements that tend to have better performance are more un-elegant looking.
Three of the most common approaches make use of:
correlated subquery
inequality join (nearly a Cartesian product)
two passes over the data
Here's an approach that uses two passes over the data, using MySQL user variables, which basically emulates the analytic RANK() OVER(PARTITION ...) function available in other DBMS:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM (
SELECT p.id
, p.patient_id
, p.visit_id
, p.patient_result
, #rn := if(#prev_patient_id = patient_id, #rn + 1, 1) AS rn
, #prev_patient_id := patient_id AS prev_patient_id
FROM tbl_patients p
JOIN (SELECT #rn := 0, #prev_patient_id := NULL) i
ORDER BY p.patient_id DESC, p.id DESC
) t
WHERE t.rn <= 2
Note that this involves an inline view, which means there's going to be a pass over all the data in the table to create a "derived tabled". Then, the outer query will run against the derived table. So, this is essentially two passes over the data.
This query can be tweaked a bit to improve performance, by eliminating the duplicated value of the patient_id column returned by the inline view. But I show it as above, so we can better understand what is happening.
This approach can be rather expensive on large sets, but is generally MUCH more efficient than some of the other approaches.
Note also that this query will return a row for a patient_id if there is only one id value exists for that patient; it does not restrict the return to just those patients that have at least two rows.
It's also possible to get an equivalent resultset with a correlated subquery:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
WHERE ( SELECT COUNT(1) AS cnt
FROM tbl_patients p
WHERE p.patient_id = t.patient_id
AND p.id >= t.id
) <= 2
ORDER BY t.patient_id ASC, t.id ASC
Note that this is making use of a "dependent subquery", which basically means that for each row returned from t, MySQL is effectively running another query against the database. So, this will tend to be very expensive (in terms of elapsed time) on large sets.
As another approach, if there are relatively few id values for each patient, you might be able to get by with an inequality join:
SELECT t.id
, t.patient_id
, t.visit_id
, t.patient_result
FROM tbl_patients t
LEFT
JOIN tbl_patients p
ON p.patient_id = t.patient_id
AND t.id < p.id
GROUP
BY t.id
, t.patient_id
, t.visit_id
, t.patient_result
HAVING COUNT(1) <= 2
Note that this will create a nearly Cartesian product for each patient. For a limited number of id values for each patient, this won't be too bad. But if a patient has hundreds of id values, the intermediate result can be huge, on the order of (O)n**2.
Try this..
SELECT id, patient_result FROM tbl_patient AS tp WHERE id < ((SELECT MAX(id) FROM tbl_patient AS tp_max WHERE tp_max.patient_ID = tp.patient_ID) - 1) GROUP BY patient_ID
Why not use simply...
GROUP BY `patient_ID` DESC LIMIT 2
... and do the rest in the next step?