How to check if specific set of ID's exists? - mysql

I have a source table (piece of it):
+--------------------+
| E M P L O Y E E |
+--------------------+
| ID | EQUIPMENT |
+--------------------+
| 1 | tv,car,phone |
| 2 | car,phone |
| 3 | tv,phone |
+----+---------------+
After normalization process I ended with two new tables:
+----------------+
| DICT_EQUIPMENT |
+----------------+
| ID | EQUIPMENT |
+----------------+
| 1 | tv |
| 2 | car |
| 3 | phone |
+----+-----------+
+---------------------+
| SET_EQUIPMENT |
+----+--------+-------+
| ID | SET_ID | EQ_ID |
+----+--------+-------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 2 | 2 |
| 5 | 2 | 3 |
| 6 | 3 | 1 |
| 7 | 3 | 3 |
+----+--------+-------+
(the piece/part)
+-----------------+
| E M P L O Y E E |
+-----------------+
| ID | EQ_SET_ID |
+-----------------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------------+
And now when I want to find correct SET_ID I can write something like this:
SELECT SET_ID
FROM SET_EQUIPMENT S1,
SET_EQUIPMENT S2,
SET_EQUIPMENT S3
WHERE S1.SET_ID = S2.SET_ID
AND S2.SET_ID = S3.SET_ID
AND S1.EQ_ID = 1
AND S2.EQ_ID = 2
AND S3.EQ_ID = 3;
Maybe any ideas for optimize this query? how find the correct set?

First, you should use explicit join syntax for the method you are using:
SELECT S1.SET_ID
FROM SET_EQUIPMENT S1 JOIN
SET_EQUIPMENT S2
ON S1.SET_ID = S2.SET_ID JOIN
SET_EQUIPMENT S3
ON S2.SET_ID = S3.SET_ID
WHERE S1.EQ_ID = 1 AND
S2.EQ_ID = 2 AND
S3.EQ_ID = 3;
Commas in a from clause are quite outdated. (And, this fixes a syntax error in your query.)
An alternative method is to use group by with a having clause:
SELECT S.SET_ID
FROM SET_EQUIPMENT S
GROUP BY S.SET_ID
HAVING SUM(CASE WHEN S.EQ_ID = 1 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN S.EQ_ID = 2 THEN 1 ELSE 0 END) > 0 AND
SUM(CASE WHEN S.EQ_ID = 3 THEN 1 ELSE 0 END) > 0;
Which method works better depends on a number of factors -- for instance, the database engine you are using, the size of the tables, the indexes on the tables. You have to test which method works better on your system.

You've normalised wrongly. Get rid of set_equipment
Change to have three tables: employee, equipment, employee_equipment.
If you're looking for the equipment for a given employee you want to use:
select id, equipment
from equipment eq
inner join employee_equipment ee on eq.id = ee.eq_id
inner join employee emp on emp.id = ee.emp_id
where emp.id = 2

Related

Selecting COUNT and MAX columns with 2 tables and a bridge table

so what I am trying to do is having 3 tables (pictures, collections, and bridge) with the following columns:
Collections Table:
| id | name |
------------------
| 1 | coll1 |
| 2 | coll2 |
------------------
Pictures Table: (timestamps are unix timestamps)
| id | name | timestamp |
-------------------------
| 5 | Pic5 | 1 |
| 6 | Pic6 | 19 |
| 7 | Pic7 | 3 |
| 8 | Pic8 | 892 |
| 9 | Pic9 | 4 |
-------------------------
Bridge Table:
| id | collection | picture |
-----------------------------
| 1 | 1 | 5 |
| 2 | 1 | 6 |
| 3 | 1 | 7 |
| 4 | 1 | 8 |
| 5 | 2 | 5 |
| 6 | 2 | 9 |
| 7 | 2 | 7 |
-----------------------------
And the result should look like this:
| collection_name | picture_count | newest_picture |
----------------------------------------------------
| coll1 | 4 | 8 |
| coll2 | 3 | 9 |
----------------------------------------------------
newest_picture should always be the picture with the heighest timestamp in that collection and I also want to sort the result by it. picture_count is obviously the count of picture in that collection.
Can this be done in a single statement with table joins and if yes:
how can I do this the best way?
A simple method uses correlated subqueries:
select c.*,
(select count(*)
from bridge b
where b.collection = c.id
) as pic_count,
(select p.id
from bridge b join
pictures p
on b.picture = b.id
where b.collection = c.id
order by p.timestamp desc
limit 1
) as most_recent_picture
from collections c;
A more common approach would use window functions:
select c.id, c.name, count(bp.collection), bp.most_recent_picture
from collections c left join
(select b.*,
first_value(p.id) over (partition by b.collection order by p.timestamp desc) as most_recent_picture
from bridge b join
pictures p
on b.picture = p.id
) bp
on bp.collection = c.id
group by c.id, c.name, bp.most_recent_picture;

Using JOIN to filter data

I have this data in a table called PROD
| Project | Position | Status |
|---------|----------|--------|
| 1 | 1 | A |
| 1 | 2 | A |
| 2 | 1 | A |
| 2 | 2 | B |
| 3 | 1 | B |
| 3 | 2 | B |
| 4 | 1 | A |
| 4 | 2 | A |
I'm trying to get all the Projects that has at least one Position with Status = B.
| Project | Position | Status |
|---------|----------|--------|
| 2 | 1 | A |
| 2 | 2 | B |
| 3 | 1 | B |
| 3 | 2 | B |
I've tried using a JOIN like this:
SELECT * FROM PROD A JOIN PROD B ON A.PROD-Project = B.PROD-Project WHERE B.PROD-Status = 'B'
This give me an empty response.
With EXISTS:
SELECT p.* FROM PROD p
WHERE EXISTS (
SELECT 1 FROM PROD
WHERE Project = p.Project AND Status = 'B'
)
or with IN:
SELECT * FROM PROD
WHERE Project IN (SELECT Project FROM PROD WHERE Status = 'B')
If you want a solution with JOIN:
SELECT DISTINCT p.*
FROM PROD p JOIN PROD pp
ON pp.Project = p.Project
WHERE pp.Status = 'B'
See the demo.
Results:
> Project | Position | Status
> ------: | -------: | :-----
> 2 | 1 | A
> 2 | 2 | B
> 3 | 1 | B
> 3 | 2 | B
You could try using a join wit the subquery
select * from PROD
INNER JOIN (
select distinct project
from PROD
where status ='B';
) t on t.project = PROD.project
I'm trying to get all the Projects that has at least one Position with Status = B.
No need for a JOIN, just do:
SELECT DISTINCT PROD.Project WHERE PROD.Status='B'

How to determine what's changed between database records

Presume first, that the following table exists in a MySQL Database
|----|-----|-----|----|----|-----------|--------------|----|
| id | rid | ver | n1 | n2 | s1 | s2 | b1 |
|----|-----|-----|----|----|-----------|--------------|----|
| 1 | 1 | 1 | 0 | 1 | Hello | World | 0 |
| 2 | 1 | 2 | 1 | 1 | Hello | World | 0 |
| 3 | 1 | 3 | 0 | 0 | Goodbye | Cruel World | 0 |
| 4 | 2 | 1 | 0 | 0 | Hello | Doctor | 1 |
| 5 | 2 | 2 | 0 | 0 | Hello | Nurse | 1 |
| 6 | 3 | 1 | 0 | 0 | Dippity | Doo-Dah | 1 |
|----|-----|-----|----|----|-----------|--------------|----|
Question
How do I write a query to determine whether for any given rid, what changed between the most recent version and the version immediately preceding it (if any) such that it produces something like this:
|-----|-----------------|-----------------|-----------------|
| rid | numbers_changed | strings_changed | boolean_changed |
|-----|-----------------|-----------------|-----------------|
| 1 | TRUE | TRUE | FALSE |
| 2 | FALSE | TRUE | FALSE |
| 3 | n/a | n/a | n/a |
|-----|-----------------|-----------------|-----------------|
I think that I should be able to do this by doing a cross-join between the table and itself but I can't resolve how to perform this join to get the desired output.
I need to generate this "report" for a table with 10's of columns and 1-10 versions of 100's of records (resulting in 1000's of rows). Note the particular design of the database is not my own and altering the structure of the database (at this time) is not an acceptable approach.
The actual format of the output isn't important - and if it simplifies the query getting a "full breakdown" of what changed for each "change set" would also be acceptable, for example
|-----|-----|-----|----|----|----|----|----|
| rid | old | new | n1 | n2 | s1 | s2 | b1 |
|-----|-----|-----|----|----|----|----|----|
| 1 | 1 | 2 | Y | N | N | N | N |
| 1 | 2 | 3 | Y | Y | Y | Y | N |
| 2 | 4 | 5 | N | N | N | Y | N |
|-----|-----|-----|----|----|----|----|----|
Note that it is also ok, in this case to omit rid records which only have a single version, as for the purposes of this report I only care about records that have changed and getting a separate list of records that haven't changed is an easy query
You can join every row with the following one with
select *
from history h1
join history h2
on h2.rid = h1.rid
and h2.id = (
select min(h.id)
from history h
where h.rid = h1.rid
and h.id > h1.id
);
Then you just need to compare every column from the two rows like h1.n1 <> h2.n1 as n1.
The full query would be:
select h1.rid, h1.id as old, h2.id as new
, h1.n1 <> h2.n1 as n1
, h1.n2 <> h2.n2 as n2
, h1.s1 <> h2.s1 as s1
, h1.s2 <> h2.s2 as s2
, h1.b1 <> h2.b1 as b1
from history h1
join history h2
on h2.rid = h1.rid
and h2.id = (
select min(h.id)
from history h
where h.rid = h1.rid
and h.id > h1.id
);
Result:
| rid | old | new | n1 | n2 | s1 | s2 | b1 |
|-----|-----|-----|----|----|----|----|----|
| 1 | 1 | 2 | 1 | 0 | 0 | 0 | 0 |
| 1 | 2 | 3 | 1 | 1 | 1 | 1 | 0 |
| 2 | 4 | 5 | 0 | 0 | 0 | 1 | 0 |
Demo: http://sqlfiddle.com/#!9/2e5d12/5
If the columns can contain NULLs, You might need something like NOT h1.n1 <=> h2.n1 as n1. <=> is a NULL-save equality check.
If the version within a rid group is guaranteed to be consecutive, you can simplify the JOIN to
from history h1
join history h2
on h2.rid = h1.rid
and h2.ver = h1.ver + 1
Demo: http://sqlfiddle.com/#!9/2e5d12/7

right join query return non-matching values

I am trying to optimize a query and I have it down to something like this,
select a.* from
(select id, count(oid) as cnt from stuff1 s1 inner join stuff2 s2 on s1.id=s2.id group by id) as a
right join
(select id,'0' as cnt from stuff2) as b
on a.id = b.id
Basically the goal was to get the count for each oid, where those having 0 count are also included. I had a query previous to this that worked fine but it took 30 seconds to execute. I am looking to optimize the old query with this one, but I am getting NULL values from table b. I need the values from table b to show up with id and 0. Any help would be greatly appreciated.
An example of the data set could be,
Stuff1
| oid | id |
|---- |----|
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 3 |
Stuff2
| id |
|----|
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
the query should produce
| id | cnt |
|----|-----|
| 1 | 2 |
| 2 | 1 |
| 3 | 1 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
| 7 | 0 |
Your query is syntactically incorrect (oid may not be defined; id in the select is ambiguous). However, I suspect you want a simple left join:
select s2.id, count(s1.id) as cnt
from stuff2 s2 left join
stuff1 s1
on s1.id = s2.id
group by s2.id;

Querying across 6 tables, is there a better way of doing this?

What I did was, I wanted each user to have their own "unique" numbering system. Instead of auto incrementing the item number by 1, I did it so that Bob's first item would start at #1 and Alice's number would also start at #1. The same goes for rooms and categories. I achieved this by creating "mapping" tables for items, rooms and categories.
The query below works, but I know it can definitely be refactored. I have primary keys in each table (on the "ids").
SELECT unique_item_id as item_id, item_name, category_name, item_value, room_name
FROM
users_items, users_map_item, users_room, users_map_room, users_category, users_map_category
WHERE
users_items.id = users_map_item.map_item_id AND
item_location = users_map_room.unique_room_id AND
users_map_room.map_room_id = users_room.room_id AND
users_map_room.map_user_id = 1 AND
item_category = users_map_category.unique_category_id AND
users_map_category.map_category_id = users_category.category_id AND
users_category.user_id = users_map_category.map_user_id AND
users_map_category.map_user_id = 1
ORDER BY item_name
users_items
| id | item_name | item_location |item_category |
--------------------------------------------------------
| 1 | item_a | 1 | 1 |
| 2 | item_b | 2 | 1 |
| 3 | item_c | 1 | 1 |
users_map_item
| map_item_id | map_user_id | unique_item_id |
----------------------------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
users_rooms
| id | room_name |
----------------------
| 1 | basement |
| 2 | kitchen |
| 3 | attic |
users_map_room
| map_room_id | map_user_id | unique_room_id |
----------------------------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
users_category
| id | room_name |
----------------------
| 1 | antiques |
| 2 | appliance |
| 3 | sporting goods |
users_map_category
| map_room_id | map_user_id | unique_category_id |
----------------------------------------------------
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 2 | 1 |
Rewriting your query with explicit JOIN conditions makes it more readable (while doing the same).
SELECT mi.unique_item_id AS item_id
, i.item_name
, c.category_name
, i.item_value
, r.room_name
FROM users_map_item mi
JOIN users_items i ON i.id = mi.map_item_id
JOIN users_map_room mr ON mr.unique_room_id = i.item_location
JOIN users_room r ON r.room_id = mr.map_room_id
JOIN users_map_category mc ON mc.unique_category_id = i.item_category
JOIN users_category c ON (c.user_id, c.category_id)
= (mc.map_user_id, mc.map_category_id)
WHERE mr.map_user_id = 1
AND mc.map_user_id = 1
ORDER BY i.item_name
The result is unchanged. Query plan should be the same. I see no way to improve the query further.
You should use LEFT [OUTER] JOIN instead of [INNER] JOIN if you want to keep rows in the result where no matching rows are found in the right hand table. You may want to move the additional WHERE clauses to the JOIN condition in this case, as it changes the outcome.