Delete rows with null values mysql - mysql

I want to delete the rows with null values in the column
How can i delete it?
SELECT employee.Name,
`department`.NUM,
SALARY
FROM employee
LEFT JOIN `department` ON employee.ID = `department`.ID
ORDER BY NUM;
+--------------------+-------+----------+
| Name | NUM | SALARY |
+--------------------+-------+----------+
| Gallegos | NULL | NULL |
| Lara | NULL | NULL |
| Kent | NULL | NULL |
| Lena | NULL | NULL |
| Flores | NULL | NULL |
| Alexandra | NULL | NULL |
| Hodge | 8001 | 973.45 |
+--------------------+-------+----------+
Should be like this
+--------------------+-------+----------+
| Name | NUM | SALARY |
+--------------------+-------+----------+
| | | |
| Hodge | 8001 | 973.45 |
+--------------------+-------+----------+

You are asking to delete, but to me it seems more like removing nulls from the result of select statement, if so use:
SELECT employee.Name,
`department`.NUM,
SALARY
FROM employee
LEFT JOIN `department` ON employee.ID = `department`.ID
WHERE (`department`.NUM IS NOT NULL AND SALARY IS NOT NULL)
ORDER BY NUM;
Note: The parentheses are not required but it’s good practice to enclose grouped comparators for better readability.
The above query will exclude the even if the NUM column is not null and the SALARY column is null and vice versa

If by deleting you mean that you don't want to see rows with null values in your table, you can use INNER JOIN instead of LEFT JOIN.
You use INNER JOIN when you want to return only records having pair on both sides, and you'll use LEFT JOIN when you need all records from the “left” table, no matter if they have pair in the “right” table or not.
You can learn more here.

Related

How to organize my query with so many ANDs

My query looks like:
SELECT SUM(ct_product_store_quantity.quantity) as quantity, `ct_product`.*
FROM `ct_product`
LEFT JOIN `ct_productLang` ON `ct_product`.`id` = `ct_productLang`.`product_id`
LEFT JOIN `ct_product_store_quantity` ON `ct_product`.`id` = `ct_product_store_quantity`.`product_id`
LEFT JOIN `ct_product_attribute` as cpa ON ct_product.id=cpa.product_id
WHERE cpa.attribute_id=10
AND cpa.attribute_value_id=36
AND cpa.attribute_id=2
AND cpa.attribute_value_id=5
AND cpa.attribute_id=7
AND cpa.attribute_value_id=31
AND cpa.attribute_id=9
AND cpa.attribute_value_id=28
AND cpa.attribute_id=8
AND cpa.attribute_value_id=25
GROUP BY `ct_product`.`id`
HAVING quantity > 0
ORDER BY `id` DESC
In simple words - each of the AND condtitions evaluate to true. If I execute them one by one it is OK. But when I try to execute it like what I posted above - no results are returned. I am sure am not doing right the multiple AND conditions part. The ct_product_attribute table:
+--------------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| product_id | int(11) | YES | MUL | NULL | |
| attribute_set_id | int(11) | YES | MUL | NULL | |
| attribute_id | int(11) | YES | MUL | NULL | |
| attribute_value_id | int(11) | YES | MUL | NULL | |
| value | varchar(255) | YES | | NULL | |
+--------------------+--------------+------+-----+---------+----------------+
Will post the other tables if needed. Just trying to not flood the post. Thank you!
EDIT
In ct_product I got products like ( just for example ):
id
1
2
3
In ct_product_attribute each product can have more than one attribute-attr.value pairs. Some of the pairs are same.( will show only the columns that I need )
id product_id attribute_id attribute_value_id
1 1 1 1
2 2 1 1
3 1 2 1
4 2 3 1
5 3 1 1
6 3 2 1
The values that I get from the request are:
attribute_id=1
attribute_value_id=1
attribute_id=2
attribute_value_id=1
And now I have to retrieve only the product with id=1. If I use OR it is retrieving both products id=1 and id=2. Not sure if it gets more clear now.
I'm pretty sure those are supposed to be ORs because you can't have all those IDs at the same time. With that in mind, you should be able to use IN.
WHERE cpa.attribute_id IN (10,2,7,9,8)
AND cpa.attribute_value_id IN (36,5,31,28,25)
I really don't know what you are trying to accomplish but you should/could use WHERE IN, as everyone pointed in the comments you are looking for a field with multiple values...
But, as for the AND question, you could/should use IN, as in;
SELECT SUM(ct_product_store_quantity.quantity) as quantity, `ct_product`.*
FROM `ct_product`
LEFT JOIN `ct_productLang` ON `ct_product`.`id` = `ct_productLang`.`product_id`
LEFT JOIN `ct_product_store_quantity` ON `ct_product`.`id` = `ct_product_store_quantity`.`product_id`
LEFT JOIN `ct_product_attribute` as cpa ON ct_product.id=cpa.product_id
WHERE cpa.attribute_id IN (10, 2, 7, 9, 8)
AND cpa.attribute_value_id IN (36, 5, 31, 28, 25)
GROUP BY `ct_product`.`id`
HAVING quantity > 0
ORDER BY `id` DESC
You can try using (cpa.attribute_id,cpa.attribute_value_id) in ((10,36),(2,5),(7,31),(9,28),(8,25))
SELECT SUM(ct_product_store_quantity.quantity) as quantity, `ct_product`.*
FROM `ct_product`
LEFT JOIN `ct_productLang` ON `ct_product`.`id` = `ct_productLang`.`product_id`
LEFT JOIN `ct_product_store_quantity` ON `ct_product`.`id` = `ct_product_store_quantity`.`product_id`
LEFT JOIN `ct_product_attribute` as cpa ON ct_product.id=cpa.product_id
WHERE (cpa.attribute_id,cpa.attribute_value_id) in ((10,36),(2,5),(7,31),(9,28),(8,25)) and `ct_product`.`id`=1
GROUP BY `ct_product`.`id`
HAVING quantity > 0
ORDER BY `id` DESC

Performant way to self-join and filter by revised rows

I'm trying to select all rows in this table, with the constraint that revised id's are selected instead of the original ones. So, if a row has a revision, that revision is selected instead of that row, if there are multiple revision numbers the highest revision number is preferred.
I think an example table, output, and query will explain this better:
Table:
+----+-------+-------------+-----------------+-------------+
| id | value | original_id | revision_number | is_revision |
+----+-------+-------------+-----------------+-------------+
| 1 | abcd | null | null | 0 |
| 2 | zxcv | null | null | 0 |
| 3 | qwert | null | null | 0 |
| 4 | abd | 1 | 1 | 1 |
| 5 | abcde | 1 | 2 | 1 |
| 6 | zxcvb | 2 | 1 | 1 |
| 7 | poiu | null | null | 0 |
+----+-------+-------------+-----------------+-------------+
Desired Output:
+----+-------+-------------+-----------------+
| id | value | original_id | revision_number |
+----+-------+-------------+-----------------+
| 3 | qwert | null | null |
| 5 | abcde | 1 | 2 |
| 6 | zxcvb | 2 | 1 |
| 7 | poiu | null | null |
+----+-------+-------------+-----------------+
View Called revisions_max:
SELECT
responses.original_id AS original_id,
MAX(responses.revision_number) AS revision
FROM
responses
WHERE
original_id IS NOT NULL
GROUP BY responses.original_id
My Current Query:
SELECT
responses.*
FROM
responses
WHERE
id NOT IN (
SELECT
original_id
FROM
revisions_max
)
AND
is_revision = 0
UNION
SELECT
responses.*
FROM
responses
INNER JOIN revisions_max ON revisions_max.original_id = responses.original_id
AND revisions_max.revision_number = responses.revision_number
This query works, but takes 0.06 seconds to run. With a table of only 2000 rows. This table will quickly start expanding to tens or hundreds of thousands of rows. The query under the union is what takes most of the time.
What can I do to improve this queries performance?
How about using coalesce()?
SELECT COALESCE(y.id, x.id) AS id,
COALESCE(y.value, x.value) AS value,
COALESCE(y.original_id, x.original_id) AS original_id,
COALESCE(y.revision_number, x.revision_number) AS revision_number
FROM responses x
LEFT JOIN (SELECT r1.*
FROM responses r1
INNER JOIN (SELECT responses.original_id AS
original_id,
Max(responses.revision_number) AS
revision
FROM responses
WHERE original_id IS NOT NULL
GROUP BY responses.original_id) rev
ON r1.original_id = rev.original_id
AND r1.revision_number = rev.revision) y
ON x.id = y.original_id
WHERE y.id IS NOT NULL
OR x.original_id IS NULL;
The approach I would take with any other DBMS is to use NOT EXISTS:
SELECT r1.*
FROM Responses AS r1
WHERE NOT EXISTS
( SELECT 1
FROM Responses AS r2
WHERE r2.original_id = COALESCE(r1.original_id, r1.id)
AND r2.revision_number > COALESCE(r1.revision_number, 0)
);
To remove any rows where a higher revision number exists for the same id (or original_id if it is populated). However, in MySQL, LEFT JOIN/IS NULL will perform better than NOT EXISTS1. As such I would rewrite the above as:
SELECT r1.*
FROM Responses AS r1
LEFT JOIN Responses AS r2
ON r2.original_id = COALESCE(r1.original_id, r1.id)
AND r2.revision_number > COALESCE(r1.revision_number, 0)
WHERE r2.id IS NULL;
Example on DBFiddle
I realise that you have said that you don't want to use LEFT JOIN and check for nulls, but I don't see that there is a better solution.
1. At least this was the case historically, I don't actively use MySQL so don't keep up to date with developments in the optimiser

Correctly join 1:n:1:1 relation in mysql database

I'm developing a system to manage rental processes right now and I'm wondering how to efficiently query all rentable objects with the person name, who is currently renting it, if the object is rented at the moment. Otherwise there should be NULL in that column.
My tables look like:
object
| object_id | object_name |
---------------------------
| 1 | Object A |
| 2 | Object B |
| 3 | Object C |
| 4 | Object D |
| 5 | Object E |
---------------------------
person
| person_id | person_name |
---------------------------
| 1 | John Doe |
| 2 | Jane Doe |
| 3 | Max Muster |
| 4 | Foobar |
---------------------------
rental
| rental_id | rental_state| person_person_id |
----------------------------------------------
| 1 | open | 1 |
| 2 | returned | 1 |
| 3 | returned | 2 |
| 4 | open | 3 |
| 5 | returned | 4 |
----------------------------------------------
rental2object
| rental_rental_id | object_object_id |
---------------------------------------
| 1 | 1 |
| 2 | 2 |
| 2 | 3 |
| 3 | 3 |
| 4 | 2 |
| 4 | 5 |
| 5 | 2 |
---------------------------------------
The result I want should look like this:
| object_id | object_name | rented_to |
-------------------------------------------
| 1 | Object A | John Doe |
| 2 | Object B | Max Muster |
| 3 | Object C | NULL |
| 4 | Object D | NULL |
| 5 | Object E | Max Muster |
-------------------------------------------
What I've got so far is:
SELECT `object_id`, `object_name`, `person_name` FROM `object`
LEFT JOIN `rental2object` ON `object_id` = `object_object_id`
LEFT JOIN `rental` ON `rental_id` = `rental_rental_id` AND `rental_state` = 'open'
LEFT JOIN `person` ON `person_id` = `person_person_id`
GROUP BY `object_id`
The obvious problem is that I don't know how to aggregate the right way while grouping.
What would be the most efficient way to achieve my goal? Appreciate your help.
EDIT
Corrected the expected result, so that Object B is also rented to Max Muster.
About your question
Objects #2 and #5 are both in rental #4. But, on your expected results, you are handling both in different way. Object E and Object B both should be the same behaviour because they are in the same rental. If not, you should to explain witch is the criteria to know if a product has or not a related person.
Group by
To be SQL92 compliant you should to include in select clause all nonaggregated columns:
SELECT `object_id`, `object_name`, `person_name` as rented_to
FROM `object`
...
GROUP BY `object_id`, `object_name`, `person_name`
To be SQL99 compliant you should to include in select clause all nonaggregated columns non functionlly dependent, in your case, they are a dependent between object_id and object_name: object_id -> object_name (the field rental_state breaks dependent functionality to person), then you can just to write:
SELECT `object_id`, `object_name`, `person_name` as rented_to
FROM `object`
...
GROUP BY `object_id`, `person_name`
MySQL 5.7.5 and up implements detection of functional dependence, then this last select is valid but I suggest to you that, for readability, use the first one.
Read MySQL Handling of GROUP BY for more info and ONLY_FULL_GROUP_BY parameter details.
Performance
Be sure you have indexes for:
object: Object_id ( is primary key, then index is implicit )
rental2object: object_object_id ( may be a composite index with the other field, but be sure object_object_id is the first field on index )
rental : rental_id & rental_state ( a composite index with both fields )
person: person_id ( is primary key, then index is implicit )
Try this
SELECT
o.object_id,
o.object_name,
p.person_name AS rent_to
FROM
rental2object ro
RIGHT JOIN object o ON ro.object_object_id = o.object_id
LEFT JOIN rental r ON ro.rental_rental_id = r.rental_id AND r.rental_status = 'open'
JOIN person p ON r.person_person_id = p.person_id
SELECT `object_id`, `object_name`,
case
when rental_state = 'Open' then `person_name`
when r1.rental_rental_id is null then null
else `rental_state`
end as RentedTo
FROM `object`
LEFT JOIN `rental2object` r1 ON `object_id` = r1.`object_object_id`
LEFT JOIN `rental` ON `rental_id` = r1.`rental_rental_id`
LEFT JOIN `person` ON `person_id` = `person_person_id`
where r1.rental_rental_id =
(select max(r2.`rental_rental_id`)
from `rental2object` r2
where r2.`object_object_id` = r1.`object_object_id`
group by r2.`object_object_id`)
or r1.rental_rental_id is null
GROUP BY `object_id`;

create index on tables for mysql query that involves join,group/order-by,union and like

Below is the explain output for the slow query that has 10's of "copying to tmp table" state in mysql processlist.
explain SELECT distinct
(radgroupreply.groupname),
count(distinct (radusergroup.username)) AS users
FROM
radgroupreply
LEFT JOIN
radusergroup ON radgroupreply.groupname = radusergroup.groupname
WHERE
(radgroupreply.groupname NOT LIKE 'FB-%' AND radgroupreply.groupname NOT LIKE '%Dropped%')
GROUP BY radgroupreply.groupname
UNION SELECT distinct
(radgroupcheck.groupname),
count(distinct (radusergroup.username))
FROM
radgroupcheck
LEFT JOIN
radusergroup ON radgroupcheck.groupname = radusergroup.groupname
WHERE
(radgroupcheck.groupname NOT LIKE 'FB-%' AND radgroupcheck.groupname NOT LIKE '%Dropped%')
GROUP BY radgroupcheck.groupname
ORDER BY groupname asc;
+----+--------------+---------------+------+---------------+------+---------+------+-------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------+---------------+------+---------------+------+---------+------+-------+----------------------------------------------+
| 1 | PRIMARY | radgroupreply | ALL | NULL | NULL | NULL | NULL | 456 | Using where; Using temporary; Using filesort |
| 1 | PRIMARY | radusergroup | ALL | NULL | NULL | NULL | NULL | 10261 | |
| 2 | UNION | radgroupcheck | ALL | NULL | NULL | NULL | NULL | 167 | Using where; Using temporary; Using filesort |
| 2 | UNION | radusergroup | ALL | NULL | NULL | NULL | NULL | 10261 | |
|NULL| UNION RESULT | <union1,2> | ALL | NULL | NULL | NULL | NULL | NULL | Using filesort |
+----+--------------+---------------+------+---------------+------+---------+------+-------+----------------------------------------------+
5 rows in set (0.00 sec)
I cant get my head around this query to create compound/single index and optimize since it has multiple joins, group by and like operations.
Here are three observations to get started.
The select distinct is unnecessary (the group by takes care of that).
The left joins are unnecessary (the where clauses turn them into inner joins).
The UNION should probably be UNION ALL. I doubt you really want to incur the overhead of removing duplicates.
So, you can write the query as:
SELECT rr.groupname, count(distinct rg.username) AS users
FROM radgroupreply rr JOIN
radusergroup rg
ON rr.groupname = rg.groupname
WHERE rr.groupname NOT LIKE 'FB-%' AND rr.groupname NOT LIKE '%Dropped%'
GROUP BY rr.groupname
UNION ALL
SELECT rc.groupname, count(rg.username)
FROM radgroupcheck rc JOIN
radusergroup rg
ON rc.groupname = rg.groupname
WHERE rc.groupname NOT LIKE 'FB-%' AND rc.groupname NOT LIKE '%Dropped%'
GROUP BY rc.groupname
ORDER BY groupname asc;
This query can take advantage of indexes on radusergroup(groupname). I am guessing an index on rc(radusergroup) would be used.
I would also advice you to remove the DISTINCT in COUNT(DISTINCT) if it is not necessary.

SQL Group by combination?

I am having problems selecting items from a table where a device_id can be either in the from_device_id column or the to_device_id column. I am trying to return all chats where the given device is ID is in the from_device_id or to_device_id columns, but only return the latest message.
select chat.*, (select screen_name from usr where chat.from_device_id=usr.device_id limit 1) as from_screen_name, (select screen_name from usr where chat.to_device_id=usr.device_id limit 1) as to_screen_name from chat where to_device_id="ffffffff-af28-3427-a2bc-83865900edbe" or from_device_id="ffffffff-af28-3427-a2bc-83865900edbe" group by from_device_id, to_device_id;
+----+--------------------------------------+--------------------------------------+---------+---------------------+------------------+----------------+
| id | from_device_id | to_device_id | message | date | from_screen_name | to_screen_name |
+----+--------------------------------------+--------------------------------------+---------+---------------------+------------------+----------------+
| 20 | ffffffff-af28-3427-a2bc-83860033c587 | ffffffff-af28-3427-a2bc-83865900edbe | ee | 2011-02-28 12:36:38 | kevin | handset |
| 1 | ffffffff-af28-3427-a2bc-83865900edbe | ffffffff-af28-3427-a2bc-83860033c587 | yyy | 2011-02-27 17:43:17 | handset | kevin |
+----+--------------------------------------+--------------------------------------+---------+---------------------+------------------+----------------+
2 rows in set (0.00 sec)
As expected, two rows are returned. How can I modify this query to only return one row?
mysql> describe chat;
+----------------+---------------+------+-----+-------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+---------------+------+-----+-------------------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| from_device_id | varchar(128) | NO | | NULL | |
| to_device_id | varchar(128) | NO | | NULL | |
| message | varchar(2048) | NO | | NULL | |
| date | timestamp | YES | | CURRENT_TIMESTAMP | |
+----------------+---------------+------+-----+-------------------+----------------+
5 rows in set (0.00 sec)
select chat.*,
(select screen_name
from usr
where chat.from_device_id=usr.device_id
limit 1
) as from_screen_name,
(select screen_name
from usr
where chat.to_device_id=usr.device_id
limit 1
) as to_screen_name
from chat
where to_device_id="ffffffff-af28-3427-a2bc-83865900edbe" or
from_device_id="ffffffff-af28-3427-a2bc-83865900edbe"
group by from_device_id, to_device_id
order by date DESC
limit 1;
You need to tell SQL that it should sort the returned data by date to get the most recent chat. Then you just limit the returned rows to 1.
You shouldn't need to use a Group By at all. Rather, you can simply use the Limit predicate to return the last row. In addition, you shouldn't need subqueries as you can use simply Joins. If chat.from_device_id and chat.to_device_id are both not-nullable, then you can replace the Left Joins with Inner Joins.
Select chat.id
, chat.from_device_id
, chat.to_device_id
, chat.message
, chat.date
, FromUser.screen_name As from_screen_nam
, ToUser.screen_name As to_screen_name
From chat
Left Join usr As FromUser
On FromUser.device_id = chat.from_device_id
Left Join usr As ToUser
On ToUser.device_id = chat.to_device_id
Where chat.to_device_id="ffffffff-af28-3427-a2bc-83865900edbe"
Or chat.from_device_id="ffffffff-af28-3427-a2bc-83865900edbe"
Order By chat.date Desc
Limit 1