I have a table called actions, with the following columns, I want to extract only the ID_tracking that have not done a certain action. I tried
SELECT id_tracking from table WHERE id_tracking NOT IN
( SELECT id_tracking FROM table where id_action = X ).
This method works, but it takes extremely long on a small table, and there will be tables with millions of rows so this is not a solution. How can this be done?
Sample data
ID_tracking | ID_action
1009 1
1009 2
1009 3
1009 5
1010 2
1010 3
1010 4
1011 5
I often approach this type of problem using GROUP BY and HAVING:
SELECT id_tracking
FROM table t
GROUP BY id_tracking
HAVING SUM(id_action = x) = 0;
One issue with your query is that you'll get multiple rows for each id_tracking that meets the condition.
In practice, though, a very reasonable approach would be:
SELECT t.id_tracking
FROM tracking t
WHERE NOT EXISTS (SELECT 1
FROM trackingaction ta
WHERE ta.id_tracking = t.id_tracking AND
ta.id_action = X
);
This uses two different tables, one where id_tracking is the unique key and one which is the table you describe. For best results, you want an index on trackingaction(id_tracking, id_action).
Why don't you do like this:
Do you have a different table for tracking IDs i.e. ID_tracking? If yes, then add a does_it_have_action column there to check if tracking ID has a action added in actions table.
When a new entry is added to your table with ID_tracking and ID_action then update that tracking IDs table and set does_it_have_action = 1 else keep it default 0.
When you want to check the tracking IDs for actions, just make a select query to this table.
Now, you will have to care about update and insert statements to this tracking IDs table. So, whenever an action is added for a tracking ID, check this tracking IDs table if ID_tracking exists or not. If exists then update does_it_have_action column equals to 1 else insert it with does_it_have_action = 0.
I hope that will work for you when you will have billions of rows one day.
P.S: Here is a rough structure for this new table tracking_ids
ID_tracking | does_it_have_action (default to 0) | created_at (current
timestamp)
Using distinct and join will make your query hell faster (when you get numerous results in inner query). I just used distinct and translated your not in to join. Try this
Demo At SqlFiddle
SELECT distinct yt.id_tracking
FROM table1 yt
left join (SELECT distinct id_tracking as idt from table1 where id_action = 3) mt
on yt.id_tracking=mt.idt
where mt.idt is null
Related
Question - let's say I have 2 tables.
Table 1 - name is permission_list, columns are ID (unique ID), col_ID, user_ID
Table 2 - name is list_entries, Columns are ID (unique ID), title, description, status
I want to select all the rows from table 2 that have status of 'public' as well as all the rows from table 2 that the ID from table 2 shows up in table 1 (under the column col_ID) AND if the user_ID in table 1 matches a certain value. So, anything public, or anything that this specific user has listed under the permissions table. This query would also remove duplicates - in case the user gets a public entry listed in their permissions_list, it wouldn't show up twice.
Hope that makes sense!
Here you go:
SELECT DISTINCT table2.* from table2
LEFT JOIN table1 USING (id)
WHERE status='public'
OR user_ID='someuser';
You need to get some education on JOIN for your first thing, and the second thing is called DISTINCT.
Start here... https://www.google.com/
You have not specified your join condition so we can't give you code samples really. Also the way you worded your question, I'm not entirely sure you don't want a UNION. Read up on those concepts and come back here when you can improve the question.
SELECT table_2.status, table_2.ID, table_1.col_ID
FROM table_1 JOIN table_2
WHERE table_2.status = 'public'
AND table_2.ID = table_1.col_ID
AND table_1.user_ID = 'certain value'
;
Try this
There are 3 tables:
Users table
------------
|uid|username|
------------
Values table
------------------
|vid|values|checked|
------------------
Relations
-----------
|cid|uid|vid|
-----------
Relations table contains user ids related to value ids. How to select value id from values table that is not related to given user id in relations table?
EDIT:
What I tried so far:
SELECT vid FROM relations where uid=user_id //this gives me array of value ids
SELECT vid FROM values where vid!=vid1 AND vid!=vid2 .....
EDIT2:
Basic solution can be found here. But is there more efficient way? If table is very large for both values table and relations table basic solution is not efficient.
Which dbms are you using? Does it support the minus clause? If yes you can do something like this
select vid from values
minus
select vid from relations where uid = #user_id
this should give the vid's which are not mapped to a given user id
Another way to do this is through a not-exists clause (handy if your dbms doesn't support the minus clause)
select v.vid from values v where not exists (select 1 from relations r where
r.vid = v.vid and r.user_id = #user_id)
I would caution against using the not in clause though. Its performance is questionable and fails if the inner query returns a null value, which though is not possible in your case, but you should make it a habit to never use the 'not in' clause with a sub-query. Only use it when you have a list of literal values e.g. '... vid not in (1, 2, 3, 4)'. Whenever you have to 'Minus' something from one table based on values in another table use the 'not exists' and never 'not in'
I think you can execute a simple query like this (assuming that the data type of user identifier is int):
DECLARE #givenUserID int --local variable where you store the given user identifier
SELECT vid
FROM Values
WHERE vid NOT IN (SELECT vid FROM Relations where uid = #givenUserID)
Is it ok for you ?
select vid from values where vid not in (select vid from relations where uid = user_id)
I think something simple like this query will suffice.
If there is no uid for a particular entry in the value table, then there shouldn't be an entry in the relations table either.
SELECT vid
FROM values
LEFT JOIN relations on values.vid = relations.vid
WHERE relations.uid IS NULL
select distinct v.vid
from values v
left join relations r on (r.vid=v.vid)
where r.uid != user_id
It's unfortunate that MySQL doesn't support with; this is not going to perform very well, unfortunately.
If a Value is exactly 0 or 1 time in your Relations table, you can use a JOIN for that:
SELECT `Values`.`vid` FROM `Values`
LEFT JOIN `Relations` ON (`Values`.`vid` = `Relations`.`vid`)
WHERE `Relations`.`uid` != 1;
This will not work if a Value is more than 1 time in the Relations table because the WHERE would match another row with a different uid in this case. It is the same with a NOT IN, this could also match a different row with the same vid but another uid.
If every Value is at least once in the Relations table, the most efficient way is to query only the Relations table:
SELECT DISTINCT `Relations`.`vid` FROM `Relations`
WHERE `Relations`.`uid` != 1;
If a Value can be 0, 1, or more times in the Relations table, the best way is to use an EXISTS (see also taimur's answer):
SELECT `Values`.`vid` FROM `Values`
WHERE NOT EXISTS (
SELECT * FROM `Relations`
WHERE `Relations`.`vid` = `Values`.`vid` AND `Relations`.`uid` = 1
);
However, EXISTS is a bit slower than the IN or JOIN, so you should compare how the execution times are in your case.
Here is the scenario. I have a MySQL table called modules which contains one or more entries each identified by a unique string - the module ID (mid).
There are a few other tables (scripts,images,sets...) which contain objects each of which"belong" to one of the modules - identified by the column 'mid' in each table.
Prior to allowing a user to drop a module entry, I need to check that the operation will not leave any orphaned objects in any of the other tables. Here is an example to make this clearer
Table modules
mname mid
Mod1 abcd1234
Mod2 wxyz9876
Table scripts
sname mid
A abcd1234
B wxyz9876
Table images
iname mid
A abcd1234
Table sets
sname mid
One or more of the tables may contain no, or no matching, entries.
I have written and tested a spot of SQL to handle this.
SELECT COUNT(*) FROM `images` WHERE mid = 'abcd1234'
UNION
SELECT COUNT(*) FROM `sets` WHERE mid = 'abcd1234'
UNION
SELECT COUNT(*) FROM `scripts` WHERE mid = 'abcd1234'
which very obligingly returns 1 implying that the module is "in use" and cannot be dropped. However, my SQL skills are pretty basic. I would much appreciate anyone who could tell me if this is a safe way to do things.
Not really a good way.
The UNION without ALL removes duplicate results. That would give you 1 if you had 3 rows returning 1. UNION ALL will make it return 3 rows with the count for each table, even when they are duplicate. After that you SUM them and you get the final count.
You should do:
SELECT SUM(cnt) FROM (
SELECT COUNT(*) as cnt FROM `images` WHERE mid = 'abcd1234'
UNION ALL
SELECT COUNT(*) FROM `sets` WHERE mid = 'abcd1234'
UNION ALL
SELECT COUNT(*) FROM `scripts` WHERE mid = 'abcd1234'
) a
You could build something around the following concept, given that there is a one-to-many relation between modules and the other tables.
select mid
,count(scripts.sname) as scripts
,count(images.iname) as images
,count(sets.sname) as sets
from modules
left join images using(mid)
left join sets using(mid)
left join scripts using(mid)
where mid = 'abcd1234'
group
by mid;
You could for example add the count(..) together, or including a HAVING clause.
I want to be able to limit the amount of duplicate records in a mySQL database table to 2.
(Excluding the id field which is auto increment)
My table is set up like
id city item
---------------------
1 Miami 4
2 Detroit 5
3 Miami 4
4 Miami 18
5 Miami 4
So in that table, only row 5 would be deleted.
How can I do this?
MySQL has some foibles when reading and writing to the same table. So I don't actually know if this will work, the syntax is fine in many implementations of SQL, but I don't know if it's MySQL friendly...
DELETE
yourTable
WHERE
1 < (SELECT COUNT(*)
FROM yourTable as Lookup
WHERE city = yourTable.city AND item = yourTable.item AND id < yourTable.id)
EDIT
Amazingly convoluted, but worth a try?
DELETE
yourTable
FROM
yourTable
INNER JOIN
(
SELECT
id
FROM
(
SELECT
id
FROM
yourTable
WHERE
1 < (SELECT COUNT(*)
FROM yourTable as Lookup
WHERE city = yourTable.city AND item = yourTable.item AND id < yourTable.id)
)
AS inner_deletes
)
AS deletes
ON deletes.id = yourTable.id
I think your problem here is that both your code and/or table structure allows inserting duplicates and you are asking this question when you should really fix your db and/or code.
i think a better solution is avoid allow more than 5 registers, you have to implement a validation where if select count(*) > 3 you will not accept the new insert.
because if you want to do this into the data tier, you have to use a stored procedure , because first you need to identify all the register with more than 3 registers and delete only the last .
Saludos
Due to MySQL being notoriously difficult when it comes to updating queried tables (see for example the answers from Dems), the best I can figure out is sadly more than one statement but on the plus side fairly readable;
CREATE TEMPORARY TABLE Dump AS SELECT id FROM table1 WHERE id NOT IN
(SELECT MIN(id) FROM table1 GROUP BY city,item UNION
SELECT MAX(id) FROM table1 GROUP BY city,item);
DELETE FROM table1 where id in (select * from Dump);
DROP TABLE DUMP;
Not sure if it was important which duplicate was removed, this keeps the first and last.
In your reply to Joachim's answer, you ask about saving 3 or 5 rows, this is one way to accomplish it. Depending on how you are using this database, you could either call this in a loop, or you could turn it into a stored procedure. Either way, you would continue to run this entire block of code until Rows Affected = 0:
drop table if exists TempTable;
create table TempTable
select city, item,
count(*) as record_count,
min(id) as ItemToDrop -- this could be changed to max() if you
-- want to delete new stuff instead
from YourTable
group by city, item
having count(*) > 2; -- This value = number of rows you save
delete from YourTable
where id in (select ItemToDrop from TempTable);
I have an optimisation question here.
Background
I have a 12000 users in a user table, on record per user. Each user can be in zero or more groups. I have a groups table with 45 groups and a groups_link table with 75000 records (to facilitate the many to many relationship).
I am making a querying screen which allows a user to filter users from the user list.
Aim
Specifically, I need help with: Selecting all users that are in one group but are not in another group.
DB Structure
Query
My current query which runs too slowly...
SELECT U.user_id,U.user_email
FROM (sc_module_users AS U)
JOIN sc_module_users_groups_links AS group_join ON group_join.user_id = U.user_id
LEFT JOIN sc_module_users_groups_links AS excluded_group_join ON group_join.user_id = U.user_id
WHERE group_join.group_id IN (27) AND excluded_group_join.group_id NOT IN (19) OR excluded_group_join.group_id IS NULL AND U.user_subscribed=1 AND U.user_active=1
GROUP BY U.user_id,U.user_id
This query takes 9 minutes to complete, it returns 11,000 records (out of 12,000).
Explain
Here's the explain on that query:
Click here for a closer look
Can anyone help me optimise this to below the 1 minute mark...?
After 3 revisions, I changed it to this
SELECT U.user_id,U.user_email FROM (sc_module_users AS U) WHERE ( user_country LIKE '%australia%' ) AND
EXISTS (SELECT group_id FROM sc_module_users_groups_links l WHERE group_id in (31) AND l.user_id=U.user_id) AND
NOT EXISTS (SELECT group_id FROM sc_module_users_groups_links l WHERE group_id in (27) AND l.user_id=U.user_id)
AND U.user_subscribed=1 AND U.user_active=1 GROUP BY U.user_id
'
mucccch faster
EDIT: removed my query suggestion but the index stuff should still apply:
The indexes on the sc_module_users_groups_links could be improved by creating a composite index just on user_id and group_id. The order of the columns in the index can have an impact - i believe having user_id first should perform better.
You could also try removing the link_id and just using a composite primary key since the link_id doesn't seem to serve any other purpose.
I believe the very first thing you need to do is to place parentheses:
// should be
.. AND ( excluded_group_join.group_id NOT IN (19)
OR excluded_group_join.group_id IS NULL) AND ....