Retrofitting an Incremental ID column with UPDATE - mysql

I have a MySQL table with an ID column, intended to be imported from external data, but for this one case there were no IDs provided. Therefore, I have a column of pure zeros. I need to update this column to have unique values in each row, and these numbers have no significance. I can think of several ways to do it, but I'm wondering what the correct way would be. My first idea was something like UPDATE table SET id = (SELECT max(id)+1 FROM table) - but I like elegant solutions when available.
Edit: It turns out running this query triggers 1093: You can't specify target table for update in FROM clause - I guess that means I really need a more elegant solution, too.

UPDATE t
SET t.id = q.id
FROM TABLE t
INNER JOIN
( SELECT MAX(id)+1 AS id
FROM TABLE
)
q
ON 1 = 1

Can't you generate a View of the table where you add an expression column wich can generate a unique identifier much like you are proposing and then import the view?

I ended up doing this, since nothing was working:
update table set accountnumber = floor(rand() * 999999999)

Related

Pull records from one table where 1 variable exists in a second table? Very large tables

I am completely new to database coding, and I've tried Googling but cannot seem to figure this out. I imagine there's a simple solution. I have a very large table with MemberIDs and a few other relevant variables that I want to pull records from (table1). I also have a second large table of distinct MemberIDs (table2). I want to pull rows from table 1 where the MemberID exists in table2.
Here’s how I tried to do it, and for some reason I suspect this isn’t working correctly, or there may be a much better way to do this.
proc sql;
create table tablewant as select
MemberID, var1, var2, var3
from table1
where exists (select MemberID from table2)
;
quit;
Is there anything wrong with the way I’m doing this? What's the best way to solve this when working with extremely large tables (over 100 million records)? Would doing some sort of join be better? Also, do I need to change
where exists (select MemberID from table2)
to
where exists (select MemberID from table2 where table1.MemberID = table2.MemberID)
?
You want to implement a "semi-join". You second solution is correct:
select MemberID, var1, var2, var3
from table1
where exists (
select 1 from table2 where table1.MemberID = table2.MemberID
)
Notes:
There's no need to select anything special in the subquery since it's not checking for values, but for row existence instead. For example, 1 will do, as well as *, or even null. I tend to use 1 for clarity.
The query needs to access table2 and this should be optimized specially for such large tables. You should consider adding the index below, if you haven't created it already:
create index ix1 on table2 (MemberID);
The query does not have a filtering criteria. That means that the engine will read 100 million rows and will check each one of them for the matching rows in the secondary table. This will unavoidably take a long time. Are you sure you want to read them all? Maybe you need to add a filtering condition, but I don't know your requirements in this respect.

Query to see if row exists based on data without selecting fields

So i know how to check a row exists from a set of data. But a lot of examples i find entails selecting a row and bunch of fields - something i don't need in this case.
In my case i just need to know it exists. I was wondering if there is a way to check a row exists without selecting/getting the row since thats some what redundant data ?
If not i will stick to using my SELECT id approach but wanted to see if i had missed a better approach just to ping the existence of a row.
Currently i am doing:
SELECT uid FROM users WHERE sessionID = ? AND uid = ?
Then i am checking if row count is == 1 afterwards. But i am still needlessly getting uid which i already technically have. It seems inefficient. So perhaps there is a better way built into mySQL?
You can do:
select (exists (select 1 from t) ) as exists_flag
This returns 1 if the row exists or 0 if no row exists. You can add a where clause to the subquery if you want a particular row.

Finding the differences in two tables

I have two large tables in a database. They both contain a column called "name". My goal is to locate rows that contain names that are in one database but not the other.
I'm guessing there will be a join statement and a where, but I cannot figure out how to use the two in tandem in order to create a successful query.
Suggestions?
SELECT * FROM TABLE_A WHERE NAME NOT IN
( SELECT NAME FROM TABLE_B )
EXISTS might be faster than IN, see Difference between EXISTS and IN in SQL?.
You can use EXISTS like this. It's useful to know both approaches since they are not exactly equal. You can swap the EXISTS quantifier for SOME, ALL or ANY. I think you can figure out what would happen :)
select * from a1 where not exists(select 1 from a2 where name=a1.name);
Note that they are not 100% equal! SQL has three-valued logic!

MySQL: Grabbing the latest ID from duplicate records within a table

I'm trying to grab the latest ID from a duplicate record within my table, without using a timestamp to check.
SELECT *
FROM `table`
WHERE `title` = "bananas"
-
table
id title
-- -----
1 bananas
2 apples
3 bananas
Ideally, I want to grab the ID 3
I'm slightly confused by the SELECT in your example, but hopefully you will be able to piece this out from my example.
If you want to return the latest row, you can simply use a MAX() function
SELECT MAX(id) FROM TABLE
Though I definitely recommend trying to determine what makes that row the "latest". If its just because it has the highest column [id], you may want to consider what happens down the road. What if you want to combine two databases that use the same data? Going off the [id] column might not be the best decision. If you can, I suggest an [LastUpdated] or [Added] datestamp column to your design.
im assuming the id's are autoincremented,
you can count how many rows you have, store that in a variable and then set the WHERE= clause to check for said variable that stores how many rows you have.
BUT this is a hack solution because if you delete a row and the ID is not decremented you can end up skipping an id.
select max(a.id) from mydb.myTable a join mydb.myTable b on a.id <> b.id and a.title=b.title;

in a subselect query how do i refer to the id of the top level select statement?

I wanted to automate my table population for testing purposes.
I needed to edit some columns from a certain table but I must make sure that the values I put in that certain column does not simply come out of randomness.
So the values actually comes from another table on a certain condition.
How can I do that? Just like this code:
update table_one set `some_id`=(select some_id from another_table where another_table.primary_id=table_one.primary_id order by rand() limit 1)
It's something like my condition for the Subselect query. It should match the id of the current row I am updating.
I really forgot my SQL now. Thanks for the answers though.
You're almost there - all you need to do is qualify the column you're selecting in the subquery, so you know it comes from the correct table:
update table_one
set some_id=(
select another_table.some_id
from another_table
where another_table.primary_id=table_one.primary_id
order by rand()
limit 1
)