I'm trying to correct a relational db for a month, but i cant find efficient solution.
Hier is my problem:
I have like 534 M rows Relational Db with lots of foreig keys(30).
I can handle normal duplicates with union...group by...havin count(*)=1 by inserting, but there are also duplciates with different keys.
example:
table 1
id | key1 | value
1 | 11 | a1
2 | 22 | a1
table 2
key1 | value
11 | a2
22 | a2
Foreign key table1(key1) references table2(key1)
I'm trying to find, remove duplicate , correct the parents.
I have tried 3 different ways,
1: PHP Script,Arrays
export tables (dump) --> array_unique, find duplicates, correct the parents array --> import tables
Its pretty fast, but need 80GB Memory, which could be problem in the future
2: PHP Script,SQL Query
exporrt tables(dump) --> find duplicates --> send queries to parent table
No need memory, but the tables are really big and 5 queries take 1 second, 50 M duplicates would take days, months, years
3: ON DUPLICATE UPDATE KEY: I added one column 'duplicate' to store duplicate keys and I defined all columns except key as unique key,
insert.... on duplicate update concat(duplicate,';',VALUES(key)).
But some tables has more than 1 key and sometimes I should define 24 column as unique index and memory problem again
I hope I could explain my problem. Do you have any idea ?
Why don't you simply create a unique key on column. Just use "Ignore" keyword it will remove the duplicate records.So your query will be something like: ALTER IGNORE TABLE testdb.table1
ADD UNIQUE INDEX column1 (column1 ASC) ;
Related
how to make the output like this? From 4 tables (rate, cost, tools, car)
|`RateID` | `Costing` | `Toolsfk` and `CarFK` |
|---------------------------------------------|
| 1 | 1000 | 1004 |
| 2 | 2000 | 2003 |
this is the tables
i want to 2 or more Foreignkey in 1 column, should I use CONCAT? but, as I know, CONCAT is for output only. So, what if for output and input data in database? just use Insert?
polymorphic association is not possible, polymorphic association means single column has more than one foreign key for different tables. foreign key can target only one table at a time, a single column foreign key can't refer more than one tables, if u want to refer more than one table using single column then there are two ways..
1- creating identical table of referenced table
2- use two columns in a table, one column refer tools table and one column refer car table
below is a link from where you can take refrence
(Possible to do a MySQL foreign key to one of two possible tables?)
I have a few huge (10s of millions of rows in 1 table) legacy tables that have a few columns I should definitely have set up to be foreign keys when I created them, but without a time machine, I have to migrate them.
What's in these columns is like a short "key" for something else. A real example of one of these new tables would be:
+----+-----+---------+
| id | key | name |
+----+-----+---------+
| 1 | bf | BetFair |
| 2 | bd | BetDaq |
+----+-----+---------+
And a current row, in the current table has something like,
.
(bet_id=1234, odds=2.1, source='bf')
(bet_id=1235, odds=2.15, source='bd')
.
And what I want the eventual outcome to be,
.
(bet_id=1234, odds=2.1, source_id=1)
(bet_id=1235, odds=2.15, source_id=2)
.
I know how to do this in multiple steps, create the new tables, add all the data from the source tables to the new tables with GROUP BY / DISTINCT, and eventually setting the new foreign key id columns with commands like,
UPDATE BetsTable SET source_id=1 WHERE source='bf',
I'm just wondering if there's more of a "one-shot", efficient SQL command to update the entire table in one step, rather than multiple.
If you just want to change the data, try this:
UPDATE BetsTable b
JOIN NewTable n ON n.key = b.source
SET b.source = n.id
If you realy need to make the source column a foreign key, you will need to change it's data type first. But in that case I'm pretty sure it would be more efficient to rebuild the table.
I have a table table with two columns (idA and idB). The table assigns Bs to As, like this:
A | B
1 | 4
3 | 2
3 | 4
4 | 1
4 | 3 ...
So one A can have multiple Bs and thus shows up in more than one row. Hence, the table cannot have a primary key and I cannot use a unique column.
Is there a way to insert new rows only if an equal value pairing does not already exist, all in one query?
I tried REPLACE INTO and INSERT IGNORE INTO as mentioned here, but both seem to work for tables with primary keys only.
You can add a primary key! It just has to be over two columns and not just one.
ALTER TABLE your_table
ADD PRIMARY KEY(idA, idB)
That will make sure you only have unique records for both columns.
I am learning MySQL and have MariaDB installed in Fedora 19.
I have a scenario where I require a column to contain multiple values in order to reduce possible redundancy of column allocation.
In the example below, is it possible to have each value in the tags column of the log table reference the tag_id column in the tags table?
users
user_id |
1 |
activities
activitity_id |
1
log
user_id | activity_id | tags
1 | 1 | 1,3,5 # multiple foreign keys?
tags
tag_id |
1 |
2 |
3 |
4 |
5 |
If it is not possible, could anyone provide the logic for the most feasible solution based on the data scenario above?
Similar Questions:
Are multiple foreign keys in a single field possible?
MySQL foreign key having multiple (conditional) possible values
it is possible to reference one column as multiple foreign keys
If you do not wish to make up a "middle man" table for linking the two tables you can have a comma separated value in the field, you would just need to use the find_in_set mysql function when doing queries
USING find_in_set
SELECT
log.user_id, log.activity_id, log.tags,
GROUP_CONCAT(tags.name) as taggedNames //This assumes there is a field called `name` in tags table
FROM
log
LEFT JOIN tags
ON
FIND_IN_SET(tags.tag_id,log.tags)
GROUP BY
log.activity_id
GROUP_CONCAT will group together a field and separate them by a deliminator, default is ,
I wasn't sure how to explain this in the title, but what I have is a table like this:
user_id | subscription_id
6 12
6 10
12 6
4 12
Each user can subscribe to all other users, but is it possible to prevent a user from subscribing to another user twice through a INSERT query?
As my subscription_id is not unique, this happens:
user_id | subscription_id
6 12
6 12
And I want to avoid that. As far as I know INSERT IGNORE, INSERT UPDATE and ON DUPLICATE only works with unique keys.
You need to set up your database table to have a composite primary key for user_id AND subscription_id
That way each row has to be unique across both the columns.
See: How to properly create composite primary keys - MYSQL
The only reliable and easy way to make sure that a tuple cannot occur more than once inside a single table is either:
Use a spanning unique key
Use a spanning primary key
Maybe triggers
The first two are roughly the same, but unique keys treat null values as distinct as well, so that might not work for you.
ALTER TABLE user_subscriptions ADD PRIMARY(user_id, subscription_id);