UPDATE/INSERT INTO/DELETE FROM table in MYSQL - mysql

I have a table in MYSQL database TABLE1, with columns COLA, COLB, COLC and (COLA,COLB) are composite primary keys. Something like this
-----------------------
| COLA | COLB | COLC |
-----------------------
| A | B | C |
-----------------------
| A | Q | D |
-----------------------
| A | E | J |
-----------------------
| B | W | P |
-----------------------
Also there is background script which passes data to a java program which should update the table under following conditions :
If new values have any new pair for PRIMARY KEYs then INSERT new row into the table.
If new values have any common pair for PRIMARY KEYs then UPDATE the table.
DELETE all other rows Where COLA value matches with new values.
If new vaues are ('A','B','L'),('A','Y','D'),('A','Q','Z') then it should :
UPDATE 1st and 2nd row.
INSERT a new row ('A','Y','D').
DELETE only 3rd row.
So table should look like
-----------------------
| COLA | COLB | COLC |
-----------------------
| A | B | L |
-----------------------
| A | Q | Z |
-----------------------
| B | W | P |
-----------------------
| A | Y | D |
-----------------------
To implement this I was running two queries :
INSERT INTO TABLE1 VALUES('A','B','L'),('A','Y','D'),('A','Q','Z') ON DUPLICATE KEY UPDATE COLC=VALUES(COLC);
Which is working the way I want. But when I try to delete other rows I am getting into problem what I am trying is :
DELETE FROM TABLE1 WHERE NOT ((COLA='A' AND COLB='B') OR (COLA='A' AND COLB='Y') OR (COLA='A' AND COLB='Q'));
But it does not work. As it deletes the last row as well.
So
How to implement the query?
Can it be clubbed into one query?
THANKS IN ADVANCE :)

I also couldn't find one query solution to this issue but for the second query a bit optimised version can be:
DELETE FROM TABLE1 WHERE COLA='A' AND COLB NOT IN ('B','Y','Q');
or
DELETE FROM TABLE1 WHERE COLA='A' AND COLC NOT IN ('L','Z','D');
Any of the above can be used and it can be a bit scalable than the one you provided.

I got the answer to first question. query should be
DELETE FROM TABLE1 WHERE COLA='A' AND NOT ((COLA='A' AND COLB='B') OR (COLA='A' AND COLB='Y') OR (COLA='A' AND COLB='Q'));

Related

Exotic GROUP BY In MySQL

Consider a typical GROUP BY statement in SQL: you have a table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| B | 2 |
| A | 3 |
| B | 4 |
+------+-------+
And you ask for
SELECT Name, SUM(Value) as Value
FROM table
GROUP BY Name
You'll receive
+------+-------+
| Name | Value |
+------+-------+
| A | 4 |
| B | 6 |
+------+-------+
In your head, you can imagine that SQL generates an intermediate sorted table like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| A | 3 |
| B | 2 |
| B | 4 |
+------+-------+
and then aggregates together successive rows: the "Value" column has been given an aggregator (in this case SUM), so it's easy to aggregate. The "Name" column has been given no aggregator, and thus uses what you might call the "trivial partial aggregator": given two things that are the same (e.g. A and A), it aggregates them into a single copy of one of the inputs (in this case A). Given any other input it doesn't know what to do and is forced to begin aggregating anew (this time with the "Name" column equal to B).
I want to do a more exotic kind of aggregation. My table looks like
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| BC | 2 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BCR | 6 |
+------+-------+
And the intended output is
+------+-------+
| Name | Value |
+------+-------+
| A | 8 |
| B | 13 |
+------+-------+
Where does this come from? A and B are the "minimal prefixes" for this set of names: they occur in the data set and every Name has exactly one of them as a prefix. I want to aggregate data by grouping rows together when their Names have the same minimal prefix (and add the Values, of course).
In the toy grouping model from before, the intermediate sorted table would be
+------+-------+
| Name | Value |
+------+-------+
| A | 1 |
| AY | 3 |
| AZ | 4 |
| B | 5 |
| BC | 2 |
| BCR | 6 |
+------+-------+
Instead of using the "trivial partial aggregator" for Names, we would use one that can aggregate X and Y together iff X is a prefix of Y; in that case it returns X. So the first three rows would be aggregated together into a row with (Name, Value) = (A, 8), then the aggregator would see that A and B couldn't be aggregated and would move on to a new "block" of rows to aggregate.
The tricky thing is that the value we're grouping by is "non-local": if A were not a name in the dataset, then AY and AZ would each be a minimal prefix. It turns out that the AY and AZ rows are aggregated into the same row in the final output, but you couldn't know that just by looking at them in isolation.
Miraculously, in my use case the minimal prefix of a string can be determined without reference to anything else in the dataset. (Imagine that each of my names is one of the strings "hello", "world", and "bar", followed by any number of z's. I want to group all of the Names with the same "base" word together.)
As I see it I have two options:
1) The simple option: compute the prefix for each row and group by that value directly. Unfortunately I have an index on the Name, and computing the minimal prefix (whose length depends on the Name itself) prevents me from using that index. This forces a full table scan, which is prohibitively slow.
2) The complicated option: somehow convince MySQL to use the "partial prefix aggregator" for Name. This runs into the "non-locality" problem above, but that's fine as long as we scan the table according to my index on Name, since then every minimal prefix will be encountered before any of the other strings it is a prefix of; we would never try to aggregate AY and AZ together if A were in the dataset.
In a declarative programming language #2 would be rather easy: extract rows one at a time, in alphabetical order, keeping track of the current prefix. If your new row's Name has that as a prefix, it goes in the bucket you're currently using. Otherwise, start a new bucket with that as your prefix. In MySQL I am lost as to how to do it. Note that the set of minimal prefixes is not known beforehand.
Edit 2
It occurred to me that if the table is ordered by Name, this would be a lot easier (and faster). Since I don't know if your data is sorted, I've included a sort in this query, but if the data is sorted, you can strip out (SELECT * FROM table1 ORDER BY Name) t1 and just use FROM table1
SELECT prefix, SUM(`Value`)
FROM (SELECT Name, Value, #prefix:=IF(Name NOT LIKE CONCAT(#prefix, '_%'), Name, #prefix) AS prefix
FROM (SELECT * FROM table1 ORDER BY Name) t1
JOIN (SELECT #prefix := '~') p
) t2
GROUP BY prefix
Updated SQLFiddle
Edit
Having slept on the problem, I realised that there is no need to do the IN, it's enough to just have a WHERE NOT EXISTS clause on the JOINed table:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE NOT EXISTS (SELECT *
FROM table1 t3
WHERE t1.Name LIKE CONCAT(t3.Name, '_%')
)
GROUP BY t1.Name
Updated Explain (Name changed to UNIQUE key from PRIMARY)
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index Name Name 11 NULL 6 Using where; Using index; Using temporary; Using filesort
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t3 index NULL Name 11 NULL 6 Using where; Using index
Updated SQLFiddle
Original Answer
Here is one way you could do it. First, you need to find all the unique prefixes in your table. You can do that by looking for all values of Name where it does not look like another value of Name with other characters on the end. This can be done with this query:
SELECT Name
FROM table1 t1
WHERE NOT EXISTS (SELECT *
FROM table1 t2
WHERE t1.Name LIKE CONCAT(t2.Name, '_%')
)
For your sample data, that will give
Name
A
B
Now you can sum all the values where the Name starts with one of those prefixes. Note we change the LIKE pattern in this query so that it also matches the prefix, otherwise we wouldn't count the values for A and B in your example:
SELECT t1.Name, SUM(t2.Value) AS `Value`
FROM table1 t1
JOIN table1 t2 ON t2.Name LIKE CONCAT(t1.Name, '%')
WHERE t1.Name IN (SELECT Name
FROM table1 t3
WHERE NOT EXISTS (SELECT *
FROM table1 t4
WHERE t3.Name LIKE CONCAT(t4.Name, '_%')
)
)
GROUP BY t1.Name
Output:
Name Value
A 8
B 13
An EXPLAIN says that both of these queries use the index on Name, so should be reasonably efficient. Here is the result of the explain on my MySQL 5.6 server:
id select_type table type possible_keys key key_len ref rows Extra
1 PRIMARY t1 index PRIMARY PRIMARY 11 NULL 6 Using index; Using temporary; Using filesort
1 PRIMARY t3 eq_ref PRIMARY PRIMARY 11 test.t1.Name 1 Using where; Using index
1 PRIMARY t2 ALL NULL NULL NULL NULL 6 Using where; Using join buffer (Block Nested Loop)
3 DEPENDENT SUBQUERY t4 index NULL PRIMARY 11 NULL 6 Using where; Using index
SQLFiddle Demo
Here are some hints on how to do the task. This locates any prefixes that are useful. That's not what you asked for, but the flow of the query and the usage of #variables, plus the need for 2 (actually 3) levels of nesting, might help you.
SELECT DISTINCT `Prev`
FROM
(
SELECT #prev := #next AS 'Prev',
#next := IF(LEFT(city, LENGTH(#prev)) = #prev, #next, city) AS 'Next'
FROM ( SELECT #next := ' ' ) AS init
JOIN ( SELECT DISTINCT city FROM us ) AS dedup
ORDER BY city
) x
WHERE `Prev` = `Next` ;
Partial output:
+----------------+
| Prev |
+----------------+
| Alamo |
| Allen |
| Altamont |
| Ames |
| Amherst |
| Anderson |
| Arlington |
| Arroyo |
| Auburn |
| Austin |
| Avon |
| Baker |
Check the Al% cities:
mysql> SELECT DISTINCT city FROM us WHERE city LIKE 'Al%' ORDER BY city;
+-------------------+
| city |
+-------------------+
| Alabaster |
| Alameda |
| Alamo | <--
| Alamogordo | <--
| Alamosa |
| Albany |
| Albemarle |
...
| Alhambra |
| Alice |
| Aliquippa |
| Aliso Viejo |
| Allen | <--
| Allen Park | <--
| Allentown | <--
| Alliance |
| Allouez |
| Alma |
| Aloha |
| Alondra Park |
| Alpena |
| Alpharetta |
| Alpine |
| Alsip |
| Altadena |
| Altamont | <--
| Altamonte Springs | <--
| Alton |
| Altoona |
| Altus |
| Alvin |
+-------------------+
40 rows in set (0.01 sec)

MySQL Update Based on ID Slow (150,000 records)

I'm trying to update a table based on the id on another table but I'm having some performance issue. I have 150,000 rows on table_to_update, and 400,000 rows on table_to_get_data.
Table to Update
+----+-----------------+
| id | field_to_update |
+----+-----------------+
| 1 | orange |
| 2 | apple |
| 3 | pear |
| 1 | orange |
+----+-----------------+
Table to Get Data
+----+-----------------+
| id | field |
+----+-----------------+
| 1 | orange |
| 2 | apple |
| 3 | pear |
+----+-----------------+
So I've tried 3 different ways:
Method 1:
UPDATE table_to_update t1, table_to_get_data t2
SET t1.field_to_update = t2.field
WHERE t1.id = t2.id
Method 2:
UPDATE table_to_update
JOIN table_to_get_data
ON table_to_update.id = table_to_get_data.id
SET table_to_update.field_to_update = table_to_get_data.field
Method 3:
UPDATE table_to_update
LEFT JOIN table_to_get_data
ON table_to_update.id = table_to_get_data.id
SET table_to_update.field_to_update = table_to_get_data.field
So far, Method 3 seems to be the fastest, however, calculating the time it would take to update 1000 rows, it will take me 12 hours to finish updating the entire table. Is there a more efficient method to update the table?
EDIT:
Added EXPLAIN table
EXPLAIN Table
Create Index on columns you have joined from both the tables.
It will make wonders for you.

Deleting almost duplicate rows in MySQL?

I have seen a few different answers for this question, but none really hit exactly what I needed to do in MySQL.
I did find a thread for MS SQL that is exactly to what I need to do here but nothing min MySQL.
Data Example
+--------+----------+--------+
| Col1 | Col2 | UniqueID |
+--------+----------+--------+
| Peaches| Outdoor | 1 |
| Peaches| Outdoor | 2 |
| Apples | Indoor | 3 |
| Apples | Indoor | 4 |
+--------+----------+--------+
Desired Output
+--------+----------+--------+
| Col1 | Col2 | UniqueID |
+--------+----------+--------+
| Peaches| Outdoor | 1 |
| Apples | Indoor | 3 |
+--------+----------+--------+
Your way is OK. You only forgot the KEYWORD TABLE
CREATE TABLE NewTable AS SELECT Col1,Col2 ,MAX(col3) FROM t GROUP BY Col1,col2
but the structure can be different from the original table
Do this way:
CREATE TABLE NewTable like t;
then add a unique key:
ALTER TABLE NewTable ADD KEY (Col1,col2);
and now copy old data in new table with ON DUPLICATE KEY UPDATE
INSERT INTO NewTable
SELECT *
from t
ON DUPLICATE KEY UPDATE Col3=GREATEST(Col3,VALUES(Col3));
so you copy every row and the duplicates tests for maximum
Im going to post the answer to the answer provided above so its clear...it is just one simple query:
CREATE NewTable AS SELECT Col1,Col2 ,MAX(col3) FROM t GROUP BY Col1,col2
Just querying max was the trick...so simple.
Thank you!

Joining Two Tables in MySQL with diff column name

I'm not very good at joining tables in mysql and I'm still learning,
So I wanted to ask, when joining two tables....
I have 2 tables
So for the first table I want to join the 2 of its columns (id & path) on the second table.
But on the second table there's no column name id and path, there is a column name pathid & value. The field of the pathid column is the same as the id.
it looks like this.
first table
| id | path |
---------------------
| 1 | country/usa |
| 2 | country/jpn |
| 3 | country/kor |
second table
| pathid | value |
-------------------
| 3 | 500 |
| 1 | 10000 |
| 2 | 2000 |
So on the first table, it indicates that for usa the id is 1, japan is 2, korea is 3.
And on the table it says that for pathid no. 3 ( which is the id for korea) the value is 500 and so on with the others.
I want it to look like this. So then the path will be joined on the second table on its corresponding value. How can I do this on mysql? Thank You
Desired Result
| id | path | value |
------------------------------
| 1 | country/usa | 10000 |
| 2 | country/jpn | 2000 |
| 3 | country/kor | 500 |
You can join on the columns irrespective of the column name as long as the data type match.
SELECT id, path, value
FROM firstTable, secondTable
WHERE id = pathid
If you have same column names on both tables then you need to qualify the name using alias. Say the column names for id were same on both tables then whenever you use id you should mention which table you are referring to. other wise it will complain about the ambiguity.
SELECT s.id, path, value
FROM firstTable f, secondTable s
WHERE f.id = s.pathid
Note that I ommited s. on other columns in select, it will work as long as the second table doesn't have columns with same name.

Retrieving rows based on a list of IDs in MySQL

I'm trying to retrieve all the rows in a table with children variables where the Foreign Key of those rows is equal to the Primary Key of rows in a table with parent variables.
Graphically it looks something like this:
Table 1. This table contains the parent rows.
ID | variable | variable | etc.
1 | XX | BB | ...
2 | YY | AA | ...
Table 2. This table contains the children rows.
ID | FK (parent) | variable | etc.
1 | 1 | BB | ...
2 | 1 | AA | ...
3 | 1 | AA | ...
4 | 2 | AA | ...
5 | 3 | AA | ...
I'm obviously not an expert in SQL, what I would normally do in another programming language is writing a loop that cycles through every row in the parent table, and then checks the children table if there is a match. I have, however, no idea of what would be the most efficient approach here. The parent table will have 50+ rows. The children table has 8000+ rows.
UPDATE: I want to dump the relevant data from the children table in a new table. So I do not want a combined table with data from the parent and children table, which is what a JOIN does I think.
UPDATE 2: I managed to get what I wanted through:
INSERT INTO NewTable
select columns
from ChildrenTable t
inner join ParentTable p
on t.parentId = p.Id
Thanks for the help!
You can try like this-
Select * from table1 left join table2 on table1.id = table2.fk
You wrote:
I want to dump the relevant data from the children table in a new
table. So I do not want a combined table with data from the parent and
children table, which is what a JOIN does I think.
Well JOIN just combines two or more relevant data from chosen tables. What I mean is you can SELECT whatever columns you want i.e. if you have such tables (a bit updated columns from your original table):
parent-table
ID | variable1 | variable2 | etc.
1 | XX | BB | ...
2 | YY | AA | ...
child-table
ID | FK-ID | variable | etc.
1 | 1 | BB | ...
2 | 1 | AA | ...
3 | 1 | AA | ...
4 | 2 | AA | ...
5 | 3 | AA | ...
And you want to retrieve only ID from first table, variabl2 from first table and variable from second one you would write
SELECT ID.parent-table, variable2.parent-table, variable.child-table
FROM parent-table
JOIN child-table ON parent-table.ID = FK-ID.child-table;
Or if you don't like joins you can ignore them and just get data from both tables and specify where clause i.e.
SELECT ID.parent-table, variable2.parent-table, variable.child-table
FROM parent-table, child-table
WHERE parent-table.ID = FK-ID.child-table;
Both above written queries are equivalent. If you want you can create a new table, let's call it parent-child-table with that data which will be a separate copy. Or if you need to use it a lot you can create a VIEW (you can google about it) which is virtual table (it stores a query), for example let's call it parent-child-view if you make some changes in parent-table and child-table changes will be reflected in parent-child-view but if you create a separate new table parent-child-table changes won't be reflected because it's just a copy.
You can Try using:
select * from table2, table1 where table2.fk = table1.id
select * from child_table_name as tblchild, parent_table_name as tblparent, where tblchild.fk_column_name=tblparent.id