I have a table which has a column which contains a text blob. I want to update some of the values in the text fields by using a mapping table, but I'm not sure if there is a way to do it without cursors. Here is an example:
USE Temp;
CREATE TABLE `Temp`.`test_text` (
`some_text` text
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT=' ';
CREATE TABLE `Temp`.`mapping` (
`src` VARCHAR(1024) NULL,
`dest` VARCHAR(1024) NULL) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT=' ';
-- test_text has a single column with some_text
INSERT INTO `Temp`.`test_text`
(`some_text`)
VALUES
('There once was a man named BobFrank. He changed his name to BobDude.');
-- path_mapping has two columns, which contain all the mappings I'd like to do
INSERT INTO `Temp`.`mapping`
(`src`,
`dest`)
VALUES
('BobFrank', 'BobsNewFrank'),
('BobDude', 'BobsNewDude');
UPDATE `Temp`.`test_text` tt, `Temp`.`mapping` mp
SET tt.some_text = REPLACE(tt.some_text, mp.src, mp.dest);
SELECT *
FROM `Temp`.`test_text`
Result:
There once was a man named BobsNewFrank. He changed his name to BobDude."
The code above only seems to replace BobFrank in the output.
What I would like to see is the value in test_text to be after the update:
There once was a man named BobsNewFrank. He changed his name to BobsNewDude.
Your current table structure is less than ideal, because the test_text table has unnormalized CSV data in it. This will preclude using most of the database operations which we would think to use here. I was able to come up with a solution, but it required storing each CSV term in test_text in a separate row. That is, I used the following table:
CREATE TABLE test_text (some_text varchar(55));
INSERT INTO test_text (some_text)
VALUES
('"/tmp/BobFrank"'),
('"/tmp/BobDude/"');
I left the path_mapping identical to what you had. Then, I only needed a fairly simple query:
SELECT
GROUP_CONCAT(REPLACE(t1.some_text, t2.src, t2.dest)) AS output
FROM test_text t1
INNER JOIN path_mapping t2
ON t1.some_text LIKE CONCAT('%', t2.src, '%');
This generated the following output:
Demo
Related
Question:How to query data that does not contain any given keywords in the flag.
The form looks like this.
CREATE TABLE `mangas` (
`id` bigint(20) UNSIGNED NOT NULL,
`flag` json NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
And then the table has data that looks like this
INSERT INTO `mangas` (`id`,`flag`) VALUES
(1,'[\"a\", \"b\", \"c\", \"d\"]');
(2,'[\"b\", \"c\", \"d\", \"f\"]');
(3,'[\"a\", \"b\", \"1\", \"2\"]');
(4,'[\"a\", \"c\", \"1\", \"3\"]');
(5,'[\"a\", \"9\", \"10\", \"2\"]');
COMMIT;
When I use JSON_ When a query is made by JSON_CONTAINS(mangas.flag, '[\"a\",\"b\"]', '$'), it can find all the data containing a and b
But I want to query and exclude all data with a or b. what should I do?
Maybe I didn't express my needs clearly
Here is a normal not in query
select * from tabel1 where a not in('a','b','c')
How to implement this not in query in JSON field?
You can use JSON_REMOVE() to exclude the data from the JSON. Have a look at this example,
https://database.guide/json_remove-remove-data-from-a-json-document-in-mysql/
I know, deleting duplicates from mysql is often discussed here. But none of the solution work fine within my case.
So, I have a DB with Address Data nearly like this:
ID; Anrede; Vorname; Nachname; Strasse; Hausnummer; PLZ; Ort; Nummer_Art; Vorwahl; Rufnummer
ID is primary Key and unique.
And i have entrys for example like this:
1;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Mobile;012345;67890
2;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Fixed;045678;877656
The different PhoneNumber are not the problem, because they are not relevant for me. So i just want to delete the duplicates in Lastname, Street and Zipcode. In that case ID 1 or ID 2. Which one of both doesn't matter.
I tried it actually like this with delete:
DELETE db
FROM Import_Daten db,
Import_Daten dbl
WHERE db.id > dbl.id AND
db.Lastname = dbl.Lastname AND
db.Strasse = dbl.Strasse AND
db.PLZ = dbl.PLZ;
And insert into a copy table:
INSERT INTO Import_Daten_1
SELECT MIN(db.id),
db.Anrede,
db.Firstname,
db.Lastname,
db.Branche,
db.Strasse,
db.Hausnummer,
db.Ortsteil,
db.Land,
db.PLZ,
db.Ort,
db.Kontaktart,
db.Vorwahl,
db.Durchwahl
FROM Import_Daten db,
Import_Daten dbl
WHERE db.lastname = dbl.lastname AND
db.Strasse = dbl.Strasse And
db.PLZ = dbl.PLZ;
The complete table contains over 10Mio rows. The size is actually my problem. The mysql runs on a MAMP Server on a Macbook with 1,5GHZ and 4GB RAM. So not really fast. SQL Statements run in a phpmyadmin. Actually i have no other system possibilities.
You can write a stored procedure that will each time select a different chunk of data (for example by rownumber between two values) and delete only from that range. This way you will slowly bit by bit delete your duplicates
A more effective two table solution can look like following.
We can store only the data we really need to delete and only the fields that contain duplicate information.
Let's assume we are looking for duplicate data in Lastname , Branche, Haushummer fields.
Create table to hold the duplicate data
DROP TABLE data_to_delete;
Populate the table with data we need to delete ( I assume all fields have VARCHAR(255) type )
CREATE TABLE data_to_delete (
id BIGINT COMMENT 'this field will contain ID of row that we will not delete',
cnt INT,
Lastname VARCHAR(255),
Branche VARCHAR(255),
Hausnummer VARCHAR(255)
) AS SELECT
min(t1.id) AS id,
count(*) AS cnt,
t1.Lastname,
t1.Branche,
t1.Hausnummer
FROM Import_Daten AS t1
GROUP BY t1.Lastname, t1.Branche, t1.Hausnummer
HAVING count(*)>1 ;
Now let's delete duplicate data and leave only one record of all duplicate sets
DELETE Import_Daten
FROM Import_Daten LEFT JOIN data_to_delete
ON Import_Daten.Lastname=data_to_delete.Lastname
AND Import_Daten.Branche=data_to_delete.Branche
AND Import_Daten.Hausnummer = data_to_delete.Hausnummer
WHERE Import_Daten.id != data_to_delete.id;
DROP TABLE data_to_delete;
You can add a new column e.g. uq and make it UNIQUE.
ALTER TABLE Import_Daten
ADD COLUMN `uq` BINARY(16) NULL,
ADD UNIQUE INDEX `uq_UNIQUE` (`uq` ASC);
When this is done you can execute an UPDATE query like this
UPDATE IGNORE Import_Daten
SET
uq = UNHEX(
MD5(
CONCAT(
Import_Daten.Lastname,
Import_Daten.Street,
Import_Daten.Zipcode
)
)
)
WHERE
uq IS NULL;
Once all entries are updated and the query is executed again, all duplicates will have the uq field with a value=NULL and can be removed.
The result then is:
0 row(s) affected, 1 warning(s): 1062 Duplicate entry...
For newly added rows always create the uq hash and and consider using this as the primary key once all entries are unique.
I've got a SQL 2008 R2 table defined like this:
CREATE TABLE [dbo].[Search_Name](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](300) NULL),
CONSTRAINT [PK_Search_Name] PRIMARY KEY CLUSTERED ([Id] ASC))
Performance querying the Name field using CONTAINS and FREETEXT works well.
However, I'm trying to keep the values of my Name column unique. Searching for an existing entry in the Name column is unbelievably slow for a large number of names (usually batches of 1,000), even with an index on the Name field. Query plans indicate I'm using the index as expected.
To search for an existing value, my query looks like this:
SELECT TOP 1 Id, Name from Search_Name where Name = 'My Name Value'
I've tried duplicating the Name column to another column and searching on the new column, but the net effect was the same.
At this point, I'm thinking I must be mis-using this feature.
Should I just stop trying to prevent duplication? I'm using a linking table to join these search name values to the underlying data. It seems somehow 'dirty' to just store a whole bunch of duplicate values...
...or is there faster way to take a list of 1,000 names and see which ones are already stored in the database?
The first change to make is to get the entire list to SQL Server at one time. Regardless of how you add the names to the existing table, doing it as a set operation will make a big difference in performance.
Passing the List as a table-valued parameter (TVP) is a clean way to handle it. Have a look here for an example. You can still use an OUTPUT clause to track which rows did or didn't make the cut, for example:
-- Some sample existing names.
declare #Search_Name as Table ( Id Int Identity, Name VarChar(32) );
insert into #Search_Name ( Name ) values ( 'Bob' ), ( 'Carol' ), ( 'Ted' ), ( 'Alice' );
select * from #Search_Name;
-- Some (prospective) new names.
declare #New_Names as Table ( Name VarChar(32) );
insert into #New_Names ( Name ) values ( 'Ralph' ), ( 'Alice' ), ( 'Ed' ), ( 'Trixie' );
select * from #New_Names;
-- Add the unique new names.
declare #Inserted as Table ( Id Int, Name VarChar(32) );
insert into #Search_Name
output inserted.Id, inserted.Name into #Inserted
select New.Name
from #New_Names as New left outer join
#Search_Name as Old on Old.Name = New.Name
where Old.Id is NULL;
-- Results.
select * from #Search_Name;
-- The names that were added and their id's.
select * from #Inserted;
-- The names that were not added.
select New.Name
from #New_Names as New left outer join
#Inserted as I on I.Name = New.Name
where I.Id is NULL;
Alternatively, you could use a MERGE statement and OUTPUT the names that were added, those that weren't, or both.
Is there an easy and simple way to create a result table that has specified columns but zero rows? In set theory this is called an empty set, but since relational databases use multidimensional sets the term doesn't fit perfectly. I have tried these two queries, but both deliver exactly one row and not zero rows:
SELECT '' AS ID;
SELECT null AS ID;
But what I want is the same result as this query:
SELECT ID FROM sometable WHERE false;
I'm searching for a more elegant way because I don't want to have a table involved, so the query is independent from any database scheme. Also a generic query might be a bit faster (not that it would matter for such a query).
SELECT "ID" LIMIT 0;
Without any real tables.
Do note that most (My)SQL clients simply will display "Empty set". However, it actually does what you want:
create table test.test_table
select "ID" limit 0;
show create table test.test_table\G
Table: test_table
Create Table: CREATE TABLE `test_table` (
`ID` varchar(2) character set latin1 NOT NULL default ''
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin
SELECT * FROM (SELECT NULL AS ID) AS x WHERE 1 = 0
You can use the DUAL pseudo-table.
SELECT whatever FROM DUAL WHERE 1 = 0
Check the documentation (look for the DUAL section).
Using strictly SQL (no PHP or anything else), is it possible to create a table and insert default data into that table only if that table doesn't exist?
Use the CREATE TABLE ... SELECT format:
create table if not exists tablename as
select * from defaultdata;
Here is one way of doing it:
CREATE TABLE IF NOT EXISTS T (
ID int(10) unsigned NOT NULL primary key,
NAME varchar(255) NOT NULL
);
REPLACE INTO T SELECT 1, 'John Doe';
REPLACE INTO T SELECT 2, 'Jane Doe';
REPLACE is a MySQL extension to the SQL standard that either inserts, or deletes and inserts.
You might do a select on the one of the meta data tables
if(not exists select * from whatever_meta where table_name = "whatever)
begin
...
end
You would have to do some research to figure out how exactly...
Can you store the table status as a variable, then use that variable to determine whether to insert data? Ex:
#status = SHOW TABLES LIKE 'my_table';
INSERT INTO my_table VALUES (1,'hello'),(2,'world') WHERE #status <> false;
The problem with Paul Morgan's answer is that it expects data to already exist in another table. Jonas' answer would be extremely resource exhaustive, especially if there's a lot of REPLACES (which are unnecessary if the table exists).
May be I am missing the point but why can't the default data be a set of insert statements...and what one simply needs to do is create the table if it does not exist followed by insert statements...that ways the default data does not have to exist in a different table.