How to split a single row in to multiple columns in mysql - mysql

Simply Asking, Is there any function available in mysql to split single row elements in to multiple columns ?
I have a table row with the fields, user_id, user_name, user_location.
In this a user can add multiple locations. I am imploding the locations and storing it in a table as a single row using php.
When i am showing the user records in a grid view, I am getting problem for pagination as i am showing the records by splitting the user_locations. So I need to split the user_locations ( single row to multiple columns).
Is there any function available in mysql to split and count the records by character ( % ).
For Example the user_location having US%UK%JAPAN%CANADA
How can i split this record in to 4 columns.
I need to check for the count values (4) also. thanks in advance.

First normalize the string, removing empty locations and making sure there's a % at the end:
select replace(concat(user_location,'%'),'%%','%') as str
from YourTable where user_id = 1
Then we can count the number of entries with a trick. Replace '%' with '% ', and count the number of spaces added to the string. For example:
select length(replace(str, '%', '% ')) - length(str)
as LocationCount
from (
select replace(concat(user_location,'%'),'%%','%') as str
from YourTable where user_id = 1
) normalized
Using substring_index, we can add columns for a number of locations:
select length(replace(str, '%', '% ')) - length(str)
as LocationCount
, substring_index(substring_index(str,'%',1),'%',-1) as Loc1
, substring_index(substring_index(str,'%',2),'%',-1) as Loc2
, substring_index(substring_index(str,'%',3),'%',-1) as Loc3
from (
select replace(concat(user_location,'%'),'%%','%') as str
from YourTable where user_id = 1
) normalized
For your example US%UK%JAPAN%CANADA, this prints:
LocationCount Loc1 Loc2 Loc3
4 US UK JAPAN
So you see it can be done, but parsing strings isn't one of SQL's strengths.

The "right thing" would be splitting the locations off to another table and establish a many-to-many relationship between them.
create table users (
id int not null auto_increment primary key,
name varchar(64)
)
create table locations (
id int not null auto_increment primary key,
name varchar(64)
)
create table users_locations (
id int not null auto_increment primary key,
user_id int not null,
location_id int not null,
unique index user_location_unique_together (user_id, location_id)
)
Then, ensure referential integrity either using foreign keys (and InnoDB engine) or triggers.

this should do it
DELIMITER $$
DROP PROCEDURE IF EXISTS `CSV2LST`$$
CREATE DEFINER=`root`#`%` PROCEDURE `CSV2LST`(IN csv_ TEXT)
BEGIN
SET #s=CONCAT('select \"',REPLACE(csv_,',','\" union select \"'),'\";');
PREPARE stmt FROM #s;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END$$
DELIMITER ;

You should do this in your client application, not on the database.
When you make a SQL query you must statically specify the columns you want to get, that is, you tell the DB the columns you want in your resultset BEFORE executing it. For instance, if you have a datetime stored, you may do something like select month(birthday), select year(birthday) from ..., so in this case we split the column birthday into 2 other columns, but it is specified in the query what columns we will have.
In your case, you would have to get exactly that US%UK%JAPAN%CANADA string from the database, and then you split it later in your software, i.e.
/* get data from database */
/* ... */
$user_location = ... /* extract the field from the resultset */
$user_locations = explode("%", $user_location);

This is a bad design, If you can change it, store the data in 2 tables:
table users: id, name, surname ...
table users_location: user_id (fk), location
users_location would have a foreign key to users thorugh user_id field

Related

Delete Duplicates from large mysql Address DB

I know, deleting duplicates from mysql is often discussed here. But none of the solution work fine within my case.
So, I have a DB with Address Data nearly like this:
ID; Anrede; Vorname; Nachname; Strasse; Hausnummer; PLZ; Ort; Nummer_Art; Vorwahl; Rufnummer
ID is primary Key and unique.
And i have entrys for example like this:
1;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Mobile;012345;67890
2;Herr;Michael;Müller;Testweg;1;55555;Testhausen;Fixed;045678;877656
The different PhoneNumber are not the problem, because they are not relevant for me. So i just want to delete the duplicates in Lastname, Street and Zipcode. In that case ID 1 or ID 2. Which one of both doesn't matter.
I tried it actually like this with delete:
DELETE db
FROM Import_Daten db,
Import_Daten dbl
WHERE db.id > dbl.id AND
db.Lastname = dbl.Lastname AND
db.Strasse = dbl.Strasse AND
db.PLZ = dbl.PLZ;
And insert into a copy table:
INSERT INTO Import_Daten_1
SELECT MIN(db.id),
db.Anrede,
db.Firstname,
db.Lastname,
db.Branche,
db.Strasse,
db.Hausnummer,
db.Ortsteil,
db.Land,
db.PLZ,
db.Ort,
db.Kontaktart,
db.Vorwahl,
db.Durchwahl
FROM Import_Daten db,
Import_Daten dbl
WHERE db.lastname = dbl.lastname AND
db.Strasse = dbl.Strasse And
db.PLZ = dbl.PLZ;
The complete table contains over 10Mio rows. The size is actually my problem. The mysql runs on a MAMP Server on a Macbook with 1,5GHZ and 4GB RAM. So not really fast. SQL Statements run in a phpmyadmin. Actually i have no other system possibilities.
You can write a stored procedure that will each time select a different chunk of data (for example by rownumber between two values) and delete only from that range. This way you will slowly bit by bit delete your duplicates
A more effective two table solution can look like following.
We can store only the data we really need to delete and only the fields that contain duplicate information.
Let's assume we are looking for duplicate data in Lastname , Branche, Haushummer fields.
Create table to hold the duplicate data
DROP TABLE data_to_delete;
Populate the table with data we need to delete ( I assume all fields have VARCHAR(255) type )
CREATE TABLE data_to_delete (
id BIGINT COMMENT 'this field will contain ID of row that we will not delete',
cnt INT,
Lastname VARCHAR(255),
Branche VARCHAR(255),
Hausnummer VARCHAR(255)
) AS SELECT
min(t1.id) AS id,
count(*) AS cnt,
t1.Lastname,
t1.Branche,
t1.Hausnummer
FROM Import_Daten AS t1
GROUP BY t1.Lastname, t1.Branche, t1.Hausnummer
HAVING count(*)>1 ;
Now let's delete duplicate data and leave only one record of all duplicate sets
DELETE Import_Daten
FROM Import_Daten LEFT JOIN data_to_delete
ON Import_Daten.Lastname=data_to_delete.Lastname
AND Import_Daten.Branche=data_to_delete.Branche
AND Import_Daten.Hausnummer = data_to_delete.Hausnummer
WHERE Import_Daten.id != data_to_delete.id;
DROP TABLE data_to_delete;
You can add a new column e.g. uq and make it UNIQUE.
ALTER TABLE Import_Daten
ADD COLUMN `uq` BINARY(16) NULL,
ADD UNIQUE INDEX `uq_UNIQUE` (`uq` ASC);
When this is done you can execute an UPDATE query like this
UPDATE IGNORE Import_Daten
SET
uq = UNHEX(
MD5(
CONCAT(
Import_Daten.Lastname,
Import_Daten.Street,
Import_Daten.Zipcode
)
)
)
WHERE
uq IS NULL;
Once all entries are updated and the query is executed again, all duplicates will have the uq field with a value=NULL and can be removed.
The result then is:
0 row(s) affected, 1 warning(s): 1062 Duplicate entry...
For newly added rows always create the uq hash and and consider using this as the primary key once all entries are unique.

Select information from multiple tables with unique ID in the table name

Here is the scenario: I have tables stored in my monitoring application database using IDs as part of the table name. For instance, for 10 monitored devices, the device logs are stored in tables for each device with the device ID as part of the name, like this:
Device ID Table Name
1 device_logs.log_1
2 device_logs.log_2
I want to be able to :
select * from all device log tables where ID IN (a-list-of-IDs)
I reference this type of information a lot, and it would be easier to do it in a quick query and possibly a report. For some small list of devices, a union query works, but after about 4-5 devices, that gets too long. Programmatically, I can do this in python with string substitution, but how do you do it in MySQL as a query?
Adding a code segment I am trying to get to work, but struggling with the syntax:
drop table if exists tmp_logs;
create temporary table tmp_logs
(
device_name varchar(30) not null,
date datetime,
message varchar (255)
)
engine=innodb;
drop procedure if exists load_tmp_log_data;
delimiter #
create procedure load_tmp_log_data()
begin
declare vid int unsigned;
truncate table tmp_logs;
start transaction;
while vid in(4976,4956) do
insert into tmp_logs values
( SELECT
dev.device,
date,
message
FROM device_logs.logs_vid
inner join master_dev.legend_device dev on dev.id=vid
where date >= '2014-7-1'
and message not like '%HDMI%'
and message not like '%DVI%'
and message not like '%SIP%'
and message not like '%Completed%'
and message not like '%Template%'
and message not like '%collection%'
and message not like '%Cache%'
and message not like '%Disconnect%'
and message not like '%Row removed%'
and message not like '%detailed discovery%'
and message not like '%restarted%'
and message not like '%Auto Answer%'
);
end while;
commit;
end #
delimiter ;
call load_tmp_log_data();
select * from tmp_logs order by device_name;
You cannot dynamically specify the user name as a query in regular SQL. You could use a prepare statement with dynamic SQL.
An alternative is to set up the query as:
select l.*
from ((select l.*
from device_logs.log_1 l
where device_id = 1 and id in (list of ids)
) union all
(select l.*
from device_logs.log_2 l
where device_id = 2 and id in (list of ids)
) . . .
) l;
You need to repeat the conditions in each subquery -- that makes them more efficient. And, use union all instead of union to avoid duplication.
Fundamentally, though, having tables of the same structure is often a sign of poor database design. It would be much better to have a single table with a column specifying the id. Then your query would be really easy:
select l.*
from logs l
where id in (list of ids);
You could generate such a table by changing the application that creates the tables. Or you could generate such a table by using an insert trigger on each of the subtables. Or, if the data can be a day or so out of date, run a job that re-creates the table every night.

Does my Full-Text Index already contain a particular value?

I've got a SQL 2008 R2 table defined like this:
CREATE TABLE [dbo].[Search_Name](
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](300) NULL),
CONSTRAINT [PK_Search_Name] PRIMARY KEY CLUSTERED ([Id] ASC))
Performance querying the Name field using CONTAINS and FREETEXT works well.
However, I'm trying to keep the values of my Name column unique. Searching for an existing entry in the Name column is unbelievably slow for a large number of names (usually batches of 1,000), even with an index on the Name field. Query plans indicate I'm using the index as expected.
To search for an existing value, my query looks like this:
SELECT TOP 1 Id, Name from Search_Name where Name = 'My Name Value'
I've tried duplicating the Name column to another column and searching on the new column, but the net effect was the same.
At this point, I'm thinking I must be mis-using this feature.
Should I just stop trying to prevent duplication? I'm using a linking table to join these search name values to the underlying data. It seems somehow 'dirty' to just store a whole bunch of duplicate values...
...or is there faster way to take a list of 1,000 names and see which ones are already stored in the database?
The first change to make is to get the entire list to SQL Server at one time. Regardless of how you add the names to the existing table, doing it as a set operation will make a big difference in performance.
Passing the List as a table-valued parameter (TVP) is a clean way to handle it. Have a look here for an example. You can still use an OUTPUT clause to track which rows did or didn't make the cut, for example:
-- Some sample existing names.
declare #Search_Name as Table ( Id Int Identity, Name VarChar(32) );
insert into #Search_Name ( Name ) values ( 'Bob' ), ( 'Carol' ), ( 'Ted' ), ( 'Alice' );
select * from #Search_Name;
-- Some (prospective) new names.
declare #New_Names as Table ( Name VarChar(32) );
insert into #New_Names ( Name ) values ( 'Ralph' ), ( 'Alice' ), ( 'Ed' ), ( 'Trixie' );
select * from #New_Names;
-- Add the unique new names.
declare #Inserted as Table ( Id Int, Name VarChar(32) );
insert into #Search_Name
output inserted.Id, inserted.Name into #Inserted
select New.Name
from #New_Names as New left outer join
#Search_Name as Old on Old.Name = New.Name
where Old.Id is NULL;
-- Results.
select * from #Search_Name;
-- The names that were added and their id's.
select * from #Inserted;
-- The names that were not added.
select New.Name
from #New_Names as New left outer join
#Inserted as I on I.Name = New.Name
where I.Id is NULL;
Alternatively, you could use a MERGE statement and OUTPUT the names that were added, those that weren't, or both.

How can I create a query that shows children of an item recursively [duplicate]

This question already has answers here:
Recursive MySQL Query with relational innoDB
(2 answers)
Closed 9 years ago.
I have a MySQL table which has the following format:
CREATE TABLE IF NOT EXISTS `Company` (
`CompanyId` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
`Name` VARCHAR(45) NULL ,
`Address` VARCHAR(45) NULL ,
`ParentCompanyId` INT UNSIGNED NULL ,
PRIMARY KEY (`CompanyId`) ,
INDEX `fk_Company_Company_idx` (`ParentCompanyId` ASC) ,
CONSTRAINT `fk_Company_Company`
FOREIGN KEY (`ParentCompanyId` )
REFERENCES `Company` (`CompanyId` )
ON DELETE NO ACTION
ON UPDATE NO ACTION)
ENGINE = InnoDB;
So to clarify, I have companies which can have a parent company. This could result in the following example table contents:
CompanyId Name Address ParentCompanyId
1 Foo Somestreet 3 NULL
2 Bar Somelane 4 1
3 McD Someway 1337 1
4 KFC Somewhere 12 2
5 Pub Someplace 2 4
Now comes my question.
I want to retrieve all children of CompanyId 2 recursive. So the following result set should appear:
CompanyId Name Address ParentCompanyId
4 KFC Somewhere 12 2
5 Pub Someplace 2 4
I thought of using the With ... AS ... statement, but it is not supported by MySQL. Another solution I thought of was using a procedure or function which returns a result set and union it with the recursive call of that function. But MySQL does only support column types as return values.
The last possible solution I thought about was to create a table with two fields: CompanyId and HasChildId. I could then write a procedure that loops recursively through the companies and fills the table with all recursive children by a companyid. In this case I could write a query which joins this table:
SELECT CompanyId, Name, Address
FROM Company C -- The child
INNER JOIN CompanyChildMappingTable M
ON M.CompanyId = C.HasChildId
INNER JOIN Company P -- The parent
ON P.CompanyId = M.CompanyId
WHERE P.CompanyId = 2;
This option should be a fast one if i'd call the procedure every 24 hours and fill the table on the fly when new records are inserted into Company. But this could be very tricky and I should do this by writing triggers on the Company table.
I would like to hear your advice.
Solution: I've built the following procedure to fill my table (now it just returns the SELECT result).
DELIMITER $$
DROP PROCEDURE IF EXISTS CompanyFillWithSubCompaniesByCompanyId$$
CREATE PROCEDURE CompanyFillWithSubCompaniesByCompanyId(IN V_CompanyId BIGINT UNSIGNED, IN V_TableName VARCHAR(100))
BEGIN
DECLARE V_CONCAT_IDS VARCHAR(9999) DEFAULT '';
DECLARE V_CURRENT_CONCAT VARCHAR(9999) DEFAULT '';
SET V_CONCAT_IDS = (SELECT GROUP_CONCAT(CompanyId) FROM Company WHERE V_CompanyId IS NULL OR ParentCompanyId = V_CompanyId);
SET V_CURRENT_CONCAT = V_CONCAT_IDS;
IF V_CompanyId IS NOT NULL THEN
companyLoop: LOOP
IF V_CURRENT_CONCAT IS NULL THEN
LEAVE companyLoop;
END IF;
SET V_CURRENT_CONCAT = (SELECT GROUP_CONCAT(CompanyId) FROM Company WHERE FIND_IN_SET(ParentCompanyId, V_CURRENT_CONCAT));
SET V_CONCAT_IDS = CONCAT_WS(',', V_CONCAT_IDS, V_CURRENT_CONCAT);
END LOOP;
END IF;
SELECT * FROM Company WHERE FIND_IN_SET(CompanyId, V_CONCAT_IDS);
END$$
Refer:
Recursive MySQL Query with relational innoDB
AND
How to find all child rows in MySQL?
It shall give a idea of how such a data structure, can be dealt in MYSQL
One quickest way to search is, use company id values in power of 2. companyId = parentId * 2 then query database like, select * from company where ((CompanyId % $parentId) == 0 )
I tried this code, it's quick but problem is it creates child's id as parentId * 2 and if depth of child goes deep, int, float may go out of range. So, I re-created my whole program.

adding '1' returns "BLOB"

A JobID goes as follows: ALC-YYYYMMDD-001. The first three are a companies initials, the last three are an incrementing number that resets daily and increments throughout the day as jobs are added for a maximum of 999 jobs in a day; it is these last three that I am trying to work with.
I am trying to get a before-insert trigger to look for the max JobID of the day, and add one so I can have the trigger derive the proper JobID. For the first job, it will of course return null. So here is what I have so far.
Through the following I can get a result of '000'.
set #maxjobID =
(select SUBSTRING(
(Select MAX(
SUBSTRING((Select JobID FROM jobs WHERE SUBSTRING(JobID,5,8)=date_format(curdate(), '%Y%m%d')),4,12)
)
),14,3)
);
select lpad((select ifnull(#maxjobID,0)),3,'0')
But I really need to add one to this keeping the leading zeros to increment the first and subsequent jobs of the day. My problem is as soon as try to add '1' I get a return of 'BLOB'. That is:
select lpad((select ifnull(#maxjobID,0)+1),3,'0')
returns 'BLOB'
I need it to return '001' so I can concatenate that result with the CO initials and the current date.
try casting VARCHAR back to INTEGER
SELECT lpad(SELECT (COALESCE(#maxjobID,0, CAST(#maxjobID AS SIGNED)) + 1),3,'0')
If you're using the MyISAM storage engine, you can implement exactly this with AUTO_INCREMENT, without denormalising your data into a delimited string:
For MyISAM tables, you can specify AUTO_INCREMENT on a secondary column in a multiple-column index. In this case, the generated value for the AUTO_INCREMENT column is calculated as MAX(auto_increment_column) + 1 WHERE prefix=given-prefix. This is useful when you want to put data into ordered groups.
In your case:
Normalise your schema:
ALTER TABLE jobs
ADD initials CHAR(3) NOT NULL FIRST,
ADD date DATE NOT NULL AFTER initials,
ADD seq SMALLINT(3) UNSIGNED NOT NULL AFTER date,
;
Normalise your existing data:
UPDATE jobs SET
initials = SUBSTRING_INDEX(JobID, '-', 1),
date = STR_TO_DATE(SUBSTRING(JobID, 5, 8), '%Y%m%d'),
seq = SUBSTRING_INDEX(JobID, '-', -1)
;
Set up the AUTO_INCREMENT:
ALTER TABLE jobs
DROP PRIMARY KEY,
DROP JobID,
MODIFY seq SMALLINT(3) UNSIGNED NOT NULL AUTO_INCREMENT,
ADD PRIMARY KEY(initials, date, seq)
;
You can then recreate your JobID as required on SELECT (or even create a view from such a query):
SELECT CONCAT_WS(
'-',
initials,
DATE_FORMAT(date, '%Y%m%d'),
LPAD(seq, 3, '0')
) AS JobID,
-- etc.
If you're using InnoDB, whilst you can't generate sequence numbers in this fashion I'd still recommend normalising your data as above.
So, I found a query that works (thus far).
Declare maxjobID VARCHAR(16);
Declare jobincrement SMALLINT;
SET maxjobID =
(Select MAX(
ifnull(SUBSTRING(
(Select JobID FROM jobs WHERE SUBSTRING(JobID,5,8)=date_format(curdate(), '%Y%m%d')),
5,
12),0)
)
);
if maxjobID=0
then set jobincrement=1;
else set jobincrement=(select substring(maxjobID,10,3))+1;
end if;
Set NEW.JobID=concat
(New.AssignedCompany,'-',date_format(curdate(), '%Y%m%d'),'-',(select lpad(jobincrement,3,'0')));
Thanks for the responses! Especially eggyal for pointing out the auto_increment capabilities in MyISAM.