mysql export to csv with related fields concated - mysql

I need to export my mysql db to a csv file. Where I'm going to be using it can't have related tables, so I need to concat related records into a single field. Is this possible to do? For example, assuming this table structure:
Items: id as INT, name as VARCHAR
ItemIdentifiers: id as INT, item_id as INT, identifier_id as INT
Identifiers: id as INT, identifier as VARCHAR
ItemColors: id as INT, item_id as INT, color_id as INT
Colors: id as INT, color as VARCHAR
and assuming this data:
Items: (1, 'some name')
ItemIdentifiers: (1, 1, 1), (2, 1, 2)
Identifiers: (1, 'ident1'), (2, 'ident2')
ItemColors: (1, 1, 1), (2, 1, 2)
Colors: (1, 'blue'), (2, 'green')
How would I get this:
'some name', 'ident1 ident2', 'blue green'
That's just a basic example, but I hope that conveys what I'm trying to do.

You can use group_concat function in combination with SELECT ... INTO
SELECT DISTINCT
items.name AS `Name`,
GROUP_CONCAT(DISTINCT identifiers.identifier
ORDER BY identifiers.identifier
SEPARATOR ' ') AS `Identifiers`,
GROUP_CONCAT(DISTINCT colors.color
ORDER BY colors.color
SEPARATOR ' ') AS `colors`
INTO OUTFILE '/tmp/data.csv'
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
FROM items
JOIN itemidentifiers
ON ( itemidentifiers.item_id = items.id )
JOIN identifiers
ON ( itemidentifiers.identifiers_id = identifiers.id )
JOIN itemcolors
ON ( itemcolors.item_id = items.id )
JOIN colors
ON ( colors.id = itemcolors.color_id )
GROUP BY Items.id
You might notice there are too many JOINs. This is because you have used relational tables. For each relational tables there are 1 additional JOIN
Note: The above query is experimental. I haven't tested it yet.

Related

Getting sum from a left table of leftjoined table

Below are the tables and the SQL query. I am doing a left join and trying to get SUM of a column that's in the left table and count from the right table.
Is it possible to get both in 1 query?
https://www.db-fiddle.com/f/3QuxG1DLgWJ8aGXNbnnwU1/1
select
s.test,
count(distinct s.name),
sum(s.score) score, -- need accurate score
count(a.id) attempts -- need accurate attempt count
from question s
left join attempts a on s.id = a.score_id
group by s.test
create table question (
id int auto_increment primary key,
test varchar(25),
name varchar(25),
score int
);
create table attempts (
id int auto_increment primary key,
score_id int,
attempt_no int
);
insert into question (test, name, score) values
('test1','name1', 10),
('test1','name2', 15),
('test1','name3', 20),
('test1','name4', 25),
('test2','name1', 15),
('test2','name2', 25),
('test2','name3', 30),
('test2','name4', 20);
insert into attempts (score_id, attempt_no) values
(1, 1),
(1, 2),
(1, 3),
(1, 4),
(2, 1),
(2, 2),
(2, 3),
(2, 4);
You need to pre-aggregate before the join:
select q.test, count(distinct q.name),
sum(q.score) score, -- need accurate score
sum(a.num_attempts) attempts -- need accurate attempt count
from question q left join
(select a.score_id, count(*) as num_attempts
from attempts a
group by a.score_id
) a
on q.id = a.score_id
group by q.test;
Here is a db-fiddle.
As Gordon said above, you can pre-aggregate, but his answer will get you the incorrect number of attempts, unfortunately. This is due to an issue with how you're structuring your DB schema. It looks like your question table really records scores of attempts at questions, and your attempts table is unnecessary. You should really have a question table that simply contains an ID and a name for the question, and a attempts table that contains an attempt ID, question ID, name, and score.
create table question (
id int auto_increment primary key,
test varchar(25)
);
create table attempts (
id int auto_increment primary key,
question_id int,
name varchar(25),
score int
);
Then your query becomes as simple as:
select
q.id as question_id,
count(distinct a.name) as attempters,
sum(a.score) as total_score,
count(a.id) as total_attempts
from question q join attempts a on q.id = a.question_id
group by q.id

Delete all duplicate rows in mysql

i have MySQL data which is imported from csv file and have multiple duplicate files on it,
I picked all non duplicates using Distinct feature.
Now i need to delete all duplicates using SQL command.
Note i don't need any duplicates i just need to fetch only noon duplicates
thanks.
for example if number 0123332546666 is repeated 11 time i want to delete 12 of them.
Mysql table format
ID, PhoneNumber
Just COUNT the number of duplicates (with GROUP BY) and filter by HAVING. Then supply the query result to DELETE statement:
DELETE FROM Table1 WHERE PhoneNumber IN (SELECT a.PhoneNumber FROM (
SELECT COUNT(*) AS cnt, PhoneNumber FROM Table1 GROUP BY PhoneNumber HAVING cnt>1
) AS a);
http://sqlfiddle.com/#!9/a012d21/1
complete fiddle:
schema:
CREATE TABLE Table1
(`ID` int, `PhoneNumber` int)
;
INSERT INTO Table1
(`ID`, `PhoneNumber`)
VALUES
(1, 888),
(2, 888),
(3, 888),
(4, 889),
(5, 889),
(6, 111),
(7, 222),
(8, 333),
(9, 444)
;
delete query:
DELETE FROM Table1 WHERE PhoneNumber IN (SELECT a.PhoneNumber FROM (
SELECT COUNT(*) AS cnt, PhoneNumber FROM Table1 GROUP BY PhoneNumber HAVING cnt>1
) AS a);
you could try using a left join with the subquery for min id related to each phonenumber ad delete where not match
delete m
from m_table m
left join (
select min(id), PhoneNumber
from m_table
group by PhoneNumber
) t on t.id = m.id
where t.PhoneNumber is null
otherwise if you want delete all the duplicates without mantain at least a single row you could use
delete m
from m_table m
INNER join (
select PhoneNumber
from m_table
group by PhoneNumber
having count(*) > 1
) t on t.PhoneNumber= m.PhoneNumber
Instead of deleting from the table, I would suggest creating a new one:
create table table2 as
select min(id) as id, phonenumber
from table1
group by phonenumber
having count(*) = 1;
Why? Deleting rows has a lot of overhead. If you are bringing the data in from an external source, then treat the first landing table as a staging table and the second as the final table.

Mysql Sum or Group concat base on distinct id

I'm sure somebody already ask this question somewhere but can't seems to find it.
Is it possible in mysql to do sum or group concat (AGGREGATE FUNCTION) combined with a distinct ?
Exemple: I have an order product which can have many option and many beneficiary. How is it possible in onequery (With out using subquery), to get the list of options, the sum of option price and the list of beneficiary ?
I've constructed a sample data set:
CREATE TABLE `order`
(id INT NOT NULL PRIMARY KEY);
CREATE TABLE order_product
(id INT NOT NULL PRIMARY KEY
,order_id INT NOT NULL
);
CREATE TABLE order_product_options
(id INT NOT NULL PRIMARY KEY
,title VARCHAR(20) NOT NULL
,price INT NOT NULL
,order_product_id INT NOT NULL
);
CREATE TABLE order_product_beneficiary
(id INT NOT NULL PRIMARY KEY
,name VARCHAR(20) NOT NULL
,order_product_id INT NOT NULL
);
INSERT INTO `order` (`id`) VALUES (1);
INSERT INTO `order_product` (`id`, `order_id`) VALUES (1, 1);
INSERT INTO `order_product_options` (`id`, `title`, `price`, `order_product_id`)
VALUES (1,'option1', 1, 1), (2, 'option2', 2, 1), (3, 'option3', 3, 1), (4, 'option3', 3, 1);
INSERT INTO `order_product_beneficiary` (`id`, `name`, `order_product_id`)
VALUES (1,'mark', 1), (2, 'jack', 1), (3, 'jack', 1);
http://sqlfiddle.com/#!9/37e383/2
The result I would like to have is
id: 1
options: option1, option2, option3, option3
options price: 9
beneficiaries: mark, jack, jack
Is this possible in mysql without using subqueries ? (I know it is possible in oracle)
If it's possible, how would you do it ?
Thanks
Based on your description, I think you just want DISTINCT in the GROUP_CONCAT(). However, that won't work because of the duplicates (as explained in a comment but not the question).
One solution is to include the ids in the results:
SELECT op.id,
GROUP_CONCAT(DISTINCT opo.title, '(', opo.id, ')' SEPARATOR ', ') AS options,
SUM(opo.price) AS options_price,
GROUP_CONCAT(DISTINCT opb.name, '(', opb.id, ')' SEPARATOR ', ') AS 'beneficiaries'
FROM order_product op INNER JOIN
order_product_options opo
ON opo.order_product_id = op.id INNER JOIN
order_product_beneficiary opb
ON opb.order_product_id = op.id
GROUP BY op.id;
This is not exactly your results, but it might suffice.
EDIT:
Oh, I see. You are joining along two different dimensions and getting a Cartesian product. The solution is to aggregate before joining:
SELECT op.id, opo.options, opo.options_price,
opb.beneficiaries
FROM order_product op INNER JOIN
(SELECT opo.order_product_id,
GROUP_CONCAT(opo.title SEPARATOR ', ') AS options,
SUM(opo.price) AS options_price
FROM order_product_options opo
GROUP BY opo.order_product_id
) opo
ON opo.order_product_id = op.id INNER JOIN
(SELECT opb.order_product_id,
GROUP_CONCAT(opb.name SEPARATOR ', ') AS beneficiaries
FROM order_product_beneficiary opb
GROUP BY opb.order_product_id
) opb
ON opb.order_product_id = op.id;
Here is the SQL Fiddle.
Somewhat like , Do your price summation in inner query and then join with order_product table.
SELECT
op.id,
MAX(opo.title) AS 'options',
MAX(opo.price) AS 'options price',
GROUP_CONCAT(opb.name SEPARATOR ', ') AS 'beneficiaries'
FROM
order_product op
INNER JOIN (
SELECT order_product_id, SUM(price) price, GROUP_CONCAT(title SEPARATOR ', ') title
FROM order_product_options
GROUP BY order_product_id
) opo ON opo.order_product_id = op.id
INNER JOIN order_product_beneficiary opb ON opb.order_product_id = op.id
GROUP BY op.id
Demo

Help with INSERT INTO..SELECT

I'm inserting a large number of rows into Table_A. Table_A includes a B_ID column which points to Table_B.B_ID.
Table B has just two columns: Table_B.B_ID (the primary key) and Table_B.Name.
I know the value for every Table_A field I'm inserting except B_ID. I only know the corresponding Table_B.Name. So how can I insert multiple rows into Table_A?
Here's a pseudocode version of what I want to do:
REPLACE INTO Table_A (Table_A.A_ID, Table_A.Field, Table_A.B_ID) VALUES
(1, 'foo', [SELECT B_ID FROM Table_B WHERE Table_B.Name = 'A'),
(2, 'bar', [SELECT B_ID FROM Table_B WHERE Table_B.Name = 'B'),...etc
I've had to do things like this when deploying scripts to a production environment where Ids differed in environments. Otherwise it's probably easier to type out the ID's
REPLACE INTO table_a (table_a.a_id, table_a.field, table_a.b_id)
SELECT 1, 'foo', b_id, FROM table_b WHERE name = 'A'
UNION ALL SELECT 2, 'bar', b_id, FROM table_b WHERE name = 'B'
If the values:
(1, 'foo', 'A'),
(2, 'bar', 'B'),
come from a (SELECT ...)
you can use this:
INSERT INTO Table_A
( A_ID, Fld, B_ID)
SELECT Data.A_ID
, Data.Field
, Table_B.B_ID
FROM (SELECT ...) As Data
JOIN Table_B
ON Table_B.Name = Data.Name
If not, you can insert them into a temporary table and then use the above, replacing (SELECT ...) with TemporaryTable.
CREATE TABLE HelpTable
( A_ID int
, Fld varchar(200)
, Name varchar(200)
) ;
INSERT INTO HelpTable
VALUES
(1, 'foo', 'A'),
(2, 'bar', 'B'), etc...
;
INSERT INTO Table_A
( A_ID, Field, B_ID)
SELECT HelpTable.A_ID
, HelpTable.Fld
, Table_B.B_ID
FROM HelpTable
JOIN Table_B
ON Table_B.Name = HelpTable.Name
;
DROP TABLE HelpTable ;

Find rows that has ALL the linked rows

I've got two tables:
User (id, name, etc)
UserRight (user_id, right_id)
I want to find the users who have rights 1, 2 and 3, but no users who only have one or two of these. Also, the number of rights will vary, so searches for (1,2,3) and (1,2,3,4,5,6,7) should work with much the same query.
Essentially:
SELECT *
FROM User
WHERE (
SELECT right_id
FROM tblUserRight
WHERE user_id = id
ORDER BY user_id ASC
) = (1,2,3)
Is this possible in MySQL?
SELECT u.id, u.name ...
FROM User u
JOIN UserRight r on u.id = r.user_id
WHERE right_id IN (1,2,3)
GROUP BY u.id, u.name ...
HAVING COUNT DISTINCT(right_id) = 3
You can also do this using PIVOT, especially if you want a visual representation. I did this on SQL Server - you may be able to translate it.
Declare #User Table (id Int, name Varchar (10))
Declare #UserRight Table (user_id Int, right_id Int)
Insert Into #User Values (1, 'Adam')
Insert Into #User Values (2, 'Bono')
Insert Into #User Values (3, 'Cher')
Insert Into #UserRight Values (1, 1)
Insert Into #UserRight Values (1, 2)
Insert Into #UserRight Values (1, 3)
--Insert Into #UserRight Values (2, 1)
Insert Into #UserRight Values (2, 2)
Insert Into #UserRight Values (2, 3)
Insert Into #UserRight Values (3, 1)
Insert Into #UserRight Values (3, 2)
--Insert Into #UserRight Values (3, 3)
SELECT *
FROM #User U
INNER JOIN #UserRight UR
ON U.id = UR.User_Id
PIVOT
(
SUM (User_Id)
FOR Right_Id IN ([1], [2], [3])
) as xx
WHERE 1=1
SELECT *
FROM #User U
INNER JOIN #UserRight UR
ON U.id = UR.User_Id
PIVOT
(
SUM (User_Id)
FOR Right_Id IN ([1], [2], [3])
) as xx
WHERE 1=1
AND [1] IS NOT NULL
AND [2] IS NOT NULL
AND [3] IS NOT NULL
In correspondance with the errors in my answer pointed out, here a solution with count and a subquery:
SELECT *
FROM User
WHERE 3 = (
SELECT Count(user_id)
FROM tblUserRight
WHERE right_id IN (1,2,3)
AND user_id = User.id
)
An optimizer may of course change this to Martin Smith's solution (i.e. by using a group by).