Grouping, yet concatenating values from within the group in MySQL - mysql

I have a simple MySQL table as such:
| CUST_ID | VISIT | PROD_ID |
|---------|-------|---------|
| 1 | 1 | 3473 |
| 1 | 2 | 324 |
| 1 | 2 | 324 |
| 2 | 1 | 426 |
| 2 | 2 | 4418 |
| 3 | 1 | 4523 |
| 4 | 1 | 976 |
| 4 | 1 | 86 |
| 4 | 2 | 3140 |
| 4 | 3 | 1013 |
And I would like to transform it to this:
| CUST_ID | VISIT | PROD_IDs |
|---------|-------|----------|
| 1 | 1 | 3473 |
| 1 | 2 | 324, 324 |
| 2 | 1 | 426 |
| 2 | 2 | 4418 |
| 3 | 1 | 4523 |
| 4 | 1 | 976, 86 |
| 4 | 2 | 3140 |
| 4 | 3 | 1013 |
This is kinda an ugly hack, I get it.
I have no idea how to cleanly create such a thing. I've tried a variety of unsuccessful grouping strategies. Even a clue or hint in the right direction would be great. Thanks.

If you're trying to group by cust_id + visit, then you can do that and use a GROUP CONCAT on the PROD_ID field, for example:
SELECT
CUST_ID,
VISIT,
GROUP_CONCAT(PROD_ID) PROD_IDS
FROM
table
GROUP BY
CUST_ID,
VISIT
Reference: GROUP CONCAT

Related

Is there a mySQL procedure that can merge duplicate rows of data into one, then allow me to manipulate that data as if it were one row?

I'm trying to come up with a stored procedure that takes multiple rows that are exactly identical, and combines them into one row while summing one column, which can then be run through more stored procedures based on the sum of that one column.
I've tried a GROUP BY statement, but that doesn't actually group the rows together, because if I run the table through another procedure it performs actions as if each row were not combined. Performing a SELECT * FROM mytable query shows that each row was not actually combined into one.
Is there any way to permanently combine multiple rows into one singular row?
To start, I've got a table like this:
+-------+-----+--------+---------+------+-----+-----------+
| RowID | pID | Name | Date | Code | QTY | Purchased |
+-------+-----+--------+---------+------+-----+-----------+
| 1 | 1 | bob | 9/29/20 | 123 | 1 | |
| 2 | 1 | bob | 8/10/20 | 456 | 1 | |
| 3 | 2 | rob | 9/15/20 | 123 | 1 | |
| 4 | 2 | rob | 9/15/20 | 123 | 1 | |
| 5 | 2 | rob | 9/15/20 | 123 | 1 | |
| 6 | 2 | rob | 9/15/20 | 123 | 1 | |
| 7 | 2 | rob | 9/15/20 | 123 | 1 | |
| 8 | 3 | john | 7/12/20 | 987 | 1 | |
| 9 | 3 | john | 7/12/20 | 987 | 1 | |
| 10 | 4 | george | 9/12/20 | 684 | 1 | |
| 11 | 5 | paul | 2/2/20 | 454 | 1 | |
| 12 | 6 | amy | 1/12/20 | 252 | 1 | |
| 13 | 7 | susan | 5/30/20 | 131 | 1 | |
| 14 | 7 | susan | 6/6/20 | 252 | 1 | |
| 15 | 7 | susan | 5/30/20 | 131 | 1 | |
+-------+-----+--------+---------+------+-----+-----------+
By the end, i'd like to have a table like this:
+-------+-----+--------+---------+------+-----+-----------+
| RowID | pID | Name | Date | Code | QTY | Purchased |
+-------+-----+--------+---------+------+-----+-----------+
| 1 | 1 | bob | 9/29/20 | 123 | 1 | |
| 2 | 1 | bob | 8/10/20 | 456 | 1 | |
| 3 | 2 | rob | 9/15/20 | 123 | 5 | |
| 4 | 3 | john | 7/12/20 | 987 | 2 | |
| 5 | 4 | george | 9/12/20 | 684 | 1 | |
| 6 | 5 | paul | 2/2/20 | 454 | 1 | |
| 7 | 6 | amy | 1/12/20 | 252 | 1 | |
| 8 | 7 | susan | 5/30/20 | 131 | 2 | |
| 9 | 7 | susan | 6/6/20 | 252 | 1 | |
+-------+-----+--------+---------+------+-----+-----------+
Where exactly identical rows are combined into one row, and the QTY field is summed, that I can then add purchases to, or make deductions from the quantity as a total. Using GROUP BY statements can achieve this, but when I go to alter the quantity or add purchases to each person, it treats it like the first table, as if nothing was actually grouped.
So you have this table:
| RowID | pID | Name | Date | Code | QTY | Purchased |
+-------+-----+--------+---------+------+-----+-----------+
| 1 | 1 | bob | 9/29/20 | 123 | 1 | |
| 2 | 1 | bob | 8/10/20 | 456 | 1 | |
| 3 | 2 | rob | 9/15/20 | 123 | 1 | |
| 4 | 2 | rob | 9/15/20 | 123 | 1 | |
| 5 | 2 | rob | 9/15/20 | 123 | 1 | |
| 6 | 2 | rob | 9/15/20 | 123 | 1 | |
| 7 | 2 | rob | 9/15/20 | 123 | 1 | |
| 8 | 3 | john | 7/12/20 | 987 | 1 | |
| 9 | 3 | john | 7/12/20 | 987 | 1 | |
| 10 | 4 | george | 9/12/20 | 684 | 1 | |
| 11 | 5 | paul | 2/2/20 | 454 | 1 | |
| 12 | 6 | amy | 1/12/20 | 252 | 1 | |
| 13 | 7 | susan | 5/30/20 | 131 | 1 | |
| 14 | 7 | susan | 6/6/20 | 252 | 1 | |
| 15 | 7 | susan | 5/30/20 | 131 | 1 | |
The best way, as has been suggested, is to create a new table with the content of your query, then to rename the old table, and the new table to the original table's name, to check if everything is all right, and to drop the original table if yes.
CREATE TABLE indata_new AS
WITH grp AS (
SELECT
MIN(rowid) AS orowid
, pid
, name
, MAX(date) AS date
, code
, SUM(qty) AS qty
FROM indata
GROUP BY
pid
, name
, code
)
SELECT
ROW_NUMBER() OVER(ORDER BY orowid ASC) AS rowid
, *
FROM grp;
ALTER TABLE indata RENAME TO indata_old;
ALTER TABLE indata_new RENAME TO indata;
-- if "indata" now contains the data you want ...
SELECT * FROM indata;
-- out rowid | orowid | pid | name | date | code | qty
-- out -------+--------+-----+--------+------------+------+-----
-- out 1 | 1 | 1 | bob | 2020-09-29 | 123 | 1
-- out 2 | 2 | 1 | bob | 2020-08-10 | 456 | 1
-- out 3 | 3 | 2 | rob | 2020-09-15 | 123 | 5
-- out 4 | 8 | 3 | john | 2020-07-12 | 987 | 2
-- out 5 | 10 | 4 | george | 2020-09-12 | 684 | 1
-- out 6 | 11 | 5 | paul | 2020-02-02 | 454 | 1
-- out 7 | 12 | 6 | amy | 2020-01-12 | 252 | 1
-- out 8 | 13 | 7 | susan | 2020-05-30 | 131 | 2
-- out 9 | 14 | 7 | susan | 2020-06-06 | 252 | 1
-- you can ...
DROP TABLE indata_old;

Output record without duplicate

I have a table like this and i want to output without duplication of the same user. If i use group by it shows only one record on the same column. iam also using left join for location and user name. A little help
+------+---------+----------+---------+
| user | work id | location | time |
+------+---------+----------+---------+
| 1 | 42 | 1 | 2hr |
| 1 | 42 | 1 | 10min |
| 1 | 42 | 1 | 30min |
| 2 | 42 | 1 | 4hr |
| 2 | 42 | 1 | 2.30min |
| 1 | 50 | 2 | 4min |
| 1 | 50 | 2 | 5min |
| 2 | 20 | 3 | 3hr |
| 1 | 20 | 3 | 6hr |
+------+---------+----------+---------+
Iam looking for this
+------+---------+----------+
| user | work id | location |
+------+---------+----------+
| 1 | 42 | 1 |
| 1 | 50 | 2 |
| 1 | 20 | 3 |
| 2 | 42 | 1 |
| 2 | 20 | 3 |
+------+---------+----------+
You simply need a distinct clause here -
SELECT DISTINCT user
,workid
,location
FROM YOUR_TABLE
ORDER BY user
,location

How to select multiple values for one table "cell" MySQL

For all players, I need to find the player number and a list of the numbers of teams for which they have ever played.
Here is the table "MATCHES":
+---------+--------+----------+-----+------+
| MATCHNO | TEAMNO | PLAYERNO | WON | LOST |
+---------+--------+----------+-----+------+
| 1 | 1 | 6 | 3 | 1 |
| 2 | 1 | 6 | 2 | 3 |
| 3 | 1 | 6 | 3 | 0 |
| 4 | 1 | 44 | 3 | 2 |
| 5 | 1 | 83 | 0 | 3 |
| 6 | 1 | 2 | 1 | 3 |
| 7 | 1 | 57 | 3 | 0 |
| 8 | 1 | 8 | 0 | 3 |
| 9 | 2 | 27 | 3 | 2 |
| 10 | 2 | 104 | 3 | 2 |
| 11 | 2 | 112 | 2 | 3 |
| 12 | 2 | 112 | 1 | 3 |
| 13 | 2 | 8 | 0 | 3 |
+---------+--------+----------+-----+------+
The best I could come up with was:
SELECT DISTINCT playerno, teamno
FROM matches
ORDER BY playerno;
which results in:
+----------+--------+
| playerno | teamno |
+----------+--------+
| 2 | 1 |
| 6 | 1 |
| 8 | 1 |
| 8 | 2 |
| 27 | 2 |
| 44 | 1 |
| 57 | 1 |
| 83 | 1 |
| 104 | 2 |
| 112 | 2 |
+----------+--------+
Notice how player 8 has played on two teams. How can I get the table to show only one row for player 8 and a list of teamno's (1 & 2)?
You could use the group_concat aggregate function:
SELECT playerno, GROUP_CONCAT(DISTINCT teamno)
FROM matches
GROUP BY playerno
ORDER BY playerno;
You could use group_concat
SELECT playerno, group_concat( teamno)
FROM matches
GROUP BY playerno;

Creating a log having the date of purchase

I need to create a log having the purchase date of an item.
Items can be owned by only one buyer at time. So, for example, if item1 was purchased by buyer2 in 2009 and after by buyer1 in 2015, then between 2009 and 2015 was owned by buyer2.
Here is my table:
+--------+------------+-----------+----------+
| id_doc | date | id_item | id_buyer |
+--------+------------+-----------+----------+
| 11 | 2016-06-07 | 1 | 4 |
| 10 | 2016-06-06 | 1 | 4 |
| 1 | 2015-11-30 | 1 | 1 |
| 9 | 2009-01-01 | 1 | 2 |
| 4 | 2001-01-12 | 1 | 2 |
| 8 | 1996-06-06 | 1 | 2 |
| 3 | 1995-05-29 | 1 | 1 |
| 2 | 1998-05-23 | 2 | 2 |
| 7 | 2014-10-10 | 3 | 2 |
| 6 | 2003-12-12 | 3 | 3 |
| 5 | 1991-01-12 | 3 | 2 |
+--------+------------+-----------+----------+
Here is a kind of table/view I need:
+------------+------------+-----------+----------+--------+
| date_from | date_to | id_item | id_buyer | id_doc |
+------------+------------+-----------+----------+--------+
| 2016-06-07 | - | 1 | 4 | 11 |
| 2016-06-06 | 2016-06-07 | 1 | 4 | 10 |
| 2015-11-30 | 2016-06-06 | 1 | 1 | 1 |
| 2009-01-01 | 2015-11-30 | 1 | 2 | 9 |
| 2001-01-12 | 2009-01-01 | 1 | 2 | 4 |
| 1996-06-06 | 2001-01-12 | 1 | 2 | 8 |
| 1995-05-29 | 1996-06-06 | 1 | 1 | 3 |
| 1998-05-23 | - | 2 | 2 | 2 |
| 2014-10-10 | - | 3 | 2 | 7 |
| 2003-12-12 | 2014-10-10 | 3 | 3 | 6 |
| 1991-01-12 | 2003-12-12 | 3 | 2 | 5 |
+------------+------------+-----------+----------+--------+
I've tried a lot with GROUP BY, GROUP_CONCAT, trying to access next record date, etc ... but I can't found out how to solve the problem.
Thanks in advance.
I finally found out the solution only for past purchases.
SELECT
main.id_doc, main.id_item, main.date AS "date_from", bi.date AS "date_to", main.id_buyer
FROM
MyTable main, MyTable bi
WHERE
bi.id_doc =
(
SELECT sub.id_doc
FROM MyTable sub
WHERE sub.id_item = main.id_item AND sub.date > main.date ORDER BY sub.date ASC LIMIT 1
);

Picking out specific values from a group in MySQL

This seems like such a simple problem, but I can't find a good solution. I'm trying to select information from a slightly misformatted table. Basically, wherever sequence=0, the person_id should actually be a company_id. This company_id then applies to all the rows which have the same group_id.
Someone thought it was a good idea to format things this way instead of simply having a company_id column, but it makes trying to select by company very difficult. It would make my programming much easier to simply add this extra column, and fix the formatting.
I want to turn something like this:
+----------+------------+-----------+----------+
| group_id | date | person_id | sequence |
+----------+------------+-----------+----------+
| 1 | 2012-08-31 | 10 | 0 |
| 1 | 2012-08-31 | 11 | 1 |
| 1 | 2012-08-31 | 12 | 2 |
| 2 | 1999-04-16 | 10 | 0 |
| 2 | 1999-04-16 | 21 | 1 |
| 2 | 1999-04-16 | 22 | 2 |
| 2 | 1999-04-16 | 23 | 3 |
| 2 | 1999-04-16 | 24 | 4 |
| 3 | 2001-01-09 | 30 | 0 |
| 3 | 2001-01-09 | 31 | 1 |
| 3 | 2001-01-09 | 11 | 2 |
| 3 | 2001-01-09 | 12 | 3 |
+----------+------------+-----------+----------+
Into this:
+------------+----------+------------+-----------+----------+
| company_id | group_id | date | person_id | sequence |
+------------+----------+------------+-----------+----------+
| 10 | 1 | 2012-08-31 | 11 | 1 |
| 10 | 1 | 2012-08-31 | 12 | 2 |
| 10 | 2 | 1999-04-16 | 21 | 1 |
| 10 | 2 | 1999-04-16 | 22 | 2 |
| 10 | 2 | 1999-04-16 | 23 | 3 |
| 10 | 2 | 1999-04-16 | 24 | 4 |
| 30 | 3 | 2001-01-09 | 31 | 1 |
| 30 | 3 | 2001-01-09 | 11 | 2 |
| 30 | 3 | 2001-01-09 | 12 | 3 |
+------------+----------+------------+-----------+----------+
The only way I can think of how to achieve this is with nested SELECT statements, which are very inefficient considering I have about 100M rows. It's a one time fix though, so I don't mind letting it run overnight.
If you permanently want to change your table to include a company_id column then do this:
First alter the table and add the new column:
alter table your_table add company_id int;
Then update all rows to set the company to the person_id = 0 for the group:
UPDATE your_table a
JOIN your_table b ON a.group_id = b.group_id
SET a.company_id = b.person_id
WHERE b.sequence = 0;
And finally remove the rows with sequence = 0:
DELETE FROM your_table WHERE sequence = 0;
Sample SQL Fiddle
The end result will be:
| group_id | date | person_id | sequence | company_id |
|----------|------------|-----------|----------|------------|
| 1 | 2012-08-31 | 11 | 1 | 10 |
| 1 | 2012-08-31 | 12 | 2 | 10 |
| 2 | 1999-04-16 | 21 | 1 | 10 |
| 2 | 1999-04-16 | 22 | 2 | 10 |
| 2 | 1999-04-16 | 23 | 3 | 10 |
| 2 | 1999-04-16 | 24 | 4 | 10 |
| 3 | 2001-01-09 | 31 | 1 | 30 |
| 3 | 2001-01-09 | 11 | 2 | 30 |
| 3 | 2001-01-09 | 12 | 3 | 30 |