Removing repeating data in mysql

Removing repeating data in mysql - mysql

So I have a join query that produces the results I want but it also inculded repeating data that I don't want.
Here is the commands
SELECT cust_id, cust_fname, cust_lname, street_address, apt, city, state, zip, h_phone, m_phone, o_phone, cu_o.order_id, order_date, s_notes, donut_id, donut_name, donut_des, donut_cost, li.donut_qty
FROM customer cu
RIGHT JOIN cust_order cu_o
ON cu.cust_id = cu_o.co_cust_id
JOIN line_item li
ON li.li_order_id = cu_o.order_id
JOIN donut
ON li.li_donut_id = donut.donut_id
;
And this is the output

Although I believe that this is not SQL's job to do this, I learned in another thread (sadly I cannot remember which one) that it can indeed be done with use of parameters. Here is a minimal and verifiable example (with a demo here):
EDIT
I realised that the original answer was missing the proper ordering of the result, so the statement is a bit more involved (you must have ordered your result set first before applying the parameter trick).
drop table if exists t;
drop table if exists u;
create table t (id int, name varchar(10));
create table u (id int, tid int, val varchar(10));
insert t values (1, 'A'), (2, 'B'), (3, 'C');
insert u values (1, 1, 'x'), (2, 1, 'y'), (3, 2, 'z'), (4,2,'w'),(5,3,'q');
select x.name, x.val from (
select o.id, case when o.name <> #test then o.name else '' end as name,
#test:=o.name, o.val
from (select t.id, t.name, u.val from u join t on u.tid = t.id order by t.name) o
join (select #test:='') test
) x
order by x.id
This is the output for this example:
+-----+----+
|name | val|
+-----+----+
|A | x |
| | y |
|B | z |
| | w |
|C | q |
+-----+----+

Then don't show it to the customer.
It is totally up to you what's gonna get out, isn't it?
I don't really see what's the problem there.
And that's the way relational databases work, if you don't like it THAT much - then you may opt out for other DB types/paradigms.
PS And, well, basically there are ways to have it the way you like it.
Say, do two queries instead of one.
Or use GPOUP_CONCAT as an aggregator function (means, along with GROUP BY) to stick that donuts together for each customer.
But these are ugly & unproductive ways.

Related

Mysql: How to join a query to find results from another table

I have two tables:
TABLE A
Unique_id
id
price
1
1
10.50
2
3
14.70
3
1
12.44
TABLE B
Unique_id
Date
Category
Store
Cost
1
2022/03/12
Shoes
A
13.24
2
2022/04/15
Hats
A
15.24
3
2021/11/03
Shoes
B
22.31
4
2000/12/14
Shoes
A
15.33
I need to filter TABLE A on a known id to get the Unique_id and average price to join to Table B.
Using this information I need to know which stores this item was sold in.
I then need to create a results table displaying the stores and the amount of days sales were recorded in the stores - regardless of whether the sales are associated with the id and the average cost.
To put it more simply I can break down the task into 2 separate commands:
SELECT AVG(price)
FROM table_a
WHERE id = 1
GROUP BY unique_id;
SELECT store, COUNT(date), AVG(cost)
FROM table_b
WHERE category = 'Shoes'
GROUP BY store;
The unique_id should inform the join but when I join the tables it messes up my COUNT function and only counts the days in which the id is connected - not the total store sales days.
The results should look something like this:
Store
AVG price
COUNT days
AVG cost
A
10.50.
3
14.60.
B
12.44
1.
22.31.

I wwas hard to grasp, what you wanted, but after some thinking and your clarification, it can be solved as the code shows
CREATE TABLE TableA
(`Unique_id` int, `id` int, `price` DECIMAL(10,2))
;
INSERT INTO TableA
(`Unique_id`, `id`, `price`)
VALUES
(1, 1, 10.50),
(2, 3, 14.70),
(3, 1, 12.44)
;
CREATE TABLE TableB
(`Unique_id` int, `Date` datetime, `Category` varchar(5), `Store` varchar(1), `Cost` DECIMAL(10,2))
;
INSERT INTO TableB
(`Unique_id`, `Date`, `Category`, `Store`, `Cost`)
VALUES
(1, '2022-03-12 01:00:00', 'Shoes', 'A', 13.24),
(2, '2022-04-15 02:00:00', 'Hats', 'A', 15.24),
(3, '2021-11-03 01:00:00', 'Shoes', 'B', 22.31),
(4, '2000-12-14 01:00:00', 'Shoes', 'A', 15.33)
SELECT
B.`Store`
, AVG(A.`price`) price
, (SELECT COUNT(*) FROM TableB WHERE `Store` = B.`Store` ) count_
, (SELECT AVG(
`cost`) FROM TableB WHERE `Store` = B.`Store` ) price
FROM TableA A
JOIN TableB B ON A.`Unique_id` = B.`Unique_id`
WHERE B.`Category` = 'Shoes'
GROUP BY B.`Store`
Store | price | count_ | price
:---- | --------: | -----: | --------:
A | 10.500000 | 3 | 14.603333
B | 12.440000 | 1 | 22.310000
db<>fiddle here

This should be the query you are after. Mainly you simply join the rows using an outer join, because not every table_b row has a match in table_a.
Then, the only hindrance is that you only want to consider shoes in your average price. For this to happen you use conditional aggregation (a CASE expression inside the aggregation function).
select
b.store,
avg(case when b.category = 'Shoes' then a.price end) as avg_shoe_price,
count(b.unique_id) as count_b_rows,
avg(b.cost) as avg_cost
from table_b b
left outer join table_a a on a.unique_id = b.unique_id
group by b.store
order by b.store;
I must admit, it took me ages to understand what you want and where these numbers result from. The main reason for this is that you have WHERE table_a.id = 1 in your query, but this must not be applied to get the result you are showing. Next time please look to it that your description, queries and sample data match.
(And then, I think that names like table_a, table_b and unique_id don't help understanding this. If table_a were called prices instead and table_b costs and unique_id were called cost_id then, I wouldn't have had to wonder how the tables are related (by id? by unique id?) and wouldn't have had to look again and again which table the cost resides in, which table has a price and which table is the outer joined one while looking at the problem, the requested result and while writing my query.)

mysql self join with group_concat and without duplicates

I would like to get rid of duplicates in my DB. There can be several duplicates of one criterion, which are then grouped together.
Let's say B is duplicate of A, and C is also duplicate of A then there should be a result like
*id* | *duplicate*
A | B, C
But now the result is like:
*id* | *duplicate*
A | B, C
B | C
Which is correct of course. The problem is, that I would like that ids which already appeared as duplicates in the results were not listed again in the column id with their own duplicates.
Here is an example: http://sqlfiddle.com/#!9/61692/1/0
Any suggestions?
Thanks,
Paul
Edit:
And here the source of the example (as recommended by Zohar Peled):
CREATE TABLE duplicates
(`id` int, `Name` varchar(7))
;
INSERT INTO duplicates
(`id`, `Name`)
VALUES
(1, 'Bob'),
(2, 'Bob'),
(3, 'Bob'),
(4, 'Alice')
;
SELECT DISTINCT d1.`id`, GROUP_CONCAT(d2.`id`) as duplicates
FROM `duplicates` as d1, `duplicates` as d2
WHERE
d1.`id`< d2.`id` AND
d1.`Name` = d2.`Name`
GROUP BY d1.`id`

This is a rather unorthodox solution, but hey...
SELECT MIN(x.id) id
, GROUP_CONCAT(DISTINCT y.id) duplicates
FROM duplicates x
JOIN duplicates y
ON y.name = x.name
AND y.id > x.id
GROUP
BY x.name

Convert columns into rows with inner join in mysql

Please take a look at this fiddle.
I'm working on a search filter select box and I want to insert the field names of a table as rows.
Here's the table schemea:
CREATE TABLE general
(`ID` int, `letter` varchar(21), `double-letters` varchar(21))
;
INSERT INTO general
(`ID`,`letter`,`double-letters`)
VALUES
(1, 'A','BB'),
(2, 'A','CC'),
(3, 'C','BB'),
(4, 'D','DD'),
(5, 'D','EE'),
(6, 'F','TT'),
(7, 'G','UU'),
(8, 'G','ZZ'),
(9, 'I','UU')
;
CREATE TABLE options
(`ID` int, `options` varchar(15))
;
INSERT INTO options
(`ID`,`options`)
VALUES
(1, 'letter'),
(2, 'double-letters')
;
The ID field in options table acts as a foreign key, and I want to get an output like the following and insert into a new table:
id field value
1 1 A
2 1 C
3 1 D
4 1 F
5 1 G
6 1 I
7 2 BB
8 2 CC
9 2 DD
10 2 EE
11 2 TT
12 2 UU
13 2 ZZ
My failed attempt:
select DISTINCT(a.letter),'letter' AS field
from general a
INNER JOIN
options b ON b.options = field
union all
select DISTINCT(a.double-letters), 'double-letters' AS field
from general a
INNER JOIN
options b ON b.options = field

Pretty sure you want this:
select distinct a.letter, 'letter' AS field
from general a
cross JOIN options b
where b.options = 'letter'
union all
select distinct a.`double-letters`, 'double-letters' AS field
from general a
cross JOIN options b
where b.options = 'double-letters'
Fiddle: http://sqlfiddle.com/#!2/bbf0b/18/0
A couple to things to point out, you can't join on a column alias. Because that column you're aliasing is a literal that you're selecting you can specify that literal as criteria in the WHERE clause.
You're not really joining on anything between GENERAL and OPTIONS, so what you really want is a CROSS JOIN; the criteria that you're putting into the ON clause actually belongs in the WHERE clause.

I just made this query on Oracle.
It works and produces the output you described :
SELECT ID, CASE WHEN LENGTH(VALUE)=2THEN 2 ELSE 1 END AS FIELD, VALUE
FROM (
SELECT rownum AS ID, letter AS VALUE FROM (SELECT DISTINCT letter FROM general ORDER BY letter)
UNION
SELECT (SELECT COUNT(DISTINCT LETTER) FROM general) +rownum AS ID, double_letters AS VALUE
FROM (
SELECT DISTINCT double_letters FROM general ORDER BY double_letters)
)
It should also run on Mysql.
I did not used the options table. I do not understand his role. And for this example, and this type of output it seems unnecessary
Hope this could help you to.

SQL Query Two Tables at once and Loop Result as JSON objects

I'm not exactly sure what I am looking for here, so apologies if this has already been covered here, I'm not sure what I need to search for!
I have a MySQL database with a table called "Locations" which looks a bit like this
id | name | other parameters
1 | shop1 | blah
2 | shop2 | blah
etc
and a table of customer queries
id | customer | department
1 | john | shop2
2 | Joe | shop2
3 | James | shop1
4 | Sue | shop2
etc
I want to query this and return a JSON object that looks like this
{"location":"shop1","queryCount":"1"},{"location":"shop2","queryCount":"3"}
The location table can be added to with time, and obviously the customer queries will be to, so both need dynamic queries.
I tried this by getting a list of locations by a simple SELECT name from locations query, turning it into an array and then looping through that as follows:
For i = UBound(listofLocations) To 0 Step -1
locations.id = listofLocations(i)
locations.queryCount= RESULT OF: "SELECT COUNT(id) as recordCount from queries WHERE department=listofLocations(i)"
objectArray.Add(locations)
Next
This works, but it is inefficient calling the database through the loop, how do I avoid this?
Thanks

The inefficient is because you are using nested query,
you should use LEFT JOIN, and you just need single query
Here is the SQL:-
select l.name, count(*) as recordCount
from Locations as l
left join customer as c
on l.name = c.department
group by l.id;
And your schema is not very optimized.
Your schema for customer should be
id, customer, location_id <-- using name is redundant,
<-- which should represent in ID (location ID)

First, I must recommend that your "department" field be changed. If you change the spelling of a name in the locations table, your relationships break.
Instead, use the id from the location table, and set up a foreign key reference.
But that's an aside. Using your current structure, this is what I'd do in SQL...
SELECT
location.name,
COUNT(queries.id) AS count_of_queries
FROM
locations
LEFT JOIN
queries
ON queries.department = locations.name
GROUP BY
location.name
Using a LEFT JOIN ensures that you get EVERY location, even if there is no query for it.
Using COUNT(queries.id) instead of COUNT(*) gives 0 if there are no associated records in the queries table.
You can then loop through the result-set of one query, rather than looping multiple queries, and build your JSON string.
It is possible to build the string in SQL, but that's generally considered bad practice. Much better to keep you SQL about data, and you php/whatever about processing and presenting it.

Try this:
SELECT T1.name, (SELECT COUNT(*) FROM queries AS T2 WHERE T2.department = T1.name) AS number FROM locations AS T1;
I agree that your scheme is not proper, you should reference department by Id, not by name

I think you're just looking for "GROUP BY"
SELECT
department_id as location,
COUNT(id) as querycount
FROM
(table of customer queries)
GROUP BY
department_id
As for the database structure... If possible, I would change the tables a bit:
Location:
id | name | other parameters
Customer:
id | name | whatever
table_of_customer_queries:
id | customer_id | location_id
1 | 2 | 2
2 | 4 | 2
3 | 2 | 1
4 | 1 | 2
GROUP BY will only give you results for those departments that have queries. If you want all departments, the LEFT JOIN option mentioned earlier is the way to go.

First off, if your Locations table really stores Department information, rename it as such. Then, change Customer.department to be a fk reference to the Locations.id column (and renamed to department_id), not the name (so much less to get wrong). This may or may not give you a performance boost; however, keep in mind that the data in the database isn't really meant to be human readable - it's meant to be program readable.
In either case, the query can be written as so:
SELECT a.name, (SELECT COUNT(b.id)
FROM Customer as b
WHERE b.department_id = a.id) as count
FROM Location as a

If you don't change your schema, your answer is to do a group by clause that counts the number of queries to each location.
Creating tables like yours:
create table #locations (id integer, name varchar(20), other text)
create table #queries (id integer, customer varchar(20), department varchar(20))
insert into #locations (id, name, other) values (1, 'shop1', 'blah')
insert into #locations (id, name, other) values (2, 'shop2', 'blah')
insert into #queries (id, customer, department) values (1, 'john', 'shop2')
insert into #queries (id, customer, department) values (2, 'Joe', 'shop2')
insert into #queries (id, customer, department) values (3, 'James', 'shop1')
insert into #queries (id, customer, department) values (4, 'Sue', 'shop2')
Querying your data:
select
l.name as location,
count(q.id) as queryCount
from #locations as l
left join #queries as q on q.department = l.name
group by l.name
order by l.name
Results:
location queryCount
-------------------- -----------
shop1 1
shop2 3

Finding a users maximum score and the associated details

I have a table in which users store scores and other information about said score (for example notes on score, or time taken etc). I want a mysql query that finds each users personal best score and it's associated notes and time etc.
What I have tried to use is something like this:
SELECT *, MAX(score) FROM table GROUP BY (user)
The problem with this is that whilst you can extra the users personal best from that query [MAX(score)], the returned notes and times etc are not associated with the maximum score, but a different score (specifically the one contained in *). Is there a way I can write a query that selects what I want? Or will I have to do it manually in PhP?

I'm assuming that you only want one result per player, even if they have scored the same maximum score more than once. I am also assuming that you want each player's first time that they got their personal best in the case that there are repeats.
There's a few ways of doing this. Here's a way that is MySQL specific:
SELECT user, scoredate, score, notes FROM (
SELECT *, #prev <> user AS is_best, #prev := user
FROM table1, (SELECT #prev := -1) AS vars
ORDER BY user, score DESC, scoredate
) AS T1
WHERE is_best
Here's a more general way that uses ordinary SQL:
SELECT T3.* FROM table1 AS T3
JOIN (
SELECT T1.user, T1.score, MIN(scoredate) AS scoredate
FROM table1 AS T1
JOIN (SELECT user, MAX(score) AS score FROM table1 GROUP BY user) AS T2
ON T1.user = T2.user AND T1.score = T2.score
GROUP BY T1.user
) AS T4
ON T3.user = T4.user AND T3.score = T4.score AND T3.scoredate = T4.scoredate
Result:
1, '2010-01-01 17:00:00', 50, 'Much better'
2, '2010-01-01 14:00:00', 100, 'Perfect score'
Test data I used to test this:
CREATE TABLE table1 (user INT NOT NULL, scoredate DATETIME NOT NULL, score INT NOT NULL, notes NVARCHAR(100) NOT NULL);
INSERT INTO table1 (user, scoredate, score, notes) VALUES
(1, '2010-01-01 12:00:00', 10, 'First attempt'),
(1, '2010-01-01 17:00:00', 50, 'Much better'),
(1, '2010-01-01 22:00:00', 30, 'Time for bed'),
(2, '2010-01-01 14:00:00', 100, 'Perfect score'),
(2, '2010-01-01 16:00:00', 100, 'This is too easy');

You can join with a sub query, as in the following example:
SELECT t.*,
sub_t.max_score
FROM table t
JOIN (SELECT MAX(score) as max_score,
user
FROM table
GROUP BY user) sub_t ON (sub_t.user = t.user AND
sub_t.max_score = t.score);
The above query can be explained as follows. It starts with:
SELECT t.* FROM table t;
... This by itself will obviously list all the contents of the table. The goal is to keep only the rows that represent a maximum score of a particular user. Therefore if we had the data below:
+------------------------+
| user | score | notes |
+------+-------+---------+
| 1 | 10 | note a |
| 1 | 15 | note b |
| 1 | 20 | note c |
| 2 | 8 | note d |
| 2 | 12 | note e |
| 2 | 5 | note f |
+------+-------+---------+
...We would have wanted to keep just the "note c" and "note e" rows.
To find the rows that we want to keep, we can simply use:
SELECT MAX(score), user FROM table GROUP BY user;
Note that we cannot get the notes attribute from the above query, because as you had already noticed, you would not get the expected results for fields not aggregated with an aggregate function, like MAX() or not part of the GROUP BY clause. For further reading on this topic, you may want to check:
Debunking GROUP BY Myths
How does MySQL decide which id to return in group by clause?
Why does MySql allow “group by” queries WITHOUT aggregate functions?
Now we only need to keep the rows from the first query that match the second query. We can do this with an INNER JOIN:
...
JOIN (SELECT MAX(score) as max_score,
user
FROM table
GROUP BY user) sub_t ON (sub_t.user = t.user AND
sub_t.max_score = t.score);
The sub query is given the name sub_t. It is the set of all the users with the personal best score. The ON clause of the JOIN applies the restriction to the relevant fields. Remember that we only want to keep rows that are part of this subquery.

SELECT *
FROM table t
ORDER BY t.score DESC
GROUP BY t.user
LIMIT 1
Side note: It is better to specify the fields than use SELECT *

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Removing repeating data in mysql - mysql

Related

Mysql: How to join a query to find results from another table

mysql self join with group_concat and without duplicates

Convert columns into rows with inner join in mysql

SQL Query Two Tables at once and Loop Result as JSON objects

Finding a users maximum score and the associated details

Categories

Resources