MySQL 5.6
We have two tables: cars and views.
Cars Views
---+------- ---+-------
id | desc id | car_id
---+------- ---+-------
1 | desc1 1 | 1
2 | Desc1 2 | 2
3 | desc2 3 | 3
The problem is with the desc field in the table cars. That row had to be unique but we unfortunately allowed the users to fill in uppercased values, which brought us to the situation of having (according to the example above) two duplicated rows: desc1 and Desc1.
The way to fix that is DELETE the duplicated cars, and keep only the first one. We know how to deal with that.
Our problem comes before that, when updating the related table, where some views are associated to a car which has a duplicated desc (for instance a car which is going to be removed). Those views should be updated for being assigned to the first of the duplicated cars (in this case the car id #1)
After the UPDATE, we'd like this result in views:
Views
---+-------
id | car_id
1 | 1
2 | 1
3 | 3
We are able to get all the ids of the duplicated cars and deal with the deletion but we're stuck with this UPDATE.
The solution will be to create a mapping table with before/after values for description ids.
The result should look something like:
Before | After
---------------
1 | 1
2 | 1
3 | 3
That table can be created with something like this:
SELECT
cars.id AS before_id,
fixed.lowest_id AS after_id
FROM cars
JOIN (
-- The lowest id value for each duplicate description
SELECT
MIN(id) AS lowest_id,
LOWER(desc) AS lower_desc
FROM cars
GROUP BY LOWER(desc)
) fixed
ON LOWER(cars.desc) = fixed.lower_desc
You can then have your views match to that mapping table to pull the new "correct" id value.
UPDATE Views AS v
JOIN (SELECT c1.id AS oldID, MIN(c2.id) AS newID
FROM Cars AS c1
JOIN Cars AS c2 ON LOWER(c1.desc) = LOWER(c2.desc)
HAVING oldID != newID) AS c
ON v.car_id = oldID
SET v.car_id = newID
The subquery finds the primary ID for each ID that contains a duplicate description. Joining this with the Views table provides the information needed to make the replacements.
Related
I have a table like this:
client_id | Product1 | Product2 | ... | Product170
--------------------------------------------------
4 | Null | 4 | ... | 5
32 | 5 | 3 | ... | Null
22 | 4 | 1 | ... | 3
I want to have the totals for each of my Products. I want a view, or something similar, like this:
product_id | Total
--------------------------------------------------
Product1 | 9
Product2 | 8
...
Preferably leaving out Products that have a sum of 0.
Am I able to do this? I have many columns so I would rather not have a SELECT statement calling each individual column by name.
(Context: This table holds orders for a business. A client will order some products and it is stored here. If you have a better way to organize this info, please let me know).
Am I able to do this?
Yes, by writing an extremely long query with UNION.
Example:
SELECT 'Product1' as product_id, SUM(Product1) as Total
FROM table
UNION
SELECT 'Product2' as product_id, SUM(Product2) as Total
FROM table
UNION
...
Obviously this is not practical, so...
If you have a better way to organize this info, please let me know
A better way to organize this info would be to normalize it using a products table (with a unique id) and a junction table (e.g. client_products). This table contains 3 columns : client_id, product_id and n (the number of product, or whatever your number represents). The primary key is (client_id, product_id), and add an index to product_id.
You can very easily query this model with SELECT product_id, SUM(n) FROM client_products GROUP BY product_id.
You can write the query like
SELECT sum(Product1) as Product1, sum(Product2) as Product1 FROM `product`
It will give you the total of each product but in one row and having product name as column. You can also add the > 0 condition in where clause
I'm having 2 tables. Table A contains a list of people who booked for an event, table B has a list of people the booker from table A brings with him/her. Both tables have many colums with unique data that I need to do certain calculations on in PHP , and as of now I do so by doing queries on the tables with a recursive PHP function to resolve it. I want to simplify the PHP and reduce the amount of queries that come from this recursive function by doing better MYSQL queries but I'm kind of stuck.
Because the table has way to many columns I will give an Excerpt of table A instead:
booking_id | A_customer | A_insurance
1 | 134 | 4
Excerpt of table B:
id | booking_id | B_insurance
1 | 1 | 0
2 | 1 | 1
3 | 1 | 1
4 | 1 | 3
The booking_id in table A is unique and set to auto increment, the booking_id in table b can occur many times (depending on how many guests the client from table A brings with him). Lets say I want to know every selected insurance from customer 134 and his guests, then I want the output like this:
booking_id | insurance
1 | 4
1 | 0
1 | 1
1 | 1
1 | 3
I have tried a couple of joins and this is the closest I've came yet, unfortunately this fails to show the row from A and only shows the matching rows in B.
SELECT a.booking_id,a.A_customer,a.A_insurance,b.booking_id,b.insurance FROM b INNER JOIN a ON (b.booking_id = a.booking_id) WHERE a.booking_id = 134
Can someone point me into the right direction ?
Please note: I have altered the table and column names for stackoverflow so it's easy for you guys to read, so it's possible that there is a typo that would break the query in it right now.
I think you need a union all for this:
select a.booking_id, a.insurance
from a
where a.a_customer = 134
union all
select b.booking_id, b.insurance
from a join
b
on a.booking_id = b.booking_id
where a.a_customer = 134;
The simplest way I can think of to achieve this is to use a UNION:
SELECT booking_id, A_insurance insurance
FROM A
WHERE booking_id = 134
UNION
SELECT booking_id, B_insurance insurance
FROM B
WHERE booking_id = 134
As my understanging of your isso is right, that should give you the result you need:
SELECT a.booking_id,a.insurance FROM a WHERE a.booking_id = 134
union
SELECT a.booking_id,b.insurance FROM b INNER JOIN a ON (b.booking_id = a.booking_id) WHERE a.booking_id = 134
I have a join table named languages_services that basically joins the table services and languages.
I need to find a service that is able to serve both ENGLISH (language_id=1) and ESPANOL (language_id=2).
table languages_services
------------------------
service_id | language_id
------------------------
1 | 1
1 | 2
1 | 3
2 | 1
2 | 3
With the data provided above, I want to test for language_id=1 AND language_id=2 where the result would look like this
QUERY RESULT
------------
service_id
------------
1
Obviously it doesn't return the one with service_id=2 because it doesn't service Espanol.
Any tips on this is greatly appreciated!
SELECT
service_id
FROM
language_services
WHERE
language_id = 1
OR language_id = 2
GROUP BY
service_id
HAVING
COUNT(*) = 2
Or...
WHERE
lanaguage_id IN (1,2)
GROUP BY
service_id
HAVING
COUNT(*) = 2
If you're always looking at 2 languages you could do it with joins, but the aggregate version is easier to adapt to differing numbers of language_ids. (Add an OR, or add an item to the IN list, and change the COUNT(*) = 2 to COUNT(*) = 3, etc, etc).
Be aware, however, that this scales very poorly. And with this table structure there isn't much you can do about that.
EDIT Example using a join for 2 languages
SELECT
lang1.service_id
FROM
language_services AS lang1
INNER JOIN
language_services AS lang2
ON lang1.service_id = lang2.service_id
WHERE
lang1.language_id = 1
AND lang2.language_id = 2
For simplicity, I will give a quick example of what i am trying to achieve:
Table 1 - Members
ID | Name
--------------------
1 | John
2 | Mike
3 | Sam
Table 1 - Member_Selections
ID | planID
--------------------
1 | 1
1 | 2
1 | 1
2 | 2
2 | 3
3 | 2
3 | 1
Table 3 - Selection_Details
planID | Cost
--------------------
1 | 5
2 | 10
3 | 12
When i run my query, I want to return the sum of the all member selections grouped by member. The issue I face however (e.g. table 2 data) is that some members may have duplicate information within the system by mistake. While we do our best to filter this data up front, sometimes it slips through the cracks so when I make the necessary calls to the system to pull information, I also want to filter this data.
the results SHOULD show:
Results Table
ID | Name | Total_Cost
-----------------------------
1 | John | 15
2 | Mike | 22
3 | Sam | 15
but instead have John as $20 because he has plan ID #1 inserted twice by mistake.
My query is currently:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
Adding DISTINCT s.planID filters the results incorrectly as it will only show a single PlanID 1 sold (even though members 1 and 3 bought it).
Any help is appreciated.
EDIT
There is also another table I forgot to mention which is the agent table (the agent who sold the plans to members).
the final group by statement groups ALL items sold by the agent ID (which turns the final results into a single row).
Perhaps the simplest solution is to put a unique composite key on the member_selections table:
alter table member_selections add unique key ms_key (ID, planID);
which would prevent any records from being added where the unique combo of ID/planID already exist elsewhere in the table. That'd allow only a single (1,1)
comment followup:
just saw your comment about the 'alter ignore...'. That's work fine, but you'd still be left with the bad duplicates in the table. I'd suggest doing the unique key, then manually cleaning up the table. The query I put in the comments should find all the duplicates for you, which you can then weed out by hand. once the table's clean, there'll be no need for the duplicate-handling version of the query.
Use UNIQUE keys to prevent accidental duplicate entries. This will eliminate the problem at the source, instead of when it starts to show symptoms. It also makes later queries easier, because you can count on having a consistent database.
What about:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN
(select distinct ID, PlanID from member_selections) s
USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
By the way, is there a reason you don't have a primary key on member_selections that will prevent these duplicates from happening in the first place?
You can add a group by clause into the inner query, which groups by all three columns, basically returning only unique rows. (I also changed 'premium' to 'cost' to match your example tables, and dropped the agent part)
SELECT
sq.ID,
sq.name,
SUM(sq.Cost) AS total_cost
FROM
(
SELECT
m.id,
m.name,
g.Cost
FROM
members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
GROUP BY
m.ID,
m.NAME,
g.Cost
) sq
group by
sq.ID,
sq.NAME
Extending further from this question Query to find top rated article in each category -
Consider the same table -
id | category_id | rating
---+-------------+-------
1 | 1 | 10
2 | 1 | 8
3 | 2 | 7
4 | 3 | 5
5 | 3 | 2
6 | 3 | 6
There is a table articles, with fields id, rating (an integer from 1-10), and category_id (an integer representing to which category it belongs). And if I have the same goal to get the top rated articles in each query (this should be the result):-
Desired Result
id | category_id | rating
---+-------------+-------
1 | 1 | 10
3 | 2 | 7
6 | 3 | 6
Extension of original question
But, running the following query -
SELECT id, category_id, max( rating ) AS max_rating
FROM `articles`
GROUP BY category_id
results into the following where everything, except the id field, is as desired. I know how to do this with a subquery - as answered in the same question - Using subquery.
id category_id max_rating
1 1 10
3 2 7
4 3 6
In generic terms
Excluding the grouped column (category_id) and the evaluated columns (columns returning results of aggregate function like SUM(), MAX() etc. - in this case max_rating), the values returned in the other fields are simply the first row under every grouped result set (grouped by category_id in this case). E.g. the record with id =1 is the first one in the table under category_id 1 (id 1 and 2 under category_id 1) so it is returned.
I am just wondering is it not possible to somehow overcome this default behavior to return rows based on conditions? If mysql can perform calculation for every grouped result set (does MAX() counting etc) then why can't it return the row corresponding to the maximum rating. Is it not possible to do this in a single query without a subquery? This looks to me like a frequent requirement.
Update
I could not figure out what I want from Naktibalda's solution too. And just to mention again, I know how to do this using a subquery, as again answered by OMG Ponies.
Use:
SELECT x.id,
x.category_id,
x.rating
FROM YOUR_TABLE x
JOIN (SELECT t.category_id,
MAX(t.rating) AS max_rating
FROM YOUR_TABLE t
GROUP BY t.category_id) y ON y.category_id = x.category_id
AND y.max_rating = x.rating