I want to fetch records from a table that contains duplicate records. I want the output to be like only two duplicate records from each set of duplicate records in overall record output set.
example-
Name
Country
John
India
Mark
India
Chris
Russia
Feggy
England
Rain
Russia
Monesy
Russia
Bhumi
India
Peter
England
Bruice
England
Radhe
India
Output should have only two duplicate set of records from all duplicate of similar type as we can see in output below the country is repeating only two times and it took only first two counters of duplicate records in final record set -
Name
Country
John
India
Mark
India
Chris
Russia
Feggy
England
Rain
Russia
Peter
England
You can number the lines by the window and select only the first N.
Sorting should be chosen according to the business logic of the query.
For example:
;WITH numbered_name AS
(
SELECT *
, ROW_NUMBER() OVER (PARTITION BY t.Country ORDER BY t.Name) rn
FROM table t
)
SELECT Name
, Country
FROM numbered_name
WHERE rn <= 2
Related
I have a table named sales in a MySQL database that looks like this:
company manufactured shipped
Mercedes Germany United States
Mercedes Germany Germany
Mercedes Germany United States
Toyota Japan Canada
Toyota Japan England
Audi Germany United States
Audi Germany France
Audi Germany Canada
Tesla United States Mexico
Tesla United States Canada
Tesla United States United States
Here is a Fiddle: http://www.sqlfiddle.com/#!17/145ff/3
I would like to return the list of companies that ship ALL of their products internationally (that is, where the value in the manufactured column differs from the value in the shipped column for ALL records of a particular company).
Using the example above, the desired result set would be:
company
Toyota
Audi
Here is my (hackish) attempt:
WITH temp_table AS (
SELECT
s.company AS company
, SUM(CASE
WHEN s.manufactured != s.shipped THEN 1
ELSE 0
END
) AS count_international
, COUNT(s.company) AS total_within_company
FROM
sales s
GROUP BY
s.company
)
SELECT
company
FROM
temp_table
WHERE count_international = total_within_company
Essentially, I count the instances where the columns do not match. Then I check whether the sum of those mismatched instances matches the number of records within a given group.
This approach works, but it's far from an elegant solution!
Can anyone offer advice as to a more idiomatic way to implement this query?
Thanks!
We can GROUP BY company and use a HAVING clause to say all countries in shipped must differ to the country in manufactured:
SELECT company
FROM sales
GROUP BY company
HAVING COUNT(CASE WHEN manufactured = shipped THEN 1 END) = 0;
Try out here: db<>fiddle
The fiddle linked in the question is a Postgres DB, but MySQL is taged as DBMS.
In a MySQL DB, the above query can be simplified to:
SELECT company
FROM sales
GROUP BY company
HAVING SUM(manufactured = shipped) = 0;
In a Postgres DB, this is not possible.
You have to think in sets... you want to display all without a match -- find the matches display the rest
SELECT DISTINCT company
FROM sales
WHERE company NOT IN (
SELECT company
FROM sales
WHERE manufactured = shipped
)
Name Place visited
Ash New york
Bob New york
Ash Chicago
Bob Chicago
Carl Chicago
Carl Detroit
Dan Detroit
Above is the sample table. The output should be two names who visited place together. I.e. the output should be Ash and Bob since the places visited by Ash also visited by Bob.
Output:
Name1 Name2
Ash Bob
What is a query for this using MySQL or even relational algebra?
The simplest method is to use group_concat(). Assuming no duplicates,
select places, group_concat(names) as names
from (select name, group_concat(place order by place) as places
from t
group by name
) t
group by places
having count(*) > 1;
This will return all the names with exactly the same places on a single row. The names will be in a comma-delimited list.
I am trying to learn Group By and Having but I can't seem to understand what happened here. I used w3shools SQL Tryit Editor.
The table I created is:
name age country
------------------------
Sara 17 America
David 21 America
Jared 27 America
Jane 54 Canada
Rob 32 Canada
Matthew 62 Canada
The Query I used:
select
sum(age), country
from
NewTable
group by
country
having
age>25;
I expected the query to categorize the information by country and use age>25 filter to create the results but here is the output:
sum(age) country
--------------------
65 America
148 Canada
What happened?! The result is sum of American and Canadian people in all ages.
The piece you're missing is specific to the having keyword. Using the having clause in your query is applied to the dataset after the grouping occurs.
It sounds like you are expecting the records with age less than 25 to be excluded from your query before grouping occurs. But, the way it works is the having clause excludes the total age for each group that sums to a total over 25.
If you want to exclude individual records before totaling the sum of the age, you could do something like this (using a where clause which is applied prior to grouping):
select sum(age), country from NewTable where age > 25 group by country;
A where clause puts a condition on which rows participate in the results.
A having clause is like a where, but puts a condition on which grouped (or aggregated) values participate in the results.
Either, try this:
select sum(age), country
from NewTable
where age > 25 -- where puts condition on raw rows
group by country
or this:
select sum(age), country
from NewTable
group by country
having sum(age) > 25 -- having puts a condition on groups
depending on what you're trying to do.
id points year country
-----------------------------------
1 45 1998 Mexico
2 45 2000 Germany
3 47 2010 Russia
4 45 1970 China
5 49 2010 Austria
I wonder how can I take row results considering only 2 items from country column. For example only records where country is Germany and Mexico. When I try to get results where only 1 country is criterion the thing is easy:
SELECT * FROM List WHERE Country='Mexico';
the result is:
id points year country
-----------------------------------
1 45 1998 Mexico
but when I try to get results where 2 country items are criteria problems start. I tried:
SELECT * FROM List WHERE country='Mexico' AND Country='Germany';
SELECT * FROM List WHERE country='Mexico' AND 'Germany';
SELECT * FROM List WHERE country='Mexico','Germany';
SELECT * FROM List WHERE country='Mexico'AND WHERE country='Germany';
but no desired result:
id points year country
-----------------------------------
1 45 1998 Mexico
2 45 2000 Germany
I understand that maybe I committed logical error because there is no single record where country is Mexico and Germany at same time, and sql maybe understands claim exactly that way, but, how to write correctly in sql language: Give me results for records where countries are Mexico and Germany. Thanks.
You are looking for IN operator
SELECT * FROM List WHERE Country in ('Mexico','Germany');
Just use OR.
So instead of
SELECT * FROM List WHERE country='Mexico' AND Country='Germany';
it would be
SELECT * FROM List WHERE country='Mexico' OR country='Germany';
IN is also a good function to use, especially if you've got multiple values that you want to check against but that's been covered in the other answers.
You need to use or or in, you have been using and and asking mysql to find a row where country is both Mexico and Germany which is not true.
SELECT * FROM List WHERE Country in ('Mexico','Germany');
try this:
SELECT * FROM List WHERE country='Mexico' OR Country='Germany';
SQL is using logic. Natural language is not.
When you say that you want the results for a list of countries you need to specify so. This request corresponds to an logical or. Since the name can be one or the other, both are correct.
SELECT * FROM List WHERE Country = 'Mexico' OR Country = 'Germany'
To prevent further mistakes like these, I recommend that you look up logical operations in the docs (they are very good). MySQL or the PostGres, both should be fine.
I have a table of cities that all share the same area code:
367 01451 Harvard Worcester Massachusetts MA 978 Eastern
368 01452 Hubbardston Worcester Massachusetts MA 978 Eastern
369 01453 Leominster Worcester Massachusetts MA 978 Eastern
The table has multiple area codes, all with multiple cities.
What I'd like to do is only select one city from each area code and delete any extra cities from duplicate area codes. What would be the best query to accomplish this?
I believe:
Mysql4: SQL for selecting one or zero record
Is coming close to what I need but didn't quite get what/how those answers were working.
Note The "978" row is the "area_code" row, table name is "zip_code".
DELETE c.*
FROM zip_code c
JOIN (
SELECT area_code, MIN(id) AS mid
FROM zip_code
GROUP BY
area_code
) co
ON c.area_code = co.area_code
AND c.id <> co.mid