2 columns. Merge duplicate values of A, sum duplicate values of B - mysql

I have a table that has 2 columns with data like this (from 1950 to 2015):
| Year | Count |
| 1994 | 10 |
| 1994 | 49 |
| 1994 | 2 |
| 1995 | 13 |
| 1995 | 6 |
I want my query result to be:
| Year | Count |
| 1994 | 61 |
| 1995 | 19 |
Things I have tried:
I began with a simple query like SELECT SUM(Count) FROM 'population' WHERE 'Year' = '1994' which was fine to bring a specific year but I wanted to fill an array with the population of every year in the database.
Doing something like SELECT Year, SUM(Count) FROM 'population' is closer to what I want except it just shown the first year only.
I'm not sure what terms I need to search up to get close to my answer. Union? I tried applying it but I just blerghed.

Try to use
SELECT year,SUM(Count) FROM 'population' group by year

Related

How to check if a group has three consecutive values in a column?

I have a table games with values such as:
+----------+------+
| game | year |
+----------+------+
| Football | 1999 |
| Football | 2000 |
| Football | 2001 |
| Football | 2002 |
| Cricket | 1996 |
| Tennis | 2001 |
| Tennis | 2002 |
| Tennis | 2003 |
| Tennis | 2009 |
| Golf | 1994 |
| Golf | 1996 |
| Golf | 1997 |
+----------+------+
I am trying to see if a game has an entry with a minimum three consecutive years in the table. My expected output is:
+----------+
| game |
+----------+
| Football |
| Tennis |
+----------+
Because:
Football has four entries out of which four are consecutive years => 1999, 2000, 2001, 2002
Tennis has four entries out of which three are consecutive years => 2001, 2002, 2003
In order to find the rows with a minimum three consecutive entries I first partitioned the table on game and then checked difference between the current and the next row as below:
select game, year, case
when (year - lag(year) over (partition by game order by year)) is null then 1
else year - lag(year) over (partition by game order by year)
end as diff
from games
Output of the above query:
+----------+------+------+
| game | year | diff |
+----------+------+------+
| Football | 1999 | 1 |
| Football | 2000 | 1 |
| Football | 2001 | 1 |
| Football | 2002 | 1 |
| Cricket | 1996 | 1 |
| Tennis | 2001 | 1 |
| Tennis | 2002 | 1 |
| Tennis | 2003 | 1 |
| Tennis | 2009 | 6 |
| Golf | 1994 | 1 |
| Golf | 1996 | 2 |
| Golf | 1997 | 1 |
+----------+------+------+
I am not able to proceed from here on getting the output by filtering the data for each game with its difference.
Could anyone let me know if I am in the right track of the implementation? If not, how do I prepare the query to get the expected output?
You could use a self join approach here:
SELECT DISTINCT g1.Game
FROM games g1
INNER JOIN games g2
ON g2.Game = g1.Game AND g2.Year = g1.Year + 1
INNER JOIN games g3
ON g3.Game = g2.Game AND g3.Year = g2.Year + 1;
Demo
The above query requires any matching game to have at least one record whose year can be found in the following year, and the year after that as well.
You can use lag() and lead() and compare them to the current Year:
with u as
(select *, case
when lag(Year) over(partition by Game order by Year) = Year - 1
and lead(Year) over(partition by Game order by Year) = Year + 1
then 1 else 0
end as consec
from games)
select distinct Game
from u
where consec = 1;
Fiddle
Yes, your initial approach is correct. You were actually really close to fully figuring it out yourself.
What I would do is alter LAG a bit:
year - LAG(year, 2) OVER (
PARTITION BY game
ORDER BY year
ROWS BETWEEN UNBOUNDED PRECEEDING AND CURRENT ROW
)
For each row, this will compare the difference between the year from current row and the year from (current - 2)th row.
If it is the third consecutive row it will yield 2 which you can filter in where clause.
If your data contains duplicates you need to group by game, year first.
Using CTE(Common Table Expression) and the useful ROW_NUMBER window function this can be easily solved.
WITH CTE (name, RN) AS (
select name, ROW_NUMBER() OVER (PARTITION BY name order by year) RN
from game)
Select Distinct name
from CTE
Where RN >= 3

How can one group by ranges and perform aggregation in mysql?

I have a table as shown below with year and quantity of goods sold in each year, I want to group the year column into ranges of Decades and sum the quantity sold in each decade. Having in mind that the First decade is 1980 - 1989, Second decade is 1981 - 1990, so on... The expected result is also shown in the second table below
sample: expected_result:
+------+----------+ +-----------+------------+
| year | qty | | Decades | Total_qnty |
+------+----------+ +-----------+------------+
| 1980 | 2 | | 1980-1989 | 13 |
| 1981 | 1 | | 1981-1990 | 12 |
| 1983 | 8 | | 1982-1991 | 12 |
| 1989 | 2 | | 1983-1992 | 12 |
| 1990 | 1 | | . | . |
| 1992 | 1 | | . | . |
| 1994 | 4 | | . | . |
+------+----------+ +-----------+------------+
Below is the sample code I tried with a couple of others but the result is not as expected,
SELECT t.range AS Decades, SUM(t.qty) as Total_qnty
FROM (
SELECT case
when s.Year between 1980 and 1989 then '1980 - 1989'
when s.Year between 1981 and 1990 then '1981 - 1990'
when s.Year between 1982 and 1991 then '1982 - 1991'
when s.Year between 1983 and 1992 then '1983 - 1992'
else '1993 - above'
end as range, s.qty
FROM sample s) t,
group by t.range
I tried this and this but still could not get the expected result. Also I wouldn't want to hardcode things. Please any help will be appreciated.
After getting insight from xObert's answer to hank99's question I was able to work around the problem with self join as shown below. Note: The raw table contains the name of product and the year it was sold, with repeated product names and year sold. Which explains why I was able to use COUNT(*) to obtain total number of products sold in each decade. Thank you all!
SELECT year1 ||' - '|| year2 AS Decades, Count_of_qnty
FROM
(SELECT s1.year year1, s1.year+9 year2, COUNT(*) AS Count_of_qnty
FROM
(SELECT DISTINCT year FROM sample) s1
JOIN sample s2
ON s2.year>=year1 AND s2.year <= year2
GROUP BY year1)

SQL insert into select from - insert the id instead of the data

I need to populate my fact table with data from lds_placement table. I have selected the records and here is what it looks like:
fk1_account_id | fk3_job_role_id | salary | no_of_placements | YEAR
---------------------------------------------------------------------
10 | 3 | 165000 | 5 | 2010
10 | 3 | 132000 | 4 | 2011
10 | 3 | 132000 | 4 | 2012
20 | 2 | 990000 | 3 | 2010
20 | 2 | 132000 | 2 | 2011
20 | 2 | 132000 | 2 | 2012
I want to insert time_id from a different table called time_dim into the column year and not the actual year itself.
The time_dim table looks like this:
time_id | year
---------------
5 | 2015
1 | 2013
2 | 2010
3 | 2014
4 | 2012
6 | 2011
I need to insert into "year" column is actually:
year
2
6
4
2
6
4
Please give me the way to insert time_id instead of year in the table.
Here is the code I used to select the top-most table.
SELECT
fk1_account_id,
fk3_job_role_id,
Sum(actual_salary) AS salary,
Count(1) AS no_of_placements,
MAX(EXTRACT(YEAR FROM plt_estimated_end_date)) AS year
FROM lds_placement
GROUP BY fk1_account_id, fk3_job_role_id, EXTRACT(YEAR FROM plt_estimated_end_date)
ORDER BY fk1_account_id;
Use a left join if you want to capture records where year doesn't exist in time_dim. Else use inner_join.
select t.fk1_account_id,t.fk3_job_role_id,t.salary,t.no_of_placements
,d.time_id
from
(SELECT fk1_account_id, fk3_job_role_id, Sum(actual_salary) as salary, Count(1) as no_of_placements, MAX(EXTRACT(YEAR FROM plt_estimated_end_date)) AS YEAR
FROM lds_placement
GROUP BY fk1_account_id, fk3_job_role_id, EXTRACT(YEAR FROM plt_estimated_end_date)
)t
left join time_dim d
on t.year=d.year
order by t.fk1_account_id

UNION Query on Multiple Tables with New Columns

I just wanna say that this forum has been really helping me a lot in my current work.
I am in need of another help in writing an sql query for our ms access database. The idea is to make a union query for all the tables (January to December) to get the unique ID numbers and get their "Item" values per month as a column in the output table.
An example could be seen below. If the ID cannot be found in the table, the value will be returned null.
This seemed easy to do in excel but we would like to do it in our backend. I've only gotten as far as writing down the UNION query for all the tables but that's how far I got.
Thanks in advance for the help.
Table 1: January
| ID | Item |
| 1 | Apple |
| 2 | Salad |
| 3 | Grapes |
Table 2: February
| ID | Item |
| 1 | Apple |
| 2 | Grapes |
| 4 | Grapes |
Output Table:
| ID | January | February |
| 1 | Apple | Apple |
| 2 | Salad | Grapes |
| 3 | Grapes | NULL |
| 4 | NULL | Grapes |
One way is with union all and group by:
select id, max(January) as January, max(February) as February
from (select id, item as January, NULL as February from January
union all
select id, NULL, item from February
) jf
group by id;
If id exist only once per table why not use Inner Join instead? It will automatically null out the column if it does not exist on the other table but exist on the first one
SELECT id, Jan.Item,Feb.Item
FROM January Jan
INNER JOIN February Feb
on Jan.id=Feb.id

simple unpivot on 2 rows table

I have table that contains data of query, sample data looks like this:
| YEAR | JAN | FEB | ... | DEC |
|------|-----|-----|-----|-----|
| 2013 | 90 | 40 | ... | 50 |
| 2014 | 30 | 20 | ... | 40 |
I'm trying to unpivot this table to have data like this:
| MONTH | 2013 | 2014 |
|-------|------|------|
| JAN | 90 | 30 |
| FEB | 40 | 20 |
| ... | ... | ... |
| DEC | 50 | 40 |
I've tried this:
select Month, 2013, 2014
from Data
unpivot
(
marks
for Month in (Jan, Feb, Mar, Apr)
) u
but all I get are months and years. Here is my sqlfiddle
I will always have 12 months, but I can have multiple data rows.
Can this be done without dynamic sql?
You're not going to get what you want with a single unpivot statement. Start with this unpivot:
with cte(Year, Month, Orders)
as
(
select Year, Month, Orders
from Data d
unpivot
(
orders
for Month in (Jan, Feb, Mar, Apr)
) u
)
I'm going to use those results in the next part, so I store it as a CTE. This query gives you results like this:
| YEAR | MONTH | ORDERS |
|------|-------|--------|
| 2013 | JAN | 90 |
| 2013 | FEB | 40 |
| 2013 | MAR | 30 |
etc...
I don't know what the numbers in your table represent, but I just called them orders. You can rename that column to whatever is appropriate. The next step is to pivot those results so that we can get the year displayed as columns:
select Month, [2013], [2014]
from cte
pivot
(
sum(orders)
for year in ([2013], [2014])
) p
order by datepart(mm, Month+'1900')
If you need to add more years, it should be obvious where to do that. Note the clever order by that sorts the months chronologically instead of alphabetically.
Here's a SQL Fiddle.