I have the following data in one of my Teradata tables and I would like to delete duplicate rows based on three different variables
Have
ID CDate DDate Seq
101 4/25/2020 5/24/2020 1201
101 4/25/2020 5/26/2020 1201
101 4/26/2020 5/24/2020 1202
101 4/26/2020 5/26/2020 1202
Want
ID CDate DDate Seq
101 4/25/2020 5/24/2020 1201
101 4/26/2020 5/26/2020 1202
Using the following query
Qualify row_number() Over (Partition By ID, CDate, DDate ORDER BY Seq)=1
I still get the same 4 rows as output
Any help is appreciated
Related
I have a table where each quiz ID is repeated multiple times. there is a date in front of each quiz id in each row. I want to select entire row for each quiz ID where date is latest with user. The date format is mm/dd/YYYY.
Sample -
USER_ID Quiz_id Name Date Marks .. .. ..
1 2 poly 4/3/2020 27
1 2 poly 4/3/2019 98
1 4 moro 4/3/2020 09
2 5 cat 4/12/2015 87
2 4 moro 4/3/2009 56
2 6 PP 4/3/2011 76
3 2 poly 4/3/2020 12
3 2 poly 5/3/2020 09
3 7 dog 4/3/2011 23
I want result look like this:Result
USER_ID Quiz_id Name Date Marks .. .. ..
1 2 poly 4/3/2020 27
1 4 moro 4/3/2020 09
2 5 cat 4/12/2015 87
2 4 moro 4/3/2009 56
2 6 PP 4/3/2011 76
3 2 poly 5/3/2020 09
3 7 dog 4/3/2011 23
You can use rank function to get the desired result:
Demo
SELECT A.* FROM (
SELECT A.*, RANK() OVER(PARTITION BY USER_ID,QUIZ_ID, NAME ORDER BY DATE DESC) RN FROM
Table1 A ORDER BY USER_ID) A WHERE RN = 1 ORDER BY USER_ID, QUIZ_ID;
I don't have MySQL installed so you will need to test and report back. The general idea is to identify the row of interest using max and a group by (table t). As the Date column appears to be text column (MySQL uses the format YYYY-MM-DD for dates) you will need to convert it to a date with str_to_date() so you can use the max() aggregate function. Finally, join with the original table (here table t2 to do the date conversion), as only the aggregate column(s) and columns named in the group by are well defined (in table t1), i.e.:
select USER_ID, Quiz_id, Date, Marks from (
select USER_ID, Quiz_id, max(str_to_date(Date, '%m/%d/%Y')) as Date2 from quiz group by 1, 2
) as t natural join (
select *, str_to_date(Date, '%m/%d/%Y') Date2 from Quiz
) as t2;
I don't recall off-hand but Date might be reserved word, in which case you will need to quote the column name, or ideally rename said column to use a better name.
Also, the original table is not in 3rd normal form as Quiz_id depends on Name. Quiz_id, as implied, should be a foreign key to a lookup table that holds the Name.
I have the following data,
id emp_id csa_taken
1 100 2
2 100 2
3 100 0
4 100 2
5 101 2
6 101 2
7 101 0
8 101 0
I expect a result with count where csa_taken=2 for individual employee.
expected result:
emp_id count_csa_taken
100 3
101 2
I have tried the following query with a failed attempt.
Select count(employee_id) From $employeeCSA where csa_taken=2
Please suggest as I am new to sql.
If I understand you correctly you like to count all employees with a cas_taken of two. As there are multiple entries for the csa_taken for one employee you need to group them.
E.g.:
SELECT COUNT(*) FROM $employeeCSA WHERE csa_taken = 2 GROUP_BY employee_id
Please note that COUNT(*) counts the rows (not the fields).
You also need group by. Try like:
Select count(employee_id),emp_id From $employeeCSA where csa_taken=2
group by emp_id
If i understand correctly, then you can try this:
SELECT emp_id,COUNT(emp_id) from dbo.Sample WHERE csa_token = 2 GROUP BY emp_id
version = MySQL 8.0
MRE:
create table test_table(
item_id int,
price decimal,
transaction_time datetime
);
insert into test_table(item_id, price, transaction_time)
Values (1, 5500, "2020-01-01 00:11:11")
, (1, 1000, "2020-01-07 01:11:11")
, (3, 1100, "2020-01-06 18:10:10")
, (3, 7700, "2020-01-03 18:10:10")
, (4, 1900, "2020-01-02 12:00:11");
Using windowing function to get cumulative price for each item_id I run:
select *
, sum(price) over(partition by item_id) as cum_fee
from test_table;
which outputs:
item_id price transaction_time cum_fee
1 5500 2020-01-01 00:11:11 6500
1 1000 2020-01-07 01:11:11 6500
3 1100 2020-01-06 18:10:10 8800
3 7700 2020-01-03 18:10:10 8800
4 1900 2020-01-02 12:00:11 1900
Now I want to get rid of duplicate item_id. The reason I added windowing function is I want to get rid of duplicate item_id but want to keep their cumulative price "cum_fee".
My initial attempt was groupby item_id at the end:
select *
, sum(price) over(partition by item_id) as cum_fee
from test_table
group by item_id;
this seems to groupby item_id first then runs windowing function outputting:
item_id price transaction_time cum_fee
1 5500 2020-01-01 00:11:11 5500
3 1100 2020-01-06 18:10:10 1100
4 1900 2020-01-02 12:00:11 1900
I know people comparing groupby Vs. windowing function which probably means we use either one but not both? is it true?
is yes, what is an alternative method to achieve my goal?
You seem to want aggregation. Perhaps this?
select item_id, min(price), min(transaction_time), sum(price)
from test_table
group by item_id;
Window functions do not change the number of rows. That is what group by does.
Trying to wrap my mind around how to write this SQL query.
Table X has 3 Columns: Year, ID, Value and looks like so
Year | ID | Value
2013 101 10000
2014 101 11000
2015 101 12000
2013 102 7000
2014 102 8000
2015 102 9000
And table Y has 3 Columns: ID, Curr_Year_Val, Next_Year_Val and looks like this
ID | Curr_Year_Val | Next_Year_Val
101 13000 14000
102 6000 5000
I would like to write a select statement to join these two tables together, but maintain the layout of Table X, like so:
Year | ID | Value
2013 101 10000
2014 101 11000
2015 101 12000
Curr_Year_Val 101 13000
Next_Year_Val 101 14000
Is there a way to achieve this result? I've figured out how to just do a left join to add the columns from table y to table x, but would rather have the columns from table y unpivoted to the rows of table x. Thanks much in advance - this seems like it should be so easy, I've been googling for hours but I'm probably not using the proper terminology for what I'm trying to do in my searches.
Thanks!
Sounds like you should use union all:
select year, id, value from x
union all
select 'curr_year_val', id, curr_year_val from y
union all
select 'next_year_val', id, next_year_val from y
order by 2, 1
SQL Fiddle Demo
BTW, other databases would require you to have the same data types for all columns when using union. This works though with mysql.
Uee union
select year, id, value
from tableX
where id ='101'
union
select 'curr_year_val', id, curr_year_val
from tableY
where id ='101'
union
select 'next_year_val', id, next_year_val
from tableY
where id ='101'
I have table with name orders:
id id_o value date
1 1 400 2014-09-30
2 1 300 2014-09-30
3 1 200 2014-09-30
4 2 100 2014-09-30
5 2 200 2014-09-30
6 3 50 2014-09-29
7 3 100 2014-09-29
8 4 300 2014-09-29
9 5 600 2014-09-28
I need select every order grouped by id_o with sum(value)< 700 and from this selected table i need display data grouped by datum.
I use multiple select:
select date, sum(mno) as mn
from (
select date,sum(value) as 'mno'
from orders
group by id_o
having sum(value)<700
) table_alias
group by date
This is result:
date mn
2014-09-30 300
2014-09-29 450
2014-09-28 600
Is there any possibility to replace or to simplify this correlated query?
Your inner query is invalid. It groups by id_o, but selects by date. In order to solve this, add an additional column to the inner queries grouping (assuming date is always the same for every id_o). You can enable strong checking by enabling the sql_mode's ONLY_FULL_GROUP_BY. Full example in SQL fiddle.
SELECT
date,
SUM(mno) AS mn
FROM (
SELECT
id_o,
date,
SUM(value) AS mno
FROM orders
GROUP BY
id_o,
date
HAVING
SUM(value) < 700
) totalPerOrder
GROUP BY date
MySQL allows this type of queries, but it's not common to do so. Consider the following data:
id id_o value date
1 1 400 2014-09-29
2 1 300 2014-09-30
3 1 200 2014-09-30
What date(s) would SELECT date, SUM(value) FROM orders GROUP BY id_o return? It could be the first, last, average, most common one, but better make it explicit. Any other DBMS wouldn't let you execute this query.
Other than that, I would rename some of the columns to be more expressive. mn, mn_o and id_o are examples of this. Also value describes nothing, anything can be a value. Even the date field could have been called value. The query itself seems fine (take care if possibly missing indexes though).