Sql query to clean up junk record - mysql

Somehow a table is having junk data, need to clean it up and generate a new table.
I think it should use case or some row_number over, tried a few, failed.
Database is mysql.
original table:
Student Registration Course
John CS
John 2018
John 2017
Peter 2019 MATH
Mary 2016 MATH
Mary 2016 CS
The rule is, if we have duplicate records for a student, merge them together, for Registration, take max of year. If no any columns is missing, like Mary. order by Course asc, take first record. so the result will be :
Student Registration Course
John 2018 CS
Peter 2019 MATH
Mary 2016 CS

It looks like you want aggregation:
select student
, max(registration) as registration
, min(course) as course
from original
group
by student;

SELECT Student, MAX(Registration), MAX(Course)
-- or MIN(Course) if you want the first alphabetical
FROM YourTable
GROUP BY Student

Related

Trouble with Group By and Having in SQL

I am trying to learn Group By and Having but I can't seem to understand what happened here. I used w3shools SQL Tryit Editor.
The table I created is:
name age country
------------------------
Sara 17 America
David 21 America
Jared 27 America
Jane 54 Canada
Rob 32 Canada
Matthew 62 Canada
The Query I used:
select
sum(age), country
from
NewTable
group by
country
having
age>25;
I expected the query to categorize the information by country and use age>25 filter to create the results but here is the output:
sum(age) country
--------------------
65 America
148 Canada
What happened?! The result is sum of American and Canadian people in all ages.
The piece you're missing is specific to the having keyword. Using the having clause in your query is applied to the dataset after the grouping occurs.
It sounds like you are expecting the records with age less than 25 to be excluded from your query before grouping occurs. But, the way it works is the having clause excludes the total age for each group that sums to a total over 25.
If you want to exclude individual records before totaling the sum of the age, you could do something like this (using a where clause which is applied prior to grouping):
select sum(age), country from NewTable where age > 25 group by country;
A where clause puts a condition on which rows participate in the results.
A having clause is like a where, but puts a condition on which grouped (or aggregated) values participate in the results.
Either, try this:
select sum(age), country
from NewTable
where age > 25 -- where puts condition on raw rows
group by country
or this:
select sum(age), country
from NewTable
group by country
having sum(age) > 25 -- having puts a condition on groups
depending on what you're trying to do.

Distinct is not working in crystal reports and in mysql

id_no doc_id item_no product customer
123 2 1 A Daisy
123 2 9 A Ben
123 4 3 A Daisy
123 4 4 A Ben
123 6 11 B Daisy
123 6 13 B Ben
when I put it in my report it results to
Daisy Daisy
Ben
And it is also the result in mysql
select distinct customer from receipt where id_no like '123'
result:
Daisy
Daisy
Ben
Another query that I tried:
select distinct id_no, customer, product from receipt where id_no like '123'
result:
123 Daisy A
123 Daisy B
123 Daisy A
123 Ben A
123 Ben B
desired result:
Daisy
Ben
Please help me please.
Thank you guys for the help I found out why the other one keeps on showing. It is because the other Daisy is spelled as Daissy that's why.
Most likely your Customer name contains additional characters between the two records. Depending on how the datatype is implemented, spaces could matter and have contributed to the difference.
Try concatenating a character before and after customer.
I am unfamiliar with the concepts in Crystal Reports, but from what I understand, you would have to create a formula like so:
"XXX" & {Receipt.Customer} & "XXX"
If you run it again, you might recognize there is additional space like so:
XXXDaisyXXX
XXXDaisy XXX
^____ Additional Space
There is no chance of error while you using distinct ..it should return distinct value ...any way you can try another way
SELECT customer FROM receipt WHERE id_no like '123' GROUP BY customer
I don't see why you are fetching three records. I tried implementing your database and ran your query. It returned the result as expected.
See the above pic. There may be some issue with the data type you used. You may try grouping via customer, but I don't think it should affect your result anyway.
Also Check if the data types match.
The selection you made from customer id and id_no is unique and with distinct it should return only two rows
plase try this code
i get solution
select distinct `customer` from receipt where `id_no`='123'
this is right
i tryied this is my past project
best of luck

Getting data from multiple tables into single row while concatenating some values

I'm trying to retrieve data from tables and combine multiple rows into a single column, without repeating any information.
I have the following tables: profile, qualification, projects.
Profile
pro_id surname firstname
------ ------- ----------
1 John James
2 King Fred
3 Luxury-Yachts Raymond
Qualification
pro_id Degree School Year
------ ------ ------ -----
1 MBA Wharton university 2002
1 LLB Yale University 2001
2 BSc Covington University 1998
2 BEd Kellog University 1995
Projects
pro_id Title Year
------ ------ ------
1 Social Networking 2003
1 Excavation of aquatic debris 2007
2 Design of solar radios 1992
2 Development of expert systems 2011
I want to retrieve the all of the information for each person, with each person appearing only once in the result. The info on qualifications and projects should each be in their own column (one column for qualifications, another for projects), separated by commas. For example, the results for the above sample data should be:
1 John James MBA Wharton university 2002, LLB Yale University 2001 Social Networking 2003, Excavation of aquatic debris 2007, Design of Solar panels 2008
2 King Fred BSc Covington University 1998, BEd Kellog University 1995, Msc MIT 2011 Design of solar radios 1992, Development of expert systems 2011
3 Raymond Luxury-Yachts
Currently, I have the query:
SELECT pro_id,
surname,
firstname,
group_concat(degree,school,year) AS qual,
concat(Title,year) AS work
FROM profile,
LEFT JOIN qualification
ON qualification.pro_id = profile.pro_id
JOIN projects
ON projects.pro_id = profile.pro_id
GROUP BY pro_id
For the sample data, this query results in:
1 John James MBA Wharton university 2002, Social Networking 2003
1 John James LLB Yale University 2001, Excavation of aquatic debris 2007
1 John James MBA Wharton university 2002, Social Networking 2003, Excavation of aquatic debris 2007
etc
Note: Raymond Luxury-Yachts isn't present in the current result.
I don't want duplicate result records. Also if the surname does not have any entry in the qualification and projects table, I want the query to return the name and display an empty field in the qualification and projects table instead of omitting them altogether.
Replace LEFT JOIN with JOIN
Select pro_id, surname, firstname, group_concat(degree,school,year) as qual,concat(Title,year) as work
from profile
join qualification on qualification.pro_id = profile.pro_id
join projects on projects.pro_id = profile.pro_id group by pro_id
What is the difference between "INNER JOIN" and "OUTER JOIN"?
Using Join will fix the issue with displaying values even if there are no records in the projects table.
For the first question, you can try making a stored function and calling it from the select statement. This function will take pro_id as parameter, create the concatenated string and return it. That's the only solution for MySQL that I can think of at the moment.
I think you are close on your thoughts of group_concat. However, with possible No values (thus leaving nulls), can cause problems. I would have each secondary table pre-concatinated by person's ID and join to THAT result. Eliminates the problem of nulls
SELECT
p.pro_id,
p.surname,
p.firstname,
PreQConcat.UserQual,
PrePJConcat.UserWork
FROM
profile p
LEFT JOIN
( select q.pro_id,
group_concat( q.degree, q.school, q.year) AS UserQual
from
qualification q
group by
q.pro_id ) PreQConcat
ON p.Pro_ID = PreQConcat.pro_id
LEFT JOIN
( select pj.pro_id,
concat(pj.Title, pj.year) AS UserWork
from
projects pj
group by
pj.pro_id ) PrePJConcat
ON p.Pro_ID = PrePJConcat.pro_id
You are going through all people anyhow, and want all their respective elements (when they exist) grouped, so why group on a possibility it doesn't exist. Let the JOINED queries run once each, complete with a single result grouped by only those people it had data for, then join back to the original profile person.

Extract values of a sorted request with SQL

I have this sql table with people's name and ages.
Bob 28
Bryan 30
Jim 25
John 42
Bill 22
Sam 28
Tom 26
I would like to make a sql command to order all people by age desc, find a name in it, a return the preceding one, the founded and the next one with their position.
For example, admit that I would like to find Tom, my request should return :
Name Age Rank
Jim 25 2
Tom 26 3
Bob 28 4
Jim has the number 2 because Bill is the youngest
Is it possible to do something like this ?
Thanks in advance for any help
SQL isn't suited for row-based operations. There's no easy way to do "find a row where some condition(s) = true, then return the previous row" in a single query. You can do it in a couple steps, though:
a) Run one query to retrieve 'Tom' and his age (26).
b) Run another query to get the next older person
SELECT name, age FROM ... WHERE age > 26 ORDER BY age ASC LIMIT 1
c) Repeat but for next younger:
SELECT name, age FROM ... WHERE age < 26 ORDER BY age DESC LIMIT 1
This'll fetch people who are at least 1 year old/younger... You don't specify what happens if there's multiple people of the same age (e.g. There's Fred who's also 26, or Doug and Elmer who are both 25), so I'm ignoring those conditions.

How can I write a SQL query with multiple COUNTs for different GROUP BYs?

One of my coworkers is working on a SQL query. After several joins (workers to accounts to tasks), she's got some information sort of like this:
Worker Account Date Task_completed
Bob Smith 12345 01/01/2010 Received
Bob Smith 12345 01/01/2010 Received
Bob Smith 12345 01/01/2010 Processed
Sue Jones 23456 01/01/2010 Received
...
Ultimately what she wants is something like this - for each date, for each account, how many tasks did each worker complete for that account?
Worker Account Date Received_count Processed_count
Bob Smith 12345 01/01/2010 2 1
... and there are several other statuses to count.
Getting one of these counts is pretty easy:
SELECT
COUNT(Task_completed)
FROM
(the subselect)
WHERE
Task_completed = 'Received'
GROUP BY
worker, account, date
But I'm not sure the best way to get them all. Essentially we want multiple COUNTs using different GROUP BYs. The best thing I can figure out is to copy and paste the subquery several times, change the WHERE to "Processed", etc, and join all those together, selecting just the count from each one.
Is there a more obvious way to do this?
SELECT worker, account, date,
SUM(task_completed = 'Received') AS received_count,
SUM(task_completed = 'Processed') AS processed_count
FROM mytable
GROUP BY
worker, account, date