I have some messy data in text files (2 tables). I'd like to merge it into 1 table but there are duplication issues. My data looks like the following:
Status table
+--------------------------+
| | Last Name | Status | |
+--------------------------+
| | Jones | On Time | |
| | Jones | On Time | |
| | Jones | On Time | |
| | Jones | On Time | |
| | Jones | Missing | |
| | Hoinski | On Time | |
| | Hoinski | Late | |
| | Hoinski | Late | |
| | Hoinski | Missing | |
+--------------------------+
Risk table
+-------------------------+
| | Last Name | Risk | |
+-------------------------+
| | Jones | High | |
| | Jones | High | |
| | Jones | Low | |
| | Jones | Medium | |
| | Jones | Medium | |
| | Jones | Medium | |
| | Jones | Medium | |
| | Smith | Low | |
| | Smith | Medium | |
| | Smith | Medium | |
| | Smith | Medium | |
| | Hoinski | High | |
| | Hoinski | High | |
| | Hoinski | Low | |
+-------------------------+
How can I use sql to aggregate these two tables into 1 table? Is it possible? I know I do not have a proper relationship (many to many) so it doesn't quite make sense. But what if I aggregate the data using Group By statements on the [last name] field?
You're correct GROUP BY will resolve your problem, here's the query.
SELECT * FROM Status
INNER JOIN Risk ON Status.[Last Name] = Risk.[Last Name]
GROUP BY Status.[Last Name]
Resolve duplicated with DISTINCT
SELECT Distinct S.[Last Name] , S.Status, R.Risk
FROM Status S
INNER JOIN Risk R
ON R.[Last Name] = S.[Last Name]
Related
This is my database tables look like :
regular_employee :
+----+---------------+-----------------+-----------+----------------+
| id | employee_name | employee_job | join_date | permanent_date |
+----+---------------+-----------------+-----------+----------------+
| 1 | ZELDA | ACCOUNTANT | 2020/02 | 2020/03 |
| 2 | YUGO | QA | 2020/02 | 2020/04 |
| 3 | XAVIER | GENERAL MANAGER | 2020/02 | 2020/05 |
| 4 | WAVY | FACTORY MANAGER | 2020/01 | 2020/02 |
+----+---------------+-----------------+-----------+----------------+
contract_employee :
+----+---------------+--------------+-----------+
| id | employee_name | employee_job | join_date |
+----+---------------+--------------+-----------+
| 1 | ANTONIO | ACCOUNTANT | 2020/01 |
| 2 | BAGGIO | ENGINEER | 2020/02 |
| 3 | CHARLES | QA | 2020/02 |
| 4 | DAVID | QA | 2020/02 |
+----+---------------+--------------+-----------+
My Goal :
Select all employee_job who joined the company on 2020/02 and distinct / group them by employee_job
What i've tried :
SELECT re.employee_job,ce.employee_job
FROM regular_employee AS re,contract_employee AS ce
WHERE re.join_date = '2020/02' AND ce.join_date = '2020/02'
GROUP BY re.employee_job,ce.employee_job
The result was :
+-----------------+--------------+
| employee_job | employee_job |
+-----------------+--------------+
| ACCOUNTANT | ENGINEER |
| ACCOUNTANT | QA |
| GENERAL MANAGER | ENGINEER |
| GENERAL MANAGER | QA |
| QA | ENGINEER |
| QA | QA |
+-----------------+--------------+
What i was expecting :
+-----------------+
| employee_job |
+-----------------+
| ACCOUNTANT |
| ENGINEER |
| GENERAL MANAGER |
| QA |
+-----------------+
How to do this query ?
A union query might make more sense here:
SELECT employee_job FROM regular_employee WHERE join_date = '2020/02'
UNION
SELECT employee_job FROM contract_employee WHERE join_date = '2020/02';
UNION by default will remove duplicate jobs which might appear in one/both of the two tables.
I have created a table in MySQL but when I display the table, some records are displayed in a crooked manner.
Here's the table displayed:
select * from air_passenger_profile;
+------------+----------+------------+-----------+---------------------------------+---------------+---------------------+
| profile_id | password | first_name | last_name | address | mobile_number | email_id |
+------------+----------+------------+-----------+---------------------------------+---------------+---------------------+
| PFL001 | PFL001 | LATHA | SANKAR | 123 BROAD CROSS ST,CHENNAI-48 | 9876543210 | LATHA#GMAIL.COM |
| PFL002 | PFL002 | ARUN | PRAKASH | 768 2ND STREET,BENGALURU-20 | 8094564243 | ARUN#AOL.COM |
| PFL003 | PFL003 | AMIT | VIKARAM | 43 5TH STREET,KOCHI-84 | 9497996990 | AMIT#AOL.COM |
| PFL004 | PFL004 | AARTHI | RAMESH | 343 6TH STREET,HYDERABAD-76 | 9595652530 | AARTHI#GMAIL.COM |
| PFL005 | PFL005 | SIVA | KUMAR | 125 8TH STREET,CHENNAI-46 | 9884416986 | SIVA#GMAIL.COM |
| PFL006 | PFL006 | RAMESH | BABU | 109 2ND CROSS ST,KOCHI-12 | 9432198760 | RAMESH#GMAIL.COM |
| PFL007 | PFL007 | GAYATHRI | RAGHU | 23 2ND CROSS ST,BENGALURU-12 | 8073245678 | GAYATHRI#GMAIL.COM |
| PFL008 | PFL008 | GANESH | KANNAN | 45 3RD ST,HYDERABAD-21 | 9375237890 | GANESH#GMAIL.COM |
+------------+----------+------------+-----------+---------------------------------+---------------+---------------------+
You place it into your database with spaces. At the point where you insert your variables into the databse, you could use PHP's trim() function, or MySQL's, to store it without the spaces.
To correct your current values:
UPDATE air_passenger_profile SET first_name = TRIM(first_name), etc...
I'm working on a MySQL database and I need to query the database and find out the users with more than one order. I tried using COUNT() but I cannot get it right. Can you please explain the correct way to do this?.
Here are my tables:
User
+-------------+----------+------------------+------------+
| userID | fName | email | phone |
+-------------+----------+------------------+------------+
| adele012 | Adele | aash#gmail.com | 0123948498 |
| ana022 | Anna | ashow#gmail.com | 0228374847 |
| david2012 | David | north#gmail.com | 902849302 |
| jefAlan | Jeffery | jefal#gmail.com | 0338473837 |
| josquein | Joseph | jquein#gmail,com | 0098374678 |
| jweiz | John | jwei#gmail.com | 3294783784 |
| jwick123 | John | jwik#gmail.com | 0998398390 |
| kenwipp | Kenneth | kwip#gmail.com | 0112938394 |
| mathCler | Maththew | matc#gmail.com | 0238927483 |
| natalij2012 | Natalie | nj#gmail.com | 1129093210 |
+-------------+----------+------------------+------------+
Orders
+---------+------------+-------------+-------------+
| orderID | date | User_userID | orderStatus |
+---------+------------+-------------+-------------+
| 1 | 2012-01-10 | david2012 | Delivered |
| 2 | 2012-01-15 | jweiz | Delivered |
| 3 | 2013-08-15 | david2012 | Delivered |
| 4 | 2013-03-15 | natalij2012 | Delivered |
| 5 | 2014-03-04 | josquein | Delivered |
| 6 | 2014-01-15 | jweiz | Delivered |
| 7 | 2014-02-15 | josquein | Delivered |
| 8 | 2015-10-12 | jwick123 | Delivered |
| 9 | 2015-02-20 | ana022 | Delivered |
| 10 | 2015-11-20 | kenwipp | Processed |
+---------+------------+-------------+-------------+
select user_userID, count(*) as orders_count from orders
group by user_userID having orders_count > 1
if you want additional data from your users table, you can do:
select * from user where user_id in (
select user_userID as orders_count from orders
group by user_userID having orders_count > 1
)
I have an access table containing over 100,000 records. My problem is that many of the records have duplicate information. I would like to merge/combine the records into record.
I have a field (CommonField) that can be used to identify the duplicates (sometimes more than two records). Each field needs to be considered on an individual basis. For instance:
If the date fields are not equal, I would prefer to keep the most recent date.
If the count fields are not equal, I would prefer to keep the larger value.
If the company names are not equal, I would prefer to keep both names unless one is within the other.
CLICK HERE for a sample of the data:
+------------------+-------------+-------+-------+------------------+-----------+------------+--------+-----------------------------+
| Existing Records | | | | | | | | |
+------------------+-------------+-------+-------+------------------+-----------+------------+--------+-----------------------------+
| ID | CommonField | First | Last | Email | Date | Currency | Count | Company |
| 1 | AA123 | John | | | | $465,000 | | ABC Company Ltd |
| 2 | AA123 | John | | John#gmail.com | 1-Mar-78 | $465,000 | 87,000 | ABC Company |
| 3 | AA123 | | Doe | | 14-Mar-78 | $465,000 | 88,000 | |
| 4 | BB456 | Dave | Smith | | 1-Apr-92 | $1,200,000 | 5,000 | Carter Company |
| 5 | BB456 | | Smith | Dave#aol.com | 1-Apr-92 | $1,200,000 | 5,000 | Simpson Ltd |
| 6 | CC568 | | | Jane#hotmail.com | 1-Sep-05 | $60,000 | | Woods Holdings |
| 7 | CC568 | | Woods | Jane#hotmail.com | | | 40,000 | Woods |
| 8 | CC568 | Jane | Woods | | 1-Sep-05 | | | |
| 9 | DD211 | Bob | Burns | Bob#gmail.com | 5-Aug-01 | $678,100 | 21,400 | |
| | | | | | | | | |
| Desired Result | | | | | | | | |
| ID | CommonField | First | Last | Email | Date | Currency | Count | Company |
| 10 | AA123 | John | Doe | John#gmail.com | 14-Mar-78 | $465,000 | 88,000 | ABC Company Ltd |
| 11 | BB456 | Dave | Smith | Dave#aol.com | 1-Apr-92 | $1,200,000 | 5,000 | Carter Company, Simpson Ltd |
| 12 | CC568 | Jane | Woods | Jane#hotmail.com | 1-Sep-05 | $60,000 | 40,000 | Woods Holdings |
| 13 | DD211 | Bob | Burns | Bob#gmail.com | 5-Aug-01 | $678,100 | 21,400 | |
+------------------+-------------+-------+-------+------------------+-----------+------------+--------+-----------------------------+
I am interested in hearing your suggestions as to the best way of tackling this project.
Ugly.
I think for the name fields, you may need another table to combine names. I'd start by making a new table from a group by query on both the common id and the company name. Add an extra field for the standardized name to the table, then use a find duplicates query to look at all the common ids with more than one name and manually assign a standardized name.
Then you can bring both the original data table and the company names table into a group by query and pull the standardized name into the final result. For the data and count fields, you can use max(date) and max(count). This should work for the first, last and email text fields also - but you will want to manually examine the results pretty carefully.
I'm stuck with some tables in mysql. Don't really know how to join the info from three tables. Very thankful if anyone could help me. Thanks.
This is what I have:
Table1.Users
+----+--------+--------------+
| id | name | lastname |
+----+--------+--------------+
| 1 | Peter | Elk |
| 2 | Amy | Lee |
| 3 | James | Ride |
| 4 | Andrea | Thompson |
+----+--------+--------------+
Table2.Projects
+-----+-------------+
| id | name |
+-----+-------------+
| 13 | Lmental |
| 26 | Comunica |
| 28 | Ecobalear |
| 49 | Puigpunyent |
+-----+-------------+
Table3.Users_Projects
+----------+-------------+
| id_users | id_projects |
+----------+-------------+
| 1 | 13 |
| 1 | 28 |
| 2 | 13 |
| 2 | 28 |
| 2 | 49 |
| 3 | 28 |
| 3 | 49 |
| 4 | 49 |
+----------+-------------+
And I would like to print something like this:
+--------+--------------+----------------------------------+
| name | lastname | project |
+--------+--------------+----------------------------------+
| Peter | Elk | Lmental,Ecobalear |
| Amy | Lee | Lmental,Ecobalear, Puigpunyent |
| James | Ride | Ecobalear,Puigounyent |
| Andrea | Thompson | Puigpunyent |
+--------+--------------+----------------------------------+
Something like...
SELECT Users.name, Users.lastname, Projects.name
FROM (Users, Projects, Users_Projects)
WHERE Users_Projects.id_users=Users.id AND Users_Projects.id_projects=Projects.id
ORDER BY ...
...will output a single user/project per line, which you'll then have to manipulate in your choosen language.
Attempting to perform the concatenation, etc. in SQL is liable to lead to a pretty horrendous query.