How to loop through this specifig relation model in SQL? - mysql

I have a database with a table called "Relations" that looks as follows:
Relations (PersonId1, PersonId2, RelationTypeId)
The primary key is (PersonId1, PersonId2, RelationTypeId)
There are two other tables, referencing to the foreign keys but that does not really matter here.
So a relation is defined for example (Mary, Andre, 3) where 3 would be referenced to an other table and would mean for example ("a friend").
My requirement is to see all friends of a specifig person but also the friends of the persons friends, so not only the first layer but also the second.
For example this would be the relation table
Andre Mary 3
Mary Carl 3
Chris James 3 (irrelevent in our case)
So i want a query where I have the PersonId of Andre and the RelationTypeId. The result should be this:
Andre Mary 3
Mary Carl 3
In my understanding it is not possible to build a query that would give this result, but i am not sure, that is why i want to know it.
Hope you understand my question, thanks in advance.

Below query will return the list of friends of person1 and their friends.
select
distinct personId2
from
relations
where
personId1 in (select distinct personId2 from relations where personId1 = <person_name>)
or personId1 = <person_name>

It’s a recursive CTE (common table expression). It’ll process query results multiple times because the main SELECT query calls the CTE part recursively. CTE is a part of a SELECT query (starts with WITH). This code will return the data subsets you’re looking for.
I use it to boost data access efficiency, when I need to, e.g., select, paginate, or display page rows linked with a specific page, etc. It’s works in actual for MySQL 8.

Related

Data inconsistencies between two tables

I have an SQL question in which I am struggling to understand and find relevant resources to help me.
The question is:
"Write an SQL query to identify data inconsistencies between two tables."
I need to compare the following tables of data:
AssetManager
AssetManagerName
John Doe
Joe Smith
Dave Grey
Lisa Sparks
Kate Green
Trip
PropertyCode
AssetManagerName
Date
P001
John Doe
2022-01-22
P001
Joe Smith
2022-01-19
P002
Dave Grey
2022-02-25
P002
John Doe
2022-04-23
P003
Kate Greens
2022-02-25
P004
Joe Smith
2022-05-29
P002
Dave Grey
2022-01-25
P001
John Doe
2022-02-24
Image translated to text from Original Source
What are the inconsistencies in this case? Is it maybe that "Kate Green" is in the AssetManager table, and you have "Kate Greens" in the Trip table? That's the only thing I can see.
What MySQL commands could I use that would help me to achieve this query?
In SQL, when we talk about inconsistencies, we are generally referring to data that would not correctly translate into a normalised form, when we try to join between tables this would result in missing data or orphaned rows. Commonly inconsistencies arise when there is no referential constraints in a schema to maintain consistency. In such cases simple spelling mistakes can easily creep into the dataset, but entirely wrong values could also be used. In this case, If there is a table that represents all the possible Asset Managers, then we would expect that in other tables that refer to Asset Managers that only values from the Asset Managers table would be used, spelling mistakes and entirely missing names will be treated the same.
In the Trip Table we can identify inconsistency with the AssetManager table by looking for any records in Trip that do not have a match in AssetManager using the AssetManagerName column.
One simple way to do this is to use an OUTER JOIN and to exclude all the matches:
SELECT Trip.*
FROM Trip
LEFT OUTER JOIN AssetManager ON Trip.AssetManagerName = AssetManager.AssetManagerName
WHERE AssetManager.AssetManagerName IS NULL
This returns the following result: (See db-fiddle)
PropertyCode
AssetManagerName
Date
P003
Kate Greens
2022-02-25
The LEFT OUTER JOIN (or LEFT JOIN) will return all the rows from the Trip table, even if there is no corresponding match in the AssetManager table on the AssetManagerName column. For the rows that do not match, all the values for the AssetManager table in the result set will be NULL.
We can then use a WHERE clause to exclude all the matches data records and only return those records that DO NOT MATCH, we do this by only allowing where AssetManager.AssetManagerName has a null value.
There are no records in Trip with a legitimate null value in the AssetManagerName, the null only exists in the recordset at a result of the LEFT OUTER JOIN evaluation.
You could also use a NOT EXISTS Clause, this syntax is sometimes easier to read and identify the intent, we want to find the records that DO NOT MATCH. But specifically in MySQL it's execution plan generally less efficient than the LEFT OUTER JOIN expression above.
SELECT Trip.*
FROM Trip
WHERE NOT EXISTS (
SELECT AssetManager.AssetManagerName
FROM AssetManager
WHERE AssetManager.AssetManagerName = Trip.AssetManagerName
)
Another variation of this is to use NOT IN. For this query we first evaluate a list of possible values for AssetManagerName and use that to identify the values that do not match.
This is helpful when there might be some legitimate null values in either of the tables for AssetManagerName as IN handles NULL values differently to EXISTS
SELECT Trip.*
FROM Trip
WHERE Trip.AssetManagerName NOT IN (
SELECT AssetManager.AssetManagerName
FROM AssetManager
)
For an interesting analysis of these options and performace considerations have a read over this article:
NOT IN vs. NOT EXISTS vs. LEFT JOIN / IS NULL: MySQL

Access: Counting Number of Occurrences in 2 Columns [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I'm working on a database for work, and I need to figure out a way for Access to count the number of projects that each employee is assigned. Projects have 1 or 2 employees assigned, and my boss needs to be able to quickly figure out how many projects each person is working on. Below is an example table:
Project Employee 1 Employee 2
Project A John Doe Jane Doe
Project B Jane Doe Sam Smith
Project C Jane Doe John Doe
Project D Sam Smith Anna Smith
Project E Anna Smith John Doe
And here is the result I'm looking for:
**Employee # of Projects**
John Doe 3
Jane Doe 3
Sam Smith 2
Anna Smith 2
The table you described is probably not the best way to store the data and I think it's only making your job more difficult. The value of a relational database is that you can have data living in different tables but related based on primary/ foreign keys which makes it significantly easier to pull reports like the one you described. It seems to me like this table might have previously lived in Excel, and I would spend some time now establishing relationships in Access which will save you time and headaches later. I would suggest creating 3 separate tables: employees, projects, and project employee assignments.
The employee table should have 3 fields: EmployeeID, which should be set to AutoNumber in Design view and then selected as the primary key, First Name, and Last Name, both short text fields. This EmployeeID field will be referenced in the project employee assignments table.
The projects table should have 2 fields: ProjectID, also set to AutoNumber in Design view and selected as the primary key, and ProjectName which will also be a short text field. You can also add other fields, perhaps a text field for ProjectDescription would be helpful later on.
The Project-Employee Assignments table should have 2 fields: EmployeeID and ProjectID. If you aren't familiar with one-to-one, one-to-many, and many-to-many relationships I would suggest looking it up- you are describing a many-to-many relationship between the projects and employees, that is, one project can have many employees and one employee can be involved in many projects. This table exists to establish those relationships between employees and projects.
From here, go to the database tools tab and select Relationships. You'll need to establish a one-to-many relationship between the Employees table and the Assignments table on the EmployeeID field. You'll also need to establish a one-to-many relationship between the Projects table and the Project-Employee Assignments table on the ProjectID field.
Enter each relationship between projects and employees in the Assignments table. If you have a short list of projects and employees, you can do this directly in the table, but I'd suggest creating a form to do this with 2 combo boxes that each select from the lists of existing projects and employees, respectively. There are many tutorials about creating combo boxes that show informative columns, like employee name, but save the ID numbers to the table. Search "Bind Combo Box to Primary Key but display a Description field" for one example.
Finally, create a query to count projects per employee. You should include your Employees table, as well as your Project-Employee Assignments table. Select FirstName and LastName from the Employees table. Select both columns (EmployeeID and ProjectID) from the Project-Employee Assignments table. Unclick "show" for EmployeeID. Right-click anywhere in the query to get a menu of more options and click the sigma for totals. Set the total for EmployeeID, FirstName, and LastName to "Group By" and for ProjectID to "Count" then save the query. Run the query and enjoy having your totals!
Elizabeth Ham's answer is very thorough and I recommend following her advice, but knowing that sometimes we don't have time to do a complete overhaul, here's some instructions on how to get results from the given table structure. As Elizabeth and I pointed out (in my comment), a single query could have gotten the requested data if the tables were complete and properly normalized.
Because there are multiple employee columns for which you want statistics, you need to join the given table at least twice, each time grouping on a different column and using a different alias. It is possible to do this using the visual Design View, however it is usually easier to post questions and answers on StackOverflow using SQL text, so that's what follows. Just paste the following code into the SQL view of a query, then you should be able to switch between SQL view and Design View.
Save the following SQL statements as two separate, named queries: [ProjectCount1] and [ProjectCount2]. Saving them allows you to refer to these queries multiple times in other queries (without embedding redundant subqueries):
SELECT P.[Employee 1] As Employee, Count(P.Project]) As ProjectCount
FROM Project As P
GROUP BY P.[Employee 1];
SELECT P.[Employee 2] As Employee, Count(P.[Project]) As ProjectCount
FROM Project As P
GROUP BY P.[Employee 2];
Now create a UNION query for the purpose of creating a unique list of employees from the two source columns. The UNION will automatically keep only distinct values (i.e. remove duplicates). (By the way, UNION ALL would return all rows from both tables including duplicates.) Save this query as [Employees]:
SELECT Employee FROM [ProjectCount1]
UNION
SELECT Employee FROM [ProjectCount2]
Finally, combine them all into a list of unique employees with a total sum of projects for each:
SELECT
E.Employee As Employee, nz(PC1.ProjectCount, 0) + nz(PC2.ProjectCount, 0) As ProjectCount
FROM
([Employees] AS E LEFT JOIN [ProjectCount1] As PC1
ON E.[Employee] = PC1.[Employee])
LEFT JOIN [ProjectCount2] As PC2
ON E.[Employee] = PC2.Employee
ORDER BY E.[Employee]
Note 1: The function nz() converts null values to the given non-null value, in this case 0 (zero). This ensures that you'll get a valid sum even when an employee appears in only one column (and as such has a null value in the other column).
Note 2: This solution will double count an employee if it's listed as both [Employee 1] and [Employee 2] in the original table. I assume that there are proper constraints to exclude that case, but if needed, one could do a self join on the second query [ProjectCount2] to exclude such double entries.
Note 3: If you do decide to follow Elizabeth's advice and you already have a lot of data in the existing structure, the above queries can also be useful in generating data for the new, normalized table structure. For instance, you could insert the unique list of employees from the above UNION query directly into a newly normalized [Employee] table.

MYSQL select query on multiple tables

I'm not seeing a clean way to write this query without subselects which I avoid because they are generally not portable, and harder to read and debug than individual queries.
Table A has exactly 2 foreign keys to table B, which are always different, but always defined. Sort of like:
MARRIAGE_TABLE
M_KEY
LAST_NAME
PERSON_HUSBAND_FK
PERSON_WIFE_FK
PERSON_TABLE
PERSON_KEY
SEX
FIRST_NAME
The PERSON_HUSBAND_FK will always point at a SEX=MALE, and the WIFE_FK will always point at a female. There will always be one of each. (this is in no way a statement on same-sex marriage BTW I'm all for it)..
I want to create a result like:
MARRIAGE HUSBAND WIFE
-------- ------- ----
SMITH TOM KATHY
JONES BILL EVE
My current approach is to get all records from the MARRIAGE TABLE and store them in a hash. Then I augment the hash with names {wife_name} and {husband_name} using 2 more queries using the husband and wife FK's. Then I format and print the hash. It works, but I'm not wild about 3 queries per row.
I'm not sure I ever encountered a table having >1 FK to another table. I've done years of table-design, but I'm not really sure this design even meets normalization. It seems like no, to me. Like they created a many-many without an intermediate table; a cheat?
Just join table PERSON_TABLE twice:
SELECT m.last_name AS marriage, p1.first_name AS husband, p2.first_name AS wife
FROM marriage_table m
INNER JOIN person_table p1 ON p1.person_key = m.person_husband_fk
INNER JOIN person_table p2 ON p2.person_key = m.person_wife_fk

How to avoid duplicates in following SQL scenario

I have a table called LIKES as follows.
As you can see it is having two columns. UserName1, UserName2.
What this table contains is that, If one person follow other persons facebook page etc.
For example, If Jon follow bobs page then there is a entry in the table as Jon, bob, If bob follows Jon facebook page, then there is a entry called Bob, Jon.
So I want to find out all the users who are following each others profile and I want it without duplicates.
I have following query, which give results of finding users who follow each others profile. but I am not able to remove duplicates
SELECT L1.USERNAME1, L2.USERNAME2
FROM LIKES L1,
LIKES L2
WHERE L1.USERNAME1=L2.USERNAME2
AND L1.USERNAME2=L2.USERNAME1
Final output from the given table should be Jon Bob, or Bob , Jon, not the both.
my query gives the both results, How can I remove the duplicates in the resluts
First, don't use comma-style joins. That syntax has been outdated for a long time. Second, one way you can avoid duplicates in this case is to require that the first name you report in your result set occur before the first alphabetically. You can do this safely because any pair of names that will appear in your result set must appear in the source table in both orders (e.g. ("Bob", "Jon") and ("Jon", "Bob")). I am assuming here that you don't need to deal with the case of a user who follows his own page. For instance:
select *
from likes L1
where
L1.username1 < L1.username2 and
exists (select 1 from likes L2 where L1.username1 = L2.username2 and L1.username2 = L2.username1);
Result:
username1 username2
Bob Jon
Click here for a SQL fiddle that demonstrates this approach using your sample data.
It looks a little crazy, but this actually works:
select min(t.username1) as username1,
max(t.username2) as username2
from likes t
group by least(t.username1, t.username2),
greatest(t.username1, t.username2)
having count(distinct t.username1) = 2
SQLFiddle
EDIT Added the having clause to deal with my misunderstanding of OP's question

Newsletter Categories in one row like 1,2 - Mysql Simple Database Design

I'am using a simple newsletter-script where different categories for one user are possible. But I want to get the different categories in one row like 1,2,3
The tables:
newsletter_emails
id email category
1 test#test.com 1
2 test#test.com 2
newsletter_categories
id name
1 firstcategory
2 secondcategory
But what Iam looking for is like this:
newsletter_emails
user_id email category
1 test#test.com 1,2
2 person#person.com 1
what's the best solution for this?
PS: The User can select his own Categorys at the profile page. (maybe with Mysql Update?)
SQL and the relational data model aren't exactly made for this kind of thing. You can do either of the following:
use a simple SELECT query on the first table, then in your consuming code, iterate over the result, fetching the corresponding rows from the second table and combining them into a string (how you'd do this exactly depends on the language you're using)
use a JOIN on both tables, iterate over the result set and accumulate values from table 2 as long as the ID from table 1 remains the same. This is harder to code than the first solution, and the result set you're pulling from the DB is larger, but you'll get away with just one query.
use DBMS-specific extensions to the SQL standard (e.g. GROUP_CONCAT) to achieve this. You'll get exactly what you asked for, but your SQL queries won't be as portable.
This is a many-to-many relationship case. Instead of having comma separated category ids make an associative table between newsletter_emails and newsletter_categories like user_category having the following schema:
user_id category
1 1
1 2
2 1
This way you won't have to do string processing if a user unsubscribes from a category. You will just have to remove the row from the user_category table.
Try this (completely untested):
SELECT id AS user_id, email, GROUP_CONCAT(category) AS category FROM newsletter_emails GROUP BY email ORDER BY user_id ASC;