mysql table problem? - mysql

i have these two tables tables for a chatting app
users{user_id,username,pictures}
chat_data(con_id, chat_text}
i used this sql query
SELECT c.chat_text, u.username
FROM chat_data c, users u
WHERE c.con_id =1
but its giving me duplicate results, when i know thiers only row with the con_id =1, what is the problem with the query!! :))

You need to "join" the tables to avoid duplicates. For example
SELECT c.chat_text, u.username
FROM chat_data c, users u
WHERE c.con_id =1
and u.id = c.user_id
You can read a bit about relational algebra which is the theory behind relational databases.

The users and chat_data tables should be JOINED in order to get a unique tuple as result.
Since users and chat_data cannot be joined, you simply get Cartesian product of the two tables.
Cartesian Products
If two tables in a join query have no
join condition, then Oracle Database
returns their Cartesian product.
Oracle combines each row of one table
with each row of the other. A
Cartesian product always generates
many rows and is rarely useful. For
example, the Cartesian product of two
tables, each with 100 rows, has 10,000
rows. Always include a join condition
unless you specifically need a
Cartesian product. If a query joins
three or more tables and you do not
specify a join condition for a
specific pair, then the optimizer may
choose a join order that avoids
producing an intermediate Cartesian
product.
Refer: http://www.stanford.edu/dept/itss/docs/oracle/10g/server.101/b10759/queries006.htm

With a query like that, you will get as many rows in the results as there are rows in the table users.

It's because of the type of join the SQL is doing. Is it returning a row for each user you have? This is what I expect it is doing, i.e. if you have 2 users John and Jack then are you getting a row for both of these users being returned?
Are you just trying to get the data related to the users involved in a conversation? If so you need some link between the 2 tables, like foreign key references from the chat_data table referencing users.

As previously stated, you're missing a link between the two tables. If you are trying to retrieve the user associated to a particular chat you will need to add a foreign key reference in chat_data that references user.user_id. But if you are trying to get multiple users associated to a chat you would need to add a new table. Your new tables would look something like this:
users{user_id,username,pictures}
chat_data(con_id, chat_text}
user_chat(user_id, con_id) //By adding this new table you can have multiple users per chat
The the query would look something like
SELECT u.username, c.chat_text
FROM users u, chat_data c, user_chat uc
WHERE u.id = uc.id
AND c.con_id = uc.con_id

Related

MySQL Select statement based on multiple joins

I have a select statement that requires me to join multiple tables (4 tables).
My tables are the following:
Teams
Team_User
Tournament_User
Tournaments
I need to get the teams from a certain tournament. My logic is at it follows:
In Tournament_User table i can find the users that are in a tournament. In Team_User i can find the users that are in a team.
To get the teams from a certain tournament I tried the following query:
SELECT t.id FROM Teams t
JOIN Team_User tu on tu.team_id = t.id
JOIN Tournament_User tru on tru.user_id = tu.user_id
JOIN Tournaments tr on tr.id = tru.tournament_id
WHERE tr.id = "tournamentId";
It gets me the correct teams, but it duplicates them.
I also added DISTINCT which it gets me the correct teams and without duplicating them, but I wonder if I can retrieve the records as expected using only joins and without DISTINCT.
Also, my records can't contain duplicates and there are no duplicates, I somehow managed to bring them duplicated based on my query.
I presume there is a Users table in your schema. There is a many-to-many relation between Teams and Users as well as a many-to-many relation between Users and Tournaments. That means each tournament will be related to many users, which in turn means that even if all users are from the same team, your query result will have each team as many times as there are users from it in the given tournament. The nature of the relations between these tables necessitates that you use DISTINCT.

Inner join three tables out of which one is not connected with other

I have three tables client_invoices ,contract_additional_info and contract out of which
client_invoices is connected with contract_additional_info and contract_additional_info is connected with contract table .
contract table and client_invoices table don't have any relations.
Now I am running following query
SELECT client_invoices.markup_type,
client_invoices.supplier_invoice_number,
client_invoices.client_payment_req_id,
client_invoices.net_amount,
client_invoices.markup_value,
client_invoices.net_qty,
client_invoices.markup_value,
contract.clientId as buyerClientId,
contract_additional_info.buyer_contract_id as contract_id
FROM client_invoices
INNER JOIN contract_additional_info ON contract_additional_info.contract_id =client_invoices.contract_id
INNER JOIN contract ON contract_additional_info.buyer_contract_id = contract.id
WHERE client_invoices.status=3 ;
It is giving me duplicate records ,how to fix the query such that it only gives unique records (unique client_invoice.supplier_invoice_number)
Have you tried using SELECT DISTINCT? This should give you only unique records.
Try a GROUP BY with the columns that are identified as primary items to show.
You can start by assuming that one of the joins is returning multiple results which causes the duplicates:
Either multiple contract_additional_info returned for each invoice or multiple contracts for each contract_additional_info.
Based on their names, I would say the former is causing this. If this is true, ask yourself if that is correct. Maybe the database structure is your problem.
If it is correct and you can have the same invoice for multiple contract_additional_info entries, then GROUP BY the column(s) you expect to be unique (check that they are unique at the column level as well). E.g. supplier invoice number and/or client_invoices.contract_id.
You could also join with a SELECT DISTINCT sub query from the contract_additional_info table.

Multi/level query in MySQl - is an extra foreign key necessary?

Lets say that I have 3 tables:
departments, each of which has 0..n
jobs, each of which has 0..n
people
Given a department, how do I get all the people who work in that department? Can I do it with a single SELECT?
I have set up a fiddle with some sample data, but I can't formulate the correct query.
Can it be done with some JOIN magic? Or do I need to add a foreign key in the peeps table, pointing back to the department_id?
A simple join
SELECT people.*
FROM departments
INNER JOIN jobs ON departments.deperatment_id = jobs.deperatment_id
INNER JOIN people ON jobs.job_id = people.job_id
WHERE departments.deperatment_id = 1
You will need to amend the column names in the join conditions to the ones used in your tables.
Note that your sample tables on SQL fiddle do not have any indexes. Adding these is VERY important for performance.

MySQL SELECT from two tables with COUNT

i have two tables as below:
Table 1 "customer" with fields "Cust_id", "first_name", "last_name" (10 customers)
Table 2 "cust_order" with fields "order_id", "cust_id", (26 orders)
I need to display "Cust_id" "first_name" "last_name" "order_id"
to where i need count of order_id group by cust_id like list total number of orders placed by each customer.
I am running below query, however, it is counting all the 26 orders and applying that 26 orders to each of the customer.
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order, customer cus
GROUP BY cust_id;
Could you please suggest/advice what is wrong in the query?
You issue here is that you have told the database how these two tables are 'connected', or what they should be connected by:
Have a look at this image:
~IMAGE SOURCE
This effectively allows you to 'join' two tables together, and use a query between them.
so you might want to use something like:
SELECT COUNT(B.order_id), A.cust_id, A.first_name, A.last_name
FROM customer A
LEFT JOIN cust_order B //this is using a left join, but an inner may be appropriate also
ON (A.cust_id= B.Cust_id) //what links them together
GROUP BY A.cust_id; // the group by clause
As per your comment requesting some further info:
Left Join (right joins are almost identical, only the other way around):
The SQL LEFT JOIN returns all rows from the left table, even if there are no matches in the right table. This means that if the ON clause matches 0 (zero) records in right table, the join will still return a row in the result, but with NULL in each column from right table. ~Tutorials Point.
This means that a left join returns all the values from the left table, plus matched values from the right table or NULL in case of no matching join predicate.
LEFT joins will be used in the cases where you wish to retrieve all the data from the table in the left hand side, and only data from the right that match.
Execution Time
While the accepted answer in this case may work well in small datasets, it may however become 'heavy' in larger databases. This is because it was not actually designed for this type of operation.
This was the purpose of Joins to be introduced.
Much work in database-systems has aimed at efficient implementation of joins, because relational systems commonly call for joins, yet face difficulties in optimising their efficient execution. The problem arises because inner joins operate both commutatively and associatively. ~Wikipedia
In practice, this means that the user merely supplies the list of tables for joining and the join conditions to use, and the database system has the task of determining the most efficient way to perform the operation. A query optimizer determines how to execute a query containing joins. So, by allowing the dbms to choose the way your data is queried, you can save a lot of time.
Other Joins/Summary
AN INNER JOIN will return data from both tables where the keys in each table match
A LEFT JOIN or RIGHT JOIN will return all the rows from one table and matching data from the other table.
Use a join when you want to query multiple tables.
Joins are much faster than other ways of querying >=2 tables (speed can be seen much better on larger datasets).
You could try this one:
SELECT COUNT(cus_order.order_id), cus.cust_id, cus.first_name, cus.last_name
FROM cust_order cus_order, customer cus
WHERE cus_order.cust_id = cus.cust_id
GROUP BY cust_id;
Maybe an left join will help you
SELECT COUNT(order_id), cus.cust_id, cus.first_name, cus.last_name ]
FROM customer cus
LEFT JOIN cust_order co
ON (co.cust_id= cus.Cust_id )
GROUP BY cus.cust_id;

Scalable way of doing self join with many to many table

I have a table structure like the following:
user
id
name
profile_stat
id
name
profile_stat_value
id
name
user_profile
user_id
profile_stat_id
profile_stat_value_id
My question is:
How do I evaluate a query where I want to find all users with profile_stat_id and profile_stat_value_id for many stats?
I've tried doing an inner self join, but that quickly gets crazy when searching for many stats. I've also tried doing a count on the actual user_profile table, and that's much better, but still slow.
Is there some magic I'm missing? I have about 10 million rows in the user_profile table and want the query to take no longer than a few seconds. Is that possible?
Typically databases are able to handle 10 million records in a decent manner. I have mostly used oracle in our professional environment with large amounts of data (about 30-40 million rows also) and even doing join queries on the tables has never taken more than a second or two to run.
On IMPORTANT lessson I realized whenever query performance was bad was to see if the indexes are defined properly on the join fields. E.g. Here having index on profile_stat_id and profile_stat_value_id (user_id I am assuming is the primary key) should have indexes defined. This will definitely give you a good performance increaser if you have not done that.
After defining the indexes do run the query once or twice to give DB a chance to calculate the index tree and query plan before verifying the gain
Superficially, you seem to be asking for this, which includes no self-joins:
SELECT u.name, u.id, s.name, s.id, v.name, v.id
FROM User_Profile AS p
JOIN User AS u ON u.id = p.user_id
JOIN Profile_Stat AS s ON s.id = p.profile_stat_id
JOIN Profile_Stat_Value AS v ON v.id = p.profile_stat_value_id
Any of the joins listed can be changed to a LEFT OUTER JOIN if the corresponding table need not have a matching entry. All this does is join the central User_Profile table with each of the other three tables on the appropriate joining column.
Where do you think you need a self-join?
[I have not included anything to filter on 'the many stats'; it is not at all clear to me what that part of the question means.]