count number of repeating entries - mysql

I am fairly new to Databases and I am just beginning to understand the DML/queries, I have two tables, one named customer this contain customer data and one named requested_games, this contains games requested by the customers, I would like to write a query that will return the customers that have requested more than two games, so far when I run the query, I don't get the desired result, not sure if I'm doing it right.
Can anyone assist with this thanks,
Below is a snippet of the query
select customers.customer_name, wants_list.requested_game, wants_list.wantslists_id,count(wants_list.customers_ID)
from customers, wants_list
where customers.customers_ID = wants_list.customers_id
and wants_list.wantslists_id = wants_list.wantslists_id
and wants_list.requested_game > '2';

just include a HAVING clause
GROUP BY customers_ID
HAVING COUNT(*) > 2
depending on how you have your data setup you may need to do
HAVING COUNT(wants_list.requested_game) > 2
This is how I like to describe how a query works maybe itll help you visualize how the query executes :)
SELECT is making an order at a restaurant....
FROM is the menu you want to order from....
JOIN is what sections of the menu you want to include
WHERE is any customization you want to make to your order (aka no mushrooms)....
GROUP BY (and anything after) is after the order has been completed and is at your table...
GROUP BY tells your server to bring your types of food together in groups
ORDER BY is saying what dishes you want first (aka i want my entree then dessert then appetizer ).
HAVING can be used to pick out any mushrooms that were accidentally left on the plate....
etc..

I would like to write a query that will return the customers that
have requested more than two games
For this to happen you need to do the following
First you need to use GROUP BY to group the games based on customers (customers_id)
Then you need to use HAVING clause to get customers who requested more than two games
Then make this a SUBQUERY if you need more information on the customer like name
Finally you use a JOIN between customers and the sub query (temp) to display more information on the customer
Like the following query
SELECT customers.customer_id, customers.customer_name, game_count
FROM (SELECT customer_id, count(wantslists_id) AS game_count
FROM wants_list
GROUP BY customer_id
HAVING count(requested_game) > '2') temp
JOIN customers ON customers.customer_id = temp.customer_id

Related

Why are Duplicates not being filtered out

I am working on some practice interview Questions and am struggling with this:
You are working with a company that sells goods to customers, and they'd like to keep track
of the unique items each customer has bought. The database is composed of two tables:
Customers and Orders. The two table schemas are given below. We want to know what
unique items were purchased by a specific customer, Wilbur, and when they were
purchased. What is the correct query that returns the customer first name, item
purchased, and purchase date with recent purchases first?
Tables: https://imgur.com/a/D47R1KU
My answer so far is
However I am getting an incorrect message as its Printing wilbur,oranges,2019-06-10
and wilbur,oranges,2018-06-10 instead of just the one with the more recent date. Please see the picture for the two tables referenced by the question. Thanks!
Between the where clause and ORDER BY, try:
GROUP BY FirstName, Item
And to get the most recent date, select MAX(PurchaseDate).
The query you are looking for is as follows.
This uses group by to indicate which columns should be grouped together, and for the column that's not grouped, how to choose which value of many to use, in this case the max value.
Note also the use of explicit, clear, SQL-92 modern join syntax and meaningful column aliases to show which table each column originates from. Distinct is not needed since each group is already unique.
Select c.FirstName, o.Item, Max(o.PurchaseDate) PurchaseDate
from Customers c
join Orders o on o.PersonId=p.PersonId
where c.FirstName = 'Wilbur'
group by c.firstName, o.Item
order by Max(o.PurchaseDate) desc;

Have to enter mySQL criteria twice?

Say I have two tables:
Table: customers
Fields: customer_id, first_name, last_name
Table: customer_cars
Fields: car_id, customer_id, car_brand, car_active
Say I am trying to write a query that shows all customers with a first name of "Karl," and the brands of the ** active ** cars they have. Not all customers will have an active car. Some cars are active, some are inactive.
Please keep in mind that this is a representative example that I just made up, for sake of clarity and simplicity. Please don't reply with questions about why we would do it this way, that I could use table aliases, how it's possible to have an inactive car, or that my field names could be better written. It's a fake example that is intended be very simple in order to illustrate the point. It has a structure and issue that I encounter all the time.
It seems like this would be best done with a LEFT JOIN and subquery.
SELECT
customer_id,
first_name,
last_name,
car_brand
FROM
customers
LEFT JOIN
(SELECT
customer_id,
car_brand
FROM
customer_cars
INNER JOIN customers ON customer_cars.customer_id = customers.customer_id
WHERE
first_name = 'Karl' AND
customer_cars.car_active = '1') car_query ON customers.customer_id = car_query.customer_id
WHERE
first_name = 'Karl'
The results might look like this:
first_name last_name car_brand
Karl Johnson Dodge
Karl Johnson Jeep
Karl Smith NULL
Karl Davis Chrysler
Notice the duplication of 'Karl' in both WHERE clauses, and the INNER JOIN in the subquery that is the same table in the outer query. My understanding of mySQL is that this duplication is necessary because it processes the subquery first before processing the outer query. Therefore, the subquery must be properly limited so it doesn't scan all records, then it tries to match on the resulting records.
I am aware that removing the car_active = '1' condition would change things, but this is a requirement.
I am wondering if a query like this can be done in a different way that only causes the criteria and joins to be entered once. Is there a recommended way to prioritize the outer query first, then match to the inner one?
I am aware that two different queries could be written (find all records with Karl, then do another that finds matching cars). However, this would cause multiple connections to the database (one for every record returned) and would be very taxing and inefficient.
I am also aware of correlating subqueries, but from my understanding and experience, this is for returning one field per customer (e.g., an aggregate field such as how much money Karl spent) within the fieldset. I am looking for a similar approach as this, but where one customer could be matched to multiple other records like in the sample output above.
In your response, if you have a recommended query structure that solves this problem, it would be really helpful if you could write a clear example instead of just describing it. I really appreciate your time!
First, is a simple and straight query not enough?
Say I am trying to write a query that shows all customers with a first
name of "Karl," and the brands of the ** active ** cars they have. Not
all customers will have an active car. Some cars are active, some are
inactive.
Following this requirement, I can just do something like:
SELECT C.first_name
, C.last_name
, CC.car_brand
FROM customers C
LEFT JOIN cutomer_cars CC ON CC.customer_id = C.customer_id
AND car_active = 1
WHERE C.first_name = 'Karl'
Take a look at the SQL Fiddle sample.

Finding the sum of a field in a linked table per group with additional search criteria

My actual tables are much more complex but here is a simplified example of the problem I am trying to work out.
Table contact: ContactID, ContactName, Pending
Table purchase: PurchaseID, ContactID, Amount, Pending, Date
Table contact_purchase_link: ContactID, PurchaseID (although it may seem like the link table is not necessary in this simplified example it is necessary in the large table schema)
Here is the query that I currently have:
SELECT DISTINCT contact.ContactID,
( SELECT SUM(Amount)
FROM purchase
WHERE purchase.ContactID = contact.ContactID
AND purchase.Pending = 0
) totalpurchase
FROM contact
INNER JOIN ( contact_purchase_link JOIN purchase
ON (contact_purchase_link.PurchaseID = purchase.PurchaseID
))
USING (ContactID)
WHERE purchase.Date > '2013-12-06' AND
AND contact.Pending =0
The problem is that I want the totalpurchase (the sum of the amount field) to be limited to the search criteria of the purchase table - meaning the query should only return the sum of the purchases after the specified date per contact. I think in order to use a group by clause the query would have to be based off the purchase table but I need the query to use the contact table so that all contacts are listed with their total purchase amounts and other relevant client data.
Is there any way to do this within one query?
To further clarify:
This query is being generated as part of a search engine. An example of why a query like this would be done is if a user wanted to generate a contact list of lastnames starting with A with purchases of a specific item or as in this example of purchases for a specific date. So that in general the query would have to generate a list of all contacts and their data (with possible search criteria on the type of contact such as all lastnames starting with 'A' etc.) and the query can also include search criteria on the purchase table such as the date of the purchase and whether the purchase was for specific items etc.
I am trying to add in the option to also list the sum of the purchases for the contact however that sum has to be limited to the search criteria for the purchase table as well and not the sum of all the contacts purchases.
If I understand your question correctly, you need to move the date comparison inside the first subquery:
SELECT DISTINCT contact.ContactID,
( SELECT SUM(Amount)
FROM purchase
WHERE purchase.ContactID = contact.ContactID
AND purchase.Pending = 0
AND purchase.Date > '2013-12-06'
) totalpurchase
FROM contact
INNER JOIN ( contact_purchase_link JOIN purchase
ON (contact_purchase_link.PurchaseID = purchase.PurchaseID
)
USING (ContactID)
WHERE purchase.Date > '2013-12-06'
AND contact.Pending =0
But the comments are right - I corrected a couple of what appears to be syntax errors, and I'm not sure about the join to contact_purchase_link. Improve your question and my answer will be less like guesswork.

Query using two tables with DISTINCT

I have two tables - clients and - group
I need to get county and zip from clients and group-assigned from group
When I search, I cannot get distinct results, that is, instead of the output showing 100 clients with zipcode 12345 in jones county in main st group.
I need to have each zip and county listed once by group. I have googled and attempted many ways but it is just beyond me.
Can anyone assist in steering me to the correct way
Adding GROUP BY group, city, zip to the end of your query should get you what you need. It will only return unique combinations of the three.
Presumably you have something like:
select g.*, c.county, c.zip
from clients c join groups g on <some join condition>
You want one result per group. So, add a group by clause such as:
group by g.id -- assuming id uniquely identifies each group
This will give an arbitrary value for the other fields, which may be sufficient for what you are doing. (This uses a MySQL features called Hidden Columns.)

What is the best way to count rows in a mySQL complex table

I have a table with the following fields (for example);
id, reference, customerId.
Now, I often want to log an enquiry for a customer.. BUT, in some cases, I need to filter the enquiry based on the customers country... which is in the customer table..
id, Name, Country..for example
At the moment, my application shows 15 enquiries per page and I am SELECTing all enquiries, and for each one, checking the country field in customerTable based on the customerId to filter the country. I would also count the number of enquiries this way to find out the total number of enquiries and be able to display the page (Page 1 of 4).
As the database is growing, I am starting to notice a bit of lag, and I think my methodology is a bit flawed!
My first guess at how this should be done, is I can add the country to the enquiryTable. Problem solved, but does anyone else have a suggestion as to how this might be done? Because I don't like the idea of having to update each enquiry every time the country of a contact is changed.
Thanks in advance!
It looks to me like this data should be spread over 3 tables
customers
enquiries
countries
Then by using joins you can bring out the customer and country data and filter by either. Something like.....
SELECT
enquiries.enquiryid,
enquiries.enquiredetails,
customers.customerid,
customers.reference,
customers.countryid,
countries.name AS countryname
FROM
enquiries
INNER JOIN customers ON enquiries.customerid = customers.customerid
INNER JOIN countries ON customers.countryid = countries.countryid
WHERE countries.name='United Kingdom'
You should definitely be only touching the database once to do this.
Depending on how you are accessing your data you may be able to get a row count without issuing a second COUNT(*) query. You havent mentioned what programming language or data access strategy you have so difficult to be more helpful with the count. If you have no easy way of determining row count from within the data access layer of your code then you could use a stored procedure with an output parameter to give you the row count without making two round trips to the database. It all depends on your architecture, data access strategy and how close you are to your database.