Using COUNT(Distinct ) in SQL

Using COUNT(Distinct ) in SQL - mysql

Here is a database schema:
customers (custID, firstname, familyname, town, state)
orders (orderID, custlD, date)
lineitems (orderID, itemlD, quantity, despatched)
items (itemID, description, unitcost, stocklevel)
itemSupplier (supplierID, itemlD)
supplier (supplierID, sName, sAddress, telephoneNo, delivers)
Here is the question:
If we assume that the description field in the items table uses a set of pre-defined categories (e.g., tents, spades, etc), then we can answer questions like, 'how many different kinds of tent does the shop sell?' Write a SQL query to list all items for which more than one of the same kind of item is sold and to find how many different types of that item are sold (i.e., a list with colunm headings: description; 'how many types are sold').
I tried to use DISTINCT in COUNT() function like this:
SELECT description, count(DISTINCT itemID) AS how_many_types_are_sold
FROM items
GROUP BY description
I'm not sure whether it's a right use of the functions or not. Is there any advice to solve this question? Thanks in advance:D

This query:
SELECT description, count(DISTINCT itemID) AS how_many_types_are_sold
FROM items
GROUP BY description;
Based on your description, this does what you want to do. But . . . you should be using DISTINCT only when necessary.
There is a really good chance that a column called itemID in a table called items is actually the primary key for the table. If so, then each row has a distinct value, and the COUNT(DISTINCT) is redundant. Based on this assumption, you can just count the number of rows that match:
SELECT description, count(*) AS how_many_types_are_sold
FROM items
GROUP BY description;
This is much preferred (because count(*) performs better than count(distinct)), assuming that itemID is unique in the table.

From my understanding, you do not need distinct in your select statement, as the group by will group every item based description which means,
SELECT description, count(*) AS how_many_types_are_sold FROM items GROUP BY description
but you can use above statement if only and if all the categories are predefined as you mentioned in your question, otherwise some not defined descriptions will be expected, in this case you need some kind of mapping table to map all the descriptions with all categories you have.

Related

Why are Duplicates not being filtered out

I am working on some practice interview Questions and am struggling with this:
You are working with a company that sells goods to customers, and they'd like to keep track
of the unique items each customer has bought. The database is composed of two tables:
Customers and Orders. The two table schemas are given below. We want to know what
unique items were purchased by a specific customer, Wilbur, and when they were
purchased. What is the correct query that returns the customer first name, item
purchased, and purchase date with recent purchases first?
Tables: https://imgur.com/a/D47R1KU
My answer so far is
However I am getting an incorrect message as its Printing wilbur,oranges,2019-06-10
and wilbur,oranges,2018-06-10 instead of just the one with the more recent date. Please see the picture for the two tables referenced by the question. Thanks!

Between the where clause and ORDER BY, try:
GROUP BY FirstName, Item
And to get the most recent date, select MAX(PurchaseDate).

The query you are looking for is as follows.
This uses group by to indicate which columns should be grouped together, and for the column that's not grouped, how to choose which value of many to use, in this case the max value.
Note also the use of explicit, clear, SQL-92 modern join syntax and meaningful column aliases to show which table each column originates from. Distinct is not needed since each group is already unique.
Select c.FirstName, o.Item, Max(o.PurchaseDate) PurchaseDate
from Customers c
join Orders o on o.PersonId=p.PersonId
where c.FirstName = 'Wilbur'
group by c.firstName, o.Item
order by Max(o.PurchaseDate) desc;

Error code 1111: Invalid use of group function sql

so I'm trying to compute the total cost per location by multiplying the sum of each item used with their respective price, this code is working for now :
create view location_costs as
select
City as location,
participant_item.SederStaple as item,
sum(ItemAmount) as quantity,
item.ItemPrice*sum(ItemAmount) as price
from participant_item
inner join item
on participant_item.SederStaple=item.SederStaple
group by item, location
order by location;
but I need to compute the sum of all of the total prices by location, so I'm adding the sum statement to my code :
create view location_costs as
select
City as location,
participant_item.SederStaple as item,
sum(ItemAmount) as quant,
sum(item.ItemPrice*sum(ItemAmount)) as price
from participant_item
inner join item
on participant_item.SederStaple=item.SederStaple
group by item, location
order by location;
and it's not working anymore:
error code 1111, invalid use of group function sql

I think you want the multipication before the sum():
select City as location, pi.SederStaple as item,
sum(ItemAmount) as quantity,
sum(ItemAmount * i.ItemPrice) as price
from participant_item pi inner join
item i
on pi.SederStaple = i.SederStaple
group by item, location
order by location;
Note that I introduced table aliases and used them for the columns that you qualified. You should qualify all column references in a query that references multiple tables.

Don't put the sum inside the sum:
select
pi.City as location,
pi.SederStaple as item,
sum(pi.ItemAmount) as quant,
sum(i.ItemPrice * pi.ItemAmount) as price
from
participant_item pi
inner join item i on pi.SederStaple = i.SederStaple
group by pi.City, pi.SederStaple
order by location;
In tidying up your aliasing I've guessed at which table these things came out of (i figured an item wouldn't have a city or a quantity, but a participant would) - you might need to tweak things a bit. Always fully alias your queries. It stops them mysteriously failing if in the future someone adds another column to a table with the same name
Also, while MySQL is quite happy to use aliases (location, item) in its group by, other databases aren't; keep the original names for your group by to make sure when your next job uses SQL Server you aren't caught out by this mysql-only "feature"

Filtering Queries, Adding a field to a query that isn't in the table

I need help with sorting and adding a field to a query that hasn't been made yet.
For the first bullet, I'm confused at the last part where it asks you to sort by one field, and then within that field, sort again. "Sort the records in ascending order by Region, and within Region, by Product Name." How would I sort it only for Region? or am I not understanding the question...
And for the second bullet, how would I create the field "Extended Price" in a query when it hasn't been created in the table? I'm sure I could handle the rest of that but all I need to know is if there is a way to create a field through query without it being created in the table its based on...
Thank you. (BTW this is a practice question. This practice assignment in no way, shape, or form will benefit my grade)

Use the query builder to construct SQL. The result should be something like:
SELECT ProductID, ProductName, Category, UnitsInStock, UnitPrice, SupplierID, SupplierName, Region, UnitsInStock * UnitPrice AS ExtendedPrice FROM Products INNER JOIN Suppliers ON Products.SupplierID = Suppliers.SupplierID ORDER BY Region, ProductName;

select p.product_name,p.category,p.supplier_id,s.supplier_name,s.region,s.unitprice
from products p join supplier s
on p.supplier_id=s.supplier_id
BTW this may work , if you modify accordingly

SELECT, Count & Insert in a single query?

I'm not entirely sure if this is possible, but I suspect it is.
I'm trying to gather some very basic statistics, so I have a'tracker' table that stores info on an ongoing basis, like so;
ID, IP, itemid
Each time an item is viewed, the visitors IP address and the Item ID are logged.
On a daily basis, I'd like to summarize this data and insert it into another table, like so;
ID, itemid, views
Now, the 'views' element I want to be unique - so ignoring any duplicate IP addresses (counting them only once).
I know I could simply loop through them all and do it that way, but is it possible to do the entire process with just a single query?
I'm using MySQL

If you group the tracker table by itemid, the number of distinct IP addresses should be the number of views you want:
INSERT INTO newtable (itemid, views)
SELECT itemid, COUNT(DISTINCT IP)
FROM tracker
GROUP BY itemid;

In other RDBMS it possible on this manner:
insert into othertable (field_views, field_itemid)
select count(distinct t.views), t.itemid from tracker
group by t.itemid
See also http://dev.mysql.com/doc/refman/5.0/en/insert-select.html
Note, this solution implies presence autoincrement in othertable.id

Try this,
insert into newtable(itemid,views)
select itemid,count(*)
from (
select itemid
from tracker
group by itemid,ip
)
as a
group by a.itemid.

mysql group by and filtering the values in each grouped record

i have a table of users im grouping by age, but each user also has a nationality and if one of the users nationality is US i want that to be value in the group record, currently it seems to take the first nationality it finds, how can i write this query?

One way to do it would be:
SELECT *, IF(INSTR(GROUP_CONCAT('--', nationality, '--'), '--US--'),
'US', nationality)
FROM table GROUP BY age;
What this does is that GROUP_CONCAT combines all the nationalities of one age and if it finds the string 'US' among them, it returns 'US' and otherwise it returns the nationality as it would normally do. It also adds '--' in the beginning and end of a nationality to make 'US' become '--US--'. If you didn't do that, the query would also think that any other nationality which contains the consecutive characters 'US' would mean US. But those '--' characters are only used internally and are not shown in the final result.
Edit: Another (cleaner but longer) way came into my mind:
SELECT * FROM (SELECT * FROM table WHERE nation='US'
UNION
SELECT * FROM table WHERE nation!='US') AS tmp
GROUP BY age;
So, first select persons whose nationality is US, then select persons whose nationality is not US and combine those two sets of persons so you get a table of persons in an order where there are first persons who are from US and then others. Then perform the GROUP BY operation to that table and you'll always get the nationality to be US if there's at least one person from US in that age, because it will always come first.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Using COUNT(Distinct ) in SQL - mysql

Related

Why are Duplicates not being filtered out

Error code 1111: Invalid use of group function sql

Filtering Queries, Adding a field to a query that isn't in the table

SELECT, Count & Insert in a single query?

mysql group by and filtering the values in each grouped record

Categories

Resources