I am currently using the following query to find a "category_id" that is used the most based on the "name" of an item.
SELECT Name,category_id,COUNT(*) as count
FROM ex.item
where name LIKE '%living%'
GROUP BY category_id ORDER by count DESC;
However I hit situations in which the count results are equal
So I have modified my query to return the result randomly :
SELECT Name,category_id,COUNT(*) as count
FROM ex.item
where name LIKE '%living%'
GROUP BY category_id ORDER by count,rand() DESC LIMIT 1;
This works but I wanted to improve the query and remove the rand() altogether doing the following:
1) Take subcategory_id into account (on the same table) into, so on the above example if category_id 550 is the most prevalent category_id being used and it has three subategory_id columns two with 800 and one with 900 then return the category_id 500 and subcategory_id 800 as the most common result.
2)Assuming that we still have the scenario per the picture above in which the count returns the same number ( even though that we included the subategory_id) , to try to use the description of the item field (in the same table) to see if the query string appears in the description field as well and if it appears in the description and name, to return the row that it appears in both as the prevalent result.
Thanks
1) You can group by more than one value:
SELECT Name,category_id, subcategory_id, COUNT(*) as count
FROM ex.item
where name LIKE '%living%'
GROUP BY category_id, subcategory_id ORDER by count DESC;
2) Use a CASE statement to compute a score/weight/boost that you can combine with other factors:
SELECT Name,case when description like '%living%' then 1 else 0 end as boost, category_id,COUNT(*) as count
FROM ex.item
where name LIKE '%living%'
GROUP BY category_id ORDER by count DESC;
Stuff you didn't ask about:
If these categories are "tied", why not take them to the one that
makes you the most money (highest margins, highest sale percentage,
etc)?
If you use wildcards at the beginning of your values, the
query can't use an index on that column. Look into a text search
function (MySQL has one, or go use something like
solr/elasticsearch/cloudsearch). This will also help you with the scoring.
Related
I want to count the ugyfel_email field in this select.
If i run it, i didnt get errors, but from the table records, it will only show one row.
What am i doing wrong?
SELECT id,ugyfel_nev,ugyfel_email,parkolo_tipus,ugyfel_tel,rendszam,erkezes_datum,
erkezes_ideje,allapot, utasok, COUNT(ugyfel_nev) AS ennyiszer FROM foglalas WHERE allapot = 'Feldolgozva' ORDER BY id DESC
In order to count a column, you need to reference it to some other column. By that I mean it needs to represent something. In your query I assume you want to count how many ugyfel_emain there is for a specific allpot, this will look like:
SELECT COUNT(ugyfel_nev), COUNT(ugyfel_email)
FROM foglalas
GROUP BY allapot
HAVING allapot = 'Feldolgozva' # if you want only for this
ORDER BY id DESC
I am trying to do a simple test where I'm pulling from a table the information of a specific part number as such:
SELECT *
FROM table_name
WHERE part_no IN ('abc123')
This returns 25 rows. Now I want to count the number that meet the "accepted" condition in a specific column but the result is limited to only the 10 most recent. My approach is to write it as follows:
Select Count(*)
FROM table_name
WHERE part_no IN ('abc123') AND lot IN ('accepted')
ORDER BY date DESC
LIMIT 10
I'm having a hard time to get the ORDER BY and LIMIT operations to work. I could use help just getting it to limit appropriately, and I can figure out the rest from there.
Edit: I understand that the operations are happening on the COUNT which only returns one row with a value; but I put the second clip to show where I am stuck in my thought process.
Your query SELECT Count(*) FROM ... will always return exactly one row.
It's not 100% clear what exactly you want to do, but if you want to know how many of the last 10 have been accepted, you could use a subquery - something like:
SELECT COUNT(*) FROM (
SELECT lot
FROM table_name
WHERE part_no IN ('abc123')
ORDER BY date DESC
LIMIT 10
)
WHERE lot IN ('accepted')
The inner query will return the 10 most recent rows for part abc123, then the outer query will count the accepted ones.
There are also other solution (for example, you could have the inner query output a field that is 0 when the part is not accepted and 1 when the part is accepted, then take the sum). Depending on which exact dialect/database you are using, you may also have more elegant options.
Select count returns ONE ROW therefore the ORDER BY and the LIMIT will not work on the results
Suppose my query is -
SELECT * FROM products ORDER BY is_featured DESC, created_date DESC
Where is_featured is a flag field in table which holds either 1 or 0,
Its obvious the above query returns set of records with all featured products at first (the lastest among which will come first) and then the normal products(the latest among which will come first).
My question: How can we rewrite the above query such that, Featured products comes first (But will be random) and then follows the normal products (sorted by created date).
I can sense the possible answer be write two separate queries, and join the resultset and iterate through the loop to display products. But wondering can it be achieved via single query?
One way I could think about is to add another calculated expression to the order by clause that returns a random value for featured products and a constant for other products, so it doesn't affect their order:
SELECT *
FROM products
ORDER BY is_featured DESC,
CASE is_featured
WHEN 1 THEN RAND()
ELSE 1 -- Or some other constant
END,
created_date DESC
I have a database with 1 table with the following rows:
id name date
-----------------------
1 Mike 2012-04-21
2 Mike 2012-04-25
3 Jack 2012-03-21
4 Jack 2012-02-12
I want to extract only distinct values, so that I will only get Mike and Jack once.
I have this code for a search script:
SELECT DISTINCT name FROM table WHERE name LIKE '%$string%' ORDER BY id DESC
But it doesn't work. It outputs Mike, Mike, Jack, Jack.
Why?
Because of the ORDER BY id DESC clause, the query is treated rather as if it was written:
SELECT DISTINCT name, id
FROM table
ORDER BY id DESC;
except that the id columns are not returned to the user (you). The result set has to include the id to be able to order by it. Obviously, this result set has four rows, so that's what is returned. (Moral: don't order by hidden columns — unless you know what it is going to do to your query.)
Try:
SELECT DISTINCT name
FROM table
ORDER BY name;
(with or without DESC according to whim). That will return just the two rows.
If you need to know an id for each name, consider:
SELECT name, MIN(id)
FROM table
GROUP BY name
ORDER BY MIN(id) DESC;
You could use MAX to equally good effect.
All of this applies to all SQL databases, including MySQL. MySQL has some rules which allow you to omit GROUP BY clauses with somewhat non-deterministic results. I recommend against exploiting the feature.
For a long time (maybe even now) the SQL standard did not allow you to order by columns that were not in the select-list, precisely to avoid confusions such as this. When the result set does not include the ordering data, the ordering of the result set is called 'essential ordering'; if the ordering columns all appear in the result set, it is 'inessential ordering' because you have enough data to order the data yourself.
I'm writing a query where I group a selection of rows to find the MIN value for one of the columns.
I'd also like to return the other column values associated with the MIN row returned.
e.g
ID QTY PRODUCT TYPE
--------------------
1 2 Orange Fruit
2 4 Banana Fruit
3 3 Apple Fruit
If I GROUP this table by the column 'TYPE' and select the MIN qty, it won't return the corresponding product for the MIN row which in the case above is 'Apple'.
Adding an ORDER BY clause before grouping seems to solve the problem. However, before I go ahead and include this query in my application I'd just like to know whether this method will always return the correct value. Is this the correct approach? I've seen some examples where subqueries are used, however I have also read that this inefficient.
Thanks in advance.
Adding an ORDER BY clause before grouping seems to solve the problem. However, before I go ahead and include this query in my application I'd just like to know whether this method will always return the correct value. Is this the correct approach? I've seen some examples where subqueries are used, however I have also read that this inefficient.
No, this is not the correct approach.
I believe you are talking about a query like this:
SELECT product.*, MIN(qty)
FROM product
GROUP BY
type
ORDER BY
qty
What you are doing here is using MySQL's extension that allows you to select unaggregated/ungrouped columns in a GROUP BY query.
This is mostly used in the queries containing both a JOIN and a GROUP BY on a PRIMARY KEY, like this:
SELECT order.id, order.customer, SUM(price)
FROM order
JOIN orderline
ON orderline.order_id = order.id
GROUP BY
order.id
Here, order.customer is neither grouped nor aggregated, but since you are grouping on order.id, it is guaranteed to have the same value within each group.
In your case, all values of qty have different values within the group.
It is not guaranteed from which record within the group the engine will take the value.
You should do this:
SELECT p.*
FROM (
SELECT DISTINCT type
FROM product p
) pd
JOIN p
ON p.id =
(
SELECT pi.id
FROM product pi
WHERE pi.type = pd.type
ORDER BY
type, qty, id
LIMIT 1
)
If you create an index on product (type, qty, id), this query will work fast.
It's difficult to follow you properly without an example of the query you try.
From your comments I guess you query something like,
SELECT ID, COUNT(*) AS QTY, PRODUCT_TYPE
FROM PRODUCTS
GROUP BY PRODUCT_TYPE
ORDER BY COUNT(*) DESC;
My advice, you group by concept (in this case PRODUCT_TYPE) and you order by the times it appears count(*). The query above would do what you want.
The sub-queries are mostly for sorting or dismissing rows that are not interested.
The MIN you look is not exactly a MIN, it is an occurrence and you want to see first the one who gives less occurrences (meaning appears less times, I guess).
Cheers,