SQL select distinct on one field MySQL server - mysql

I know there are several question like this one but I feel I've spent more than enough time trying different examples for a simple little hobby project. Yes, I'm being lazy, but in my defense, It's Saturday morning...
So I've got a table, Items which have the following fields (and more irrelevant ones):
id (varchar), product (varchar), provider (varchar)
Items are part of products, and have a provider.
I want to write a query where I can get one item per product a given provider supplies.
So even if a provider supplies all items for a product, I just want one of those.
I know I can't do distinct on only one field but I've tried with different variants o joins.
Should be simple for those of you who actually know what you're doing.

mysql have gone to the trouble of creating a website that contains a manual and all references to its features.
for distinct:
http://dev.mysql.com/doc/refman/5.0/en/distinct-optimization.html
select DISTINCT(columnname) from MYTABLE where RULESAPPLY

You could use Distinct operator
SELECT DISTINCT * FROM ...

Related

SQL only get rows that matches full number split by a comma

I'm working on something that shows shops under a specific category, however I have an issue because I store the categories of a shop like this in a record with the id of a category. "1,5,12". Now, the problem is if I want to show shops with category 2, it "mistakens" 12 as category 2. This is the SQL right now.
SELECT * FROM shops WHERE shop_cats LIKE '%".$sqlid."%' LIMIT 8
Is there a way to split the record "shop_cats" by a comma in SQL, so it checks the full number? The only way I can think of is to get all the shops, and do it with PHP, but I don't like that as it will take too many resources.
This is a really, really bad way to store categories, for many reasons:
You are storing numbers as strings.
You cannot declare proper foreign key relationships.
A (normal) column in a table should have only one value.
SQL has poor string functions.
The resulting queries cannot take advantage of indexes.
The proper way to store this information in a database is using a junction table, with one row per shop and per category.
Sometimes, we are stuck with other people's really bad design decisions. If this is your case, then you can use FIND_IN_SET():
WHERE FIND_IN_SET($sqlid, shop_cats) > 0
But you should really fix the data structure.
If you can, the correct solution should be to normalize the table, i.e. have a separate row per category, not with commas.
If you can't, this should do the work:
SELECT * FROM shops WHERE CONCAT(',' , shop_cats , ',') LIKE '%,".$sqlid.",%' LIMIT 8
The table shops does not follow 1NF (1st Normal Form) i.e; every column should exactly one value. To avoid that you need to create another table called pivot table which relates two tables (or entities). But to answer your question, the below SQL query should do the trick.
SELECT * FROM shops WHERE concat(',',shop_cats,',') LIKE '%,".$sqlid.",%' LIMIT 8

Extremely basic SQL Misunderstanding

I'm preparing for an exam in databases and SQL and I'm solving an exercise:
We have a database of 4 tables that represent a human resources company. The tables are:
applicant(a-id,a-name,a-city,years-of-study),
job(job-name,job-id),
qualified(a-id,job-id)
wish(a-id,job-id).
the table applicant represents the table of applicants obviously. And jobs is the table of available jobs. the table qualified shows what jobs a person is qualified for, and the table wish shows what jobs a person is interested in.
The question was to write a query that displays for each job-id, the number of applicants that are both qualified and interested to work in.
Here is the solution the teacher wrote:
Select q1.job_id
, count(q1.a_id)
from qualified as q1
, wish as w1
Where q1.a_id = w1.a_id
and q1.job_id = w1.job_id
Group by job_id;
That's all well and good, I'm not sure why we needed that "as q1" and "as w1", but i can see why it works.
And here is the solution I wrote:
SELECT job-id,COUNT(a-id) FROM job,qualified,wish WHERE (qualified.a-id=wish.a-id)
GROUP BY job-id
Why is my solution wrong? And also - From which table will it select the information? Suppose I write SELECT job-id FROM job,qualified,wish. From which table will it take the information? because job-id exists in all 3 of these tables.
You can only refer to tables mentioned in the FROM clause. If it's ambiguous (because more than one has a column of the same name) then you need to be explicit by qualifying the name. Usually the qualifier is an alias but it could also be the table name itself if an alias wasn't specified.
There's a concept of a "natural join" which joins tables on common column(s) between two tables. Not all systems support that notation but I think MySQL does. I believe these systems usually collapse the joined pairs into a single column.
select q1.job_id, count(q1.a_id) from qualified as q1, wish as w1
where q1.a_id = w1.a_id and q1.job_id = w1.job_id
group by job_id;
I don't think I've worked on any systems that would have accepted the query above because the grouping column would have been strictly unclear even though the intention really is not. So if it truly does work correctly on MySQL then my guess is that it recognizes the equivalence of the columns and cuts you some slack on the syntax.
By the way, your query appears to be incorrect because you only included a single column in a join that requires two columns. You also included a third table which means that your result will effectively do a cross join of every row in that table. The grouping is going to still going to reduce it to one row per job_id but the count is going to be multiplied by the number of rows in the job table. Perhaps you added that table thinking it would hurt to add it just in case you need it but that is not what it means at all.
Your query will list non-existing jobs in case the database has orphan records in applicant and qualified, and might also omit jobs that have no qualified and willing candidates.
I'm not exactly sure, because I have no idea if there's any database that will accept COUNT(a-id) when there's no information about the table from which to take this value.
edit: Interestingly it looks like both of these problems are shared by both of the solutions, but shawnt00 has a point: your solution makes a huge pointless cartesian of three tables: see it without the group by.
My current best guess for a working answer would therefore be http://sqlfiddle.com/#!9/09d0c/6

How to select number with LIKE?

I have a bunch of products, and a bunch of category pages. One product can be in multiple categories. So in my database I have a products table with a "categories" column. In this column I store the ID's of all the categories that the current product is stored in, its a string seperated with semicolons.
Example: 1;5;23;35;49;.
When I browse to Category Page ID 5, I want to see all products that have 5; in its categories-column. I currently do this by
SELECT * FROM products WHERE categories LIKE "%".category.";%"
The problem is that this matches more than just 5. It matches 15; or 25; aswell.
So questions:
How do I make sure that I only select the number I want? If category is "5" I do not want it to match 15, 25, 35 and so on.
Maybe this is a very bad way of storing the category-ids. Do you have any suggestions of a different way of storing what products that belong to what category?
Others have mentioned that a junction table is the right way to design the database. SQL has a very nice data structure for storing lists. It is not called a "string", it is called a "table".
But, sometimes one is stuck with data in this format and needs to work with it. In that case, the key is to put the delimiters on both side to prevent the problem you are having:
SELECT *
FROM products
WHERE concat(';', categories) LIKE "%;".category.";%"
Your list already ends in a semicolon, so that is not necessary.
Another more typical MySQL solution is find_in_set():
SELECT *
FROM products
WHERE find_in_set(category, replace(categories, ';', ',') > 0;
It is designed for comma-delimited lists. Odd that MySQL supports such a function when storing lists this way is generally a bad idea, but it does. Still, a junction table is better for performance reasons (and for other reasons).
Answers/comments to your two questions:
The only way I can think of that you could do this without modifying your schema (see #2) is to use a MySQL regular expression but this is really not a good idea. See http://dev.mysql.com/doc/refman/5.1/en/regexp.html for documentation though
You are right - this is not a good way to store categories. What you want is a join also known as a junction table (see http://en.wikipedia.org/wiki/Junction_table). One way would be to have three tables: product, category, and a product_categories table. Product and category would have a unique ID as you already have and the product_categories table would have two columns: product_id and category_id. If product 1 belongs to categories 10 and 11, you would have two rows in the product_categories table: 1,10 and 1,11.
I can elaborate if you need more help but this should get you started in re-architecting your database (more) correctly.
You can try changing your like criteria to "%;".category.";%"

Filter a MySQL Result in Delphi

I'm having an issue with a certain requirement to one of my Homework Assignments. I am required to take a list of students and print out all of the students with credit hours of 12 or more. The Credit hours are stored in a separate table, and referenced through a third table
basically, a students table, a classes table with hours, and an enrolled table matching student id to Course id
I used a SUM aggregate grouped by First name from the tables and that all works great, but I don't quite understand how to filter out the people with less than 12 hours, since the SQL doesn't know how many hours each person is taking until it's done with the query.
my string looks like this
'SELECT Students.Fname, SUM(Classes.Crhrs) AS Credits
FROM Students, Classes, Enrolled
WHERE Students.ID = Enrolled.StudentID AND Classes.ID = Enrolled.CourseID
GROUP BY Students.Fname;'
It works fine and shows the grid in the Delphi Project, but I don't know where to go from here to filter the results, since each query run deletes the previous.
Since it's a homework exercise, I'm going to give a very short answer: look up the documentation for HAVING.
Beside getting the desired result directly from SQL as Martijn suggested, Delphi datasets have ways to filter data on the "client side" also. Check the Filter property and the OnFilter record.
Anyway, remember it is usually better to apply the best "filter" on the database side using the proper SQL, and then use client side "filters" only to allow for different views on an already obtained data set, without re-querying the same data, thus saving some database resources and bandwidth (as long as the data on the server didn't change meanwhile...)

I need some sort of full text search on mysql database

I've stuck with one quite tricky problem.
I have list of products from different warehouses, where each product have: Brand and Model plus some extra details. Model could be quite different from different warehouses for the same product, but Brand is always the same.
All list of products I store in one table, let's say it will be Product table.
Then I have another table - Model, with CORRECT Model Name, Brand and additional details like image, description etc. Plus I have keywords column where I try to add all keywords manually.
And here is the problem, I need to associate each product that I receive from warehouse with one record from my Model table. Right now I'm using full text search in boolean mode, but that's quite painful and does not work very well. I need to do a lot of manual work.
Here are just few examples of names that I have:
WINT.SPORT3D
WINT.SPORT3D XL
WINT.SPORT 3D
WINT.SPORT3D MO
WINTER SPORT 3D
The correct name for all of these items would be: WINTER SPORT 3D, so they should all be assigned to the same model.
So, is there any way to improve full text search or some other technique to solve my problem?
Database that I'm using is MySQL, I would prefer not to change it.
I'll start by putting together a more formal definition of the tables:
warehouse:
warehouse_id,
warehouse_product_id,
product_brand,
product_name,
local_id
Here I'd using local_id as a foreign key to your 'Model' table - but to avoid further confusion, I'll call it 'local'
local:
id,
product_brand,
product_name
It seems like the table you describe as 'product' is redundant.
Obviously until the data is cross referenced, local_id will be null. But after it is populated it won't have to change, and given a warehouse_id, a band and a product, you can find your local descriptor easily:
SELECT local.*
FROM local, warehouse
WHERE local.id=warehouse.local_id
AND warehouse.product_brand=local.product_brand
AND warehouse_id=_____
AND warehouse.product_brand=____
AND warehouse.product_name=____
So all you need to do is populate the links. Soundex is a rather crude tool - a better solution for this would be the Levenstein distance algorithm. There's a mysql implementation here
Given a set of rows in the warehouse table which need to be populated:
SELECT w.*
FROM warehouse w
WHERE w.local_id IS NULL;
...for each row identify the best match as (using the values from the previous query as w.*)....
SELECT local.id
FROM local
WHERE local.product_brand=w.product_brand
ORDER BY levenstein(local.product_name, w.product_name) ASC
LIMIT 0,1
But this will find the best match, even if the 2 strings are completely different! Hence....
SELECT local.id
FROM local
WHERE local.product_brand=w.product_brand
AND levenstein(local.product_name, w.product_name)<
(IF LENGTH(local.product_name)<LENGTH(w.product_name),
LENGTH(local.product_name), LENGTH(w.product_name))/2
ORDER BY levenstein(local.product_name, w.product_name) ASC
LIMIT 0,1
...requires at least half the string to match.
So this can be implemented in a single update statement:
UPDATE warehouse w
SET local_id=(
SELECT local.id
FROM local
WHERE local.product_brand=w.product_brand
AND levenstein(local.product_name, w.product_name)<
(IF LENGTH(local.product_name)<LENGTH(w.product_name),
LENGTH(local.product_name), LENGTH(w.product_name))/2
ORDER BY levenstein(local.product_name, w.product_name) ASC
LIMIT 0,1
)
WHERE local_id IS NULL;
Try Soundex. All of your examples resolve to W532 while the last one resolves to W536. So, you could:
Add a column to PRODUCT and MODEL called SoundexValue and calculate the Soundex value for each product and model
Compare the Soundex values in the PRODUCT table to the ones in the Model Table. You may have to use a range (+/- 5) to get a higher rate of matching.
Follow the 80/20 rule. That is, spend 80% of your manual effort on the 20% that don't easily fall out.