I'm working in PHP and MySQL
I feel like punching a wall. I have no idea how to do this and it's making me angry. I've done plenty of things like it before and this isn't working. UGH. Please help!!!!
I'm doing a search. Customer will type in something like "88 brake pads". I want to return all the parts that say "brake" all the parts that say "pads" and all the parts that have the year 88. My issue here is, the years and the parts are in different tables.
Example:
Years Table
ID | part_id | year |
============================
1 | 15 | 1945 |
2 | 15 | 1946 |
3 | 16 | 1984 |
4 | 18 | 1987 |
Parts Table
ID | part_name |
=====================
15 | brakes |
16 | hose |
17 | crank |
18 | muffler |
This SQL Works, when pulling out of parts table
SELECT DISTINCT * FROM hj_parts WHERE part_name LIKE "%brakes"
I will get the output
ID | part_name |
=====================
15 | brakes |
So what I need is if someone enters "brakes 87" I want to get back all the info for part 15 (brakes) and part 18 (because it's from 1987).
I am so totally lost.
I've done something like this to pull out all the parts from a certain year, and I thought I could just tweak it, and it's not helping.
$query = "SELECT hj_parts.* FROM hj_parts JOIN(SELECT DISTINCT part_id FROM hj_years WHERE (year BETWEEN $starter AND $ender) OR (year='all')) hj_years ON hj_parts.id = hj_years.part_id";
Here's how I'd do it. First create a query that brings everything together, and we'll put it in a view for convenience sake
create view AllParts as
SELECT Parts.*,Years.* from Parts inner join Years on Years.part_id=Parts.ID
Then you do a search like this
Select * from AllParts where part_name like '%brake' or year=1987
If your database is big this might take a while. It's not necessarily a perfect solution, but it can work for you.
If you are interested in doing a keyword search, perhaps a nosql database would be better
Related
I wanted to ask you which could be the best approach creating my MySQL database structure having the following case.
I've got a table with items, which is not needed to describe as the only important field here is the ID.
Now, I'd like to be able to assign some attributes to each item - by its ID, of course. But I don't know exactly how to do it, as I'd like to keep it dynamic (so, I do not have to modify the table structure if I want to add a new attribute type).
What I think
I think - and, in fact, is the structure that I have right now - that I can make a table items_attributes with the following structure:
+----+---------+----------------+-----------------+
| id | item_id | attribute_name | attribute_value |
+----+---------+----------------+-----------------+
| 1 | 1 | place | Barcelona |
| 2 | 2 | author_name | Matt |
| 3 | 1 | author_name | Kate |
| 4 | 1 | pages | 200 |
| 5 | 1 | author_name | John |
+----+---------+----------------+-----------------+
I put data as an example for you to see that those attributes can be repeated (it's not a relation 1 to 1).
The problem with this approach
I have the need to make some querys, some of them for statistic purpouses, and if I have a lot of attributes for a lot of items, this can be a bit slow.
Furthermore - maybe because I'm not an expert on MySQL - everytime I want to make a search and find "those items that have 'place' = 'Barcelona' AND 'author_name' = 'John'", I end up having to make multiple JOINs for every condition.
Repeating the example before, my query would end up like:
SELECT *
FROM items its
JOIN items_attributes attr
ON its.id = attr.item_id
AND attr.attribute_name = 'place'
AND attr.attribute_value = 'Barcelona'
AND attr.attribute_name = 'author_name'
AND attr.attribute_value = 'John';
As you can see, this will return nothing, as an attribute_name cannot have two values at once in the same row, and an OR condition would not be what I'm searching for as the items MUST have both attributes values as stated.
So the only possibility is to make a JOIN on the same repeated table for every condition to search, which I think it's very slow to perform when there are a lot of terms to search for.
What I'd like
As I said, I'd like to be able to keep the attributes types dynamical, so by adding a new input on 'attribute_name' would be enough, without having to add a new column to a table. Also, as they are 1-N relationship, they cannot be put in the 'items' table as new columns.
If the structure, in your opinion, is the only one that can acheive my interests, if you could light up some ideas so the search queries are not a ton of JOINs it would be great, too.
I don't know if it's quite hard to get it as I've been struggling my head until now and I haven't come up with a solution. Hope you guys can help me with that!
In any case, thank you for your time and attention!
Kind regards.
You're thinking in the right direction, the direction of normalization. The normal for you would like to have in your database is the fifth normal form (or sixth, even). Stackoverflow on this matter.
Table Attribute:
+----+----------------+
| id | attribute_name |
+----+----------------+
| 1 | place |
| 2 | author name |
| 3 | pages |
+----+----------------+
Table ItemAttribute
+--------+----------------+
| item_id| attribute_id |
+--------+----------------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
+--------+----------------+
So for each property of an object (item in this case) you create a new table and name it accordingly. It requires lots of joins, but your database will be highly flexible and organized. Good luck!
In my Opinion it should be something like this, i know there are a lot of table, but actually it normilizes your DB
Maybe that is why because i cant understant where you get your att_value column, and what should contains this columns
I have a table with the following structure:
id | workerID | materialID | date | materialGathered
Different workers contribute different amounts of different material per day. A single worker can only contribute once a day, but not necessarily every day.
What I need to do is to figure out which of them was the most productive and which of them was the least productive, while it is supposed to be measured as AVG() material gathered per day.
I honestly have no idea how to do that, so I'll appreciate any help.
EDIT1:
Some sample data
1 | 1 | 2013-01-20 | 25
2 | 1 | 2013-01-21 | 15
3 | 1 | 2013-01-22 | 17
4 | 1 | 2013-01-25 | 28
5 | 2 | 2013-01-20 | 23
6 | 2 | 2013-01-21 | 21
7 | 3 | 2013-01-22 | 17
8 | 3 | 2013-01-24 | 15
9 | 3 | 2013-01-25 | 19
Doesn't really matter how the output looks, to be honest. Maybe a simple table like that:
workerID | avgMaterialGatheredPerDay
And I didn't really attempt anything because I literally have no idea, haha.
EDIT2:
Any time period that is in the table (from earliest to latest date in the table) is considered.
Material doesn't matter at the moment. Only the arbitrary units in the materialGathered column matter.
As in your comments you say that we look at each worker and consider their avarage daily working skill, rather than checking which worked most in a given time, the answer is rather easy: Group by workerid to get a result record per worker, use AVG to get their avarage amount:
select workerid, avg(materialgathered) as avg_gathered
from work
group by workerid;
Now to the best and worst workers. These can be more than two. So you cannot just take the first or last record, but need to know the maximum and the minimum avg_gathered.
select max(avg_gathered) as max_avg_gathered, min(avg_gathered) as min_avg_gathered
from
(
select avg(materialgathered) as avg_gathered
from work
group by workerid
);
Now join the two queries to get all workers that worked the avarage minimum or maximum:
select work.*
from
(
select workerid, avg(materialgathered) as avg_gathered
from work
group by workerid
) as worker
inner join
(
select max(avg_gathered) as max_avg_gathered, min(avg_gathered) as min_avg_gathered
from
(
select avg(materialgathered) as avg_gathered
from work
group by workerid
)
) as worked on worker.avg_gathered in (worked.max_avg_gathered, worked.min_avg_gathered)
order by worker.avg_gathered;
There are other ways to do this. For example with HAVING avg(materialgathered) IN (select min(avg_gathered)...) OR avg(materialgathered) IN (select max(avg_gathered)...) instead of a join. The join is very effective though, because you need just one select for both min and max.
I have two tables, and I want to create a SELECT pulling a single record from one table based on multiple records from another table.
Perhaps it will be clearer if I just give a sort of example.
Table: dates. Fields: dosID, personID, name, dateIn
1 | 10 | john smith | 2013-09-05
2 | 10 | john smith | 2013-01-25
Table: cards. Fields: cardID, personID, cardColor, cardDate
1 | 10 | red | 2013-09-05
2 | 10 | orange | 2013-09-05
3 | 10 | black | 2013-09-05
4 | 10 | green | 2013-01-25
5 | 10 | orange | 2013-01-25
So what I want is to only select a record from the dates table if a person did not receive a "red" card. The closest I have come is something like:
SELECT name, dateIn FROM dates, cards
WHERE dates.personID = cards.personID AND cardColor != 'red' AND dateIn = cardDate;
But for this query the 2013-09-05 date-of-service would still be pulled out because of the "orange" and "black" cards given on the same day.
I have tried searching but I am not even sure how to properly describe this issue and so my Google-fu has failed me. Any help or suggestions would be very much appreciated.
The easiest to understand version would filter using NOT EXISTS:
SELECT name, dateIn
FROM dates
WHERE NOT EXISTS(
SELECT *
FROM cards
WHERE cards.personID = dates.personID
AND cards.cardDate = dates.dateIn
AND cards.cardColor = 'red'
)
See it on sqlfiddle.
I'm a bit confused, you're getting the non-red 2013-09-05 rows back.
http://sqlfiddle.com/#!2/89260
I tried it using an inner join (as you wrote it), and an outer join. Same results.
EDIT:
Sorry, misunderstood your post.
eggyal's answer looks like a winner to me.
I know the title makes no sense at first glance. But here's the situation: the DB table is named 'teams'. In it, there are a bunch of columns for positions in a soccer team (gk1, def1, def2, ... , st2). Each column is type VARCHAR and contains a player's name. There is also a column named 'captain'. The content of that column (not the most fortunate solution) is not the name of the captain, but rather the position.
So if the content of 'st1' is Zlatan Ibrahimovic and he's the captain, then the content of 'captain' is the string 'st1', and not the string 'Zlatan Ibrahimovic'.
Now, I need to write a query which gets a row form the 'teams' table, but only if the captain is Zlatan Ibrahimovic. Note that at this point I don't know if he plays st1, st2 or some other position. So I need to use just the name in the query, to check if the position he plays on is set as captain. Logically, it would look like:
if(Zlatan is captain)
get row content
In MySQL, the if condition would actually be the 'where' clause. But is there a way to write it?
$query="select * from teams where ???";
The "Teams" table structure is:
-----------------------------------------------------------------
| gk1 | def1 | def2 | ... | st2 | captain |
-----------------------------------------------------------------
| player1 | player2 | player3 | ... | playerN | captainPosition |
-----------------------------------------------------------------
Whith all fields being of VARCHAR type.
Because the content of the captain column is the position and not the name, and you want to choose based on the position, this is trivial.
$query="select * from teams where captain='st1'";
Revised following question edit:
Your database design doesn't allow this to be done very efficiently. You are looking at a query like
SELECT * FROM teams WHERE
(gk1='Zlatan' AND captain='gk1') OR
(de1='Zlatan' AND captain='de1') OR
...
The design mandates this sort of query for many functions: how you can find the team which a particular player plays for without searching every position? [Actually you could do that by finding the name in a concatenation of all the positions, but it's still not very efficient or flexible]
A better solution would be to normalise your data so you had a single table showing which player was playing where:
Situation
Team | Player | Posn | Capt
-----+--------+------+------
1 | 12 | 1 | 0
1 | 11 | 2 | 1
1 | 13 | 10 | 0
...with other tables which allow you to identify the Team, Player and Postion referenced here. There would need to be some referential checks to ensure that each team had only one captain, and only plays one goalkeeper, etc.
You could then easily see that the captain of Team 1 is Player 11 who plays in position 2; or find the team (if any) for which player 11 is captain.
SELECT Name FROM Teams
WHERE Situation.Team = Teams.id
AND Situation.Capt = 1
AND Situation.Player = Players.id
AND Players.Name = 'Zlatan';
A refinement on that idea might be
Situation
Team | Player | Posn | Capt | Playing
-----+--------+------+------+--------
1 | 12 | 1 | 0 | 1
1 | 11 | 2 | 1 | 1
1 | 13 | 10 | 0 | 0
1 | 78 | 1 | 0 | 0
...so that you could have two players who are goalkeepers (for example) but only field of them.
Redesigning the database may be a lot of work; but it's nowhere near as complicated or troublesome as using your existing design. And you will find that the performance is better if you don't need to use inefficient queries.
By what have you exposed, you just need to put the two conditions and check if the query returned 1 record. If it returns no records, he is not the captain:
SELECT *
FROM Teams
WHERE name = 'Zlatan Ibrahimovic' AND position = 'st1';
Just after some opinions on the best way to achieve the following outcome:
I would like to store in my MySQL database products which can be voted on by users (each vote is worth +1). I also want to be able to see how many times in total a user has voted.
To my simple mind, the following table structure would be ideal:
table: product table: user table: user_product_vote
+----+-------------+ +----+-------------+ +----+------------+---------+
| id | product | | id | username | | id | product_id | user_id |
+----+-------------+ +----+-------------+ +----+------------+---------+
| 1 | bananas | | 1 | matthew | | 1 | 1 | 2 |
| 2 | apples | | 2 | mark | | 2 | 2 | 2 |
| .. | .. | | .. | .. | | .. | .. | .. |
This way I can do a COUNT of the user_product_vote table for each product or user.
For example, when I want to look up bananas and the number of votes to show on a web page I could perform the following query:
SELECT p.product AS product, COUNT( v.id ) as votes
FROM product p
LEFT JOIN user_product_vote v ON p.id = v.product_id
WHERE p.id =1
If my site became hugely successful (we can all dream) and I had thousands of users voting on thousands of products, I fear that performing such a COUNT with every page view would be highly inefficient in terms of server resources.
A more simple approach would be to have a 'votes' column in the product table that is incremented each time a vote is added.
table: product
+----+-------------+-------+
| id | product | votes |
+----+-------------+-------+
| 1 | bananas | 2 |
| 2 | apples | 5 |
| .. | .. | .. |
While this is more resource friendly - I lose data (eg. I can no longer prevent a person from voting twice as there is no record of their voting activity).
My questions are:
i) am I being overly worried about server resources and should just stick with the three table option? (ie. do I need to have more faith in the ability of the database to handle large queries)
ii) is their a more efficient way of achieving the outcome without losing information
You can never be over worried about resources, when you first start building an application you should always have resources, space, speed etc. in mind, if your site's traffic grew dramatically and you never built for resources then you start getting into problems.
As for the vote system, personally I would keep the votes like so:
table: product table: user table: user_product_vote
+----+-------------+ +----+-------------+ +----+------------+---------+
| id | product | | id | username | | id | product_id | user_id |
+----+-------------+ +----+-------------+ +----+------------+---------+
| 1 | bananas | | 1 | matthew | | 1 | 1 | 2 |
| 2 | apples | | 2 | mark | | 2 | 2 | 2 |
| .. | .. | | .. | .. | | .. | .. | .. |
Reasons:
Firstly user_product_vote does not contain text, blobs etc., it's purely integer so it takes up less resources anyways.
Secondly, you have more of a doorway to new entities within your application such as Total votes last 24 hr, Highest rated product over the past 24 hour etc.
Take this example for instance:
table: user_product_vote
+----+------------+---------+-----------+------+
| id | product_id | user_id | vote_type | time |
+----+------------+---------+-----------+------+
| 1 | 1 | 2 | product |224.. |
| 2 | 2 | 2 | page |218.. |
| .. | .. | .. | .. | .. |
And a simple query:
SELECT COUNT(id) as total FROM user_product_vote WHERE vote_type = 'product' AND time BETWEEN(....) ORDER BY time DESC LIMIT 20
Another thing is if a user voted at 1AM and then tried to vote again at 2PM, you can easily check when the last time they voted and if they should be allowed to vote again.
There are so many opportunities that you will be missing if you stick with your incremental example.
In regards to your count(), no matter how much you optimize your queries it would not really make a difference on a large scale.
With an extremely large user-base your resource usage will be looked at from a different perspective such as load balancers, mainly server settings, Apache, catching etc., there's only so much you can do with your queries.
If my site became hugely successful (we can all dream) and I had thousands of users voting on thousands of products, I fear that performing such a COUNT with every page view would be highly inefficient in terms of server resources.
Don't waste your time solving imaginary problems. mysql is perfectly able to process thousands of records in fractions of a second - this is what databases are for. Clean and simple database and code structure is far more important than the mythical "optimization" that no one needs.
Why not mix and match both? Simply have the final counts in the product and users tables, so that you don't have to count every time and have the votes table , so that there is no double posting.
Edit:
To explain it a bit further, product and user table will have a column called "votes". Every time the insert is successfull in user_product_vote, increment the relevant user and product records. This would avoid dupe votes and you wont have to run the complex count query every time as well.
Edit:
Also i am assuming that you have created a unique index on product_id and user_id, in this case any duplication attempt will automatically fail and you wont have to check in the table before inserting. You will just to make sure the insert query ran and you got a valid value for the "id" in the form on insert_id
You have to balance the desire for your site to perform quickly (in which the second schema would be best) and the ability to count votes for specific users and prevent double voting (for which I would choose the first schema). Because you are only using integer columns for the user_product_vote table, I don't see how performance could suffer too much. Many-to-many relationships are common, as you have implemented with user_product_vote. If you do want to count votes for specific users and prevent double voting, a user_product_vote is the only clean way I can think of implementing it, as any other could result in sparse records, duplicate records, and all kinds of bad things.
You don't want to update the product table directly with an aggregate every time someone votes - this will lock product rows which will then affect other queries which are using products.
Assuming that not all product queries need to include the votes column, you could keep a separate productvotes table which would retain the running totals, and keep your userproductvote table as a means to enforce your user voting per product business rules / and auditing.