I need to know what is the fastest way to count the rows of views according to each product. I tried to run the query by join it between two table. A 'product_db' and 'product_views'. But it took about a minute to complete a query.
Here is my code:
select *,
count(product_views.vwr_id) as product_viewer,
from product_db
inner join product_viewer on product_db.id=product_views.vwr_cid
where product_id='$pid' order by id desc
Where '$pid' is a product id.
This is my product_views table.
I need to include a column of viewers into my table. But it takes very long time to load. I either tried to count a separate query, but no luck. Please you guy suggest a more brilliant way.
Regards,
It sounds like your query is slow, not the counting. Two things you could try:
Make sure the product_id field has an index on it.
If the product_id is a numeric field, remove the single quotes around it. In other words change this where product_id='$pid' to this where product_id=$pid. MySQL could be doing a conversion on the product_id field to convert it to a string for the comparison and ignoring the index if it does exist.
Related
okay I know that the query works as it runs just fine in my DB that I use for practice. however, I am still extremely new to MySQL and would just like to understand a little bit more.
here is my query...
SELECT
photos.id, photos.image_url, COUNT(*) as total
FROM photos
JOIN likes
ON likes.photo_id = photos.id
GROUP BY photos.id
ORDER BY total DESC
LIMIT 1;
so my question is, how does the aggregate function "COUNT(*)" know to count from the correct table? I want it to count from the "likes" table and it does that, but how does it understand that is what I am asking?
I was originally thinking I would need to do a "COUNT(likes.photo_id)" but it was unnecessary.
so how does it know?
am I just going down a rabbit hole that in the long run just does not matter?
Count() counts the number of rows returned by the query as a whole. If you run the query without the count, it returns a specific number of rows. That's what Count() is counting.
It isn't counting rows in either photos or likes. It's counting rows in the joined result set.
Here's a cool thing about SQL: the result of relational operations (for example JOIN or UNION) between tables is... another table! It isn't a table that is stored in your database, but it's a table.
You can think of an analogy in arithmetic: the sum of two positive integers is another positive integer. In mathematics, this is called a closure.
It's the same in relational algebra. When you combine two tables with one of the relational operators, the result is another thing that could be a table itself.
So COUNT(*) is not counting rows in either table. It's counting the rows in the table produced as the result of the JOIN.
I have the issue of using GROUP BY when select all the column from the table and in result with the poor performance in term of speed.
Select * from employee
group by customer_id;
The query above wouldn't be change,it is mandatory and fixed.It takes 17720ms is to long and the result must take shorter time, which is below 1 minute as my desired result.Since the table has many column and record, so it take much time in query searching.Is there any solution to solve this problem.Thanks.
For as simple as your query is, it appears almost pointless... You would not have duplicate employee IDs within an employee table, and doing a group by would still result in returning every row, every column.
However, that said, to optimize a GROUP BY, you would need an index on that column ... which I would think would already exist as the employee ID would probably be the primary key to the table.
Additionally, you don't have any aggregate columns what would warrant a group by. Are you instead just trying to LOOK for a specific employee? If so, that would be a different query using a WHERE clause for the criteria you are looking for.
FEEDBACK...
You updated your question and did a group by CUSTOMER ID (not employee ID). Ok, but what do you really mean to group by..
OR... Did you want to ORDER by a customer... In other words, I want a list of all employees, but want them sorted by the customer they are associated with... If this is the case, you would want something like...
select *
from employees
ORDER BY
customerID,
employeeLastName,
employeeFirstName
Without seeing your table structure(s), but if the employee table DOES have a column for the customer ID they are associated with, this query would put all employees for the same customer in a common PRE-SORT output by customer, then within that customer, sorted by the employees name (last, first).
If you have another table(s) with relationships between employees and customers, we would need to see that too to better offer an answer.
Column with heavy type LIKE BLOB, TEXT, NVARCHAR(200 or more) will slowdown your query by a lot if you have a lot of records. I suggest to check if it is really necessary to load them all from the start.
Also, you GROUP BY seem weird. What exactly are you trying to achieve with it?
The GROUP BY is not just weird, it is wrong. If you don't specify all the non-aggregate columns in the GROUP BY, you get seemingly random values for each column. Remove the GROUP BY or explain why you think you need it.
Or maybe the "*" is not correct. OK, you cannot show us your real column names, at least show us the real pattern to the SELECT, even if it has bogus column names.
I'm also confused as to why you call it a "search". There is no WHERE clause, which is where "search" criteria goes.
I have a table with nearly 30 M records and size is 6.6 GB. I need to query some data from it and use group by and order by. It takes me too long to query the data, I lost connection to DB so many times...
I have index on all necessary fields as key and composite key. What else can I do to make it faster for the query?
Example query:
select id, max(price), avg(order) from table group by id, date order by id, location.
use EXPLAIN query, where query is your query. For example: EXPLAIN select * from table group by id, date order by id, location.
You'll see a table where MySQL analyses your query and shows which indices it looks for. Possibly you don't have sufficient (god enough) indices.
I don't think you can. With no filter (WHERE clause) and AVG the entire tables has to be read.
The only thing I can think of is to have a new table with ID, AVG_ORDER, MAX_PRICE (or whatever you need) and update that using a trigger or stored procedure when you insert/update new rows.
an index on ID,PRICE index might help you if you didn't need that pesky average.
Indexing isn't going to do you any good. You're averaging a column, so you have to read every row in the table. That's going to take time.
SELECT DISTINCT `Stock`.`ProductNumber`,`Stock`.`Description`,`TComponent_Status`.`component`, `TComponent_Status`.`certificate`,`TComponent_Status`.`status`,`TComponent_Status`.`date_created`
FROM Stock , TBOM , TComponent_Status
WHERE `TBOM`.`Component` = `TComponent_Status`.`component`
AND `Stock`.`ProductNumber` = `TBOM`.`Product`
Basically table TBOM HAS :
24,588,820 rows
The query is ridiculously slow, i'm not too sure what i can do to make it better. I have indexed all the other tables in the query but TBOM has a few duplicates in the columns so i can't even run that command. I'm a little baffled.
To start, index the following fields:
TBOM.Component
TBOM.Product
TComponent_Status.component
Stock.ProductNumber
Not all of the above indexes may be necessary (e.g., the last two), but it is a good start.
Also, remove the DISTINCT if you don't absolutely need it.
The only thing I can really think of is having an index on your Stock table on
(ProductNumber, Description)
This can help in two ways. Since you are only using those two fields in the query, the engine wont be required to go to the full data row of each stock record since both parts are in the index, it can use that. Additionally, you are doing DISTINCT, so having the index available to help optimize the DISTINCTness, should also help.
Now, the other issue for time. Since you are doing a distinct from stock to product to product status, you are asking for all 24 million TBOM items (assume bill of materials), and each BOM component could have multiple status created, you are getting every BOM for EVERY component changed.
If what you are really looking for is something like the most recent change of any component item, you might want to do it in reverse... Something like...
SELECT DISTINCT
Stock.ProductNumber,
Stock.Description,
JustThese.component,
JustThese.certificate,
JustThese.`status`,
JustThese.date_created
FROM
( select DISTINCT
TCS.Component,
TCS.Certificate,
TCS.`staus`,
TCS.date_created
from
TComponent_Status TCS
where
TCS.date_created >= 'some date you want to limit based upon' ) as JustThese
JOIN TBOM
on JustThese.Component = TBOM.Component
JOIN Stock
on TBOM.Product = Stock.Product
If this is a case, I would ensure an index on the component status table, something like
( date_created, component, certificate, status, date_created ) as the index. This way, the WHERE clause would be optimized, and distinct would be too since pieces already part of the index.
But, how you currently have it, if you have 10 TBOM entries for a single "component", and that component has 100 changes, you now have 10 * 100 or 1,000 entries in your result set. Take this and span 24 million, and its definitely not going to look good.
I have this query (I didn't write) that was working fine for a client until the table got more then a few thousand rows in it, now it's taking 40 seconds+ on only 4200 rows.
Any suggetions on how to optimize and get the same result?
I've tried a few other methods but didn't get the correct result that this slower query returned...
SELECT COUNT(*) AS num
FROM `fl_events`
WHERE id IN(
SELECT DISTINCT (e2.id)
FROM `fl_events` AS e1, fl_events AS e2
WHERE e1.startdate >= now() AND e1.startdate = e2.startdate
)
ORDER BY `startdate`
Any help would be greatly appriciated!
Appart from the obvious indexes needed, I don't really get why you are joining your table with itself for choosing the IN condition. The ORDER BY is also not needed. Are you sure that your query can't be written just like this?:
SELECT COUNT(*) AS num
FROM `fl_events` AS e1
WHERE e1.startdate >= now()
I don't think rewriting the query will help. The key to your question is "until the table got more than a few thousand rows." This implies that important columns aren't indexed. Prior to a certain number of records, all the data fit on a single memory block - over that point, it takes a new block. And index is the only way to speed up the search.
first - check to see that the ID in fl_events is actually marked as a primary key. That physically orders the records and without it you can see data corruption and occasionally super-slow results. The use of distinct in the query makes it look like it might NOT be a unique value. That will pose a problem.
Then, make sure to add an index on the start_date.
The slowness is probably related to the join of the event table with itself, and possibly startdate not having an index.