Need to improve sql performance - mysql

Table temporary_search_table
post_id,property_status, property_address,....more 30 field
Table search_meta
meta_id,search_id,status,created_date
Ok I need Total data which created_date is yesterday. For each temporary_search_table data there may multiple entry within search_meta. So we need to pick last one field from search_meta and check created date is yesterday and property_status is pending. if yes then we can count the number. If there is no data available in search_meta for entry in temporary_search_table then we dont need to count that row within our results.
Here i am attaching my sql data. its work but for 30000 row it take lots of time.
SELECT COUNT(id) FROM temporary_search_table
WHERE property_status = 'pending' AND (1 = (SELECT DATEDIFF(NOW(), created_date)
FROM search_meta WHERE post_id = search_id ORDER BY created_date DESC LIMIT 0,1 ))
Thanks in advance.

Apart from checking the indexes on your table, it would probably be better to not use a correlated sub query and use a straight join instead.
SELECT COUNT(id)
FROM temporary_search_table
INNER JOIN search_meta ON post_id = search_id
WHERE property_status = 'pending' AND DATEDIFF(NOW(), created_date) = 1
ORDER BY created_date DESC
LIMIT 1

Related

How to get the last rows per group in mysql

I have a query that retrieves the reservation made by a team
the query computes and retrieves good but the problem is that I only want to retrieve the latest reservation made by the team but my query shows their first reservation made.
Here is the complete query
select
tbl_lab_reservations.id,
tbl_lab_reservations.full_desc,
serial_number,
rsvn_owner,
reservation_id,
reservation_date_end,
reservation_date_start,
(SELECT DATEDIFF( if(reservation_date_end = '0000-00-00', CURDATE(), reservation_date_end),
reservation_date_start)+1) as totalNumberOfDaysReserve
from tbl_lab_reservations
join tbl_lab_assets on tbl_lab_assets.id = tbl_lab_reservations.lab_id
where tbl_lab_reservations.full_desc = 'Dell Optiplex 380'
and tbl_lab_reservations.asset_status = 'Idle'
group by serial_number, rsvn_owner
ORDER BY tbl_lab_reservations.id ASC
The query that you have given is correct for showing the records from first to latest as the order by clause is asc. To retrieve the latest record,
change the order by clause to desc
from which you will get the latest record as the first one in the result(only if you have the tbl_lab_reservations.id is unique for all records).
To get only the latest record and omit the other, you have the limit keyword. The limit should be used at the end of the query to set the limit of records to be fetched.
Syntax : limit N ,where N specifies the number of record.
Example for you.
link
select id, events from msql
where id < '05'
group by id
order by id desc
limit 1

DISTINCT ON query w/ ORDER BY max value of a column

I've been tasked with converting a Rails app from MySQL to Postgres asap and ran into a small issue.
The active record query:
current_user.profile_visits.limit(6).order("created_at DESC").where("created_at > ? AND visitor_id <> ?", 2.months.ago, current_user.id).distinct
Produces the SQL:
SELECT visitor_id, MAX(created_at) as created_at, distinct on (visitor_id) *
FROM "profile_visits"
WHERE "profile_visits"."social_user_id" = 21
AND (created_at > '2015-02-01 17:17:01.826897' AND visitor_id <> 21)
ORDER BY created_at DESC, id DESC
LIMIT 6
I'm pretty confident when working with MySQL but I'm honestly new to Postgres. I think this query is failing for multiple reasons.
I believe the distinct on needs to be first.
I don't know how to order by the results of max function
Can I even use the max function like this?
The high level goal of this query is to return the 6 most recent profile views of a user. Any pointers on how to fix this ActiveRecord query (or it's resulting SQL) would be greatly appreciated.
The high level goal of this query is to return the 6 most recent
profile views of a user.
That would be simple. You don't need max() nor DISTINCT for this:
SELECT *
FROM profile_visits
WHERE social_user_id = 21
AND created_at > (now() - interval '2 months')
AND visitor_id <> 21 -- ??
ORDER BY created_at DESC NULLS LAST, id DESC NULLS LAST
LIMIT 6;
I suspect your question is incomplete. If you want:
the 6 latest visitors with their latest visit to the page
then you need a subquery. You cannot get this sort order in one query level, neither with DISTINCT ON, nor with window functions:
SELECT *
FROM (
SELECT DISTINCT ON (visitor_id) *
FROM profile_visits
WHERE social_user_id = 21
AND created_at > (now() - interval '2 months')
AND visitor_id <> 21 -- ??
ORDER BY visitor_id, created_at DESC NULLS LAST, id DESC NULLS LAST
) sub
ORDER BY created_at DESC NULLS LAST, id DESC NULLS LAST
LIMIT 6;
The subquery sub gets the latest visit per user (but not older than two months and not for a certain visitor21. ORDER BY must have the same leading columns as DISTINCT ON.
You need the outer query to get the 6 latest visitors then.
Consider the sequence of events:
Best way to get result count before LIMIT was applied
Why NULLS LAST? To be sure, you did not provide the table definition.
PostgreSQL sort by datetime asc, null first?

This slow MySQL Query needs improvement

This query works and provides me with the information I need, but it is very slow: it takes 18 seconds to agregate a database of only 4,000 records.
I'm bringing it here to see if anyone has any advice on how to improve it.
SELECT COUNT( status ) AS quantity, status
FROM log_table
WHERE time_stamp
IN (SELECT MAX( time_stamp ) FROM log_table GROUP BY userid )
GROUP BY status
Here's what it does/what it needs to do in plain text:
I have a table full of logs, each log contains a "userid", "status" (integer between 1-12) and "time_stamp" (a time stamp of when the log was created). There may be many entries for a particular userid, but with a different time stamp and status. I'm trying to get the most recent status (based on time_stamp) for each userid, then count the occurrences of each most-recent status among all the users.
My initial idea was to use a sub query with GROUP BY userid, that worked fast - but that always returned the first entry for each userid, not the most recent. If I could do GROUP BY userid using time_stamp DESC to Identify which row should be the representative for the group, that would be great. But of course ORDER BY inside of group does not work.
Any suggestions?
The first thing to try is to make this an explicit join:
SELECT COUNT(status) AS quantity, status
FROM log_table join
(select lg.userid, MAX( time_stamp ) as maxts
from log_table lg
GROUP BY userid
) lgu
on lgu.userid = lg.userid and lgu.maxts = lg.time_stamp
GROUP BY status;
Another approach is to use a different where clause. This will work best if you have an index on log_table(userid, time_stamp). This approach is doing the filtering by saying "there is no timestamp bigger than this one for a given user":
SELECT COUNT(status) AS quantity, status
FROM log_table
WHERE not exists (select 1
from log_table lg2
where lgu.userid = lg.userid and lg2.time_stamp > lg.time_stamp
)
GROUP BY status;

MySQL: Getting average or sum from 200,000 rows

I would like to get the average or at least the sum of 200,000 rows from mySQL database. This is how I am querying the database but the amount is too large for me to query because I cannot afford to overload the server.
SELECT user_id, total_email FROM email_users
WHERE email_code = 1
LIMIT 200000
SELECT SUM(total_email), AVG(total_email) FROM email_users
WHERE user_id IN
(
01, 02,..., 200000-th user_id
)
My question is there a way to somehow combine the two queries into one so that I can get just the sum or average of 200,000 email_users which has email_code = 1.
EDIT: Thanks to all that have answered. I didn't realise the answer was so easy - nested select statement.
You can do this with a subquery:
SELECT SUM(total_email), AVG(total_email)
from (SELECT eu.*
FROM email_users eu
WHERE eu.email_code = 1
LIMIT 200000
) eu
Some notes. First, using limit without an order by gives indeterminate results. You could (in theory) run this query twice and get different results. Second, this assumes that there is a field called total_email in email_users.
SELECT SUM(total_email), AVG(total_email)
FROM (SELECT total_email
FROM email_users
WHERE email_code = 1
LIMIT 200000) x
How about something like this assuming you just want any 200K records from the DB where email_code=1
SELECT SUM(total_email), AVG(total_email) FROM email_users
WHERE user_id IN
(
SELECT user_id
FROM email_users
WHERE email_code = 1 LIMIT 200000
)
or
SELECT SUM(total_email), AVG(total_email) FROM
(SELECT user_id , total_email
FROM email_users
WHERE email_code = 1 LIMIT 200000)

Sql Query to count same date entries

All I want to count entries based on date.(i.e entries with same date.)
My table is
You can see 5th and 6th entry have same date.
Now, the real problem as i think is the same date entry have different time so i am not getting what I want.
I am using this sql
SELECT COUNT( created_at ) AS entries, created_at
FROM wp_frm_items
WHERE user_id =1
GROUP BY created_at
LIMIT 0 , 30
What I am getting is this.
I want entries as 2 for date 2012-02-22
The reason you get what you get is because you also compare the time, down to a second apart. So any entries created the same second will be grouped together.
To achieve what you actually want, you need to apply a date function to the created_at column:
SELECT COUNT(1) AS entries, DATE(created_at) as date
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE(created_at)
LIMIT 0 , 30
This would remove the time part from the column field, and so group together any entries created on the same day. You could take this further by removing the day part to group entries created on the same month of the same year etc.
To restrict the query to entries created in the current month, you add a WHERE-clause to the query to only select entries that satisfy that condition. Here's an example:
SELECT COUNT(1) AS entries, DATE(created_at) as date
FROM wp_frm_items
WHERE user_id = 1
AND created_at >= DATE_FORMAT(CURDATE(),'%Y-%m-01')
GROUP BY DATE(created_at)
Note: The COUNT(1)-part of the query simply means Count each row, and you could just as well have written COUNT(*), COUNT(id) or any other field. Historically, the most efficient approach was to count the primary key, since that is always available in whatever index the query engine could utilize. COUNT(*) used to have to leave the index and retrieve the corresponding row in the table, which was sometimes inefficient. In more modern query planners this is probably no longer the case. COUNT(1) is another variant of this that didn't force the query planner to retrieve the rows from the table.
Edit: The query to group by month can be created in a number of different ways. Here is an example:
SELECT COUNT(1) AS entries, DATE_FORMAT(created_at,'%Y-%c') as month
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE_FORMAT(created_at,'%Y-%c')
You must eliminate the time with GROUP BY
SELECT COUNT(*) AS entries, created_at
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE(created_at)
LIMIT 0 , 30
Oops, misread it.
Use GROUP BY DATE(created_at)
Try:
SELECT COUNT( created_at ) AS entries, created_at
FROM wp_frm_items
WHERE user_id =1
GROUP BY DATE(created_at)
LIMIT 0 , 30