Is it ok to use COUNT queries? - mysql

Is it OK to use the number of rows in result returned by query (COUNT() function in MySQL) for any checks and so on?
For example if I want to check how many posts have the user made today to check if he can create another (in another words, have the user reached his daily limit), is it a good practice to just send a query like this
SELECT COUNT(post_text) FROM posts WHERE (date_published = CURDATE() AND userId = 115);
or is there a better approach. I faced this a couple of times (I don't write database logic often) and it always kinda confused me if I get this wrong or not. So hope you will just clarify this for me once and for all, thanks.

If a user has daily limits, then you probably want to do this check in the database.
In that case, you would implement this restriction using a trigger rather than at the application level. This ensures that the restriction is always applied, regardless of competing threads, table locks, or who is doing the update.
If you do want to implement the restriction at the application level, then you would use a query, presumably with count(). I would expect the query to include the user id:
SELECT COUNT(*)
FROM posts p
WHERE p.date_published = CURDATE() AND p.user_id = ?;

Related

Optimize query from view with UNION ALL

So I'm facing a difficult scenario, I have a legacy app, bad written and designed, with a table, t_booking. This app, has a calendar view, where, for every hall, and for every day in the month, shows its reservation status, with this query:
SELECT mr1b.id, mr1b.idreserva, mr1b.idhotel, mr1b.idhall, mr1b.idtiporeserva, mr1b.date, mr1b.ampm, mr1b.observaciones, mr1b.observaciones_bookingarea, mr1b.tipo_de_navegacion, mr1b.portal, r.estado
FROM t_booking mr1b
LEFT JOIN a_reservations r ON mr1b.idreserva = r.id
WHERE mr1b.idhotel = '$sIdHotel' AND mr1b.idhall = '$hall' AND mr1b.date = '$iAnyo-$iMes-$iDia'
AND IF (r.comidacena IS NULL OR r.comidacena = '', mr1b.ampm = 'AM', r.comidacena = 'AM' AND mr1b.ampm = 'AM')
AND (r.estado <> 'Cancelled' OR r.estado IS NULL OR r.estado = '')
LIMIT 1;
(at first there was also a ORDER BY r.estado DESC which I took out)
This query, after proper (I think) indexing, takes 0.004 seconds each, and the overall calendar view is presented in a reasonable time. There are indexes over idhotel, idhall, and date.
Now, I have a new module, well written ;-), which does reservations in another table, but I must present both types of reservations in same calendar view. My first approach was create a view, joining content of both tables, and selecting data for calendar view from this view instead of t_booking.
The view is defined like this:
CREATE OR REPLACE VIEW
t_booking_hall_reservation
AS
SELECT id,
idreserva,
idhotel,
idhall,
idtiporeserva,
date,
ampm,
observaciones,
observaciones_bookingarea,
tipo_de_navegacion, portal
FROM t_booking
UNION ALL
SELECT HR.id,
HR.convention_id as idreserva,
H.id_hotel as idhotel,
HR.hall_id as idhall,
99 as idtiporeserva,
date,
session as ampm,
observations as observaciones,
'new module' as observaciones_bookingarea,
'events' as tipo_de_navegacion,
'new module' as portal
FROM new_hall_reservation HR
JOIN a_halls H on H.id = HR.hall_id
;
(table new_hall_reservation has same indexes)
I tryed UNION ALL instead of UNION as I read this is much more efficient.
Well, the former query, changing t_booking for t_booking_hall_reservation, takes 1.5 seconds, to multiply for each hall and each day, which makes calendar view impossible to finish.
The app is spaguetti code, so, looping twice, once over t_booking and then over new_hall_reservation and combining results is somehow difficult.
Is it possible to tune the view to make this query fast enough? Another approach?
Thanks
PS: the less I modify original query, the less I'll need to modify the legacy app, which is, at less, risky to modify
This is too long for a comment.
A view is (almost) never going to help performance. Yes, they make queries simpler. Yes, they incorporate important logic. But no, they don't help performance.
One key problem is the execution of the view -- it doesn't generally take the filters in the overall tables into account (although the most recent versions of MySQL are better at this).
One suggestion -- which might be a fair bit of work -- is to materialize the view as a table. When the underlying tables change, you need to change t_booking_hall_reservation using triggers. Then you can create indexes on the table to achieve your performance goals.'
t_booking, unless it is a VIEW, needs
INDEX(idhotel, idhall, date)
VIEWs are syntactic sugar; they do not enhance performance; sometimes they are slower than the equivalent SELECT.

MySql Triggers and performance

I have the following requirement. I have 4 MySQL databases and an application in which the user needs to get the count of number of records in tables of each of these databases. The issue is that count may change in every minute or second. So whenever the user mouse-hovering the particular UI area, I need to have a call to all these databases and get the count. I don’t think it is a best approach, as these tables contain millions of records and every time on mouse over, a dB call is going to all these databases.
Trigger is the one approach I found. Rather than we are pulling data from the database, I feel like whenever any insert/update/delete happening to these tables, a trigger will execute and that will increment/decrement the count in another table (which contain only the count of these tables). But I have read like triggers will affect database performance, but also read some situation trigger is the only solution.
So please guide me in my situation triggers are the solution? If it affects the database performance I don’t need that. Is there any other better approach for this problem?
Thanks
What I understood is you have 4 databases and n number of tables in each of them and when the user hovers over a particular area in your application the user should see the number of rows in that table.
I would suggest you to use count(*) to return the number of rows in each table in the database.Triggers are used to do something when a particular event like update,delete or insert occurs in a database.It's not a good idea to invoke triggers to react to user interactions like hovering.If you can tell me in which language you are designing the front end I can be more specific.
Example:
SELECT COUNT(*) FROM tablename where condition
OR
SELECT SQL_CALC_FOUND_ROWS * FROM tablename
WHERE condition
LIMIT 5;
SELECT FOUND_ROWS();
The second one is used when you want to limit the results but still return total number of rows found.Hope it helps.
Please don't use count(*). This is inefficient, possibly to the point of causing a table scan. If you can get to the information schema, this should return the result you need sub-second:
select table_rows from information_schema.tables where table_name = 'tablename'
If you can't for some reason, and your table has a primary key, try:
SELECT COUNT(field) FROM tablename
...where field is part of the primary key. This will be slower, especially on large tables, but still better than count(*).
Definitely don't use trigger.

Update statistic counter or just count(*) - Perfomance

What is the faster/better way to keep track on statistical data in a message board?
-> number of posts/topics
Update a column like 'number_of_posts' for each incoming post or after a post gets deleted.
Or just count(*) on the posts matching a topicId?
Just use count(*) - it's built into the database. It's well tested, and already written.
Having a special column to do this for you means you need to write the code to manage it, keep it in sync with the actual value (on adds and deletes). Why make more work for yourself?

Is it better to store list of each user's Blocked users for query exclusion in $_SESSION var, or to exclude in "real-time" with sub-query?

On one of my PHP/MySQL sites, every user can block every other user on the site. These blocks are stored in a Blocked table with each row representing who did the blocking and who is the target of the block. The columns are indexed for faster retrieval of a user's entire "block list".
For each user, we must exclude from any search results any user that appears in their block list.
In order to do that, is it better to:
1) Generate the "block list" whenever the user logs in by querying the Blocked table once at login and saving it to the $_SESSION (and re-querying any time they make a change to their "block list" and re-saving it to the $_SESSION), and then querying as such:
NOT IN ($commaSeparatedListFromSession)
or
2) Exclude the blocked users in "real-time" directly in the query by using a sub-query for each user's search query as such:
NOT IN (SELECT userid FROM Blocked WHERE Blocked.from = $currentUserID) ?
If the website is PHP and the blocklist is less than say 100 total per user I would store it in a table, load it to $_SESSION when changed/loggging in. You could just as easily load it from SQL on each page load into a local variable however.
What I would store in $_SESSION is a flag 'has_blocklist_contents' that would decide whether or not you should load or check the blocklist on page load.
Instead of then using a NOT IN with all of your queries the list I think it might be smarter to filter them out using PHP.
I have two reasons for wanting to implement this way:
Your database can re-use the SQL for all users on the system resulting in a performance boost for retrieving comments and such.
Your block list will most of the time be empty, so you're not adding any processing time for the majority of users.
I think there is 3rd solution to it. In my opinion this would be the better way to go.
If you can write this
NOT IN (SELECT userid FROM Blocked WHERE Blocked.from = $currentUserID)
Then you can surely write this.
....
SomeTable st
LEFT JOIN
Blocked b
ON( st.userid = b.userid AND Blocked.from = $currentUserID)
WHERE b.primaryKey IS NULL;
I hope you understand what I mean by the above query.
This way you get the best of both worlds i.e. You don't have to run 2 queries, and you don't have to save data in $_SESSION
Don't use the $_SESSION as a substitute for a proper caching system. The more junk you pile into $_SESSION, the more you'll have to load for each and every request.
Using a sub-select for exclusions can be brutally slow if you're not careful to keep your database tuned. Make sure your indexes are covering all your WHERE conditions.

What is more efficient(speed/memory): a join or multiple selects

I have the following tables:
users
userId|name
items
itemId|userId|description
What I want to achieve: I want to read from the database all users and their items (an user can have multiple items). All this data I want it stored in a structure like the following:
User {
id
name
array<Item>
}
where Item is
Item {
itemId
userId
description
}
My first option would be to call a SELECT * from users, partially fill an array with users and after that for each user do a SELECT * from items where userId=wantedId and complete the array of items.
Is this approach correct, or should I use a join for this?
A reason that I don't want to use join is that I have a lot of redundant data:
userId1|name1|ItemId11|description11
userId1|name1|ItemId12|description12
userId1|name1|ItemId13|description13
userId1|name1|ItemId14|description14
userId2|name2|ItemId21|description21
userId2|name2|ItemId22|description22
userId2|name2|ItemId23|description23
userId2|name2|ItemId24|description24
by redundant I mean: userId1,name1 and userId2,name2
Is my reason justified?
LATER EDIT: I added to the title speed or memory when talking about efficiency
You're trading off network roundtrips for bytes on the wire and in RAM. Network latency is usually the bigger problem, since memory is cheap and networks have gotten faster. It gets worse as the size of the first result set grows - Google for "(n+1) query problem".
I'd prefer the JOIN. Don't write it using SELECT *; that's a bad idea in almost every case. You should spell out precisely what columns you want.
Join is the best performance way. Reduce overhead and you can use relationated indexes. You can test .. but i'm sure that joins are more fast and optimized than multiple selects
The answer is: it depends.
Multiple SELECT:
If you end up issuing lots of queries to populate the description, the you have to take into account that you'll end up with a lot of round trips to the database.
Using a JOIN:
Yes, you'll be returning more data, but you've only got one round trip.
You've mentioned that you'll partially fill an array with users. Do you know how many users you'll want to fill in advance, because in that case I would use the following (I'm using Oracle here):
select *
from item a,
(select * from
(select *
from user
order by user_id)
where rownum < 10) b
where a.user_id = b.user_id
order by a.user_id
That would return all the items for the first 10 users only (that way most of the work is done on the database itself, rather than getting all the users back, discarding all but the first ten...)