SQL - Reduce the time request - mysql

I've got this table (eod) :
| eod_id | company_symbol | date | open | close | high | low |
| 1 | AAA | 01-01-2000 | 40.00 | 42.00 | 43.00 | 39.00 |
I use those 3 requests :
1. SELECT COUNT(*) FROM eod WHERE company_symbol="AAA" AND CLOSE>OPEN
AND DATE BETWEEN "0000-00-00" AND "0000-00-00";
2. SELECT COUNT(*) FROM eod WHERE company_symbol="AAA" AND CLOSE<OPEN
AND DATE BETWEEN "0000-00-00" AND "0000-00-00";
3. SELECT min(date), max(date) FROM eod WHERE company_symbol="AAA"
Each request takes around 0,7sec to be executed, so I would like to reduce the time of each one. How could I process ? Is it possible to do the two first requests in only one ?
Thanks in advance for your help,
Steve

Combining first two:
SELECT
SUM(CASE WHEN CLOSE>OPEN THEN 1 ELSE 0 END) as Higher,
SUM(CASE WHEN CLOSE<OPEN THEN 1 ELSE 0 END) as Lower
FROM eod WHERE company_symbol="AAA"
AND DATE BETWEEN "0000-00-00" AND "0000-00-00";

As you suspect the first two queries can be combined:
SELECT COUNT(CASE WHEN CLOSE < OPEN THEN 1 END),
COUNT(CASE WHEN CLOSE > OPEN THEN 1 END)
FROM eod
WHERE company_symbol="AAA"
AND DATE BETWEEN "0000-00-00" AND "0000-00-00";
(I assume you DATE BETWEEN clause is just an example, but if not it could be changed to DATE = "0000-00-00")
I'd say a nonclustered index on company_symbol is a must, if your dbms supports non key columns then include OPEN, CLOSE and DATE in this index, then depending on how frequently you insert/update data it may also be worth having indexes on your date columns too.
As always with performance based questions you are in a much better position to help yourself than we are to help you, you can view execution plans, IO statistics, and run various tests etc to determine what is slowing your query down, once you have identified the problem more specifically you can then look at adding specific indexes to resolve the problem.

My first suggestion to improve performance would be to never ever user count(*) but to use a single column like count(eod_id) in your case.

Related

MYSQL use LEAST() or GREATEST() functions in WHERE CLAUSE

I Have a lot of columns to process in a query (columns result can be also NULL) and at the end i need an unique list of all pieces for a timetable (e.g. "what part of what piece i should work first")
my table is something like this
piece type | deadline for first check | deadline for second check | deadline for third check | deadline for n. check
---------------------------------------------------------------------------------------------------------------------
FIRST | NULL | 2022-02-01 | 2022-01-18 | 2022-04-01
SECOND | 2022-03-01 | 2022-01-15 | 2022-03-15 | 2022-05-01
Current query and php processing (slow) give me out something like :
2022-01-15 SECOND (second check)
2022-03-01 SECOND (first check)
2022-05-01 SECOND (n. check)
2022-01-18 FIRST (third check)
...
As i've more than 600 pieces (of different types) and 6-7 checks to do (in total, but something like 4 for a piece type, 2 for a piece type and so on) i would like to know if is there a way to limit (let's say "least of deadlines < today" or something like 'least of deadlines within 10 days ) if (php based) is there "no filtering" list (on piece type)
Any help appreciated!
Consider something like:
SELECT piece_type, check_date, which
FROM (
SELECT piece_type, deadline1 AS check_date, 1 AS which
FROM tbl WHERE deadline1 IS NOT NULL
UNION ALL
SELECT piece_type, deadline2 AS check_date, 2 AS which
FROM tbl WHERE deadline2 IS NOT NULL
UNION ALL
SELECT piece_type, deadline3 AS check_date, 3 AS which
FROM tbl WHERE deadline3 IS NOT NULL
) AS x
ORDER BY check_date;

Is it possible in MySQL to find the Min/Max but remove outliers first?

I have a table that holds scan datetime values. I am wanting to find the start and stop scan time of the users from the main portion of scanning. The issue is that a user may perform some checks before or after the bulk of the scanning and generate a few more scans. The data might look as below.
....
| 2020-04-01 19:48:05 |
| 2020-04-01 19:48:22 |
| 2020-04-01 19:48:23 |
| 2020-04-01 19:48:48 |
| 2020-04-01 19:48:49 |
| 2020-04-01 20:45:33 |
+---------------------+
If I group by the date and grab the min/max of these values my time elapsed will be much large than the actual. In the case above the max would add almost 1 hour of extra time, which was not really spent scanning.
SELECT date, MIN(datetime), MAX(datetime) FROM table GROUP BY date
There might be 1 extra scan or there might be several scans at the beginning or the end of the data so throwing out the first and last data points is not really an option.
Hmmm . . . I think this is a gap and islands problem. You need some definition of when an outlier occurs. Say it is 5 minutes:
select min(datetime), max(datetime), count(*) as num_scans
from (select t.*,
sum(case when prev_datetime > datetime - interval 5 minute then 0 else 1 end) over (order by datetime) as grp
from (select t.*,
lag(datetime) over (order by datetime) as prev_datetime
from t
) t
) t
group by grp;
I'm not sure how you distinguish actual scans from the outliers. Perhaps if there is more than one row or so. If that is the case, you can remove the outliers with logic such as having count(*) > 1.

Is it possible to use the current dayname in the where clause in MySQL?

I am trying to create a view in MySQL based on the current day of the week. I am creating a table to keep track of tasks based on the day of the week. For example, some tasks will happen every Tuesday, some will happen on Wednesday and Friday, etc.
I decided to set the table up with a column for each day of the week. If the task needs to be executed on that day I will store a 1 in the column, otherwise it will be a 0. The table looks like this:
| ID | Monday | Tuesday | Wednesday | Thursday | Friday | Task |
-----------------------------------
| 1 | 0 | 1 | 0 | 0 | 0 | "SomeTask" |
| 2 | 0 | 0 | 1 | 0 | 1 | "SomeTask" |
| 3 | 0 | 1 | 0 | 0 | 0 | "SomeTask" |
I would like to create a SELECT statement that will be used in a view to show the tasks that need to be executed on the current day. In other words, today is Tuesday so I would like to a query that will get the rows with the ID of 1 and 3 to show up.
I tried the following , but it didn't work:
SELECT * FROM MyTasks WHERE DAYNAME(curdate()) = 1
Is there a better way to format the table? Is there anyway to use DAYNAME in the WHERE clause? Any suggestions?
You can use case like this:
SELECT * FROM `MyTasks` WHERE (CASE DAYNAME(NOW())
WHEN 'Monday' THEN `Monday`=1
WHEN 'Tuesday' THEN `Tuesday`=1
WHEN 'Wednesday' THEN `Wednesday`=1
WHEN 'Thursday' THEN `Thursday`=1
WHEN 'Friday' THEN `Friday`=1
END)
Apart from that I don't see any way of you accomplishing this, as the column names are static and can't be dynamically built up based on other functions etc
you can get day name of using DAYNAME(curdate()) function
this is returning Thursday (today is 2015-03-05) but,
According to your table structure have to use 1 of following queries
01 SELECT * FROM MyTasks WHERE (
CASE DAYNAME(curdate())
WHEN 'Monday' THEN `Monday`=1
WHEN 'Tuesday' THEN `Tuesday`=1
WHEN 'Wednesday' THEN `Wednesday`=1
WHEN 'Thursday' THEN `Thursday`=1
WHEN 'Friday' THEN `Friday`=1
END)
02 SELECT * FROM MyTasks WHERE (
CASE weekday(curdate())
WHEN 0 THEN `Monday`=1
WHEN 1 THEN `Tuesday`=1
WHEN 2 THEN `Wednesday`=1
WHEN 3 THEN `Thursday`=1
WHEN 4 THEN `Friday`=1
END)
DAYNAME returns you Name of Day in a week, so your query should be:
SELECT * FROM MyTasks WHERE DAYNAME(NOW()) = 'Saturday';
I think you need DAYOFWEEK function to get week day index.
Use the WEEKDAY() function to get the answer what you are looking for.
SELECT * FROM MyTasks WHERE WEEKDAY(curdate()) = 1
It would be better to define a column named Day that would be an enum of each day of the week, instead of the 7 columns you have, like this :
`Day ENUM(1, 2, 3, 4, 5, 6, 7)`
You can then just convert the current day into the adequate value (e.g. from 1 to 7) and use it in your SQL query, like this using PHP :
$sql = 'select * from table where Day = ' . date('N');
date('N') will return a value from 1 to 7 depending on the current day of the week.
Note : this will use the server time of the machine running PHP.
Here is an example of the table :
CREATE TABLE IF NOT EXISTS `enumtest` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`Day` enum('1','2','3','4','5','6','7') NOT NULL,
`Task` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;
You can keep your model as it is and use one of two solutions: interpolate the column name on your language as you do with column values (hacky) or you can use an stored procedure for that.
But you can also do this in a RDBMS usual way, with two tables and a joint one. You would have a weekday, a task and a weekday_task table in your schema.
Table task would only have data related to the task itself, maybe with a surrogate serial id. Table weekday only data related to the weekday itself, nothing much, just information as its name and probably an working_day attribute.
And the joint would just include the task Pk and the weekday Pk. It is an ordinary n:m relation and the record would exist only for appointments.
This model is probably missing a lot of stuff related to the usual domain problem and appears to be a learning exercise, so, if it is about learning, you should go with the n:m solution.
The actual problem will probably grow to require by (valid times, start and end) for both task and weekday_task, as the weekday will probably gain a specific day companion, as a more complex solution to deal with real world frequencies anyway. This is not trivial stuff and may be GoF already mapped this for you both as persistence model and domain model.

Graph per-day from ranges in MySQL

I am trying to make a graph that has a point for each day showing the number of horses present per-day.
This is example of data I have (MySQL)
horse_id | start_date | end_date |
1 | 2011-04-02 | 2011-04-03 |
2 | 2011-04-02 | NULL |
3 | 2011-04-04 | 2014-07-20 |
4 | 2012-05-11 | NULL
So a graph on that data should output one row per day starting on 2011-04-02 and ending on CURDATE, for each day it should return how many horses are registered.
I can't quite wrap my head around how I would do this, since I only have a start date and an end date for each item, and I want to know per-day how many was present on that day.
Right now, I do a loop and a SQL query per day, but that is - as you might have guesses - thousands of queries, and I was hoping it could be done smarter.
If a day between 2011-04-02 and now contains nothing, I still want it out but with a 0.
If possible I would like to avoid having a table with a row for each day containing a count.
I hope it makes sense, I am very stuck here.
What you should have, is a table containing just dates from at least the earliest date in your current table till the current date.
Then you can use this table to left join it something like this:
SELECT
dt.date,
COUNT(yt.horse_id)
FROM
dates_table dt
LEFT JOIN your_table yt ON dt.date BETWEEN yt.start_date AND COALESCE(end_date, CURDATE())
GROUP BY dt.date
Be sure to have a column of your_table in the COUNT() function, otherwise it counts the NULL values too.
The COALESCE() function returns the first of its parameter which isn't NULL, so if you don't have an end_date specified, the current date is taken instead.

MySQL: Return 0 if row doen't exist

I've been bashing my head on this for a while, so now I'm here :) I'm a SQL beginner, so maybe this will be easy for you guys...
I have this query:
SELECT COUNT(*) AS counter, recur,subscribe_date
FROM paypal_subscriptions
WHERE recur='monthly' and subscribe_date > "2010-07-16" and subscribe_date < "2010-07-23"
GROUP BY subscribe_date
ORDER BY subscribe_date
Now the dates I've shown above are hard coded, my application will supply a variable date range.
Right now I'm getting a result table where there is a value for that date.
counter |recur | subscribe_date
2 | Monthly | 2010-07-18
3 | Monthly | 2010-07-19
4 | Monthly | 2010-07-20
6 | Monthly | 2010-07-22
I'd like to return in the counter column if the date doesn't exist.
counter |recur | subscribe_date
0 | Monthly | 2010-07-16
0 | Monthly | 2010-07-17
2 | Monthly | 2010-07-18
3 | Monthly | 2010-07-19
4 | Monthly | 2010-07-20
0 | Monthly | 2010-07-21
6 | Monthly | 2010-07-22
0 | Monthly | 2010-07-23
Is this possible?
You will need a table of dates (new table added), and then you will have to do an outer join between that table and your query.
This question is also similar to another question. Answers can be quite similar.
Insert Dates in the return from a query where there is none
You will need a table of dates to group against. This is quite easy in MSSQL using CTE's like this - I'm not sure if MySQL has something similar?
Otherwise you will need to create a hard table as a one off exercise
EDIT : Give this a try:
SELECT COUNT(pp.subscribe_date) AS counter, dl.date, MIN(pp.recur)
FROM date_lookup dl
LEFT OUTER JOIN paypal pp
on (pp.subscribe_date = dl.date AND pp.recur ='monthly')
WHERE dl.date >= '2010-07-16' and dl.date <= '2010-07-23'
GROUP BY dl.date
ORDER BY dl.date
The subject of the query needs to be changed to the date_lookup table
(the order of the Left Outer Join becomes important)
Count(*) isn't going to work since the 'date' record always exists - need to count something in the PayPay table
pp.recur ='monthly' is now a join condition, not a filter because of the LOJ
Finally, showing pp.recur in the select list isn't going to work.
I've used an aggregate, but MIN(pp.recur) will return null if there are no PayPal records
What you could do when you parameterize your query is to just repeat the Recur Type Filter?
Again, plz excuse the MSSQL syntax
SELECT COUNT(pp.subscribe_date) AS counter, dl.date, #ppRecur
FROM date_lookup dl
LEFT OUTER JOIN paypal pp
on (pp.subscribe_date = dl.date AND pp.recur =#ppRecur)
WHERE dl.date >= #DateFrom and dl.date <= #DateTo
GROUP BY dl.date
ORDER BY dl.date
Since there was no easy way to do this, I had to have the application fill in the blanks for me rather than have the database return the data I wanted. I do get a performance hit for this, but it was necessary for the completion of the report.
I will definitely look into making this return what I want from the DB in the near future. I'll give nonnb's solution a try.
thanks everyone!