MySql - Abysmal Performance - mysql

I am trying to run a relatively simple query on a table that has half a million rows. It's just a small fragment I'm using to test the values I get back are correct. The problem is this query takes over 20 minutes to complete, which seems unusually slow even for 500,000 records.
DROP VIEW IF EXISTS view_temp_sortie_stats;
CREATE VIEW view_temp_sortie_stats AS
SELECT server_id, session_id, ucid, role, sortie_id,
(
SELECT COUNT(sortie_id)
FROM raw_gameevents_log
WHERE sortie_id = l.sortie_id AND server_id = l.server_id AND session_id = l.session_id AND target_player_ucid = l.ucid AND event = "HIT"
) AS HitsReceived
FROM raw_gameevents_log l
WHERE ucid IS NOT NULL AND sortie_id IS NOT NULL
GROUP BY server_id, session_id, ucid, role, sortie_id;
SELECT * FROM view_temp_sortie_stats;
Here is my table:
Next I tried to add indexes for server_id, session_id, sortie_id to see if it would improve - this took more than 10 minutes to apply and timed out. So I could not add them.
This seems abnormally slow, it shouldn't take this much time to add indexes, or perform this query.
My innodb_buffer_pool_size is 5GB, yet the mysqld process only consumes 300mb of memory when these queries are run.
I am running on Windows Server 2012 R2 Standard with 12 GB Ram, 2x Intel Haswell CPU, so I should be seeing much better performance than this from mysql.
There is no one else connected to this instance of MySql and no other operations are occurring.
EDIT - Here is the query explained
Does someone know what might be wrong?
EDIT2 - Partial Fix
After some googling I found out why the add index was taking forever - the original query was still running in the background for over 2hrs. Once I Killed the query the add index took about 30 seconds.
Now when I run the above query it takes 27 seconds - which is a drastic improvement for sure, but that still seems pretty slow for 500,000 records. Here is the new query explain plan:

Your subquery is:
SELECT COUNT(sortie_id)
FROM raw_gameevents_log
WHERE sortie_id = l.sortie_id AND server_id = l.server_id
AND session_id = l.session_id AND target_player_ucid = l.ucid
AND event = "HIT"
and the main query is:
SELECT server_id, session_id, ucid, role, sortie_id, [...]
FROM raw_gameevents_log l
WHERE ucid IS NOT NULL AND sortie_id IS NOT NULL
GROUP BY server_id, session_id, ucid, role, sortie_id;
Let's start from the subquery. The COUNT can count on whatever, so we don't bother with the select fields. The WHERE fields:
WHERE sortie_id = l.sortie_id AND server_id = l.server_id
AND session_id = l.session_id AND target_player_ucid = l.ucid
AND event = "HIT"
You create an index beginning with the constant fields, then the others:
CREATE INDEX subqindex ON raw_gameevents_log(
event,
sortie_id, server_id, session_id, target_player_ucid
)
Then the main query:
WHERE ucid IS NOT NULL AND sortie_id IS NOT NULL
GROUP BY server_id, session_id, ucid, role, sortie_id;
Here you need an index on
ucid, sortie_id, server_id, session_id, role
Finally, you might try getting rid of the subquery (even if the optimizer probably already does a good work with that):
SELECT server_id, session_id, ucid, role, sortie_id,
COALESCE(hits, 0) AS hits
FROM raw_gameevents_log l
LEFT JOIN
(
SELECT COUNT(1) AS hits FROM raw_gameevents_log
WHERE event = 'HIT'
) AS h
ON (h.sortie_id = l.sortie_id, h.server_id = l.server_id, h.session_id = l.session_id, h.target_player_ucid = l.ucid)
WHERE l.ucid IS NOT NULL AND l.sortie_id IS NOT NULL
GROUP BY l.server_id, l.session_id, l.ucid, l.role, l.sortie_id;

Related

Tips to optimize query, with many subqueries in MySQL

I have ~6 tables where I have to count or sum fields based on matching site_ids and date. I have the following query, with many subqueries which takes an extraordinary amount of time to run. I am certain there is an easier, more efficient way, however I am rather new to these more complex queries. I have read regarding optimizations, specifically using joins ON but struggling to understand and implement.
The goal is to speed this up and not bring my small server to it's knees when running. Any assistance or direction would be VERY much appreciated!
SELECT date(date_added) as dt_date,
site_id as dt_site_id,
(SELECT site_id from branch_mappings bm WHERE mark_id_site = dt.site_id) as site_id,
(SELECT parent_id from branch_mappings bm WHERE mark_id_site = dt.site_id) as main_site_id,
(SELECT corp_owned from branch_mappings bm WHERE mark_id_site = dt.site_id) as corp_owned,
count(id) as dt_calls,
(SELECT count(date_submitted) FROM mark_unbounce ub WHERE date(date_submitted) = dt_date AND ub.site_id = dt.site_id) as ub,
(SELECT count(timestamp) FROM mark_wordpress_contact wp WHERE date(timestamp) = dt_date AND wp.site_id = dt.site_id) as wp,
(SELECT count(added_on) FROM m_shrednations sn WHERE date(added_on) = dt_date AND sn.description = dt.site_id) as sn,
(SELECT sum(users) FROM mark_ga ga WHERE date(ga.date) = dt_date AND channel LIKE 'Organic%' AND ga.site_id = dt.site_id) as ga_organic
FROM mark_dialogtech dt
WHERE site_id is not null
GROUP BY site_name, dt_date
ORDER BY site_name, dt_date;
What you're doing is the equivalent of asking your server to query 7+ different tables every time you run this query. Personally, I use Joins and nested queries because I can whittle down do what I need.
The first 3 subqueries can be replaced with...
SELECT date(date_added) as dt_date,
dt.site_id as dt_site_id,
bm.site_id as site_id,
bm.parent_id as main_site_id,
bm.corp_owned as corp_owned,
FROM mark_dialogtech dt
INNER JOIN branch_mappings bm
ON bm.mark_id_site = dt.site_id
I'm not sure why you are running the last 3. Is there a business requirement? If so, consider how often this is to be run and when.
If absolutely necessary, add those to the joins like...
FROM mark_dialogtech dt
INNER JOIN
(SELECT site_id, count(date_submitted) FROM mark_unbounce GROUP BY site_id) ub
on ub.site_id = dt.site_id
This should limit the results to only records where the site_id exists in both the mark_dialogtech and mark_unbounce (or whatever table). From my experience, this method has sped things up.
Still, my concern is the number of aggregations you're performing. If they can be cached to a dashboard and pulled during slow times, that would be best.
Its hard to analyze how big is your query(no data examples) but in your case I hightly recommend to use CTE(Common Table Expressions). Check this :
https://www.sqlpedia.pl/cte-common-table-expressions/
CTEs do not have a physical representation in tempdb like temporary tables or tabular variables. CTE can be viewed as such a temporary, non-materialized view. When MSSQL executes a query and encounters a CTE, it replace the reference to that CTE with definition. Therefore, if the CTE data is used several times in a given query, the same code will be executed several times and MSSQL does not optimize it. Soo... it will work just for few data like you want to do.
Appreciate all the responses.
I ended up creating a python script to run the queries separately and inserting the results into the table for the respective KPI. So, I scrapped the idea of a single query due to performance. I concatenated each date and site_id to create the id, then leveraged an ON DUPLICATE KEY UPDATE with each INSERT statement.
The python dictionary looks like this, and I simply looped. Again, thanks for the help.
SELECT STATEMENTS (Python Dict)
"dt":"SELECT date(date_added) as dt_date, site_id as dt_site, count(site_id) as dt_count FROM mark_dialogtech WHERE site_id is not null GROUP BY dt_date, dt_site ORDER BY dt_date, dt_site;",
"ub":"SELECT date_submitted as ub_date, site_id as ub_site, count(site_id) as ub_count FROM mark_unbounce WHERE site_id is not null GROUP BY ub_date, ub_site;",
"wp":"SELECT date(timestamp) as wp_date, site_id as wp_site, count(site_id) as wp_count FROM mark_wordpress_contact WHERE site_id is not null GROUP BY wp_date, wp_site;",
"sn":"SELECT date(added_on) as sn_date, description as sn_site, count(description) as sn_count FROM m_shrednations WHERE description <> '' GROUP BY sn_date, sn_site;",
"ga":"SELECT date as ga_date, site_id as ga_site, sum(users) as ga_count FROM mark_ga WHERE users is not null GROUP BY ga_date, ga_site;"
INSERT STATEMENTS (Python Dict)
"dt":f"INSERT INTO mark_helper_rollup (id, on_date, site_id, dt_calls, added_on) VALUES ('{dbdata[0]}','{dbdata[1]}',{dbdata[2]},{dbdata[3]},'{dbdata[4]}') ON DUPLICATE KEY UPDATE dt_Calls={dbdata[3]}, added_on='{dbdata[4]}';",
"ub":f"INSERT INTO mark_helper_rollup (id, on_date, site_id, ub, added_on) VALUES ('{dbdata[0]}','{dbdata[1]}',{dbdata[2]},{dbdata[3]},'{dbdata[4]}') ON DUPLICATE KEY UPDATE ub={dbdata[3]}, added_on='{dbdata[4]}';",
"wp":f"INSERT INTO mark_helper_rollup (id, on_date, site_id, wp, added_on) VALUES ('{dbdata[0]}','{dbdata[1]}',{dbdata[2]},{dbdata[3]},'{dbdata[4]}') ON DUPLICATE KEY UPDATE wp={dbdata[3]}, added_on='{dbdata[4]}';",
"sn":f"INSERT INTO mark_helper_rollup (id, on_date, site_id, sn, added_on) VALUES ('{dbdata[0]}','{dbdata[1]}',{dbdata[2]},{dbdata[3]},'{dbdata[4]}') ON DUPLICATE KEY UPDATE sn={dbdata[3]}, added_on='{dbdata[4]}';",
"ga":f"INSERT INTO mark_helper_rollup (id, on_date, site_id, ga_organic, added_on) VALUES ('{dbdata[0]}','{dbdata[1]}',{dbdata[2]},{dbdata[3]},'{dbdata[4]}') ON DUPLICATE KEY UPDATE ga_organic={dbdata[3]}, added_on='{dbdata[4]}';",
It would be very difficult to analyze the query with out the data, Any ways!
try joining the tables and group it, that should improve the performance
here is a left join sample
SELECT column names
FROM table1
LEFT JOIN table2
ON table1.common_column = table2.common_column;
check this for more detailed inform https://learnsql.com/blog/how-to-left-join-multiple-tables/

How to improve the execute efficiency of this sql?

The sql throws timeout exception in the PRD environment.
SELECT
COUNT(*) totalCount,
SUM(IF(t.RESULT_FLAG = 'success', 1, 0)) successCount,
SUM(IF(b.ERROR_CODE = 'Y140', 1, 0)) unrecognizedCount,
SUM(IF(b.ERROR_CODE LIKE 'Y%' OR b.ERROR_CODE = 'E008', 1, 0)) connectCall,
SUM(IF(b.ERROR_CODE = 'N004', 1, 0)) hangupUnconnect,
SUM(IF(b.ERROR_CODE = 'Y001', 1, 0)) hangupConnect
FROM
lbl_his b LEFT JOIN lbl_error_code t ON b.TASK_ID = t.TASK_ID AND t.CODE = b.ERROR_CODE
WHERE
b.TASK_ID = "5f460e4ffa99f51697ad4ae3"
AND b.CREATE_TIME BETWEEN "2020-07-01 00:00:00" AND "2020-10-28 00:00:00"
The size of table lbl_his is super large. About 20,000,000 rows data which occupied 20GB disk.
The size of table lbl_error_code is small. Only 305 rows.
The indexes of table lbl_his:
TASK_ID
UPDATE_TIME
CREATE_TIME
RECORD_ID
The union indexes of table lbl_his:
TASK_ID, ERROR_CODE, UPDATE_TIME
TASK_ID, CREATE_TIME
There are no index created for table lbl_error_code.
I ran EXPLAIN SELECT and found the sql hit the index of lbl_his.TASK_ID and lbl_error_code.primary.
How to avoid to execute timeout?
For an index solution on lbl_his, try putting a non-clustered index on
firstly the things you filter on by exact match
then the things you filter on as ranges (or inexact matches)
e.g., the initial part of the index should be TASK_ID then CREATE_TIME. Putting these first is very important as it means the engine can do one seek to get the data.
Then include any other fields in use (either as part of index, or includes - doesn't matter) - in this case, ERROR_CODE. This makes your index a covering index.
Therefore your final new non-clustered index on lbl_his should be (TASK_ID, CREATE_TIME, ERROR_CODE)

Performance issue on query with math calculations

This my query with its performance (slow_query_log):
SELECT j.`offer_id`, o.`offer_name`, j.`success_rate`
FROM
(
SELECT
t.`offer_id`,
(
SUM(CASE WHEN `offer_id` = t.`offer_id` AND `sales_status` = 'SUCCESS' THEN 1 ELSE 0 END) / COUNT(*)
) AS `success_rate`
FROM `tblSales` AS t
WHERE DATE(t.`sales_time`) = CURDATE()
GROUP BY t.`offer_id`
ORDER BY `success_rate` DESC
) AS j
LEFT JOIN `tblOffers` AS o
ON j.`offer_id` = o.`offer_id`
LIMIT 5;
# Time: 180113 18:51:19
# User#Host: root[root] # localhost [127.0.0.1] Id: 71
# Query_time: 10.472599 Lock_time: 0.001000 Rows_sent: 0 Rows_examined: 1156134
Here, tblOffers have all the OFFERS listed. And the tblSales contains all the sales. What am trying to find out is the top selling offers, based on the success rate (ie. those sales which are SUCCESS).
The query works fine and provides the output I needed. But it appears to be that its a bit slower.
offer_id and sales_status are already indexed in the tblSales. So do you have any suggestion on improving the inner query (where it calculates the success rate) so that performance can be improved? I have been playing with the math for more than 2hrs. But couldn't get a better way.
Btw, tblSales has lots of data. It contains those sales which are SUCCESSFUL, FAILED, PENDING, etc.
Thank you
EDIT
As you requested am including the table design also(only relevant fields are included):
tblSales
`sales_id` bigint UNSIGNED NOT NULL AUTO_INCREMENT,
`offer_id` bigint UNSIGNED NOT NULL DEFAULT '0',
`sales_time` DATETIME NOT NULL DEFAULT '0000-00-00 00:00:00',
`sales_status` ENUM('WAITING', 'SUCCESS', 'FAILED', 'CANCELLED') NOT NULL DEFAULT 'WAITING',
PRIMARY KEY (`sales_id`),
KEY (`offer_id`),
KEY (`sales_status`)
There are some other fields also in this table, that holds some other info. Amount, user_id, etc. which are not relevant for my question.
Numerous 'problems', none of which involve "math".
JOINs make things difficult. LEFT JOIN says "I don't care whether the row exists in the 'right' table. (I suspect you don't need LEFT??) But it also says "There may be multiple rows in the right table. Based on the column names, I will guess that there is only one offer_name for each offer_id. If this is correct, then here my first recommendation. (This will convince the Optimizer that there is no issue with the JOIN.) Change from
SELECT ..., o.offer_name, ...
LEFT JOIN `tblOffers` AS o ON j.`offer_id` = o.`offer_id`
...
to
SELECT ...,
( SELECT offer_name FROM tbloffers WHERE offer_id j.offer_id
) AS offer_name, ...
It also gets rid of a bug wherein you are assuming that the inner ORDER BY will be preserved for the LIMIT. This used to be the case, but in newer versions of MariaDB / MySQL, it is not. The ORDER BY in a "derived table" (your subquery) is now ignored.
2 down, a few more to go.
"Don't hide an indexed column in a function." I am referring to DATE(t.sales_time) = CURDATE(). Assuming you have no sales_time values for the 'future', then that test can be changed to t.sales_time >= CURDATE(). If you really need to restrict to just today, then do this:
AND sales_time >= CURDATE()
AND sales_time < CURDATE() + INTERVAL 1 DAY
The ORDER BY and the LIMIT should usually be put together. In your case, you may as well add the LIMIT to the "derived table", thereby leading to only 5 rows for the outer query to work with. But... There is still the question of getting them sorted correctly. So change from
SELECT ...
FROM ( SELECT ...
ORDER BY ... )
LIMIT ...
to
SELECT ...
FROM ( SELECT ...
ORDER BY ...
LIMIT 5 ) -- trim sooner
ORDER BY ... -- deal with the loss of ordering from derived table
Rolling it all together, I have
SELECT j.`offer_id`,
( SELECT offer_name
FROM tbloffers
WHERE offer_id = j.offer_id
) AS offer_name,
j.`success_rate`
FROM
( SELECT t.`offer_id`,
AVG(t.sales_status = 'SUCCESS') AS `success_rate`
FROM `tblSales` AS t
WHERE t.sales_time >= CURDATE()
GROUP BY t.`offer_id`
ORDER BY `success_rate` DESC
LIMIT 5
) AS j
ORDER BY `success_rate` DESC;
(I took the liberty of shortening the SUM(...) in two ways.)
Now for the indexes...
tblSales needs at least (sales_time), but let's go for a "covering" (with sales_time specifically first):
INDEX(sales_time, sales_status, order_id)
If tbloffers has PRIMARY KEY(offer_id), then no further index is worth adding. Else, add this covering index (in this order):
INDEX(offer_id, offer_name)
(Apologies to other Answerers; I stole some of your ideas.)
Here, tblOffers have all the OFFERS listed. And the tblSales contains all the sales. What am trying to find out is the top selling offers, based on the success rate (ie. those sales which are SUCCESS).
Approach this with a simple JOIN and GROUP BY:
SELECT s.offer_id, o.offer_name,
AVG(s.sales_status = 'SUCCESS') as success_rate
FROM tblSales s JOIN
tblOffers o
ON o.offer_id = s.offer_id
WHERE s.sales_time >= CURDATE() AND
s.sales_time < CURDATE() + INTERVAL 1 DAY
GROUP BY s.offer_id, o.offer_name
ORDER BY success_rate DESC;
Notes:
The use of date arithmetic allows the query to make use of an index on tblSales(sales_time) -- or better yet tblSales(salesTime, offer_id, sales_status).
The arithmetic for success_rate has been simplified -- although this has minimal impact on performance.
I added offer_name to the GROUP BY. If you are learning SQL, you should always have all the unaggregated keys in the GROUP BY clause.
A LEFT JOIN is only needed if you have offers in tblSales which are not in tblOffers. I am guessing you have proper foreign key relationships defined, and this is not the case.
Based on not much information that you have provided (i mean table schema) you could try the following.
SELECT `o`.`offer_id`, `o`.`offer_name`, SUM(CASE WHEN `t`.`sales_status` = 'SUCCESS' THEN 1 ELSE 0 END) AS `success_rate`
FROM `tblOffers` `o`
INNER JOIN `tblSales` `t`
ON `o`.`offer_id` = `t`.`offer_id`
WHERE DATE(`t`.`sales_time`) = CURDATE()
GROUP BY `o`.`offer_id`
ORDER BY `success_rate` DESC
LIMIT 0,5;
You can find a sample of this query in this SQL Fiddle example
Without knowing your schema, the lowest hanging fruit I see is this part....
WHERE DATE(t.`sales_time`) = CURDATE()
Try changing that to something that looks like
Where t.sales_time >= #12-midnight-of-current-date and t.sales_time <= #23:59:59-of-current-date

Stored procedure is so slowly when using count distinct

When I run the stored procedure for the first time, it is so slow and the process lasts for 1 minute, and then I run it again and it lasts 10 seconds.
Following is my main sql statement, please help me to check out , thank you very much!
example 1
SELECT sql_no_cache view_address.is_facility,count(DISTINCT
view_address.provider_id)as totalCount FROM pv_mview_provider_address view_address WHERE
view_address.network_group_id=5047 AND view_address.carrier_group_id=93 GROUP BY
view_address.is_facility;
explain:
example 2:
SELECT SQL_NO_CACHE is_facility,count(distinct provider_id) FROM (SELECT
view_address.provider_id,view_address.is_facility FROM pv_mview_provider_address
view_address WHERE view_address.network_group_id=5047 AND view_address.carrier_group_id=93
) as p GROUP BY is_facility
explain:
this sql will spend 10 s to load the data.
The table stores 4000,0000 rows.
Thank you very much!
For this query:
select sql_no_cache a.is_facility,
count(distinct a.provider_id) as totalCount
from pv_mview_provider_address a
where a.network_group_id = 5047 and
a.carrier_group_id = 93
group by a.is_facility;
You want an index. The best index is pv_mview_provider_address(network_group_id, carrier_group_id, is_facility). However, if the reference in the from clause is a view and not a table, then you need to figure out what is happening with the view.

indexes don't affect time execution in ms sql 2014 VS mysql (mariaDB 10)

I'm porting statistics analyzer system from MySQL (MariaDB 10) to MS SQL 2014, and I found a strange thing. Normally I used to use single- and multi-field indexes for most operations: statistics database holds about 60 millions of events on 4-core pc, and analysis includes funnels, event segmentation, cohort analysis, KPIs and more, so it may be slow sometimes.
But I was quite surprised when I've executed several query sequences from on MS SQL and then removed all indexes (except the main clastered id): I saw that execution time even decreased! I've restarted server (cache is cleared) but after each restart result was similar - my queries work faster without indexes (actually speed is the same, but no time is spent on manual indexes creation ).
I suppose MS SQL creates implicit indexes for me, but in this case it looks like I should remove all indexes creation from my queries? In MySQL you can clearly see that adding indexes really works. Does this MS SQL behaviour mean that I don't need to care about indexes anymore? I've made several tests with my queries and it seems that indexes almost don't affect execution time. Last time I dealed with MS SQL was a long ago and it was MS SQL 2000, so maybe MSFT developed f**n' AI during last 15 years? :)
Just in case this test sql code (generated by back-end for front-end) is below.
In short it produces graph data for particular type of events for last 3 months over time, then does segmentation by one parameter. It creates temp table from main events table with user set constraints (time period, parameters), creates several more temp tables and indexes, does several joins and returns final select result:
select min(tmstamp), max(tmstamp)
from evt_db.dbo.events
where ( ( source = 3 )
and ( event_id=24 )
and tmstamp > 1451606400
AND tmstamp < 1458000000
);
select min(param1), max(param1), count(DISTINCT(param1))
from evt_db.dbo.events
WHERE ( ( source = 3 )
AND ( event_id=24 )
AND tmstamp > 1451606400
AND tmstamp < 1458000000
);
create table #_tmp_times_calc_analyzer_0_0 (
tm_start int,
tm_end int,
tm_origin int,
tm_num int
);
insert into #_tmp_times_calc_analyzer_0_0 values
( 1451606400, 1452211200, 1451606400, 0 ),
( 1452211200, 1452816000, 1452211200, 1 ),
( 1452816000, 1453420800, 1452816000, 2 ),
( 1453420800, 1454025600, 1453420800, 3 ),
( 1454025600, 1454630400, 1454025600, 4 ),
( 1454630400, 1455235200, 1454630400, 5 ),
( 1455235200, 1455840000, 1455235200, 6 ),
( 1455840000, 1456444800, 1455840000, 7 ),
( 1456444800, 1457049600, 1456444800, 8 ),
( 1457049600, 1457654400, 1457049600, 9 ),
( 1457654400, 1458259200, 1457654400, 10 );
And...
CREATE INDEX tm_num ON _tmp_times_calc_analyzer_0_0 (tm_num);
SELECT id, t1.uid, tmstamp, floor((tmstamp - 1451606400) / 604800) period_num,
param1 into #_tmp_events_view_analyzer_0_0
FROM evt_db.dbo.events t1
WHERE ( ( source = 3 )
AND ( event_id=24 )
AND tmstamp > 1451606400
AND tmstamp < 1458000000
);
CREATE INDEX uid ON _tmp_events_view_analyzer_0_0 (uid);
CREATE INDEX period_num ON _tmp_events_view_analyzer_0_0 (period_num);
CREATE INDEX tmstamp ON _tmp_events_view_analyzer_0_0 (tmstamp);
CREATE INDEX _index_param1 ON _tmp_events_view_analyzer_0_0 (param1);
create table #_tmp_median_analyzer_0_0 (ts int );
insert into #_tmp_median_analyzer_0_0
select distinct(param1) v
from #_tmp_events_view_analyzer_0_0
where param1 is not null
order by v ;
select tm_origin, count(distinct uid), count(distinct id)
from #_tmp_times_calc_analyzer_0_0
left join #_tmp_events_view_analyzer_0_0 ON period_num = tm_num
GROUP BY tm_origin;
select top 600 (param1) seg1, count(distinct uid), count(distinct id)
from #_tmp_events_view_analyzer_0_0
GROUP BY param1
order by 1 asc;
And...
select seg1, tm_origin, count(distinct uid), count(distinct id)
from
( SELECT (param1) seg1, tm_origin, uid, id
from #_tmp_times_calc_analyzer_0_0
left join #_tmp_events_view_analyzer_0_0 ON period_num = tm_num
group by param1, tm_origin, uid, id
) t
GROUP BY seg1, tm_origin;
select min(param1), max(param1), round(avg(param1),0)
from #_tmp_events_view_analyzer_0_0;
DECLARE #c BIGINT = (SELECT COUNT(*) FROM #_tmp_median_analyzer_0_0);
SELECT round(AVG(1.0 * ts),0)
FROM
( SELECT ts
FROM #_tmp_median_analyzer_0_0
ORDER BY ts OFFSET (#c - 1) / 2 ROWS
FETCH NEXT 1 + (1 - #c % 2) ROWS ONLY
) AS median_val;
evt_db.dbo.events needs INDEX(source, event, tmstamp), with tmstamp 3rd. In the case of MySQL, those first 2 SELECTs will run entirely in the index (because it is a "covering" index). source and event can be in either order.
Later, you have a similar SELECT but it also has id, t1.uid. You could make this covering index for it: INDEX(source, event, tmstamp, uid, id). Again, tmstamp must be third in the list.
select top 600 (param1) seg1, count(distinct uid), count(distinct id) ... might benefit from INDEX(param1, uid, id), where param1 must be first.
The other indexes you list are possibly not useful at all. What indexes did you try?
One difference between MySQL and other Databases -- MySQL almost never uses more than one index in a query. And, in my experience, MySQL's choice is 'wise'. Perhaps MSSql is trying too hard to use two indexes, when simply scanning the table would be less work.