What is a better way to process data? - json

I have a table with around 10 million rows and 47 columns in Oracle. I do some processing on them before converting the data into JSON and transporting it to view layer. The processing is mostly select() grouping by various columns. This processing by select() is done 5 times with each time differently grouped columns. Now this is taking a lot of time. Is there any way to speed up the process?
I was thinking about pumping data from table into a csv file and processing it and then converting data into JSON to send it. Am I thinking into right direction. Please help.
The 5 queries I use are below for better understanding.
select sum(case when LOWER(column1) LIKE 'succeeded' then 1 else 0 end)/count(*))
from tablename where (TIME_STAMP between 'startTime' and 'endTime')
select column2,sum(case when LOWER(column1) LIKE 'succeeded' then 1 else 0 end)/count(*))
from tablename where (TIME_STAMP between 'startTime' and 'endTime') group by column2;
select column2,column3,sum(case when LOWER(column1) LIKE 'succeeded' then 1 else 0 end)/count(*))
from tablename where (TIME_STAMP between 'startTime' and 'endTime') group by column2,column3;
select column4,column3,sum(case when LOWER(column1) LIKE 'succeeded' then 1 else 0 end)/count(*))
from tablename where (TIME_STAMP between 'startTime' and 'endTime') group by column4,column3;
select column5,column4,column3,sum(case when LOWER(column1) LIKE 'succeeded' then 1 else 0 end)/count(*))
from tablename where (TIME_STAMP between 'startTime' and 'endTime') group by column5,column4,column3;
The result set is combined with the help of JSON and sent to View layer.
EDIT1: There are going to be multiple connections(5-20) to this database. Each connection executing these same queries.

Related

How to fetch multiple data from two tables in sql query

There are two table one is egg table and other one is rate disabled table.
Below I have share a screenshot so you can understand.
I want to fetch egg table data whose all the field is greater than 0 and for that particular field rate_status not disabled .
output should come like this:`
desi_egg =108, small_egg =55
(only two field should come because double_keshar_egg and medium_egg rate is greate than 0 and large_egg rate_status is disabled)
Here merchant_id is common for both table.
Can anyone has any idea
How to solve this proble by using sql query or hql query.
I am using MySql databse.
You are suggesting some cumbersome query like this:
select concat_ws(', ',
(case when desi_egg > 0 and
not exists (select 1
from testdb.rate_disabled rd
where rd.merchant_id = e.merchant_id and
rd.productName = 'desi_egg'
)
then concat('desi_egg=', e.desi_egg)
end),
(case when desi_egg > 0 and
not exists (select 1
from testdb.rate_disabled rd
where rd.merchant_id = e.merchant_id and
rd.productName = 'double_kesher_egg'
)
then concat('double_kesher_egg=', e.double_kesher_egg)
end),
. . .
) as all_my_eggs
from testdb.egg e;

How to use CASE to select MAX(date) WHEN active=1? (without subquery)

I'm trying to optimize some code, if this is possible it's not only more elegant but it would save me running several other queries to get the same data and speed up my while loop considerably.
How would I CASE select the MAX (date) where it is also 1 from a dataset like this?
0 2020-06-30
0 2020-06-26
1 2020-06-25 <---- I want this guy
0 2020-06-24
0 2020-06-24
0 2020-06-23
0 2020-06-22
0 2020-06-22
0 2020-06-16
0 2020-06-16
0 2020-06-12
1 2020-06-12
0 2020-06-11
0 2020-06-01
0 2020-06-01
I tried something like this but obviously this doens't work.
CASE
WHEN aty.type_count = '1' AND ac.activity_date = MAX(ac.activity_date)
THEN ac.activity_date
ELSE 0
END
AS max_date_active
I can't just sort by both columns as sometimes there are no 1 results. I guess I could make the result set a query, but I am running other SUM(CASE())'s on the same data set, so I'm trying to make it all work together as a single, elegant query.
Any ideas?
EDIT: I updated the name to "without subquery" as once I'm using a subquery I might as well just create a separate query to get the results. I'm curently thinking I just get the entire data set back, and figure out what I want using a PHP loop. Not as elegant but at least it saves several complex joined queries.
A LIMIT query might be the easiest option here:
SELECT *
FROM yourTable
WHERE type_count = '1'
ORDER BY activity_date DESC
LIMIT 1;
If there might be more than one record with a type count of 1, tied for the latest date, then we can use a subquery:
SELECT *
FROM yourTable t1
WHERE
type_count = '1' AND
activity_date = (SELECT MAX(activity_date) FROM yourTable WHERE type_count = '1');
As far as I can tell it's not possible to do exactly what I wanted. Subqueries are possible but if I'm processing the query twice inside itself I'd rather handle them separately.
In the end I just kept the result set shown in my question, and then did a basic loop in PHP to extract the info I wanted.

return row even if all no value found in table mysql

I Have a select where I am trying to return a row even if there is nothing to be found from the select.
Here is the select
select
1 as risk_management,
0 as Borrow,
0 as Interest,
IFNULL(d.symbol,'E') as symbol,
IFNULL(d.Abbreviation,'EUR') as Abbreviation,
IFNULL(sum(round((a.amount_financed - a.amount_invested - a.amount_withdrawn) * i.average_rate / j.average_rate, 2)),0) as LendingOffers,
IFNULL( min(a.Interest),0) as InterestLend,
0 as VolumePerDay,
0 as LatestId,
0 as InterestLatestRealized,
0 as InterestBorrowLow,
IFNULL(max(a.Interest),0) as InterestLendHigh
from market_cap a
where ........more statements here...
But when I run this select I still get nothing returned.
I would like mysql to generate a row that has 0 for numbers and 'E' and 'EUR' if the value is missing, I thought IFNULL works for that after searching other stackoverflow but its not working in my case.
Since I don't have your data I cannot test the query for you, but I can demonstrate you the basic idea.
You need to create a buffer table with your default data as the main subselect of your query. In my example, it is called "dv" as in "default values".
The query which fetches the real values is also a subquery in the from clause. In my example, it is called "rv" as in "real values".
I use a left (outer) join on to join both select statements with a condition which is always true (on 1 = 1).
Therefore, when the query which fetches the real values cannot find any results, we can still use the values in the default table.
select
IFNULL(rv.risk_management, dv.risk_management) as risk_management,
IFNULL(rv.Borrow, dv.Borrow) as Borrow,
IFNULL(rv.Interest, dv.Interest) as Interest,
IFNULL(rv.symbol, dv.symbol) as symbol,
IFNULL(rv.Abbreviation, dv.Abbreviation) as Abbreviation,
IFNULL(rv.LendingOffers, dv.LendingOffers) as LendingOffers,
IFNULL(rv.InterestLend, dv.InterestLend) as InterestLend,
IFNULL(rv.VolumePerDay, dv.VolumePerDay) as VolumePerDay,
IFNULL(rv.LatestId, dv.LatestId) as LatestId,
IFNULL(rv.InterestLatestRealized, dv.InterestLatestRealized) as InterestLatestRealized,
IFNULL(rv.InterestBorrowLow, dv.InterestBorrowLow) as InterestBorrowLow,
IFNULL(rv.InterestLendHigh, dv.InterestLendHigh) as InterestLendHigh
from (
1 as risk_management,
0 as Borrow,
0 as Interest,
'E' as symbol,
'EUR' as Abbreviation
0 as LendingOffers,
0 as InterestLend,
0 as VolumePerDay,
0 as LatestId,
0 as InterestLatestRealized,
0 as InterestBorrowLow,
0 as InterestLendHigh
) as dv
LEFT JOIN (
select
risk_management,
Borrow,
Interest,
d.symbol,
d.Abbreviation,
sum(round((a.amount_financed - a.amount_invested - a.amount_withdrawn) * i.average_rate / j.average_rate, 2)) as LendingOffers,
min(a.Interest) as InterestLend,
VolumePerDay,
LatestId,
InterestLatestRealized,
InterestBorrowLow,
max(a.Interest) as InterestLendHigh
from market_cap a
where ........more statements here...
) AS rv
ON 1 = 1
Good succes and have a nice day Masnad Nihit

why my sql query slow?

I try to create a view which join from 4 tables (tb_user is 200 row, tb_transaction is 250.000 row, tb_transaction_detail is 250.000 row, tb_ms_location is 50 row),
when i render with datatables serverside, it's take 13 secons. even when I filtering it.
I don't know why it's take too long...
here my sql query
CREATE VIEW `vw_cashback` AS
SELECT
`tb_user`.`nik` AS `nik`,
`tb_user`.`full_name` AS `nama`,
`tb_ms_location`.`location_name` AS `lokasi`,
`tb_transaction`.`date_transaction` AS `tanggal_setor`,
sum(CASE WHEN `tb_transaction_detail`.`vehicle_type`=1 THEN 1 ELSE 0 END) AS `mobil`,
sum(CASE WHEN `tb_transaction_detail`.`vehicle_type`=2 THEN 1 ELSE 0 END) AS `motor`,
sum(CASE WHEN `tb_transaction_detail`.`vehicle_type`=3 THEN 1 ELSE 0 END) AS `truck`,
sum(CASE WHEN `tb_transaction_detail`.`vehicle_type`=4 THEN 1 ELSE 0 END) AS `speda`,
sum(`tb_transaction_detail`.`total`) AS `total_global`,
(sum(`tb_transaction_detail`.`total`) * 0.8) AS `total_user`,
(sum(`tb_transaction_detail`.`total`) * 0.2) AS `total_tgr`,
((sum(`tb_transaction_detail`.`total`) * 0.2) / 2) AS `total_cashback`,
(curdate() - cast(`tb_user`.`created_at` AS date)) AS `status`
FROM `tb_user`
JOIN `tb_transaction` ON `tb_user`.`id` = `tb_transaction`.`user_id`
JOIN `tb_transaction_detail` ON `tb_transaction`.`id` = `tb_transaction_detail`.`transaction_id`
JOIN `tb_ms_location` ON `tb_ms_location`.`id` = `tb_transaction`.`location_id`
GROUP BY
`tb_user`.`id`,
`tb_transaction`.`date_transaction`,
`tb_user`.`nik`,
`tb_user`.`full_name`,
`tb_user`.`created_at`,
`tb_ms_location`.`location_name`
thanks
The unfiltered query must be slow, because it takes all records from all tables, joins and aggregates them.
But you say the view is still slow when you filter. The question is: How do you filter? As you are aggregating by user, location and transaction date, it should be one of these. However, you don't have the user ID or the transaction ID in your result list. This doesn't feel natural and I'd suggest you add them, so a query like
select * from vw_cashback where user_id = 5
or
select * from vw_cashback where transaction_id = 12345
would be possible.
As is, you'd have to filter by location name or user nik / name. So if you want it thus, then create Indexes for the lookup:
CREATE idx_location_name ON tb_ms_location(location_name, id)
CREATE idx_user_name ON tb_user(full_name, id)
CREATE idx_user_nik ON tb_user(nik, id)
The latter two can even be turned into covering indexs (i.e. indexes containing all columns used in the query) that may still speed up the process:
CREATE idx_user_name ON tb_user(nik, id, full_name, created_at);
CREATE idx_user_nik ON tb_user(full_name, id, nik, created_at);
As for the access via index, you also may want covering indexes:
CREATE idx_location_id ON tb_ms_location(id, location_name)
CREATE idx_user_id ON tb_user(id, nik, full_name, created_at);

Searching large (6 million) rows MySQL with stored queries?

I have a database with roughly 6 million entries - and will grow - where I'm running queries to return for a HighCharts charting functionality. I need to read longitudinally over years, so I'm running queries like this:
foreach($states as $state_id) { //php code
SELECT //mysql psuedocode
sum(case when mydatabase.Year = '2003' then 1 else 0 end) Year_2003,
sum(case when mydatabase.Year = '2004' then 1 else 0 end) Year_2004,
sum(case when mydatabase.Year = '2005' then 1 else 0 end) Year_2005,
sum(case when mydatabase.Year = '2006' then 1 else 0 end) Year_2006,
sum(case when mydatabase.Year = '2007' then 1 else 0 end) Year_2007,
sum(case when mydatabase.Year = '$more_years' then 1 else 0 end) Year_$whatever_year,
FROM mytable
WHERE State='$state_id'
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
AND "other_filters IN (etc, etc, etc)
} //end php code
But for various state at once... So returning lets say 5 states, each with the above statement but a state ID is substituted. Meanwhile the years can be any number of years, the Sex (male/female/other) and Age segment and other modifiers keep changing based on filters. The queries are long (at minimum 30-40seconds) a piece. So a thought I had - unless I'm totally doing it wrong - is to actually store the above query in a second table with the results, and first check that "meta query" and see if it was "cached" and then return the results without reading the db (which won't be updated very often).
Is this a good method or are there potential problems I'm not seeing?
EDIT: changed to table, not db (duh).
Table structure is:
id | Year | Sex | Age_segment | Another_filter | Etc
Nothing more complicated than that and no joining anything else. There are keys on id, Year, Sex, and Age_segment right now.
Proper indexing is what is needed to speed up the query. Start by doing an "EXPLAIN" on the query and post the results here.
I would suggest the following to start off. This way avoids the for loop and returns the data in 1 query. Not knowing the number of rows and cardinality of each column I suggest a composite index on State and Year.
SELECT mytable.State,mytable.Year,count(*)
FROM mytable
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
AND "other_filters IN (etc, etc, etc)
GROUP BY mytable.State,mytable.Year
The above query can be further optimised by checking the cardinality of some of the columns. Run the following to get the cardinality:
SELECT Age_segment FROM mytable GROUP BY Age_segment;
Pseudo code...
SELECT Year
, COUNT(*) total
FROM my_its_not_a_database_its_a_table
WHERE State = $state_id
AND Sex IN (0,1)
AND Age_segment IN (5,4,3,2,1)
GROUP
BY Year;