Combine 2 SELECTs into one SELECT in my RAILS application - mysql

I have a table called ORDEREXECUTIONS that stores all orders that have been executed. It's a multi currency application hence the table has two columns CURRENCY1_ID and CURRENCY2_ID.
To get a list of all orders for a specific currency pair (e.g. EUR/USD) I need to lines to get the totals:
v = Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",c1,c2,Time.now()-24.hours).sum("quantity").to_d
v+= Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",c2,c1,Time.now()-24.hours).sum("unitprice*quantity").to_d
Note that my SUM() formula is different depending on the the sequence of the currencies.
e.g. If I want the total ordered quantities of the currency pair USD it then executes (assuming currency ID for USD is 1 and EUR is 2.
v = Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",1,2,Time.now()-24.hours).sum("quantity").to_d
v+= Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",2,1,Time.now()-24.hours).sum("unitprice*quantity").to_d
How do I write this in RoR so that it triggers only one single SQL statement to MySQL?

I guess this would do:
v = Orderexecution.where("is_master=1
and ( (currency1_id, currency2_id) = (?,?)
or (currency1_id, currency2_id) = (?,?)
)
and created_at>=?"
,c1, c2, c2, c1, Time.now()-24.hours
)
.sum("CASE WHEN currency1_id=?
THEN quantity
ELSE unitprice*quantity
END"
,c1
)
.to_d

So you could do
SELECT SUM(IF(currency1_id = 1 and currency2_id = 2, quantity,0)) as quantity,
SUM(IF(currency2_id = 1 and currency1_id = 2, unitprice * quantity,0)) as unitprice _quantity from order_expressions
WHERE created_at > ? and (currency1_id = 1 or currency1_id = 2)
If you plug that into find_by_sql you should get one object back, with 2 attributes, quantity and unitprice_quantity (they won't show up in the output of inspect in the console but they should be there if you inspect the attributes hash or call the accessor methods directly)
But depending on your indexes that might actually be slower because it might not be able to use indexes as efficiently. The seemly redundant condition on currency1_id means that this would be able to use an index on [currency1_id, created_at]. Do benchmark before and after - sometimes 2 fast queries are better than one slow one!

Related

How to select sql with if else use column

I'm beginner for mysql and this my graduation project please help me.
//AMOUNT_PEOPLE is variable in nodejs
//place_id is variable in nodejs recieve from front end.
SELECT
(IF AMOUNT_PEOPLE >= 10
RETURN COLUMN package_tb.price_group*(AMOUNT_PEOPLE-1)
ELSE IF AMOUNT_PEOPLE >= 6
RETURN COLUMN package_tb.price_group*(AMOUNT_PEOPLE-1) - (SELECT option_tb.price_group FROM option_tb WHERE obtion_tb.place_id = place_id)
ElSE
RETURN COLUMN price_normal*AMOUNT_PEOPLE
END IF) AS price,name,detail
FROM package_tb WHERE package_tb.place_id = place_id
This ticket booking program
Logic is
Check number of tourist
if tourist >= 10 must use group_price and free 1 person include food for free person option
but if tourist >= 6 must use group_price
and free 1 person but not include food for free person option
finally tourist 0-5 must use normal_price
Such customer tell me "I want ticket for 10 tourist" the system will check as above explain.
package_tb
-package_id
-place_id
-name
-detail
-price_group
-price_normal
option_tb
-option_id
-place_id
-name
-price_group
-price_normal
place_tb
-place_id
-name
If tourist use price group option have to use price group only
But tourist use price normal option have to use price normal only.
Sorry for my bad english.
There exists a CASE function.
Which is a standard SQL function for switch logic.
Based on the updated question I'm assuming that there can be multiple options per place.
So with node.js variables:
SET #place_id = ${PLACE_ID};
SET #amount_people = ${AMOUNT_PEOPLE};
SELECT
CASE
WHEN #amount_people >= 10
THEN (p.price_group * (#amount_people - 1))
WHEN #amount_people >= 6
THEN (p.price_group * (#amount_people - 1)) - SUM(o.price_group)
ELSE p.price_normal
END AS price,
p.name,
p.detail
FROM package_tb p
LEFT JOIN option_tb o ON o.place_id = p.place_id
WHERE p.place_id = #place_id
GROUP BY p.package_id, p.place_id, p.price_normal, p.price_group, p.name, p.detail;
A test on rextester here
-- From which table you want use price_normal? tb or o ???
SELECT
CASE WHEN p.AMOUNT_PEOPLE >= 4
THEN p.price_group
WHEN p.AMOUNT_PEOPLE >= 3
THEN p.price_group - o.price_food
ELSE tb.price_normal
END AS price,
p.name, p.detail
FROM package_tb p JOIN option_tb o ON o.package_id = p.package_id;
There are two main control-flow functions in MySQL IF() and CASE
Since you are not comparing the value of AMOUNT_PEOPLE directly, CASE is a bit of overkill, which can be simplified slightly by using IF.
The syntax for IF is IF(<expr>, <true_result>, <false_result>). This allows you to perform else if by chaining another IF() as the false_result
IF(<expr>, ..., IF(<expr>, ..., ...))
Instead of using else if, you only need to remove the option_tb.price_group when AMOUNT_PEOPLE is fewer than 10 to get your desired pricing.
/* groups with 6 or more use group price and one person free */
IF(AMOUNT_PEOPLE >= 6,
/* groups with fewer than 10 people remove option */
p.price_group*(AMOUNT_PEOPLE-1) - IF(AMOUNT_PEOPLE < 10,
o.price_group,
0
),
p.price_normal*AMOUNT_PEOPLE
) AS price
This reduces the amount of code slightly, to determine when to subtract a person.
Instead of using a nested sub-query, which would be executed for each row returned. If the option_tb.place_id is unique, a JOIN would be more preferable.
If option_tb.place_id is not unique, you would need to use a GROUP BY. One approach is to JOIN using a sub-query, to avoid false matching on the join table groupings.
To ensure results are not excluded when a row in the option_tb table fails to match a place_id, you would use a LEFT JOIN that returns NULL instead of excluding the row.
Then you can use COALESE(<column>, 0) to retrieve the value from the column or 0 if the column value is NULL.
In NodeJS you can use ${var} to inject a variable into a string.
For example:
var place_id = 1;
var query = 'SELECT ${place_id};';
console.log(query);
Results in
SELECT 1;
Putting it all together.
Example: db-fiddle
SELECT
IF(${AMOUNT_PEOPLE} >= 6,
p.price_group*(${AMOUNT_PEOPLE}-1) - IF(${AMOUNT_PEOPLE} < 10, COALESCE(o.price_group, 0), 0),
p.price_normal*${AMOUNT_PEOPLE}
) AS price,
p.name,
p.detail
FROM package_tb AS p
LEFT JOIN (
SELECT
place_id,
SUM(price_group) AS price_group
FROM option_tb
GROUP BY place_id
) AS o
ON o.place_id = p.place_id
WHERE p.place_id = ${place_id};

How to Find First Valid Row in SQL Based on Difference of Column Values

I am trying to find a reliable query which returns the first instance of an acceptable insert range.
Research:
some of the below links adress similar questions, but I could get none of them to work for me.
Find first available date, given a date range in SQL
Find closest date in SQL Server
MySQL difference between two rows of a SELECT Statement
How to find a gap in range in SQL
and more...
Objective Query Function:
InsertRange(1) = (StartRange(i) - EndRange(i-1)) > NewValue
Where InsertRange(1) is the value the query should return. In other words, this would be the first instance where the above condition is satisfied.
Table Structure:
Primary Key: StartRange
StartRange(i-1) < StartRange(i)
StartRange(i-1) + EndRange(i-1) < StartRange(i)
Example Dataset
Below is an example User table (3 columns), with a set range distribution. StartRanges are always ordered in a strictly ascending way, UserID are arbitrary strings, only the sequences of StartRange and EndRange matters:
StartRange EndRange UserID
312 6896 user0
7134 16268 user1
16877 22451 user2
23137 25142 user3
25955 28272 user4
28313 35172 user5
35593 38007 user6
38319 38495 user7
38565 45200 user8
46136 48007 user9
My current Query
I am trying to use this query at the moment:
SELECT t2.StartRange, t2.EndRange
FROM user AS t1, user AS t2
WHERE (t1.StartRange - t2.StartRange+1) > NewValue
ORDER BY t1.EndRange
LIMIT 1
Example Case
Given the table, if NewValue = 800, then the returned answer should be 23137. This means, the first available slot would be between user3 and user4 (with an actual slot size = 813):
InsertRange(1) = (StartRange(i) - EndRange(i-1)) > NewValue
InsertRange = (StartRange(6) - EndRange(5)) > NewValue
23137 = 25955 - 25142 > 800
More Comments
My query above seemed to be working for the special case where StartRanges where tightly packed (i.e. StartRange(i) = StartRange(i-1) + EndRange(i-1) + 1). This no longer works with a less tightly packed set of StartRanges
Keep in mind that SQL tables have no implicit row order. It seems fair to order your table by StartRange value, though.
We can start to solve this by writing a query to obtain each row paired with the row preceding it. In MySQL, it's hard to do this beautifully because it lacks the row numbering function.
This works (http://sqlfiddle.com/#!9/4437c0/7/0). It may have nasty performance because it generates O(n^2) intermediate rows. There's no row for user0; it can't be paired with any preceding row because there is none.
select MAX(a.StartRange) SA, MAX(a.EndRange) EA,
b.StartRange SB, b.EndRange EB , b.UserID
from user a
join user b ON a.EndRange <= b.StartRange
group by b.StartRange, b.EndRange, b.UserID
Then, you can use that as a subquery, and apply your conditions, which are
gap >= 800
first matching row (lowest StartRange value) ORDER BY SB
just one LIMIT 1
Here's the query (http://sqlfiddle.com/#!9/4437c0/11/0)
SELECT SB-EA Gap,
EA+1 Beginning_of_gap, SB-1 Ending_of_gap,
UserId UserID_after_gap
FROM (
select MAX(a.StartRange) SA, MAX(a.EndRange) EA,
b.StartRange SB, b.EndRange EB , b.UserID
from user a
join user b ON a.EndRange <= b.StartRange
group by b.StartRange, b.EndRange, b.UserID
) pairs
WHERE SB-EA >= 800
ORDER BY SB
LIMIT 1
Notice that you may actually want the smallest matching gap instead of the first matching gap. That's called best fit, rather than first fit. To get that you use ORDER BY SB-EA instead.
Edit: There is another way to use MySQL to join adjacent rows, that doesn't have the O(n^2) performance issue. It involves employing user variables to simulate a row_number() function. The query involved is a hairball (that's a technical term). It's described in the third alternative of the answer to this question. How do I pair rows together in MYSQL?

query optimization for mysql

I have the following query which takes about 28 seconds on my machine. I would like to optimize it and know if there is any way to make it faster by creating some indexes.
select rr1.person_id as person_id, rr1.t1_value, rr2.t0_value
from (select r1.person_id, avg(r1.avg_normalized_value1) as t1_value
from (select ma1.person_id, mn1.store_name, avg(mn1.normalized_value) as avg_normalized_value1
from matrix_report1 ma1, matrix_normalized_notes mn1
where ma1.final_value = 1
and (mn1.normalized_value != 0.2
and mn1.normalized_value != 0.0 )
and ma1.user_id = mn1.user_id
and ma1.request_id = mn1.request_id
and ma1.request_id = 4 group by ma1.person_id, mn1.store_name) r1
group by r1.person_id) rr1
,(select r2.person_id, avg(r2.avg_normalized_value) as t0_value
from (select ma.person_id, mn.store_name, avg(mn.normalized_value) as avg_normalized_value
from matrix_report1 ma, matrix_normalized_notes mn
where ma.final_value = 0 and (mn.normalized_value != 0.2 and mn.normalized_value != 0.0 )
and ma.user_id = mn.user_id
and ma.request_id = mn.request_id
and ma.request_id = 4
group by ma.person_id, mn.store_name) r2
group by r2.person_id) rr2
where rr1.person_id = rr2.person_id
Basically, it aggregates data depending on the request_id and final_value (0 or 1). Is there a way to simplify it for optimization? And it would be nice to know which columns should be indexed. I created an index on user_id and request_id, but it doesn't help much.
There are about 4907424 rows on matrix_report1 and 335740 rows on matrix_normalized_notes table. These tables will grow as we have more requests.
First, the others are right about knowing better how to format your samples. Also, trying to explain in plain language what you are trying to do is also a benefit. With sample data and sample result expectations is even better.
However, that said, I think it can be significantly simplified. Your queries are almost completely identical with the exception of the one field of "final_value" = 1 or 0 respectively. Since each query will result in 1 record per "person_id", you can just do the average based on a CASE/WHEN AND remove the rest.
To help optimize the query, your matrix_report1 table should have an index on ( request_id, final_value, user_id ). Your matrix_normalized_notes table should have an index on ( request_id, user_id, store_name, normalized_value ).
Since your outer query is doing the average based on an per stores averages, you do need to keep it nested. The following should help.
SELECT
r1.person_id,
avg(r1.ANV1) as t1_value,
avg(r1.ANV0) as t0_value
from
( select
ma1.person_id,
mn1.store_name,
avg( case when ma1.final_value = 1
then mn1.normalized_value end ) as ANV1,
avg( case when ma1.final_value = 0
then mn1.normalized_value end ) as ANV0
from
matrix_report1 ma1
JOIN matrix_normalized_notes mn1
ON ma1.request_id = mn1.request_id
AND ma1.user_id = mn1.user_id
AND NOT mn1.normalized_value in ( 0.0, 0.2 )
where
ma1.request_id = 4
AND ma1.final_Value in ( 0, 1 )
group by
ma1.person_id,
mn1.store_name) r1
group by
r1.person_id
Notice the inner query is pulling all transactions for the final value as either a zero OR one. But then, the AVG is based on a case/when of the respective value for the normalized value. When the condition is NOT the 1 or 0 respectively, the result is NULL and is thus not considered when the average is computed.
So at this point, it is grouped on a per-person basis already with each store and Avg1 and Avg0 already set. Now, roll these values up directly per person regardless of the store. Again, NULL values should not be considered as part of the average computation. So, if Store "A" doesn't have a value in the Avg1, it should not skew the results. Similarly if Store "B" doesnt have a value in Avg0 result.

Zend Framework 1 query, ordering results by satisfied criteria

I've to extract all products from "product" table (DBMS: MySQL, Adapter: PDO), ordering result by the number of filtering criteria that are matched.
This is an example of raw SQL query (but I'll use Zend_DB classes and adapters):
SELECT *
FROM product
WHERE price < 300
AND price > 100
AND discount = TRUE
AND used = FALSE
AND type = MEAL
and a lot of other optionals filter criteria that end user could introduce from the UI.
All the filter criteria (where conditions in the query) could be optionally matched by the user in the form of the web app, and the GOAL the my algorithm is to order the results from the most matching criteria product to the product that match at least 2 criteria.
I'm using Zend Framework 1, and my question is:
Is there any Zend class that could help me in this particular algorithm?
If no, could anyone suggest a solution for this problem?
I've tried a crude solution where I'll compose the query considering all the possible combination of the criteria, but considering that there are a lot of criteria, the algorithm complexity increases so much, so I suppose that an alternative may exists.
Thanks
Something like...
SELECT * FROM (
SELECT P.*,
case when price < 300 and price > 100 then 1 else 0 end +
case when discount = true then 1 else 0 end +
case when used = false then 1 else 0 end +
case when type = 'MEAL' then 1 else 0 end +
... (for each possible outcome) as Matches
FROM product p)
Where matches > 2
Order by Matches descending

Select data which have same letters

I'm having trouble with this SQL:
$sql = mysql_query("SELECT $menucompare ,
(COUNT($menucompare ) * 100 / (SELECT COUNT( $menucompare )
FROM data WHERE $ww = $button )) AS percentday FROM data WHERE $ww >0 ");
$menucompare is table fields names what ever field is selected and contains data bellow
$button is the week number selected (lets say week '6')
$ww table field name with row who have the number of week '6'
For example, I have data in $menucompare like that:
123456bool
521478bool
122555heel
147788itoo
and I want to select those, who have same word in the last of the data and make percentage.
The output should be like that:
bool -- 50% (2 entries)
heel -- 25% (1 entry)
itoo -- 25% (1 entry)
Any clearness to my SQL will be very appreciated.
I didn't find anything like that around.
Well, keeping data in such format probably not the best way, if possible, split the field into 2 separate ones.
First, you need to extract the string part from the end of the field.
if the length of the string / numeric parts is fixed, then it's quite easy;
if not, you should use regular expressions which, unfortunately, are not there by default with MySQL. There's a solution, check this question: How to do a regular expression replace in MySQL?
I'll assume, that numeric part is fixed:
SELECT s.str, CAST(count(s.str) AS decimal) / t.cnt * 100 AS pct
FROM (SELECT substr(entry, 7) AS str FROM data) AS s
JOIN (SELECT count(*) AS cnt FROM data) AS t ON 1=1
GROUP BY s.str, t.cnt;
If you'll have regexp_replace function, then substr(entry, 7) should be replaced to regexp_replace(entry, '^[0-9]*', '') to achieve the required result.
Variant with substr can be tested here.
When sorting out problems like this, I would do it in two steps:
Sort out the SQL independently of the presentation language (PHP?).
Sort out the parameterization of the query and the presentation of the results after you know you've got the correct query.
Since this question is tagged 'SQL', I'm only going to address the first question.
The first step is to unclutter the query:
SELECT menucompare,
(COUNT(menucompare) * 100 / (SELECT COUNT(menucompare) FROM data WHERE ww = 6))
AS percentday
FROM data
WHERE ww > 0;
This removes the $ signs from most of the variable bits, and substitutes 6 for the button value. That makes it a bit easier to understand.
Your desired output seems to need the last four characters of the string held in menucompare for grouping and counting purposes.
The data to be aggregated would be selected by:
SELECT SUBSTR(MenuCompare, -4) AS Last4
FROM Data
WHERE ww = 6
The divisor in the percentage is the count of such rows, but the sub-stringing isn't necessary to count them, so we can write:
SELECT COUNT(*) FROM Data WHERE ww = 6
This is exactly what you have anyway.
The divdend in the percentage will be the group count of each substring.
SELECT Last4, COUNT(Last4) * 100.0 / (SELECT COUNT(*) FROM Data WHERE ww = 6)
FROM (SELECT SUBSTR(MenuCompare, -4) AS Last4
FROM Data
WHERE ww = 6
) AS Week6
GROUP BY Last4
ORDER BY Last4;
When you've demonstrated that this works, you can re-parameterize the query and deal with the presentation of the results.