SQLAlchemy - Find highest value less than specified value - sqlalchemy

I have a SQLAlchemy Model which holds a list of products. In addition to the primary key, each product is allocated a sequential numeric product identifier (prod_id).
prod_id values >= 10000 are reserved.
When a new product is added I therefore need to allocate it the next available prod_id, which should be 1 greater than the highest current prod_id, ignoring any > 9999.
I can probably construct something in SQL but I'm struggling to form a SQLAlchemy query that will return that value.
I'm not worried about race conditions - this is a simplified example to illustrate the problem.
The Model:
class Product(db.Model):
__tablename__ = 'products'
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String(64))
prod_id = db.Column(db.Integer)
Where prod_id is the sequential product identifier (with prod_ids >=10000 reserved).
In pseudo code, I'm looking for a SQLAlchemy query equivalent to:
next_prod_id = 1 + max(Product.prod_id where Product.prod_id < 10000)
So, as an example, if the existing table is ...
id name prod_id
1 Product A 1
2 Product B 2
3 Reserved A 10000
4 Product C 3
5 Reserved B 10001
6 Product D 4
... then next_prod_id needs to be 5.
Here's my stab at a query, but it isn't working (syntax error that I can't sort out):
next_prod_id = Product.query(func.max(Product.prod_id)) \
.filter_by(Product.prod_id < 10000) + 1

OK, managed to get func.max to do the job.
Here's my solution - not sure if it is the most efficient but it works.
next_prod_id = db.session.query(db.func.max(Product.prod_id)) \
.filter(Product.prod_id < 10000) \
.scalar() + 1

Related

mysql arbitrary index query performance

Say you have an inventory journal table with these fields:
ID, ProductID, WarehouseID, etc.
ID = PK
ProductID & WarehouseID are both FK and indexed.
Then let's say we populate 5 million rows of data into the table. I ran 2 queries.
The first query used both FKs ProductID and WarehouseID
SELECT inventoryjournals.id,inventoryjournals.ProductID
FROM zenlite.inventoryjournals
where productid = 1 && WarehouseID = 1
limit 30 offset 2500000
This took 5.75s to return the result understandably because it goes through from 1st record to 2.5 mill record. But then I ran another query with arbitrary ID constraint
SELECT inventoryjournals.id,inventoryjournals.ProductID
FROM zenlite.inventoryjournals
where productid = 1 && WarehouseID = 1 && id <10000000
limit 30 offset 2500000
or even this
SELECT inventoryjournals.id,inventoryjournals.ProductID
FROM zenlite.inventoryjournals
where productid = 1 && WarehouseID = 1 && id > 0
limit 30 offset 2500000
This shrank the time down to 1.5 ~ 1.6s?! Does this mean it's always better to add the PK constraints in all read queries? Like id > 0 (always true)
My question is, will doing this pose any risk?
There is no way to make offset 2500000 run fast. It must skip over that many rows (unless it hits the end of the table).
All 3 of your queries could benefit from
INDEX(production, WarehouseID, id)
Large offsets are a poor way to do "pagination". It is better to "remember where you left off". Or are you using the large OFFSET for some other purpose?

SQL: How to check "if this record exists then that record must also exist" for given ID set

my database table (DWInfo) looks like this:
InstanceID | AttributeID
1 | 1
1 | 2
1 | 3
2 | 1
2 | 4
3 | 1
3 | 2
There are several instances and every instance has multiple attributes.
What I want to achieve is this: for a given set/rule of id's I want to get all InstanceID's which violate the condition, for example let the given ID's be 1 and 2, which means if there is an instance with AttributeID=1, Attribute=2 should also exist for it. In this case the result would be instance two, because this instance violates the condition.
I tried it with JOINS but this only seemed effective for 2 attributes and not more.
Select * from DWInfo dw1 INNER JOIN DWInfo dw2 ON dw1.InstanceID = dw2.InstanceID where dw1.AttributeID != dw2.AttributeID and dw1.AttributeID = 1 AND dw2.AttributeID != 2
Is it possible to solve this problem with a SQL query?
Assuming that each InstanceId can have only one of each different AttributeId, i.e. a unique composite index (InstanceId, AttributeId):
SELECT InstanceID
FROM DWInfo
WHERE AttributeID IN (1,2)
GROUP BY InstanceID
HAVING SUM(AttributeId = 1) = 1
AND COUNT(*) < 2 /* Or SUM(AttributeId = 2) = 0 */
SQLFiddle DEMO
Note that if having AttributeId of 2 means that the instance requires an AttributeId of 1 also.. slightly different logic, this is neater:
SELECT InstanceID
FROM DWInfo
WHERE AttributeID IN (1,2)
GROUP BY InstanceID
HAVING COUNT(*) < 2
Where there exists Attribute 1 find the ones that don't have Attribute 2.
select InstanceID
from DWInfo
group by InstanceID
having
count(case when AttributeID = 1 then 1 end) > 0
and count(case when AttributeID = 2 then 1 end) = 0
This answer is basically the same as Arth's. You might find it beneficial to filter the Attributes in the where clause but it's not strictly necessary. I prefer the standard syntax using case expressions even though the shorthand would be handy if it were portable. I also prefer count over sum in these scenarios.
It's not clear whether you can have duplicates (probably not) and whether Attribute 2 can appear alone. You might have to tweak the numbers a bit but you should be able to follow the pattern.
I think this does what you want:
select instanceid
from dwinfo
where attributeid in (1, 2)
group by instanceid
having count(*) = 2;
This guarantees that you have two matching rows for each instance. If you can have duplicates, then use:
having count(distinct attributeid) = 2
EDIT:
For the conditional version (if 1 --> 2):
having max(attributeid = 2) > 0
That is, if it has 1 or 2, then it has to have 2, and everything is ok.

Mysql SELECT where 2 column match set of values

I am doing product filter the point is the more specific the user select the products the less results should appear. At he moment I am writing multiple queries and storing in arrays and checking for array intersect, but the result is opposite, which means when user apply more filters, i will show more products.
So i am thinking there could be a SQL command which I don't know!
simplified example:
------------
table "filter"
------------
product
Spec
value
------------
Sample data
------------
book1,page,200
book1,cover,leather
book1,language,en
book2,page,300
book2,cover,paper
book2,language,de
book3,page,150
book3,cover,hard
book3,language,en
SELECT `product` FROM `filter` where ...
how do I select (page=200 and langauge=en)?
If understand correctly you are probably looking for something like this
SELECT product
FROM filter
WHERE (spec = 'page' AND value = '200')
OR (spec = 'language' AND value = 'en')
GROUP BY product
HAVING COUNT(*) = 2 -- 2 here represents number of spec-value pairs
Output:
| PRODUCT |
-----------
| book1 |
SQLFiddle
Another alternative, but less elegant. I just wanted to show another way of doing it.
SELECT DISTINCT product
FROM filter f
WHERE
EXISTS (SELECT 1 FROM filter WHERE spec = 'language' AND value = 'en' AND product = f.product)
AND EXISTS (SELECT 1 FROM filter WHERE spec = 'page' AND value = 200 AND product = f.product);

Combine 2 SELECTs into one SELECT in my RAILS application

I have a table called ORDEREXECUTIONS that stores all orders that have been executed. It's a multi currency application hence the table has two columns CURRENCY1_ID and CURRENCY2_ID.
To get a list of all orders for a specific currency pair (e.g. EUR/USD) I need to lines to get the totals:
v = Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",c1,c2,Time.now()-24.hours).sum("quantity").to_d
v+= Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",c2,c1,Time.now()-24.hours).sum("unitprice*quantity").to_d
Note that my SUM() formula is different depending on the the sequence of the currencies.
e.g. If I want the total ordered quantities of the currency pair USD it then executes (assuming currency ID for USD is 1 and EUR is 2.
v = Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",1,2,Time.now()-24.hours).sum("quantity").to_d
v+= Orderexecution.where("is_master=1 and currency1_id=? and currency2_id=? and created_at>=?",2,1,Time.now()-24.hours).sum("unitprice*quantity").to_d
How do I write this in RoR so that it triggers only one single SQL statement to MySQL?
I guess this would do:
v = Orderexecution.where("is_master=1
and ( (currency1_id, currency2_id) = (?,?)
or (currency1_id, currency2_id) = (?,?)
)
and created_at>=?"
,c1, c2, c2, c1, Time.now()-24.hours
)
.sum("CASE WHEN currency1_id=?
THEN quantity
ELSE unitprice*quantity
END"
,c1
)
.to_d
So you could do
SELECT SUM(IF(currency1_id = 1 and currency2_id = 2, quantity,0)) as quantity,
SUM(IF(currency2_id = 1 and currency1_id = 2, unitprice * quantity,0)) as unitprice _quantity from order_expressions
WHERE created_at > ? and (currency1_id = 1 or currency1_id = 2)
If you plug that into find_by_sql you should get one object back, with 2 attributes, quantity and unitprice_quantity (they won't show up in the output of inspect in the console but they should be there if you inspect the attributes hash or call the accessor methods directly)
But depending on your indexes that might actually be slower because it might not be able to use indexes as efficiently. The seemly redundant condition on currency1_id means that this would be able to use an index on [currency1_id, created_at]. Do benchmark before and after - sometimes 2 fast queries are better than one slow one!

MySQL - What's wrong with the query?

I am trying to query a database to find the following.
If a customer searches for a hotel in a city between dates A and B, find and return the hotels in which rooms are free between the two dates.
There will be more than one room in each room type (i.e. 5 Rooms in type A, 10 rooms in Type B, etc.) and we have to query the database to find only those hotels in which there is at least one room free in at least one type.
This is my table structure:
**Structure for table 'reservations'**
reservation_id
hotel_id
room_id
customer_id
payment_id
no_of_rooms
check_in_date
check_out_date
reservation_date
**Structure for table 'hotels'**
hotel_id
hotel_name
hotel_description
hotel_address
hotel_location
hotel_country
hotel_city
hotel_type
hotel_stars
hotel_image
hotel_deleted
**Structure for table 'rooms'**
room_id
hotel_id
room_name
max_persons
total_rooms
room_price
room_image
agent_commision
room_facilities
service_tax
vat
city_tax
room_description
room_deleted
And this is my query:
$city_search = '15';
$check_in_date = '29-03-2010';
$check_out_date = '31-03-2010';
$dateFormat_check_in = "DATE_FORMAT('$reservations.check_in_date','%d-%m-%Y')";
$dateFormat_check_out = "DATE_FORMAT('$reservations.check_out_date','%d-%m-%Y')";
$dateCheck = "$dateFormat_check_in >= '$check_in_date' AND $dateFormat_check_out <= '$check_out_date'";
$query = "SELECT $rooms.room_id,
$rooms.room_name,
$rooms.max_persons,
$rooms.room_price,
$hotels.hotel_id,
$hotels.hotel_name,
$hotels.hotel_stars,
$hotels.hotel_type
FROM $hotels,$rooms,$reservations
WHERE $hotels.hotel_city = '$city_search'
AND $hotels.hotel_id = $rooms.hotel_id
AND $hotels.hotel_deleted = '0'
AND $rooms.room_deleted = '0'
AND $rooms.total_rooms - (SELECT SUM($reservations.no_of_rooms) as tot
FROM $reservations
WHERE $dateCheck
GROUP BY $reservations.room_id) > '0'";
The number of rooms already reserved in each room type in each hotel will be stored in the reservations table.
The thing is the query doesn't return any result at all. Even though it should if I calculate it myself manually.
I tried running the sub-query alone and I don't get any result. And I have lost quite some amount of hair trying to de-bug this query from yesterday. What's wrong with this? Or is there a better way to do what I mentioned above?
Edit: Code edited to remove a bug. Thanks to Mark Byers.
Sample Data in reservation table
1 1 1 2 1 3 2010-03-29 2010-03-31 2010-03-17
2 1 2 3 3 8 2010-03-29 2010-03-31 2010-03-18
5 1 1 5 5 4 2010-03-29 2010-03-31 2010-03-12
The sub-query should return
Room ID : 1 Rooms Booked : 7
Room ID : 2 Rooms Booked : 8
But it does not return any value at all.... If i remove the dateCheck condition it returns
Room ID : 2 Rooms Booked : 8
Your problem is here:
$rooms.total_rooms - (SELECT SUM($reservations.no_of_rooms) as tot,
$rooms.room_id as id
FROM $reservations,$rooms
WHERE $dateCheck
GROUP BY $reservations.room_id) > '0'"
You are doing a subtraction total_rooms - (tot, id) where the first operand is a scalar value and the second is a table with two columns. Remove one of the columns in the result set and make sure you only return only one row.
You also should use the JOIN keyword to make joins instead of separating the tables with commas. That way you won't forget to add the join condition.
You probably want something along these lines:
SELECT column1, column2, etc...
FROM $hotels
JOIN $rooms
ON $hotels.hotel_id = $rooms.hotel_id
JOIN (
SELECT SUM($reservations.no_of_rooms) as tot,
$rooms.room_id as id
FROM $reservations
JOIN $rooms
ON ??? /* Aren't you missing something here? */
WHERE $dateCheck
GROUP BY $reservations.room_id
) AS T1
ON T1.id = room_id
WHERE $hotels.hotel_city = '$city_search'
AND $hotels.hotel_deleted = '0'
AND $rooms.room_deleted = '0'
AND $rooms.total_rooms - T1.tot > '0'