I have this db structure
and this is my joined tables (sample data)
i want to filter when (key = 'price' and value > 4000) and (key = 'top-speed' and value > 200)
thanks for help)
Try this:
SELECT
car.id,
car.name,
key,
value
FROM
car c
LEFT JOIN car_specification_value csv ON c.id = csv.car_id
LEFT JOIN specification s ON csv.specification_id = s.id
WHERE
s.key = 'price'
AND csv.value > 4000
AND s.key = 'top-speed'
AND csv.value > 200;
A common practice is giving aliases to your tables so the join conditions can be determined in a shorter way.
One method uses aggregation:
select csv.car_id
from car_specification_value csv
where (csv.key = 'price' and (csv.value + 0) > 4000) and
(csv.key = 'top-speed' and (csv.value + 0) > 200)
group by csv.car_id
having count(distinct csv.key) = 2; -- both match
Note the + 0. This uses implicit conversion to change the value to a number, so it can be properly compared to a number. One challenge of key/value data structures is that all the values are strings, and that is tricky for other data types.
Related
I opened up this new question because I'm not sure the user's request and wording matched each other: pandas left join where right is null on multiple columns
What is the equivalent pandas code to this SQL? Contextually we're finding entries from a column in table_y that aren't in table_x with respect to several columns.
SELECT
table_x.column,
table_x.column2,
table_x.column3,
table_y.column,
table_y.column2,
table_y.column3,
FROM table_x
LEFT JOIN table_y
ON table_x.column = table_y.column
ON table_x.column2 = table_y.column2
WHERE
table_y.column2 is NULL
Is this it?
columns_join = ['column', 'column2']
data_y = data_y.set_index(columns_join)
data_x = data_x.set_index(columns_join)
data_diff = pandas.concat([data_x, data_y]).drop_duplicates(keep=False) # any row not in both
# Select the diff representative from each dataset - in case datasets are too large
x1 = data_x[data_x.index.isin(data_diff.index)]
x2 = data_y[data_y.index.isin(data_diff.index)]
# Perform an outer join with the joined indices from each set,
# then remove the entries only contributed from table_x
data_compare = x1.merge(x2, how = 'outer', indicator=True, left_index=True, right_index=True)
data_compare_final = (
data_compare
.query('_merge == left_join')
.drop('_merge', axis=1)
)
I don't think that's equivalent because we only removed entries from table_x that aren't in the join based on multiple columns. I think we have to continue and compare the column against table_y.
data_compare = data_compare.reset_index().set_index('column2')
data_y = data_y.reset_index().set_index('column2')
mask_column2 = data_y.index.isin(data_compare.index)
result = data_y[~mask_column2]
Without test data it is a bit difficult to be sure that this helps but you can try:
# Only if columns to join on in the right dataframe have the same name as columns in left
table_y[['col_join_1', 'col_join_2']] = table_y[['column', 'column2']] # Else this is not needed
# Merge left (LEFT JOIN)
table_merged = table_x.merge(
table_y,
how='left',
left_on=['column', 'column2'],
right_on=['col_join_1', 'col_join_2'],
suffixes=['_x', '_y']
)
# Filter dataframe
table_merged = table_merged.loc[
table_merged.column2_y.isna(),
['column_x', 'column2_x', 'column3_x', 'column_y', 'column2_y', 'column3_y']
]
I found an equivalent that amounts to setting the index to the join column(s), union'ing the tables, dropping the duplicates, and performing a cross join between the contributions to the union. From there, one can select
left_only for this equivalent SQL
SELECT
table_x.*,
table_y.*
FROM table_x
LEFT JOIN table_y
ON table_x.column = table_y.column
ON table_x.column2 = table_y.column2
WHERE
table_y.column2 is NULL
right_only for this equivalent SQL
SELECT
table_x.*,
table_y.*
FROM table_y
LEFT JOIN table_x
ON table_y.column = table_x.column
ON table_y.column2 = table_x.column2
WHERE
table_x.column2 is NULL
def create_dataframe_joined_diffs(dataframe_prod, dataframe_new, columns_join):
"""
Set the indices to the columns_key
Concat the dataframes and remove duplicates
Select the diff representative from each dataset
Reset the indices and perform an outer join
Pseudo-SQL:
SELECT
UNIQUE(*)
FROM dataframe_prod
OUTER JOIN dataframe_new
ON columns_join
"""
data_new = dataframe_new.set_index(columns_join)
data_prod = dataframe_prod.set_index(columns_join)
# Get any row not in both (may be removing too many)
data_diff = pandas.concat([data_prod, data_new]).drop_duplicates(keep=False) # any row not in both
# Select the diff representative from each dataset
x1 = data_prod[data_prod.index.isin(data_diff.index)]
x2 = data_new[data_new.index.isin(data_diff.index)]
# Perform an outer join and keep the joined indices from each set
# Sort the columns to make them easier to compare
data_compare = x1.merge(x2, how = 'outer', indicator=True, left_index=True, right_index=True).sort_index(axis=1)
return data_compare
mask_left = dataframe_compare['_merge'] == 'left_only'
mask_right = dataframe_compare['_merge'] == 'right_only'
I have a sql like this:
SELECT
userAddress.user_address_complete,
userAddress.user_address_point,
deliveryZone.delivery_zone_id,
St_contains(deliveryZone.delivery_zone_polygon,
Geomfromtext('POINT(userAddress.user_address_point)')) AS cnt
FROM user_addresses userAddress
LEFT JOIN delivery_zones deliveryZone
ON (deliveryZone.restaurants_id = 154
AND St_contains(deliveryZone.delivery_zone_polygon,
Geomfromtext('POINT(userAddress.user_address_point)'))
> 0)
WHERE userAddress.user_address_user_id = 1
problem is that POINT(userAddress.user_address_point) should use userAddress.user_address_point field data, but sql can't understand that it is a field name and behave with it like a string so we have not result.
any suggestion?
Try to exclude column name from string. Split it into
'POINT('userAddress.user_address_point'))
SELECT
userAddress.user_address_complete,
userAddress.user_address_point,
deliveryZone.delivery_zone_id,
St_contains(deliveryZone.delivery_zone_polygon,
Geomfromtext('POINT('userAddress.user_address_point')')) AS cnt
FROM user_addresses userAddress
LEFT JOIN delivery_zones deliveryZone
ON (deliveryZone.restaurants_id = 154
AND St_contains(deliveryZone.delivery_zone_polygon,
Geomfromtext('POINT('userAddress.user_address_point')'))
> 0)
WHERE userAddress.user_address_user_id = 1
I'm facing a problem and I'm not finding the answer. I'm querying a MySql table during my java process and I would like to exclude some rows from the return of my query.
Here is the query:
SELECT
o.offer_id,
o.external_cat,
o.cat,
o.shop,
o.item_id,
oa.value
FROM
offer AS o,
offerattributes AS oa
WHERE
o.offer_id = oa.offer_id
AND (cat = 1200000 OR cat = 12050200
OR cat = 13020304
OR cat = 3041400
OR cat = 3041402)
AND (oa.attribute_id = 'status_live_unattached_pregen'
OR oa.attribute_id = 'status_live_attached_pregen'
OR oa.attribute_id = 'status_dead_offer_getter'
OR oa.attribute_id = 'most_recent_status')
AND (oa.value = 'OK'
OR oa.value='status_live_unattached_pregen'
OR oa.value='status_live_attached_pregen'
OR oa.value='status_dead_offer_getter')
The trick here is that I need the value to be 'OK' in order to continue my process but I don't need mysql to return it in its response, I only need the other values to be returned, for the moment its returning two rows by query, one with the 'OK' value and another with one of the other values.
I would like the return value to be like this:
'000005261383370', '10020578', '1200000', '562', '1000000_157795705', 'status_live_attached_pregen'
for my query, but it returns:
'000005261383370', '10020578', '1200000', '562', '1000000_157795705', 'OK'
'000005261383370', '10020578', '1200000', '562', '1000000_157795705', 'status_live_attached_pregen'
Some help would really be appreciated.
Thank you !
You can solve this with an INNER JOIN on the self I think:
SELECT o.offer_id
,o.external_cat
,o.cat
,o.shop
,o.item_id
,oa.value
FROM offer AS o
INNER JOIN offerattributes AS oa
ON o.offer_id = oa.offer_id
INNER JOIN offerattributes AS oaOK
ON oaOK.offer_id = oa.offer_id
AND oaOK.value = 'OK'
WHERE o.cat IN (1200000,12050200,13020304,3041400,3041402)
AND oa.attribute_id IN ('status_live_unattached_pregen','status_live_attached_pregen','status_dead_offer_getter','most_recent_status')
AND oa.value IN ('status_live_unattached_pregen','status_live_attached_pregen','status_dead_offer_getter');
By doing a self-JOIN with the restriction of value OK, it will limit the result set to offer_ids that have an OK response, but the WHERE clause will still retrieve the values you need. Based on your description, I think this is what you were looking for.
I also converted your implicit cross JOIN to an explicit INNER JOIN, as well as changed your ORs to IN, should be more performant this way.
I want to use a Select query from mysql database in C:
mysql_query(conn,"SELECT SI AS SUBSCRIBER_ID ,TG2 AS TAG_ID, SUM(CTR) AS NBR FROM (SELECT H.SUBSCRIBER_ID AS SI, TG.TAG_ID AS TG1,T.TAG_ID AS TG2, COUNT(TG.TAG_ID) AS COUNTER,CASE WHEN (TG.TAG_ID = T.TAG_ID) THEN COUNT(TG.TAG_ID) ELSE 0 END AS CTR from content_hits H left join CONTENT_TAG TG ON TG.CONTENT_ID = H.CONTENT_ID LEFT JOIN TAG T ON 1= 1 GROUP BY H.SUBSCRIBER_ID, TG.TAG_ID,T.TAG_ID) AS TAB GROUP BY SI,TG2");
After that, I want to use 'NBR' to fill an array of one dimension.
I tried this:
result = mysql_store_result(conn);
while ((row = mysql_fetch_row(result)))
{
t[i]=*row['NBR'];
printf("%d",t[i]);
}
But it didn't work.
You cannot access the row columns by name like you have t[i]=*row['NBR'];. Use for example fields = mysql_fetch_fields(result); to get the column names and iterate through the fields array to find which column id 'NBR' has. This id can then be used in t[i]=row[id];. This is all in the mysql connectors doc http://dev.mysql.com/doc/refman/5.0/en/mysql-fetch-fields.html
I've written a simple linq query as follows:
var query = from c in context.ViewDeliveryClientActualStatus
join b in context.Booking on c.Booking equals b.Id
join bg in context.BookingGoods on c.Booking equals bg.BookingId
select new { c, b, bg };
I have filtered the previous query with a number of premises and then needed to group by a set of fields and get the sum of some of them, as so:
var rows = from a in query
group a by new {h = a.c.BookingRefex, b = a.c.ClientRefex, c = a.b.PickupCity, d = a.b.PickupPostalCode} into g
select new
{
Booking_refex = g.Key.h,
Client_refex = g.Key.b,
//Local = g.
Sum_Quan = g.Sum(p => p.bg.Quantity),
};
I'd like to get a few values from a which I haven't included in the group by clause. How can I get those values? They're not accessible through g.
The g in your LINQ expression is an IEnumerable containing a's with an extra property Key. If you want to access fields of a that are not part of Key you will have to perform some sort of aggregation or selection. If you know that a particular field is the same for all elements in the group you can pick the value of the field from the first element in the group. In this example I assume that c has a field named Value:
var rows = from a in query
group a by new {
h = a.c.BookingRefex,
b = a.c.ClientRefex,
c = a.b.PickupCity,
d = a.b.PickupPostalCode
} into g
select new {
BookingRefex = g.Key.h,
ClientRefex = g.Key.b,
SumQuantity = g.Sum(p => p.bg.Quantity),
Value = g.First().c.Value
};
However, if c.Value is the same within a group you might as well include it in the grouping and access it using g.Key.cValue.
Just add those field in the
new {h = a.c.BookingRefex, b = a.c.ClientRefex, c = a.b.PickupCity, d = a.b.PickupPostalCode}
they will be accessible in g then.