Not the same result between SqLite and MariaDb SELECT - mysql

I have 2 tables on Debian server, rides and steps.
From a Xamarin app I get data from this server (Refit, Newtonsoft Json and SQLite-net PCL packages) to populate local tables.
When I use this query on mariadb:
SELECT 1_steps.*
FROM 1_rides, 1_steps
WHERE 1_rides.id=1_steps.ride_id
AND 1_rides.start=1
GROUP BY 1_rides.id
I got correct results (first step of each ride, then it starts with 1)
But when use equivalent for SqLite:
SELECT Steps.*
FROM Rides,Steps
WHERE Rides.Id=Steps.RideId
AND Rides.Start=1
GROUP BY Rides.Id
In the result, I get the last step of each (same) ride!
Whether on mariadb or sqlite, each table has a primary key (id field).
I checked, the data is sent, received and saved in the same order.
Simply added in mobile app with:
foreach (var step in await App.RestClient.getSteps())
if (dbCon.InsertOrReplace(step) != 1)
....
I tried adding ORDER BY Rides.Id but that does not change anything.

You are relying on something that is not allowed by strict SQL standards: whenever you have a group by clause, the fields in the select clause must either appear in the group by clause as well, or must be aggregations (e.g. min, count), or must be functionally dependent on the group by fields.
In your case those conditions are not met and so if the DB engine allows this, it will have to decide which value to pick within a same group: the first, the last, or still something else.
The way to deal with this, is to be explicit what you want to get in such a case, by specifying an aggregation:
SELECT 1_steps.id,
min(1_steps.step),
max(1_steps.whatever),
avg(1_steps.some_number),
FROM 1_rides
INNER JOIN 1_steps
ON 1_rides.id=1_steps.ride_id
WHERE 1_rides.start=1
GROUP BY 1_rides.id
You did not specify the fields of your table, but the idea should be clear: list the fields separately (not *), and apply the type of aggregation to them you need.
Alternative
If you are not interested in aggregating anything, but just want one particular record from steps per ride, then don't use group by, but specify the condition that filters exactly that one record from steps:
SELECT 1_steps.*
FROM 1_rides
INNER JOIN 1_steps
ON 1_rides.id=1_steps.ride_id
WHERE 1_rides.start=1
AND 1_steps.step = 1
ORDER BY 1_rides.id
Note the condition 1_steps.step = 1: you'll have to decide what that condition should be of course.

Related

Can SqlAlchemy's array_agg function accept more than one column?

I want to return arrays with data from the entire row (so all columns), not just a single column. I can do this with a raw sql statement in Postgresql,
SELECT
array_agg(users.*)
FROM users
WHERE
l_name LIKE 'Br%'
GROUP BY f_name;
but when I try to do it with SqlAlchemy, I'm getting
sqlalchemy.exc.ProgrammingError: (psycopg2.ProgrammingError) can't adapt type 'InstrumentedAttribute'
For example, when I execute this query, it works fine
query: Query[User] = session.query(array_agg(self.user.f_name))
But with this I get arrays of rows with only one column value in them (in this example, the first name of a user) whereas I want the entire row (all columns for a user).
I've tried explicitly listing multiple columns, but to no avail. For example I've tried this:
query: Query[User] = session.query(array_agg((self.user.f_name, self.user.l_name))))
But it doesn't work. I get the above error message.
You could use Python feature unpack for create
example = [func.array_agg(column) for column in self.example.__table__.columns]
query = self.dbsession.query(*attach)
And after join results

MySQL 5.7 RAND() and IF() without LIMIT leads to unexpected results

I have the following query
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table LIMIT 20) t
which returns something like this:
That's exactly what you would expect. However, as soon as I remove the LIMIT 20 I receive highly unexpected results (there are more rows returned than 20, I cut it off to make it easier to read):
SELECT t.res, IF(t.res=0, "zero", "more than zero")
FROM (
SELECT table.*, IF (RAND()<=0.2,1, IF (RAND()<=0.4,2, IF (RAND()<=0.6,3,0))) AS res
FROM table) t
Side notes:
I'm using MySQL 5.7.18-15-log and this is a highly abstracted example (real query is much more difficult).
I'm trying to understand what is happening. I do not need answers that offer work arounds without any explanations why the original version is not working. Thank you.
Update:
Instead of using LIMIT, GROUP BY id also works in the first case.
Update 2:
As requested by zerkms, I added t.res = 0 and t.res + 1 to the second example
The problem is caused by a change introduced in MySQL 5.7 on how derived tables in (sub)queries are treated.
Basically, in order to optimize performance, some subqueries are executed at different times and / or multiple times leading to unexpected results when your subquery returns non-deterministic results (like in my case with RAND()).
There are two easy (and likewise ugly) workarounds to get MySQL to "materialize" (aka return deterministic results) these subqueries: Use LIMIT <high number> or GROUP BY id both of which force MySQL to materialize the subquery and return the expected results.
The last option is turn off derived_merge in the optimizer_switch variable: derived_merge=off (make sure to leave all the other parameters as they are).
Further readings:
https://mysqlserverteam.com/derived-tables-in-mysql-5-7/
Subquery's rand() column re-evaluated for every repeated selection in MySQL 5.7/8.0 vs MySQL 5.6

Mysql multiple AND and one OR

This is my query
SELECT time FROM logs WHERE (pstatus!=\"6\" OR ptype!=\"7\") AND uid=\"$id\" AND project=\"$pid\" GROUP BY project;
I wanna get only values(time) which ignores rows that has pstatus=6 or ptype=7.
What am i doing wrong here as it currently prints all values.
Have you tried using single quotes? Like
SELECT time FROM logs WHERE pstatus!='6' AND ptype!='7'
AND uid='$id' AND project='$pid' GROUP BY project;
Also, if you want to exclude rows where "pstatus=6 or ptype=7" then the negated form must use AND.

SUM(IF(COND,EXPR,NULL)) and IF(COND, SUM(EXPR),NULL)

I'm working of generating sql request by parsing Excel-like formulas.
So for a given formula, I get this request :
SELECT IF(COL1='Y', SUM(EXPR),NULL)
FROM Table
I don't get the results I want. If I manually rewrite the request like this it works :
SELECT SUM(IF(COL1='Y', EXPR, NULL))
FROM Table
Also, the first request produces the right value if I add a GROUP BY statement, for COL1='Y' row :
SELECT IF(COL1='Y', SUM(EXPR),NULL)
FROM Table
GROUP BY COL1
Is there a way to keep the first syntax IF(COND, SUM(EXPR), NULL) and slightly edit it to make it works without a GROUP BY statement ?
You have to use GROUP BY since you are using SUM - otherwise SQL engine is not able to tell how do you want to summarize the column.
Alternatively you could summarize this column only:
SELECT SUM(EXPR)
FROM Table
WHERE COL1='Y'
But then you would have to run separate query for each such column, read: not recommended for performance reasons.

AS clause in influx DB

How to use AS clause in influxDB?
SELECT os_family AS OsName, os_Image AS PlatformIcon FROM statistics
When I run this query I got following error.
ERROR: syntax error, unexpected AS, expecting FROM SELECT os_family AS OsName, os_Image AS PlatformIcon FROM statistics ^^
How to use SQL like AS clause in influx DB?
The AS clause in InfluxDB is meant to be used in two cases (described below). That said we can definitely add this as a new feature if this is a common use case. I don't know if that's necessary though, if you want to get back the columns as OsName then why did you name it os_family to begin with. We can discuss this further on the mailing list (you can find information about the mailing list among other means to reach us here http://influxdb.com/community/)
The two use cases are for joining multiple time series:
select * from foo as f inner join bar as b where f.somecolumn > 0`
and to alias aggregators:
select count(value) as c from foo where foo.value > 100