SQL select everything with arbitrary IN clause - mysql

This will sound silly, but trust me it is for a good (i.e. over-engineered) cause.
Is it possible to write a SQL query using an IN clause which selects everything in that table without knowing anything about the table? Keep in mind this would mean you can't use a subquery that references the table.
In other words I would like to find a statement to replace "SOMETHING" in the following query:
SELECT * FROM table_a WHERE table_a.id IN (SOMETHING)
so that the results are identical to:
SELECT * FROM table_a
by doing nothing beyond changing the value of "SOMETHING"
To satisfy the curious I'll share the reason for the question.
1) I have a FactoryObject abstract class which grants all models that extend it some glorious factory method magic using two template methods: getData() and load()
2) Models must implement the template methods. getData is a static method that accepts ID constraints, pulls rows from the database, and returns a set of associative arrays. load is not static, accepts an associative array, and populates the object based on that array.
3) The non-abstract part of FactoryObject implements a getObject() and a getObjects() method. These call getData, create objects, and loads() the array responses from getData to create and return populated objects.
getObjects() requires ID constraints as an input, either in the form of a list or in the form of a subquery, which are then passed to getData(). I wanted to make it possible to pass in no ID constraints to get all objects.
The problem is that only the models know about their tables. getObjects() is implemented at a higher level and so it doesn't know what to pass getData(), unless there was a universal "return everything" clause for IN.
There are other solutions. I can modify the API to require getData to accept a special parameter and return everything, or I can implement a static getAll[ModelName]s() method at the model level which calls:
static function getAllModelObjects() {
return getObjects("select [model].id from [model]");
}
This is reasonable and may fit the architecture anyway, but I was curious so I thought I would ask!

Works on SQL Server:
SELECT * FROM table_a WHERE table_a.id IN (table_a.id)

Okay, I hate saying no so I had to come up with another solution for you.
Since mysql is opensource you can get the source and incorporate a new feature that understands the infinity symbol. Then you just need to get the mysql community to buy into the usefulness of this feature (steer the conversation away from security as much as possible in your attempts to do so), and then get your company to upgrade their dbms to the new version once this feature has been implemented.
Problem solved.

The answer is simple. The workaround is to add some criteria like these:
# to query on a number column
AND (-1 in (-1) OR sample_table.sample_column in (-1))
# or to query on a string column
AND ('%' in ('%') OR sample_table.sample_column in ('%'))
Therefore, in your example, two following queries should return the same result as soon as you pass -1 as the parameter value.
SELECT * FROM table_a;
SELECT * FROM table_a WHERE (-1 in (-1) OR table_a.id in (-1));
And whenever you want to filter something out, you can pass it as a parameter. For example, in the following query, the records with id of 1, 2 and 6 are filtered.
SELECT * FROM table_a WHERE (-1 in (1, 2, 6) OR table_a.id in (1, 2, 6));
In this case, we have a default value like -1 or % and we have a parameter that can be anything. If the parameter is the default value, nothing is filtered.
I suggest % character as the default value if you are querying over a text column or -1 if you are querying over the PK of the table. But it totally depends to you to substitute % or -1 with any reserved character or number that you decide on.

similiar to #brandonmoore:
select * from table_a where table_a.id not in ('0')

How about:
select * from table_a where table_a.id not ine ('somevaluethatwouldneverpossiblyexistintable_a.id')
EDIT:
As much as I would like to continue thinking of a way to solve your problem, I know there isn't a way to so I figure I'll go ahead and be the first person to tell you so I can at least get credit for the answer. It's truly a bittersweet victory though :/
If you provide more info though maybe I or someone else can help you think of another workaround.

Related

how to check if a column has truthy value using laravel's eloquent

Suppose I have Post model that has is_verified column with smallint datatype, how can I get all records that is verified? One thing to do this is using this:
Post::where('is_verified', true)->get();
The code above will produce the following query:
select * from `posts` where `posts`.`is_verified` = true
... which will get me all verified Post records; in note that is_verified on all existing records is either 0 or 1.
However, after I get myself curious and try to manually change some is_verified's record value from 1 to another truthy number e.g. 2, the above eloquent query didn't work as expected anymore: records with is_verified value of 2 didn't get retrieved.
I tried to execute the sql query directly from HeidiSQL as well, but it was just the same. Then I tried to change the = in the sql query to is, and now it's working as expected i.e. all records with truthy is_verified get retrieved:
select * from `posts` where `posts`.`is_verified` is true
So my questions are:
Does the above behaviour is correct and expected?
How can I execute the last sql query in eloquent? One thing I can think of is where('is_verified', '!=', 0) but that feels weird in terms of readability especially when the query is pretty long and a bit complicated
As I stated before, the is_verified column is a smallint. Does this affects the behaviour? Because this conversation here states that boolean column datatype is typically tinyint, not smallint.
And that's it. Thank you in advance!
It is not the correct way to handle boolean values, you shouldn't save boolean columns as smallint, you can use the explicit boolean column type as described in the documentation.
Once you setup the boolean field correctly the logic you have in place will work. So Post::where('is_verified', true)->get(); will return the expected results.
Yes, the problem is the smallint column type, if you put tinyint it also should work like the boolean column. You can read more about the differences here.
After doing some deeper digging, I would like to write down the things I've found:
I have updated my mysql to the newest version as of now (v8) and boolean datatype defined in migration results in tinyint(1) in the db. This is happening turns out because in mysql bool or boolean are actually just the synonyms of tinyint(1), so that was a totally normal behaviour, not due to lower-version issues.
I found #dz0nika answer that states that smallint and tinyint results in different behaviour in the query to be quite incorrect. The two datatypes simply differ in terms of byte-size while storing integer value.
As of mysql documentation, it is stated that:
A value of zero is considered false. Nonzero values are considered true.
But also that:
However, the values TRUE and FALSE are merely aliases for 1 and 0, respectively.
Meaning that:
select * from `posts` where `posts`.`is_verified` = true;
Is the same as
select * from `posts` where `posts`.`is_verified` = 1;
Thus the query will only get Post records with is_verified value of 1.
To get Post records with truthy is_verified value, wether 1, or 2, or 3, etc; use is instead of = in the query:
select * from `posts` where `posts`.`is_verified` is true;
You can read more about these informations here and here (look for the "boolean" part)
So, how about the eloquent query? How can we get Post with truthy is_verified using eloquent?
I still don't know what's best. But instead of using where('is_verified', '!=', 0) as I stated in my question, I believe it's better to use whereRaw() instead:
Post::whereRaw('posts.is_verified is true')->get();
If you found this information to be quite missing or incorrect, please kindly reply. Your opinion is much appreciated.

Best way in Doctrine to load only entities which have attached entities

I have an entity, let's call it Foo and a second one Bar
Foo can (but doesn't have to) have one or multiple Bar entries assigned. It looks something like this:
/**
* #ORM\OneToMany(targetEntity="Bar", mappedBy="foo")
* #ORM\OrderBy({"name" = "ASC"})
*/
private $bars;
I now would like to load in one case only Foo entities that have at least one Bar entity assigned. Previously, there was one foreach loop to traverse all Foo entries and if it had assigned entries, the Foo entry got assigned to an array.
My current implementation is in the FooRepository a function called findIfTheresBar which looks like this:
$qb = $this->_em->createQueryBuilder()
->select('e')
->from($this->_entityName, 'e')
/* some where stuff here */
->addOrderBy('e.name', 'ASC')
->join('e.bars', 'b')
->groupBy('e.id');
Is this the correct way to load such entries? Is there a better (faster) way? It kind of feels as if it should have a having(...) in the query.
EDIT:
I've investigated it a little further. The query should return 373 out of 437 entries.
Version 1: only using join(), this loaded 373 entries in 7.88ms
Version 2: using join() and having(), this loaded 373 entries in 8.91ms
Version 3: only using leftJoin(), this loaded all 437 entries (which isn't desired) in 8.05ms
Version 4: using leftJoin() and having(), this loaded 373 entries in 8.14ms
Since Version 1 which only uses an innerJoin as #Chausser pointed out, is the fastest, I will stick to that one.
Note: I'm not saying Version 1 will be the fastest in all scenarios and on every hardware, so kind of a follow up question, does anybody know about a performance comparison?
Please take a look at this answer for more information on how SQL JOINs work: https://stackoverflow.com/a/16598900/1307183
Using a join, which is an alias of innerJoin, is exactly what you want. This only returns records where entries exist in both Foo and Bar - aka where the association/attached entity exists. This calls INNER JOIN in SQL, which, if your database structure is defined correctly, is the absolute best and fastest way to get the data you want.
Using a leftJoin calls LEFT JOIN in SQL, which returns all records from Foo, even if there is no Bar associated with it (for example, where bar_id in your foo table would be null).
You have no reason to use having() in any of the above scenarios you described. If you want to filter further you would do that with a ->addWhere() function. Using the having() clause is something you would only want to do if you were selecting aggregate data in your original query (like SELECT SUM(field) AS sum_field).

Multiple, unknown number of fields passed into a query

Is it possible to create a generic query that would work for different types of documents? For example I have "cases" and "factories",
They have different set of fields. e.g:
{
id: 'case_o1',
name: 'Case numero uno',
amount: 40
}
{
id: 'factory_002',
location: 'Venezuela',
workers: 200,
operating: true
}
Is it possible to create a generic query where I would pass the type of an entity (case or factory) and additional parameters and it would filter results based on those?
I could of course use javascript view, but it doesn't allow me to filter by multiple fields. Let's say I want to fetch all factories located in Venezuela, with number of workers between 20 and 55.
I started with this, but then I got stuck:
select * from `mybucket` as entity
where position(meta(entity).id, $entity_type) == 0
How do I pass multiple predicates and have the query to recognize them?
I can of course list fields like this:
where position(meta(entity).id, $entity_type) == 0
and entity.location == 'Venezuela'
and entity.workers > $workers_min
and entity.workers < $workers_max
but then
I'm gonna have to create a separate query for each entity
And even then it won't solve my problem - I have no idea how to ignore predicates, what if next time $workers_min and $workers_max are not passed, does it mean I have to create a query for every single predicate (column)?
For security reasons I cannot generate free-form queries and pass them to Couchbase server, all the queries are already stored in the database, our api just picks them up out of a document and executes them
I think it's possible to create a query that would be "short-circuiting" for args that's undefined (e.g. WHERE $location IS MISSING OR entity.location == $location or something like that)
Is it possible at all to create a query that would be able to effectively filter and order a dataset based on arbitrary parameters? Or there's no way?
#Agzam. Sorry. I were writting my comment when you said it. But anyway. What you are asking for is possible by using coalesces in a not too complex expressions, but it is a REALLY bad idea because this will drastically throw down most of internal database optimizations. Including the use of any existing index. So, except if you are dealing with a relatively small database (and you are sure it will remain being approximately the same size), I suggest you to better try distinct approach… This is, in fact, the reason I implmented sqlapi.
If you need to have all querys previously stored in database, it probably could be much better to sort given arguments by its name and precalculate and store precalculated querys for each possible combination.
You can do it by assigning a default value to the variable when is not used. For instance if $location is not used you can set it to -1 as default value.
Then the where condition would be:
WHERE ($location=-1 OR entity.location = $location)

Why does MySQL permit non-exact matches in SELECT queries?

Here's the story. I'm testing doing some security testing (using zaproxy) of a Laravel (PHP framework) application running with a MySQL database as the primary store for data.
Zaproxy is reporting a possible SQL injection for a POST request URL with the following payload:
id[]=3-2&enabled[]=on
Basically, it's an AJAX request to turn on/turn off a particular feature in a list. Zaproxy is fuzzing the request: where the id value is 3-2, there should be an integer - the id of the item to update.
The problem is that this request is working. It should fail, but the code is actually updating the item where id = 3.
I'm doing things the way I'm supposed to: the model is retrieved using Eloquent's Model::find($id) method, passing in the id value from the request (which, after a bit of investigation, was determined to be the string "3-2"). AFAIK, the Eloquent library should be executing the query by binding the ID value to a parameter.
I tried executing the query using Laravel's DB class with the following code:
$result = DB::select("SELECT * FROM table WHERE id=?;", array("3-2"));
and got the row for id = 3.
Then I tried executing the following query against my MySQL database:
SELECT * FROM table WHERE id='3-2';
and it did retrieve the row where id = 3. I also tried it with another value: "3abc". It looks like any value prefixed with a number will retrieve a row.
So ultimately, this appears to be a problem with MySQL. As far as I'm concerned, if I ask for a row where id = '3-2' and there is no row with that exact ID value, then I want it to return an empty set of results.
I have two questions:
Is there a way to change this behaviour? It appears to be at the level of the database server, so is there anything in the database server configuration to prevent this kind of thing?
This looks like a serious security issue to me. Zaproxy is able to inject some arbitrary value and make changes to my database. Admittedly, this is a fairly minor issue for my application, and the (probably) only values that would work will be values prefixed with a number, but still...
SELECT * FROM table WHERE id= ? AND ? REGEXP "^[0-9]$";
This will be faster than what I suggested in the comments above.
Edit: Ah, I see you can't change the query. Then it is confirmed, you must sanitize the inputs in code. Another very poor and dirty option, if you are in an odd situation where you can't change query but can change database, is to change the id field to [VAR]CHAR.
I believe this is due to MySQL automatically converting your strings into numbers when comparing against a numeric data type.
https://dev.mysql.com/doc/refman/5.1/en/type-conversion.html
mysql> SELECT 1 > '6x';
-> 0
mysql> SELECT 7 > '6x';
-> 1
mysql> SELECT 0 > 'x6';
-> 0
mysql> SELECT 0 = 'x6';
-> 1
You want to really just put armor around MySQL to prevent such a string from being compared. Maybe switch to a different SQL server.
Without re-writing a bunch of code then in all honesty the correct answer is
This is a non-issue
Zaproxy even states that it's possibly a SQL injection attack, meaning that it does not know! It never said "umm yeah we deleted tables by passing x-y-and-z to your query"
// if this is legal and returns results
$result = DB::select("SELECT * FROM table WHERE id=?;", array("3"));
// then why is it an issue for this
$result = DB::select("SELECT * FROM table WHERE id=?;", array("3-2"));
// to be interpreted as
$result = DB::select("SELECT * FROM table WHERE id=?;", array("3"));
You are parameterizing your queries so Zaproxy is off it's rocker.
Here's what I wound up doing:
First, I suspect that my expectations were a little unreasonable. I was expecting that if I used parameterized queries, I wouldn't need to sanitize my inputs. This is clearly not the case. While parameterized queries eliminate some of the most pernicious SQL injection attacks, this example shows that there is still a need to examine your inputs and make sure you're getting the right stuff from the user.
So, with that said... I decided to write some code to make checking ID values easier. I added the following trait to my application:
trait IDValidationTrait
{
/**
* Check the ID value to see if it's valid
*
* This is an abstract function because it will be defined differently
* for different models. Some models have IDs which are strings,
* others have integer IDs
*/
abstract public static function isValidID($id);
/**
* Check the ID value & fail (throw an exception) if it is not valid
*/
public static function validIDOrFail($id)
{
...
}
/**
* Find a model only if the ID matches EXACTLY
*/
public static function findExactID($id)
{
...
}
/**
* Find a model only if the ID matches EXACTLY or throw an exception
*/
public static function findExactIDOrFail($id)
{
...
}
}
Thus, whenever I would normally use the find() method on my model class to retrieve a model, instead I use either findExactID() or findExactIDOrFail(), depending on how I want to handle the error.
Thank you to everyone who commented - you helped me to focus my thinking and to understand better what was going on.

when using union that uses values from a form it creates a error?

I have this union statement when I try to take parameters from a form and pass it to a union select statement it says too many parameters. This is using MS ACCESS.
SELECT Statement FROM table 1 where Date = Between [Forms]![DateIN]![StartDate]
UNION
SELECT Statement FROM table 2 where Date = Between [Forms]![DateIN]![StartDate]
This is the first time I am using windows DB applications to do Database apps. I am Linux type of person and always use MySQL for my projects but for this one have to use MS Access.
Is there anther way to pass parameters to UNION Statement because this method of defining values in a form can work on Single SELECT statements. But I don't know why this problem exist.
Between "Determines whether the value of an expression falls within a specified range of values" like this ...
expr [Not] Between value1 And value2
But your query only gives it one value ... Between [Forms]![DateIN]![StartDate]
So you need to add And plus another date value ...
Between [Forms]![DateIN]![StartDate] And some_other_date
Also Date is a reserved word. If you're using it as a field name, enclose it in brackets to avoid confusing the db engine: [Date]
If practical, rename the field to avoid similar problems in the future.
And as Gord pointed out, you must also bracket table names which include a space. The same applies to field names.
Still getting problems when using this method of calling the values or dates from the form to be used on the UNION statement. Here is the actual query that I am trying to use.
I don't want to recreate the wheel but I was thinking that if the Date() can be used with between Date() and Date()-6 to represent a 7 days range then I might have to right a module that takes the values from the for and then returns the values that way I can do something like Sdate() and Edate() then this can be used with Between Sdate() and Edate().
I have not tried this yet but this can be my last option I don't even know if it will work but it is worth a try. But before i do that i want to try all the resources that Access can help me make my life easy such as its OO Stuff it has for helping DB programmers.
SELECT
"Expenditure" as [TransactionType], *
FROM
Expenditures
WHERE
(((Expenditures.DateofExpe) Between [Forms]!Form1![Text0] and [Forms]![Form1]![Text11]))
UNION
SELECT
"Income" as [TransactionType], *
FROM
Income
WHERE
(((Income.DateofIncom) Between [Forms]!Form1![Text0] and [Forms]![Form1]![Text11] ));
Access VBA has great power but I don't want to use it as of yet as it will be hard to modify changes for a user that does not know how to program. trying to keep this DB app simple as possible for a dumb user to fully operate.
Any comments is much appreciated.