Store results of expensive function calls in a MySQL table - mysql

Let's suppose I have a set of integers of a variable length. I apply a function on this set of integers and I obtain a result.
myFunction(setOfIntegers) => myResult
Let's suppose a call to myFunction is very expensive and I would like to somehow store the results of this function calls.
In my application I am already using MySQL and what I was thinking was to somehow create a table with the setOfIntegers as a PK and myResult as an additional field.
I was thinking that I could do this by transforming the setOfIntegers to a string before storing it in the DB.
Can this be done in any other way? Or would there be a better way to store results of such function calls in order to avoid calling them a 2nd time with the same set of integers?

I don't know about Java, but Perl has my $str = join(',', $array) and PHP has $str = implode(',', $array). Then the string $str could be used as the PRIMARY KEY (assuming it is not too long). And the result would go in the other column.
Your app code (in Java) would need to first do an implode and SELECT to see if the function has already been evaluated for the given array. If not, then perform the function and end by INSERTing a new row.
If this will be multi-threaded, you could use INSERT IGNORE to deal with dups. (There are other solutions, too.)
Another note: If your set-of-integers is ordered, then what I describe is 'complete'. If it is unordered, then sort it before imploding. This will provide a canonical representation.

If the function can be implemented in MySQL directly, I would suggest using Views.
https://www.mysqltutorial.org/mysql-views-tutorial.aspx/

Related

Searching for multiple values in 1 query

If I have a database having 2 fields, Roll no and name and I have a list (of n values) of roll numbers for which I have to search the corresponding names.
Can this be done using just one query in SQL or HQL?
SELECT name FROM [table] WHERE id IN ([list of ids])
where [list of ids] is for example 2,3,5,7.
Use the IN operator and separate your Roll no's by a comma.
SELECT name
FROM yourtable
WHERE [Roll no] IN (1, 2, 3, 4, etc)
You can use the IN statement as shown above.
There are a couple of minor issues with this. It can perform poorly if the number of values in the clause gets too large.
The second issue is that in many development environments you land up needing to dynamically create the query with a variable number of items (or a variable number of placeholders if using parameterised queries). While not difficult if does make your code look messy and mean you haven't got a nice neat piece of SQL that you can copy out and use to test.
But examples (using php).
Here the IN is just dynamically created with the SQL. Assuming the roll numbers can only be integers it is applying intval() to each member of the array to avoid any non integer values being used in the SQL.
<?php
$list_of_roll_no = array(1,2,3,4,5,6,7,8,9);
$sql = "SELECT FROM some_table WHERE `Roll no` IN (".implode(", ", array_map ('intval', $list_of_roll_no)).")";
?>
Using mysqli bound parameters is a bit messy. This is because the bind parameter statement expects a variable number of parameters. The 2nd parameter onwards are the values to be bound, and it expects them to be passed by reference. So the foreach here is used to generate an array of references:-
<?php
$list_of_roll_no = array(1,2,3,4,5,6,7,8,9);
if ($stmt = $mysqli->prepare("SELECT FROM some_table WHERE `Roll no` IN (".implode(",", array_fill(0, count($list_of_roll_no), '?')).")"))
{
$bind_arguments = [];
$bind_arguments[] = str_repeat("i", count($list_of_roll_no));
foreach ($list_of_roll_no as $list_of_roll_no_key => $list_of_roll_no_value)
{
$bind_arguments[] = & $list_of_roll_no[$list_of_roll_no_key]; # bind to array ref, not to the temporary $recordvalue
}
call_user_func_array(array($statement, 'bind_param'), $bind_arguments);
$statement->execute();
}
?>
Another solution is to push all the values into another table. Can be a temp table. Then you use an INNER JOIN between your table and your temp table to find the matching values. Depending on what you already have in place then this is quite easy to do (eg, I have a php class to insert multiple records easily - I just keep passing them across and the class batches them up and inserts them occasionally to avoid repeatedly hitting the database).

Multiple, unknown number of fields passed into a query

Is it possible to create a generic query that would work for different types of documents? For example I have "cases" and "factories",
They have different set of fields. e.g:
{
id: 'case_o1',
name: 'Case numero uno',
amount: 40
}
{
id: 'factory_002',
location: 'Venezuela',
workers: 200,
operating: true
}
Is it possible to create a generic query where I would pass the type of an entity (case or factory) and additional parameters and it would filter results based on those?
I could of course use javascript view, but it doesn't allow me to filter by multiple fields. Let's say I want to fetch all factories located in Venezuela, with number of workers between 20 and 55.
I started with this, but then I got stuck:
select * from `mybucket` as entity
where position(meta(entity).id, $entity_type) == 0
How do I pass multiple predicates and have the query to recognize them?
I can of course list fields like this:
where position(meta(entity).id, $entity_type) == 0
and entity.location == 'Venezuela'
and entity.workers > $workers_min
and entity.workers < $workers_max
but then
I'm gonna have to create a separate query for each entity
And even then it won't solve my problem - I have no idea how to ignore predicates, what if next time $workers_min and $workers_max are not passed, does it mean I have to create a query for every single predicate (column)?
For security reasons I cannot generate free-form queries and pass them to Couchbase server, all the queries are already stored in the database, our api just picks them up out of a document and executes them
I think it's possible to create a query that would be "short-circuiting" for args that's undefined (e.g. WHERE $location IS MISSING OR entity.location == $location or something like that)
Is it possible at all to create a query that would be able to effectively filter and order a dataset based on arbitrary parameters? Or there's no way?
#Agzam. Sorry. I were writting my comment when you said it. But anyway. What you are asking for is possible by using coalesces in a not too complex expressions, but it is a REALLY bad idea because this will drastically throw down most of internal database optimizations. Including the use of any existing index. So, except if you are dealing with a relatively small database (and you are sure it will remain being approximately the same size), I suggest you to better try distinct approach… This is, in fact, the reason I implmented sqlapi.
If you need to have all querys previously stored in database, it probably could be much better to sort given arguments by its name and precalculate and store precalculated querys for each possible combination.
You can do it by assigning a default value to the variable when is not used. For instance if $location is not used you can set it to -1 as default value.
Then the where condition would be:
WHERE ($location=-1 OR entity.location = $location)

mysql prepared statements which statement i should use

Is there any major difference between the following prepared statements? Which one is more preferred, if so why?
1:
$stmt = $db->prepare("INSERT INTO users(userName) VALUES (:user)");
$user = "Steve";
$stmt->bindParam(':user', $user);
$stmt->execute();
2:
$stmt2 = $db->prepare('INSERT into users(userName) VALUES(:user)');
$stmt2->execute(array(':user'=>'Steve'));
bindParam takes a variable parameter as a reference. That means variable value MIGHT be modified, depending on what you did (like invoked a stored procedure that alters value of variables passed to it).
That's why you should be using bindValue instead, unless you expect MySQL to alter the value of your variable.
Only actual MAJOR difference is that you cannot specify the variable type if you use your second scenario. Every parameter is treated as a string, while when using bindParam / bindValue, you have the freedom to define whether the parameter is a string or integer.
So what you should use then? Well, neither is wrong. If you find it easier to use second approach while inserting a lot of string data, then there's nothing wrong with it.

Using fetchrow_hashref to store data

I am trying to take information out of a MySQL database, which I will then manipulate in perl:
use strict;
use DBI;
my $dbh_m= DBI->connect("dbi:mysql:Populationdb","root","LisaUni")
or die("Error: $DBI::errstr");
my $Genotype = 'Genotype'.1;
#The idea here is eventually I will ask the database how many Genotypes there are, and then loop it round to complete the following for each Genotype:
my $sql =qq(SELECT TransNo, gene.Gene FROM gene JOIN genotypegene ON gene.Gene = genotypegene.Gene WHERE Genotype like '$Genotype');
my $sth = $dbh_m-> prepare($sql);
$sth->execute;
my %hash;
my $transvalues = $sth->fetchrow_hashref;
my %hash= %$transvalues;
$sth ->finish();
$dbh_m->disconnect();
my $key;
my $value;
while (($key, $value) = each(%hash)){
print $key.", ".$value\n; }
This code doesn't produce any errors, but the %hash only stores the last row taken from the database (I got the idea of writing it this way from this website). If I type:
while(my $transvalues = $sth->fetchrow_hashref){
print "Gene: $transvalues->{Gene}\n";
print "Trans: $transvalues->{TransNo}\n";
}
Then it does print off all the rows, but I need all this information to be available once I've closed the connection to the database.
I also have a related question: in my MySQL database the row consists of e.g 'Gene1'(Gene) '4'(TransNo). Once I have taken this data out of the database as I am doing above, will the TransNo still know which Gene it is associated with? Or do I need to create some kind of hash of hash structure for that?
You are calling the "wrong" function
fetchrow_hashref will return one row as a hashref, you should wrap it's use inside a loop, ending it when fetchrow_hashref returns undef.
It seems like you are looking for fetchall_hashref, that will give you all of the returned rows as a hash with the first parameter specified what field to use as a key.
$hash_ref = $sth->fetchall_hashref ($key_field);
Each row will be inserted into $hash_ref as an internal hashref, using $key_field as the key in which you can find the row in $hash_ref.
What does the documentation say?
The fetchall_hashref method can be used to fetch all the data to be returned from a prepared and executed statement handle.
It returns a reference to a hash containing a key for each distinct value of the $key_field column that was fetched.
For each key the corresponding value is a reference to a hash containing all the selected columns and their values, as returned by fetchrow_hashref().
Documentation links
DBI - search.cpan.org #fetchrow_hashref
DBI - search.cpan.org #fetchall_hashref

How efficient is it to call a UDF and sproc from within my LINQ to SQL?

I ran into an issue where I need to call a UDF within my LINQ to SQL and then another stored procedure within that. Here's the code.
public IQueryable<DataDTO> GetLotsaData(string dataId, DateTime date, string custIDs)
{
var data = (from rs in _context.spXI_GetData(dataId, date, custIDs)
select new DataDTO
{
Time = rs.Time,
TimeZone = _context.GetTimezone(postDate, _context.GetDetailedData(rs.PKID, custIDs).FirstOrDefault().Zip),
CompletedTime = rs.Completed_Time,
});
return data.AsQueryable<DataDTO>();
}
The line I'm worried about is the one where I'm calling the GetTimezone UDF. Is it inefficient to call a UDF in the middle of a LINQ query and then another stored procedure (GetDetailedData) to get a single value for that UDF? What kind of SQL would this generate?
It looks a bit convoluted to me, but still better than the alternative which would be a sub-select or join in my stored procedure. (I'm trying to avoid having my stored procedure return the new field - TimeZone - instead just having it returned in my DTO.) And yes, I realize this could all be avoided if we were using UTC. Sadly, I have no control over that.
Why can't spXI_GetData return the complete result set? I'd say that would be optimal in this situation.
The GetTimezone and GetDetailedData functions will be called for every row in the spXI_GetData set. Would be better if the GetTimezone function could return a inline table and than you could join with it instead.