How to handle concurrency in Propel? - mysql

I'm new to the world of web & databases, and what I don't understand is how usually web application programmers synchronize database information when using ORM.
Let's say that we have this code:
$q = new AuthorQuery();
$firstAuthor = $q->findPK(1);
//Now someone changes the value of "LastViewedBy" in database,
but our PHP code still thinks that the value is "Me"
if( $q->getLastViewedBy() === "Me" )
{
//Do some stuff...
//Change information that affect the flow of this method
$q->save();
}
It's a classic synchronization problem (let's say that 10,000 people are executing this script at the same time).
How do I solve it when using Propel and MySQL (InnoDB)?

Related

Use MySQL Query Execution Plan for Detecting SQL Injections

I have a project that requires we allow users to create custom columns, enter custom values, and use these custom values to execute user defined functions.
Similar Functionality In Google Data Studio
We have exhausted all implementation strategies we can think of (executing formulas on the front end, in isolated execution environments, etc.).
Short of writing our own interpreter, the only implementation we could find that meets the performance, functionality, and scalability requirements is to execute these functions directly within MySQL. So basically taking the expressions that have been entered by the user, and dynamically rolling up a query that computes results server side in MySQL.
This obviously opens a can of worms security wise.
Quick aside: I expect to get the "you shouldn't do it that way" response. Trust me, I hate that this is the best solution we can find. The resources online describing similar problems is remarkably scarce, so if there are any suggestions for where to find information on analogous problems/solutions/implementations, I would greatly appreciate it.
With that said, assuming that we don't have alternatives, my question is: How do we go about doing this safely?
We have a few current safeguards set up:
Executing the user defined expressions against a tightly controlled subquery that limits the "inner context" that the dynamic portion of the query can pull from.
Blacklisting certain phrases the should never be used (SELECT, INSERT, UNION, etc.). This introduces issues, because a user should be able to enter something like: CASE WHEN {{var}} = "union pacific railroad" THEN... but that is a tradeoff we are willing to make.
Limiting the access of the MySQL connection making the query to only have access to the tables/functionality needed for the feature.
This gets us pretty far. But I'm still not comfortable with it. One additional option that I couldn't find any info online about was using the query execution plan as a means of detecting if the query is going outside of its bounds.
So prior to actually executing the query/getting the results, you would wrap it within an EXPLAIN statement to see what the dynamic query was doing. From the results of the EXPLAIN query, you should able to detect any operations (subqueries, key references, UNIONs, etc.) that fall outside of the bounds of what the query is allowed to do.
Is this a useful validation method? It seems to me that this would be a powerful tool for protecting against a suite of SQL injections, but I couldn't seem to find any information online.
Thanks in advance!
(from Comment)
Some Examples showing the actual autogenerated queries being used. There are both visual and list examples showing the query execution plan for both malicious and valid custom functions.
GRANT only SELECT on the table(s) that they are allowed to manipulate. This allows arbitrarily complex SELECT queries to be run. (The one flaw: Such queries may run for a long time and/or take a lot of resources. MariaDB has more facilities for preventing run-away selects.)
Provide limited "write" access via Stored Routines with expanded privileges, but do not pass arbitrary values into them. See SQL SECURITY: DEFINER has the privileges of the person creating the routine. (As opposed to INVOKER is limited to SELECT on the tables mentioned above.)
Another technique that may or may not be useful is creating VIEWs with select privileges. This, for example, can let the user see most information about employees while hiding the salaries.
Related to that is the ability to GRANT different permissions on different columns, even in the same table.
(I have implemented a similar web app, and released it to everyone in the company. And I could 'sleep at night'.)
I don't see subqueries and Unions as issues. I don't see the utility of EXPLAIN other than to provide more info in case the user is a programmer trying out queries.
EXPLAIN can help in discovering long-running queries, but it is imperfect. Ditto for LIMIT.
More
I think "UDF" is either "normalization" or "EAV"; it is hard to tell which. Please provide SHOW CREATE TABLE.
This is inefficient because it builds a temp table before removing the 'NULL' items:
FROM ( SELECT ...
FROM ...
LEFT JOIN ...
) AS context
WHERE ... IS NULL
This is better because it can do the filtering sooner:
FROM ( SELECT ...
FROM ...
LEFT JOIN ...
WHERE ... IS NULL
) AS context
I wanted to share a solution I found for anyone who comes across this in the future.
To prevent someone from entering some malicious SQL injection in a "custom expression" we decided to preprocess and analyze the SQL prior to sending it to the MySQL database.
Our server is running NodeJS, so we used a parsing library to construct an abstract syntax tree from their custom SQL. From here we can traverse the tree and identify any operations that shouldn't be taking place.
The mock code (it won't run in this example) would look something like:
const valid_types = [ "case", "when", "else", "column_ref", "binary_expr", "single_quote_string", "number"];
const valid_tables = [ "context" ];
// Create a mock sql expressions and parse the AST
var exp = YOUR_CUSTOM_EXPRESSION;
var ast = parser.astify(exp);
// Check for attempted multi-statement injections
if(Array.isArray(ast) && ast.length > 1){
this.error = throw Error("Multiple statements detected");
}
// Recursively check the AST for unallowed operations
this.recursive_ast_check([], "columns", ast.columns);
function recursive_ast_check(path, p_key, ast_node){
// If parent key is the "type" of operation, check it against allowed values
if(p_key === "type") {
if(validator.valid_types.indexOf(ast_node) == -1){
throw Error("Invalid type '" + ast_node + "' found at following path: " + JSON.stringify(path));
}
return;
}
// If parent type is table, then the value should always be "context"
if(p_key === "table") {
if(validator.valid_tables.indexOf(ast_node) == -1){
throw Error("Invalid table reference '" + ast_node + "' found at following path: " + JSON.stringify(path));
}
return;
}
// Ignore null or empty nodes
if(!ast_node || ast_node==null) { return; }
// Recursively search array values down the chain
if(Array.isArray(ast_node)){
for(var i = 0; i<ast_node.length; i++) {
this.recursive_ast_check([...path, p_key], i, ast_node[i]);
}
return;
}
// Recursively search object keys down the chain
if(typeof ast_node === 'object'){
for(let key of Object.keys(ast_node)){
this.recursive_ast_check([...path, p_key], key, ast_node[key]);
}
}
}
This is just a mockup adapted from our implementation, but hopefully it will provide some guidance. Should also note, it is best to also implement all of the strategies discussed above as well. Many safeguards are better than just one.

ASP.NET Core EF Performance loss due to migration?

I have just migrated my asp.net core web app's database from Sql Server to MySQL. It went very smooth but i am noticing a loss in performance. When i change my database back to the SQL Server, the page loads in less then one second. When im using the MySQL database, it takes +- 15 seconds to load the page. They both have the same data. It is not just on this page; I just picked this one out as an example.
I use Entity Framework and I query for all the Sales in my database (+- 100 records). I use IQueryable and then return it to the View. When I am debugging I can tell the queries go pretty fast, but when I return my View, this is where it starts to slow down.
public IActionResult SalesOverview()
{
var sales = _repo.QueryAsNoTracking().
.Select(result => new
{
Id = result.Id,
ClientId= result.Client.Id,
ClientNamr = result.Client.Name,
DeliveryDate = result.DeliveryDate,
NumberOfDetails = result.NumberOfDetails.Count
});
return View(sales);
}
Did I miss out on anything during the migration? I'm pretty sure it is not my code because it runs fine with the SQL Server
Kind regards,
Brian

Distribute records on different MySQL databases - MySQL Proxy alternative

My scenario is the following:
Right now I am using one big MySQL database with multiple tables to store user data. Many tables contain auto increment columns.
I would like to split this into 2 or more databases. The distribution should be done by user_id and is determined (cannot be randomized). E.g. user 1 and 2 should be on database1, user 3 on database2, user 4 on database3.
Since I don't want to change my whole frontend, I would like to still use one db adapter and kind of add a layer between the query generation (frontend) and the query execution (on the right database). This layer should distribute the queries to the right database based on the user_id.
I have found MySQL Proxy which sounds exactly like what I need. Unfortunately, it's in alpha and not recommended to be used in a production environment.
For php there is MySQL Native Driver Plugin API which sounds promising but then I need a layer that supports at least php and java.
Is there any other way I can achieve my objectives? Thanks!
This site seems to offer the service you're looking for (for a price).
http://www.sqlparser.com/
It lets you parse and modify queries and results. However what you're looking to do seems like it will only require a couple lines of code to distinguish between different user id's, so even though mysql-proxy is still in alpha your needs are simple enough that I would just use the proxy.
Alternatively, you could user whatever server-side language you're using to grab their user.id info, and then create a mysql connection to the appropriate database based on that info. Here's some php I scrabbled together which in spirit does what I think you're looking to do.
</php
// grab user.id from wherever you store it
$userID = get_user_id($clientUserName);
$userpass = get_user_pass($clientUserName);
if ($userID % 4 == 0) { // every 4th user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db4');
}
else if ($userID % 3 == 0) { // every 3th user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db3');
}
else if ($userID % 2 == 0) { // every 2nd user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db2');
}
else // every other user
$db = new mysqli('localhost', $clientUserName, $userPass, 'db1');
}
$db->query('SELECT * FROM ...;');
?>

Convert Plain password into .NET Membership HASH password in T-SQL

We have a system that stores the username/password as plain text. I have been asked to covert this to Membership.
I'd like a SQL function to convert the plain text password into .NET Membership Hashed password. The only HashBytes function I found doesn't even come close to what I see in the .NET Membership table.
I am desperate. Please help.
I learnt that you cannot generate the same HASH algorithm in T-SQL and the best way is to simple read your basic table to the ASP.net side, then call the Membership Script to insert users. This cuts down on time as well.
DataSet dtData = DivDatabase.ExecuteSQLString("SELECT * FROM Users");
foreach (DataRow row in dtData.Tables[0].Rows)
{
MembershipUser mUser = null;
mUser = Membership.GetUser(row["Username"].ToString());
if (mUser == null)
{
mUser = Membership.CreateUser(row["Username"].ToString(), row["Password"].ToString(), row["Email"].ToString() );
}
}
I first check if the userName is not in the system already. This is because I had duplicating usernames and I wanted to eliminate that. For additional information from the old table that doesn't exist in the Membership tables, we agreed with the senior management that we are not going to use the Membership Profile as it's not easy to write queries against this. But I have added sample code for reference.
var profile = System.Web.Profile.ProfileBase.Create(row["Username"].ToString());
profile.SetPropertyValue("FirstName", row["FirstName"].ToString());
profile.SetPropertyValue("LastName", row["LastName"].ToString());
profile.Save();
I hope you find this useful.

How IQueryables are dealt with in ASP.NET MVC Views?

I have some tables in a MySQL database to represent records from a sensor. One of the features of the system I'm developing is to display this records from the database to the web user, so I used ADO.NET Entity Data Model to create an ORM, used Linq to SQL to get the data from the database, and stored them in a ViewModel I designed, so I can display it using MVCContrib Grid Helper:
public IQueryable<TrendSignalRecord> GetTrends()
{
var dataContext = new SmgerEntities();
var trendSignalRecords = from e in dataContext.TrendSignalRecords
select e;
return trendSignalRecords;
}
public IQueryable<TrendRecordViewModel> GetTrendsProjected()
{
var projectedTrendRecords = from t in GetTrends()
select new TrendRecordViewModel
{
TrendID = t.ID,
TrendName = t.TrendSignalSetting.Name,
GeneratingUnitID = t.TrendSignalSetting.TrendSetting.GeneratingUnit_ID,
//{...}
Unit = t.TrendSignalSetting.Unit
};
return projectedTrendRecords;
}
I call the GetTrendsProjectedMethod and then I use Linq to SQL to select only the records I want. It is working fine in my developing scenario, but when I test it in a real scenario, where the number of records is way greater (something around a million records), it stops working.
I put some debug messages to test it, and everything works fine, but when it reaches the return View() statement, it simply stops, throwing me a MySQLException: Timeout expired. That let me wondering if the data I sent to the page is retrieved by the page itself (it only search for the displayed items in the database when the page itself needs it, or something like that).
All of my other pages use the same set of tools: MVCContrib Grid Helper, ADO.NET, Linq to SQL, MySQL, and everything else works alright.
You absolutely should paginate your data set before executing your query if you have millions of records. This could be done using the .Skip and .Take extension methods. And those should be called before running any query against your database.
Trying to fetch millions of records from a database without pagination would very likely cause a timeout at best.
Well, assuming information in this blog is correct, .AsPagination method requires you to sort your data by a particular column. It's possible that trying to do an OrderBy on a table with millions of records in it is just a time consuming operation and times out.