I have an array of objects related to users and want to get all objects related to one user. I can't save the userid as a parent node but as a child so that I want to use the equalTo method.
ref.orderByChild("userid").equalTo(uid).on("child_added", function(snapshot) {
console.log(snapshot.val());
});
Does this first query all objects (slow) and then select only the required ones or does firebase optimize the query itself on the server? I come from SQL and I am a bit unsure how to handle where queries in firebase.
Edit: there are also security issues. A user could receive all objects by hacking the js code? I hope the security rules should solve this?
Example JSON:
{
Objectkey1: { userid: 'uid', ... },
Objectkey2: { userid: 'uid', ... },
...
}
Does this first query all objects (slow) and then select only the required ones or does firebase optimize the query itself on the server?
Yup, that's pretty much what happens. So this operation will always get slower as you add more items to the location identified by ref.
If that type of performance is a concern (i.e. if you care about scalability), consider adding an inverted/secondary index to the thing that user identified by uid.
Related
I have a project that requires we allow users to create custom columns, enter custom values, and use these custom values to execute user defined functions.
Similar Functionality In Google Data Studio
We have exhausted all implementation strategies we can think of (executing formulas on the front end, in isolated execution environments, etc.).
Short of writing our own interpreter, the only implementation we could find that meets the performance, functionality, and scalability requirements is to execute these functions directly within MySQL. So basically taking the expressions that have been entered by the user, and dynamically rolling up a query that computes results server side in MySQL.
This obviously opens a can of worms security wise.
Quick aside: I expect to get the "you shouldn't do it that way" response. Trust me, I hate that this is the best solution we can find. The resources online describing similar problems is remarkably scarce, so if there are any suggestions for where to find information on analogous problems/solutions/implementations, I would greatly appreciate it.
With that said, assuming that we don't have alternatives, my question is: How do we go about doing this safely?
We have a few current safeguards set up:
Executing the user defined expressions against a tightly controlled subquery that limits the "inner context" that the dynamic portion of the query can pull from.
Blacklisting certain phrases the should never be used (SELECT, INSERT, UNION, etc.). This introduces issues, because a user should be able to enter something like: CASE WHEN {{var}} = "union pacific railroad" THEN... but that is a tradeoff we are willing to make.
Limiting the access of the MySQL connection making the query to only have access to the tables/functionality needed for the feature.
This gets us pretty far. But I'm still not comfortable with it. One additional option that I couldn't find any info online about was using the query execution plan as a means of detecting if the query is going outside of its bounds.
So prior to actually executing the query/getting the results, you would wrap it within an EXPLAIN statement to see what the dynamic query was doing. From the results of the EXPLAIN query, you should able to detect any operations (subqueries, key references, UNIONs, etc.) that fall outside of the bounds of what the query is allowed to do.
Is this a useful validation method? It seems to me that this would be a powerful tool for protecting against a suite of SQL injections, but I couldn't seem to find any information online.
Thanks in advance!
(from Comment)
Some Examples showing the actual autogenerated queries being used. There are both visual and list examples showing the query execution plan for both malicious and valid custom functions.
GRANT only SELECT on the table(s) that they are allowed to manipulate. This allows arbitrarily complex SELECT queries to be run. (The one flaw: Such queries may run for a long time and/or take a lot of resources. MariaDB has more facilities for preventing run-away selects.)
Provide limited "write" access via Stored Routines with expanded privileges, but do not pass arbitrary values into them. See SQL SECURITY: DEFINER has the privileges of the person creating the routine. (As opposed to INVOKER is limited to SELECT on the tables mentioned above.)
Another technique that may or may not be useful is creating VIEWs with select privileges. This, for example, can let the user see most information about employees while hiding the salaries.
Related to that is the ability to GRANT different permissions on different columns, even in the same table.
(I have implemented a similar web app, and released it to everyone in the company. And I could 'sleep at night'.)
I don't see subqueries and Unions as issues. I don't see the utility of EXPLAIN other than to provide more info in case the user is a programmer trying out queries.
EXPLAIN can help in discovering long-running queries, but it is imperfect. Ditto for LIMIT.
More
I think "UDF" is either "normalization" or "EAV"; it is hard to tell which. Please provide SHOW CREATE TABLE.
This is inefficient because it builds a temp table before removing the 'NULL' items:
FROM ( SELECT ...
FROM ...
LEFT JOIN ...
) AS context
WHERE ... IS NULL
This is better because it can do the filtering sooner:
FROM ( SELECT ...
FROM ...
LEFT JOIN ...
WHERE ... IS NULL
) AS context
I wanted to share a solution I found for anyone who comes across this in the future.
To prevent someone from entering some malicious SQL injection in a "custom expression" we decided to preprocess and analyze the SQL prior to sending it to the MySQL database.
Our server is running NodeJS, so we used a parsing library to construct an abstract syntax tree from their custom SQL. From here we can traverse the tree and identify any operations that shouldn't be taking place.
The mock code (it won't run in this example) would look something like:
const valid_types = [ "case", "when", "else", "column_ref", "binary_expr", "single_quote_string", "number"];
const valid_tables = [ "context" ];
// Create a mock sql expressions and parse the AST
var exp = YOUR_CUSTOM_EXPRESSION;
var ast = parser.astify(exp);
// Check for attempted multi-statement injections
if(Array.isArray(ast) && ast.length > 1){
this.error = throw Error("Multiple statements detected");
}
// Recursively check the AST for unallowed operations
this.recursive_ast_check([], "columns", ast.columns);
function recursive_ast_check(path, p_key, ast_node){
// If parent key is the "type" of operation, check it against allowed values
if(p_key === "type") {
if(validator.valid_types.indexOf(ast_node) == -1){
throw Error("Invalid type '" + ast_node + "' found at following path: " + JSON.stringify(path));
}
return;
}
// If parent type is table, then the value should always be "context"
if(p_key === "table") {
if(validator.valid_tables.indexOf(ast_node) == -1){
throw Error("Invalid table reference '" + ast_node + "' found at following path: " + JSON.stringify(path));
}
return;
}
// Ignore null or empty nodes
if(!ast_node || ast_node==null) { return; }
// Recursively search array values down the chain
if(Array.isArray(ast_node)){
for(var i = 0; i<ast_node.length; i++) {
this.recursive_ast_check([...path, p_key], i, ast_node[i]);
}
return;
}
// Recursively search object keys down the chain
if(typeof ast_node === 'object'){
for(let key of Object.keys(ast_node)){
this.recursive_ast_check([...path, p_key], key, ast_node[key]);
}
}
}
This is just a mockup adapted from our implementation, but hopefully it will provide some guidance. Should also note, it is best to also implement all of the strategies discussed above as well. Many safeguards are better than just one.
What's the best way to save data in db to keep track of a user clicked a like/favorite button or not?
I've tried the following data structure:
postId:{
...
like_count:123,
user_who_liked:[uid1,uid2,uid3...]
}
When the data above is downloaded to the client, we can check if the client liked the post or not by checking if the array user_who_liked contains the client's uid or not. However, if there are more than 1000 items in the user_who_liked array, this approach consumes too much redundant bandwidth.
I found the following json data from instagram:
{
...
viewer_has_liked: false
viewer_has_saved: false
viewer_has_saved_to_collection: false
viewer_in_photo_of_you: false
}
This approach seems more efficient, but how do the database know if the user has liked or not? Do they store a bunch of uids and check if the user's uid is inside it? Are those has_liked/has_saved booleans derived attributes?
I'm trying to create a database (json) with Firebase.
I searched the docs and the net but couldn't find a clear way to start.
I want to have a database of users.
each user (represented as UID) should have a nickname and a list of friends.
I tried making a .json file that looks like this:
{
users:{
}
}
and adding it to the Firebase console to get started but it wouldn't work.
How can I do it?
the database should look like this:
{
users:{
UID:{
nickname: hello
friends: UID2
}
UID2:{
nickname: world
friends: UID
}
}
I don't know if I got that right, so I would really appreciate any help you guys could give me at this subject.
Thanks in advance!
Seems like a good place to start. I would make two changes though.
keep the list is friends separate
keep the friends as a set, instead of a single value or array
keep the list is friends separate
A basic recommendation when using the Firebase Database is to keep your data structure shallow/flat. There are many reasons for this, and you have at least two.
With your current data structure, say that you want to show a list of user names. You can only get that list by listening to /users. And that means you don't just get the user name for each user, but also their list of friends. Chances that you're going to show all that data to the user are minimal, so that means that you've just wasted some of their bandwidth.
Say that you want to allow everyone to read the list of user names. But you only want each user to be able to read their own list of friends. Your current data structure makes that hard, since permission cascades and rules are not filters.
A better structure is to keep the list of user profiles (currently just their name) separate from the list of friends for each user.
keep the friends as a set
You current have just a single value for the friends property. As you start building the app you will need to store multiple friends. The most common is to then store an array or list of UIDS:
[ UID1, UID2, UID3 ]
Or
{
"-K.......1": "UID1"
"-K.......5": "UID2"
"-K.......9": "UID3"
}
These are unfortunately the wrong type for this data structure. Both the array and the second collection are lists: an ordered collection of (potentially) non-unique values. But a collection of friends doesn't have to be ordered, it has to be unique. I'm either in the collection or I'm not in there, I can't be in there multiple times and the order typically doesn't matter. That's why you often end up looking for friends.contains("UID1") or ref.orderByValue().equalTo("UID1") operations with the above models.
A much better model is to store the data as a set. A set is a collection of unordered values, which have to be unique. Perfect for a collection of friends. To store that in Firebase, we use the UID as the key of the collection. And since we can't store a key without a value, we use true as the dummy value.
So this leads to this data model:
{
users:{
UID:{
nickname: hello
}
UID2:{
nickname: world
}
}
friends:{
UID:{
UID2: true
}
UID2:{
UID: true
}
}
}
There is a lot more to say/learn about NoSQL data modeling in general and Firebase specifically. To learn about that, I recommend reading NoSQL data modeling and watching Firebase for SQL developers.
I keep a collection of Friends where the users field is an array of 2 user ids: ['user1', 'user2'].
Getting the friends of a user is easy:
friendsCollection.where("users", "array-contains", "user1").get()
This should get you all documents where user1 appears.
Now the tricky part was on how to query a single friend. Ideally, firebase would support multiple values in array-contains, but they won't do that: https://github.com/firebase/firebase-js-sdk/issues/1169
So they way I get around this is to normalize the users list before adding the document. Basically I'm utilizing JS' truthiness to check what userId is greater, and which is smaller, and then making a list in that order.
when adding a friend:
const user1 = sentBy > sentTo ? sentBy : sentTo
const user2 = sentBy > sentTo ? sentTo : sentBy
const friends = { users: [user1, user2] }
await friendsCollection.add(friends)
This basically ensures that whoever is part of the friendship will always be listed in the same order, so when querying, you can just:
await friendsCollection.where("users", "==", [user1, user2]).get()
This obviously only works because I trust the list will always have 2 items, and trust that the JS truthiness will work deterministically, but it's a great solution for this specific problem.
I found this great node mysql boilerplate:
https://github.com/ocastillo/nodejs-mysql-boilerplate
it works terrific! However, now I need to hook it in to my existing user table, and my key field is named userID, not simply id, and changing the key fieldname in mysql breaks the example. So my question is, where in the project do I need to specify a different id field name? I see user.id in /util/auth.js passport.serializeUser and id in passport.deserializeUser functions, but it seems it must be specified elsewhere too. I'm hoping this is a simple question for users of passportjs!
Yes, you should only need to change the code in the serializeUser and deserializeUser functions. Those two functions you control, and state within them what you'd like to serialize into the session cookie (when the user logs in), and deserialize from the session cookie (when the user revisits the site after logging in). Think of them as ways to remember who this person is, once they return. The passport.use function is only used to define the authentication strategy, and within that, the manner in which you'll "log the user in".
So this should work (assuming I've followed what you've said above):
passport.serializeUser(function(user, done) {
done(null, user.userID);
});
passport.deserializeUser(function(user_id, done) {
new data.ApiUser({userID: user_id}).fetch().then(function(user) {
return done(null, user);
}, function(error) {
return done(error);
});
});
You might benefit from more examples, here's a gist I put together on passport configuration within Node (however this one uses Mongo): https://gist.github.com/dylants/8030433
I have a small app where users create things that are assigned to them.
There are multiple users but all the things are in the same table.
I show the things belonging to a user by retrieving all the things with that user's id but nothing would prevent a user to see another user's things by manually typing the thing's ID in the URL.
Also when a user wants to create a new thing, I have a validation rule set to unique but obviously if someone else has a thing with the same name, that's not going to work.
Is there a way in my Eloquent Model to specify that all interactions should only be allowed for things belonging to the logged in user?
This would mean that when a user tries to go to /thing/edit and that he doesn't own that thing he would get an error message.
The best way to do this would be to check that a "thing" belongs to a user in the controller for the "thing".
For example, in the controller, you could do this:
// Assumes that the controller receives $thing_id from the route.
$thing = Things::find($thing_id); // Or how ever you retrieve the requested thing.
// Assumes that you have a 'user_id' column in your "things" table.
if( $thing->user_id == Auth::user()->id ) {
//Thing belongs to the user, display thing.
} else {
// Thing does not belong to the current user, display error.
}
The same could also be accomplished using relational tables.
// Get the thing based on current user, and a thing id
// from somewhere, possibly passed through route.
// This assumes that the controller receives $thing_id from the route.
$thing = Users::find(Auth::user()->id)->things()->where('id', '=', $thing_id)->first();
if( $thing ) {
// Display Thing
} else {
// Display access denied error.
}
The 3rd Option:
// Same as the second option, but with firstOrFail().
$thing = Users::find(Auth::user()->id)->things()->where('id', '=', $thing_id)->firstOrFail();
// No if statement is needed, as the app will throw a 404 error
// (or exception if errors are on)
Correct me if I am wrong, I am still a novice with laravel myself. But I believe this is what you are looking to do. I can't help all that much more without seeing the code for your "thing", the "thing" route, or the "thing" controller or how your "thing" model is setup using eloquent (if you use eloquent).
I think the functionality you're looking for can be achieved using Authority (this package is based off of the rails CanCan gem by Ryan Bates): https://github.com/machuga/authority-l4.
First, you'll need to define your authority rules (see the examples in the docs) and then you can add filters to specific routes that have an id in them (edit, show, destroy) and inside the filter you can check your authority permissions to determine if the current user should be able to access the resource in question.