Which data can I unhesitatingly send over a GET request? - mysql

I create a VueJS application with express and sequelize to access a mysql database (currently running with XAMPP). I have a database which consist of 3 tables:
users: [id (primary key), name, email, family_id (foreign key)]
families: [id (primary key), name]
hobbies: [id (primary key), name, user_id (foreign key)]
All of these IDs are auto_increment so the first user registered gets the ID 1 and so on.
Every user within the same family (so with equal family_id) is allowed to see the hobbies of the other family members. I have a SQL query, which gives me all the family members. On my websity I have a simple drop down menu, where I can select the member. With a GET request I then want to retrieve all hobbies of the selected member.
Now I can basically decide if I use the id or the email for the request parameter e.g. /api/hobbies/:id or /api/hobbies/:email. Email reveals more private information while id reveals information about my internal strucutre like "At least (id) number of users exists.". I think it is better to use the id.
Maybe there is also the possibility to assign a random id (not auto increment) in the database? But I dont know how to to this.

Nothing you send as a parameter to a GET request is private. Those parameters are part of the URL you GET, and those URLs can be logged in various proxy servers, etc, all over the internet without your consent or your users' consent.
It seem to me that family members' hobbies can be sensitive data. What if the whole family likes, say, golf? A cybercreep could easily figure out that a good time for burglary would be Saturday afternoons.
And if your app does GET operations with autoincrementing id values, it's child's play for a cybercreep to examine any record they want. Check out the Panera Bread data breach for example. https://krebsonsecurity.com/2018/04/panerabread-com-leaks-millions-of-customer-records/
At a minimum use POST for that kind of data.
Better yet, use a good authentication / session token system on your app, and conceal data from users if they're not members of that family.
And, if you want to use REST style GET parameters, you need to do these things to be safe:
Use randomized id values. It must be very difficult for a cybercreep to guess a second value from knowing a first value. Serial numbers won't do. Neither will email addresses.
Make sure unauthenticated users can see no data.
Make sure authenticated users can only see the subset of data for which they're authorized.
My suggestion to avoid REST-style GET parameters comes from multiple security auditors saying, you must change that.

Related

Securing MySQL id numbers so they are not sequential

I am working on a little package using PHP and MySQL to handle entries for events. After completing an entry form the user will see all his details on a page called something like website.com/entrycomplete.php?entry_id=15 where the entry_id is a sequential number. Obviously it will be laughably easy for a nosey person to change the entry_id number and look at other people's entries.
Is there a simple way of camouflaging the entry_id? Obviously I'm not looking to secure the Bank of England so something simple and easy will do the job. I thought of using MD5 but that produces quite a long string so perhaps there is something better.
Security through obscurity is no security at all.
Even if the id's are random, that doesn't prevent a user from requesting a few thousand random id's until they find one that matches an entry that exists in your database.
Instead, you need to secure the access privileges of users, and disallow them from viewing data they shouldn't be allowed to view.
Then it won't matter if the id's are sequential.
If the users do have some form of authentication/login, use that to determine if they are allowed to see a particular entry id.
If not, instead of using a url parameter for the id, store it in and read it from a cookie. And be aware that this is still not secure. An additional step you could take (short of requiring user authentication) is to cryptographically sign the cookie.
A better way to implement this is to show only the records that belong to that user. Say the id is the unique identifier for each user. Now store both entry_id and id in your table (say table name is entries).
Now when the user requests for record, add another condition in the mysql query like this
select * from entries where entry_id=5 and id=30;
So if entry_id 5 does not belong to this user, it will not have any result at all.
Coming towards restricting the user to not change his own id, you can implement jwt tokens. You can give a token on login and add it to every call. You can then decrypt the token in the back end and get the user's actual id out of it.

Grouping with associated variables

i have a table as below:
Account no. Login Name Numbering
1234 rty234 1
1234 bhoin1 1
3456 rty234 2
3456 0hudp 2
9876 cfrdk 3
From the table above, you can see that rty234 and bhoin1 registered a same account no of 1234, thus i know that rty234 and bhoin1 are related and i numbered them as 1. The numbering field was based on the account no.
Then I found that rty234 also registered another account no of 3456 and the same account no was registered by 0hudp as well. Thus, i concluded that rty234, bhoin1 and 0hudp are related. Therefore, i wanted to renumber the third and forth row to 1. If they are not further related, then just remain the numbering. How can i achieve that using mysql.
The expected output will be as follow:
Account no. Login Name Numbering New_Numbering
1234 rty234 1 1
1234 bhoin1 1 1
3456 rty234 2 1
3456 0hudp 2 1
9876 cfrdk 3 3
You need to understand how to design a relational database.
These groupings that you want to make with the New_Numbering field should be done at the time the accounts are registered. I see two pieces of arbitrary information that needs to be tracked: account number and login name. Seems like the people registering the account can type whatever they want here, effectively, perhaps account numbers must be numerical. That detail doesn't matter.
What you want here is one account which can have multiple account numbers associated with it, and multiple logins. I would also assume that future development may add more to this, for example - why do people need multiple logins? Maybe different people are using them, or different applications. Presumably, we could collect additional information about the login names that stores additional details about each login. The same could be said about account numbers - certainly they contain more detail than just an account number.
First, you need one main login table.
You describe rty234 and bhoin1 as if they are unique people. So make this is a login_name column which is a unique index in a login table. This table should have an auto-increment login_id as the primary key. Probably this table also has a password field and additional information about that person.
Second, create an account table.
After creating their login, make them register an account with that login. Make this a two-step process. When they offer a new account number, create a record for it in the account table with additional identifying information that only the account-holder would know. Somehow you have to validate that this is actually their account in order to create this record, I would think. This table would also contain an auto-incremented primary key called account_id in addition to account_no and probably other details about the account.
Third, create a login_account table.
Once you validate that a login actually should have access to an account, create a record here. This should contain a login_id and an account_id which connects these two tables. Additionally, it might be good to include the information provided which shows that this login should have access to this account.
Now, when you want to query this data, you can find groups of data that have the same login_id or account_id, or even that share either a login or an account with a specific registration. Beyond that, it gets hairy to do in an SQL query. So if you really want to be able to go through the data and see who is in the same organization or something, because they share either a login or an account with the same group, you have to have some sort of script.
Create an organization table.
This table should contain an organization_id so you can track it, but probably once you identify the group you'll want to add a name or additional notes, or link it to additional functionality. You can then also add this organization_id field to the login or account tables, so you can fill them once you know the organization. You have to think about if it's possible for two organizations to share accounts, and maybe there's a more complicated design necessary. But I'm going to keep it simple here.
Your script should load up all of the login_id and account_id values and cache them somewhere. Then go through them all and if they have an organization_id, put their login_id or account_id in a hashmap with the value as the organization_id. Then load up all of the login_account records. If either the login_id or account_id has an organization_id in its hashmap, then add the other to its hashmap with the same organization_id. (if there's already one there, it would violate the simple organization uniqueness assumption I made, but this is where you would handle complexity - so I would just throw an exception and see if it happens when I run the script)
Hopefully this is enough example to get you started. When you properly design a database like this, you allow the information to connect naturally. This makes column additions and future updates much easier. Good luck!

Storing userID and other data and using it to query database

I am developing an app with PhoneGap and have been storing the user id and user level in local storage, for example:
window.localStorage["userid"] = "20";
This populates once the user has logged in to the app. This is then used in ajax requests to pull in their information and things related to their account (some of it quite private). The app is also been used in web browser as I am using the exact same code for the web. Is there a way this can be manipulated? For example user changes the value of it in order to get info back that isnt theirs?
If, for example another app in their browser stores the same key "userid" it will overwrite and then they will get someone elses data back in my app.
How can this be prevented?
Before go further attack vectors, storing these kind of sensitive data on client side is not good idea. Use token instead of that because every single data that stored in client side can be spoofed by attackers.
Your considers are right. Possible attack vector could be related to Insecure Direct Object Reference. Let me show one example.
You are storing userID client side which means you can not trust that data anymore.
window.localStorage["userid"] = "20";
Hackers can change that value to anything they want. Probably they will changed it to less value than 20. Because most common use cases shows that 20 is coming from column that configured as auto increment. Which means there should be valid user who have userid is 19, or 18 or less.
Let me assume that your application has a module for getting products by userid. Therefore backend query should be similar like following one.
SELECT * FROM products FROM owner_id = 20
When hackers changed that values to something else. They will managed to get data that belongs to someone else. Also they could have chance to remove/update data that belongs to someone else agains.
Possible malicious attack vectors are really depends on your application and features. As I said before you need to figure this out and do not expose sensitive data like userID.
Using token instead of userID is going solved that possible break attemps. Only things you need to do is create one more columns and named as "token" and use it instead of userid. ( Don't forget to generate long and unpredictable token values )
SELECT * FROM products FROM owner_id = iZB87RVLeWhNYNv7RV213LeWxuwiX7RVLeW12

data type for emails

I have a program where the user can enter multiple email addresses to get notification. I'm creating a field in the database to keep track of this and I'm not sure what would be the best data type to choose for all the email addresses. At this point I believe we will limit it to 4 email addresses.
What data type would be appropriate here for mysql?
Not sure this is relevant but I plan to serialize the data (with php function) When processing the email addresses. Interested in any feedback on my plans and if there is a better way to do this.
This indicates that you have 1:many relation of user:email addresses. Create another table with user_id and email columns and link it up to your users table via user_id.
Never serialize data and stick it in a column, you'll regret it later.

Database Design: User Profiles like in Meetup.com

In Meetup.com, when you join a meetup group, you are usually required to complete a profile for that particular group. For example, if you join a movie meetup group, you may need to list the genres of movies you enjoy, etc.
I'm building a similar application, wherein users can join various groups and complete different profile details for each group. Assume the 2 possibilities:
Users can create their own groups and define what details to ask users that join that group (so, something a bit dynamic -- perhaps suggesting that at least an EAV design is required)
The developer decides now which groups to create and specify what details to ask users who join that group (meaning that the profile details will be predefined and "hard coded" into the system)
What's the best way to model such data?
More elaborate example:
The "Movie Goers" group request their members to specify the following:
Name
Birthdate (to be used to compute member's age)
Gender (must select from "male" or "female")
Favorite Genres (must select 1 or more from a list of specified genres)
The "Extreme Sports" group request their member to specify the following:
Name
Description of Activities Enjoyed (narrative form)
Postal Code
The bottom line is that each group may require different details from members joining their group. Ideally, I would like anyone to create a group (ala MeetUp.com). However, I also need the ability to query for members fairly well (e.g. find all women movie goers between the ages of 25 and 30).
For something like this....you'd want maximum normalization, so you wouldn't have duplicate data anywhere. Because your user-defined tables could possibly contain the same type of record, I think that you might have to go above 3NF for this.
My suggestion would be this - explode your tables so that you have something close to 6NF with EAV, so that each question that users must answer will have its own table. Then, your user-created tables will all reference one of your question tables. This avoids the duplication of data issue. (For instance, you don't want an entry in the "MovieGoers" group with the name "John Brown" and one in the "Extreme Sports" group with the name "Johnny B." for the same user; you also don't want his "what is your favorite color" answer to be "Blue" in one group and "Red" in another. Any data that can span across groups, like common questions, would be normalized in this form.)
The main drawback to this is that you'd end up with a lot of tables, and you'd probably want to create views for your statistical queries. However, in terms of pure data integrity, this would work well.
Note that you could probably get away with only factoring out the common fields, if you really wanted to. Examples of common fields would include Name, Location, Gender, and others; you could also do the same for common questions, like "what is your favorite color" or "do you have pets" or something to that extent. Group-specific questions that don't span across groups could be stored in a separate table for that group, un-exploded. I wouldn't advise this because it wouldn't be as flexible as the pure 6NF option and you run the risk of duplication (how do you predetermine which questions won't be common questions?) but if you really wanted to, you could do this.
There's a good question about 6NF here: Would like to Understand 6NF with an Example
I hope that made some sense and I hope it helps. If you have any questions, leave a comment.
Really, this is exactly a problem for which SQL is not a right solution. Forget normalization. This is exactly the job for NoSQL document stores. Every user as a document, having some essential fields like id, name, pwd etc. And every group adds possibility to add some fields. Unique fields can have names group-id-prefixed, shared fields (that grasp some more general concept) can have that field name free.
Except users (and groups) then you will have field descriptions with name, type, possible values, ... which is also very good for a document store.
If you use key-value document store from the beginning, you gain this freeform possibility of structuring your data plus querying them (though not by SQL, but by the means this or that NoSQL database provides).
First i'd like to note that the following structure is just a basis to your DB and you will need to expand/reduce it.
There are the following entities in DB:
user (just user)
group (any group)
template (list of requirement united into template to simplify assignment)
requirement (single requirement. For example: date of birth, gender, favorite sport)
"Modeling":
**User**
user_id
user_name
**Group**
name
group_id
user_group
user_id (FK)
group_id (FK)
**requirement**:
requirement_id
requirement_name
requirement_type (FK) (means the type: combo, free string, date) - should refers to dictionary)
**template**
template_id
template_name
**template_requirement**
r_id (FK)
t_id (FK)
The next step is to model appropriate schema for storing restrictions, i.e. validating rule for any requirement in any template. We have to separate it because for different groups the same restrictions can be different (for example: "age"). You can use the following table:
**restrictions**
group_id
template_id
requirement_id (should be here as template_id because the same requirement can exists in different templates and any group can consists of many templates)
restriction_type (FK) (points to another dict: value, length, regexp, at_least_one_value_choosed and so on)
So, as i said it is the basis. You can feel free to simplify this schema (wipe out tables, multiple templates for group). Or you can make it more general adding opportunity to create and publish temaplate, requirements and so on.
Hope you find this idea useful
You could save such data as JSON or XML (Structure, Data)
User Table
Userid
Username
Password
Groups -> JSON Array of all Groups
GroupStructure Table
Groupid
Groupname
Groupstructure -> JSON Structure (with specified Fields)
GroupData Table
Userid
Groupid
Groupdata -> JSON Data
I think this covers most of your constraints:
users
user_id, user_name, password, birth_date, gender
1, Robert Jones, *****, 2011-11-11, M
group
group_id, group_name
1, Movie Goers
2, Extreme Sports
group_membership
user_id, group_id
1, 1
1, 2
group_data
group_data_id, group_id, group_data_name
1, 1, Favorite Genres
2, 2, Favorite Activities
group_data_value
id, group_data_id, group_data_value
1,1,Comedy
2,1,Sci-Fi
3,1,Documentaries
4,2,Extreme Cage Fighting
5,2,Naked Extreme Bike Riding
user_group_data
user_id, group_id, group_data_id, group_data_value_id
1,1,1,1
1,1,1,2
1,2,2,4
1,2,2,5
I've had similar issues to this. I'm not sure if this would be the best recommendation for your specific situation but consider this.
Provide a means of storing data as XML, or JSON, or some other format that delimits the data, but basically stores it in field that has no specific format.
Provide a way to store the definition of that data
Provide a lookup/index table for the data.
This is a combination of techniques indicated already.
Essentially, you would create some interface to your clients to create a "form" for what they want saved. This form would indicated what pieces of information they want from the user. It would also indicate what pieces of information you want to search on.
Save this information to the definition table.
The definition table is then used to describe the user interface for entering data.
Once user data is entered, save the data (as xml or whatever) to one table with a unique id. At the same time, another table will be populated as an index with
id where the xml data was saved
name of field data is stored in
value of field data stored.
id of data definition.
now when a search commences, there should be no issue in searching for the information in the index table by name, value and definition id and getting back the id of the xml/json (or whatever) data you stored in the table that the data form was stored.
That data should be transformable once it is retrieved.
I was seriously sketchy on the details here, I hope this is enough of an answer to get you started. If you would like any explanation or additional details, let me know and I'll be happy to help.
if you're not stuck to mysql, i suggest you to use postgresql which provides build-in array datatypes.
you can define a define an array of varchar field to store group specific fields, in your groups table. to store values you can do the same in the membership table.
comparing to string parsing based xml types, this array approach will be really fast.
if you dont like array approach you can check out xml datatypes and an optional hstore datatype which is a key-value store.