Grouping with associated variables - mysql

i have a table as below:
Account no. Login Name Numbering
1234 rty234 1
1234 bhoin1 1
3456 rty234 2
3456 0hudp 2
9876 cfrdk 3
From the table above, you can see that rty234 and bhoin1 registered a same account no of 1234, thus i know that rty234 and bhoin1 are related and i numbered them as 1. The numbering field was based on the account no.
Then I found that rty234 also registered another account no of 3456 and the same account no was registered by 0hudp as well. Thus, i concluded that rty234, bhoin1 and 0hudp are related. Therefore, i wanted to renumber the third and forth row to 1. If they are not further related, then just remain the numbering. How can i achieve that using mysql.
The expected output will be as follow:
Account no. Login Name Numbering New_Numbering
1234 rty234 1 1
1234 bhoin1 1 1
3456 rty234 2 1
3456 0hudp 2 1
9876 cfrdk 3 3

You need to understand how to design a relational database.
These groupings that you want to make with the New_Numbering field should be done at the time the accounts are registered. I see two pieces of arbitrary information that needs to be tracked: account number and login name. Seems like the people registering the account can type whatever they want here, effectively, perhaps account numbers must be numerical. That detail doesn't matter.
What you want here is one account which can have multiple account numbers associated with it, and multiple logins. I would also assume that future development may add more to this, for example - why do people need multiple logins? Maybe different people are using them, or different applications. Presumably, we could collect additional information about the login names that stores additional details about each login. The same could be said about account numbers - certainly they contain more detail than just an account number.
First, you need one main login table.
You describe rty234 and bhoin1 as if they are unique people. So make this is a login_name column which is a unique index in a login table. This table should have an auto-increment login_id as the primary key. Probably this table also has a password field and additional information about that person.
Second, create an account table.
After creating their login, make them register an account with that login. Make this a two-step process. When they offer a new account number, create a record for it in the account table with additional identifying information that only the account-holder would know. Somehow you have to validate that this is actually their account in order to create this record, I would think. This table would also contain an auto-incremented primary key called account_id in addition to account_no and probably other details about the account.
Third, create a login_account table.
Once you validate that a login actually should have access to an account, create a record here. This should contain a login_id and an account_id which connects these two tables. Additionally, it might be good to include the information provided which shows that this login should have access to this account.
Now, when you want to query this data, you can find groups of data that have the same login_id or account_id, or even that share either a login or an account with a specific registration. Beyond that, it gets hairy to do in an SQL query. So if you really want to be able to go through the data and see who is in the same organization or something, because they share either a login or an account with the same group, you have to have some sort of script.
Create an organization table.
This table should contain an organization_id so you can track it, but probably once you identify the group you'll want to add a name or additional notes, or link it to additional functionality. You can then also add this organization_id field to the login or account tables, so you can fill them once you know the organization. You have to think about if it's possible for two organizations to share accounts, and maybe there's a more complicated design necessary. But I'm going to keep it simple here.
Your script should load up all of the login_id and account_id values and cache them somewhere. Then go through them all and if they have an organization_id, put their login_id or account_id in a hashmap with the value as the organization_id. Then load up all of the login_account records. If either the login_id or account_id has an organization_id in its hashmap, then add the other to its hashmap with the same organization_id. (if there's already one there, it would violate the simple organization uniqueness assumption I made, but this is where you would handle complexity - so I would just throw an exception and see if it happens when I run the script)
Hopefully this is enough example to get you started. When you properly design a database like this, you allow the information to connect naturally. This makes column additions and future updates much easier. Good luck!

Related

Securing MySQL id numbers so they are not sequential

I am working on a little package using PHP and MySQL to handle entries for events. After completing an entry form the user will see all his details on a page called something like website.com/entrycomplete.php?entry_id=15 where the entry_id is a sequential number. Obviously it will be laughably easy for a nosey person to change the entry_id number and look at other people's entries.
Is there a simple way of camouflaging the entry_id? Obviously I'm not looking to secure the Bank of England so something simple and easy will do the job. I thought of using MD5 but that produces quite a long string so perhaps there is something better.
Security through obscurity is no security at all.
Even if the id's are random, that doesn't prevent a user from requesting a few thousand random id's until they find one that matches an entry that exists in your database.
Instead, you need to secure the access privileges of users, and disallow them from viewing data they shouldn't be allowed to view.
Then it won't matter if the id's are sequential.
If the users do have some form of authentication/login, use that to determine if they are allowed to see a particular entry id.
If not, instead of using a url parameter for the id, store it in and read it from a cookie. And be aware that this is still not secure. An additional step you could take (short of requiring user authentication) is to cryptographically sign the cookie.
A better way to implement this is to show only the records that belong to that user. Say the id is the unique identifier for each user. Now store both entry_id and id in your table (say table name is entries).
Now when the user requests for record, add another condition in the mysql query like this
select * from entries where entry_id=5 and id=30;
So if entry_id 5 does not belong to this user, it will not have any result at all.
Coming towards restricting the user to not change his own id, you can implement jwt tokens. You can give a token on login and add it to every call. You can then decrypt the token in the back end and get the user's actual id out of it.

Which data can I unhesitatingly send over a GET request?

I create a VueJS application with express and sequelize to access a mysql database (currently running with XAMPP). I have a database which consist of 3 tables:
users: [id (primary key), name, email, family_id (foreign key)]
families: [id (primary key), name]
hobbies: [id (primary key), name, user_id (foreign key)]
All of these IDs are auto_increment so the first user registered gets the ID 1 and so on.
Every user within the same family (so with equal family_id) is allowed to see the hobbies of the other family members. I have a SQL query, which gives me all the family members. On my websity I have a simple drop down menu, where I can select the member. With a GET request I then want to retrieve all hobbies of the selected member.
Now I can basically decide if I use the id or the email for the request parameter e.g. /api/hobbies/:id or /api/hobbies/:email. Email reveals more private information while id reveals information about my internal strucutre like "At least (id) number of users exists.". I think it is better to use the id.
Maybe there is also the possibility to assign a random id (not auto increment) in the database? But I dont know how to to this.
Nothing you send as a parameter to a GET request is private. Those parameters are part of the URL you GET, and those URLs can be logged in various proxy servers, etc, all over the internet without your consent or your users' consent.
It seem to me that family members' hobbies can be sensitive data. What if the whole family likes, say, golf? A cybercreep could easily figure out that a good time for burglary would be Saturday afternoons.
And if your app does GET operations with autoincrementing id values, it's child's play for a cybercreep to examine any record they want. Check out the Panera Bread data breach for example. https://krebsonsecurity.com/2018/04/panerabread-com-leaks-millions-of-customer-records/
At a minimum use POST for that kind of data.
Better yet, use a good authentication / session token system on your app, and conceal data from users if they're not members of that family.
And, if you want to use REST style GET parameters, you need to do these things to be safe:
Use randomized id values. It must be very difficult for a cybercreep to guess a second value from knowing a first value. Serial numbers won't do. Neither will email addresses.
Make sure unauthenticated users can see no data.
Make sure authenticated users can only see the subset of data for which they're authorized.
My suggestion to avoid REST-style GET parameters comes from multiple security auditors saying, you must change that.

MySql DB structure with Overriting group settings and allowing settings

how are you?
I'm working on a project that contains accounts with mailing list.
The account has 3 packages he can buy. Each package has it's own settings. e.g.: first package the user gets 1 email per day, and in second package he gets 5 emails per day.
Another feature that I want is the opertunity to override some of the package settings. Which means, for one account I'll set daily email limit as 7.
One more feature I need in this system is email providers. I want the first package to get emails only from first provider, second package from 2 providers and so on.
So I have a problem designing my DB.
I created table emailSubscriptions which has EmailID and name.
I created table accountsGroup which only contains GroupId and name.
I created table accounts which has AccountID, GroupID (foreign key), Email, password and investment. (According to his investment he gets his package).
I've created table accountsSubscriptions which has SUBSCRIPTION ID, AccountID, EmailID and IsActive.
I created table packages which contains PackageID, GroupID, from investment and to investment, and all other package settings e.g. maxEmailsPerDay ....
Of course the end user has. GUI that he can see his settings and edit what he can according to his current package. The admin of the users has GUI too.
Any way, now I got stuck.
I thought about adding to accounts all package columns and then when I want to send emails, I'll take the settings from the group and where ever it's not 0 / empty just override, but the problem is when some settings are 0 / 1, then the column is default 0 and if the groupSettings is 1 for something and I want to turn it off I can't. So this is the first problem
The second problem is with allowed emails subscriptions ... Same problem actually.
I thought about adding to package the allowedEmails, but then it means when ever I send the emails I need to use LIKE operator - and this is not good for runtime.
So I really need you help... Hope you can help me.
Thanks !!
The requirements part lacks clarity, I'd say.
But let's go for it anyway.
Let's extract entities from this messy field of things.
Each entity would generally means one table.
Start from Account.
Account has Subscriptions. It is not clear what's the relation here: if it is 1:1 ("account can have only one subscription") - then reference to it is a part of the Account entity, if it is 1:n - then you'll need a special Account-Subscriptions relation table.
Now Subscription - it is defined by SubscriptionType, or Package, so there must be a table that contains these records (these limits and whatever else you want). Account or Account-Subscription table would refer to it to define what subscription(s) the Account have.
Then Providers - they're referred by SubscriptionType/Packages. If there could be more than 1 Provider per Package/SubscriptionType - then you need additional Package-Provider realtion table.
And finally, the Overrides. That's a trickier part because of the weak requirements on it, but as soon as they're overriding the Package paremeters, I suggest to keep the entity structure same to the package.
You may even place it into the same Package table, sorting 'em out by date, or assign them weights, always keeping the default Package record with the lowest weight.
Then, when you create an override, you copy the whole default record except for the overridden fields, and assign it next weight (or current date), and when you query it - group it and get the MAX().
There's no Email entity itself - but you didn't mention it in your requirements sections whatsoever.
So, that's pretty much it: Accounts, Subscriptions, (optional) Account-Subscription, Packages, Providers, (optional) Package-Provider, (optional, may be incorporated into Packages) Overrrides.
Works for you?

When to use a relational database structure?

I'm creating a file hosting service, but right now I am creating the account email activation part of registering. So I had to come up with a database structure.
And right now it's:
users
id
first_name
last_name
email
password
since
active
hash_activate
But I can do it like a relational database too:
users
id
first_name
last_name
email
password
since
activation
id
user_id
hash
active
What would be the best way to go about it? And why?
If every person has only one activation hash active at at time, then it's feasible to store it in same table with users.
However, one advantage of separating it is that users only have an activation hash for a brief period of time, so to keep the user records smaller, you could store the hashes in a separate table. Keeping the user records small keeps it more performant. In this case, you wouldn't have active column. You'd just delete inactive hashes.
If you do store the activation columns in the user table, just be sure to select the columns by name. E.g. in most cases, you'll want do this:
SELECT id, first_name, last_name, email, password
FROM users
Instead of:
SELECT *
FROM users
You'd only want to select the activation columns when you needed them.
The second would only be sensible if one user could have multiple activations. You don't say whether this is true or false, so I couldn't possibly advise you.
If activations are a temporary thing, or having a hash defines someone as active, then make them different. Otherwise, that really won't matter.
However, neither is necessarily more or less relational than the other, without much more information. If you put a unique constraint on the combination of values in each row, and set each column up with a NOT NULL constraint, your first one would be quite relational.
You use a relational design when correctness of data, over time, is as important, if not more important, than what the application does with that data, and/or when data structure correctness/consistency is critical to the correct operation of an application, but might not necessarily be guaranteed by the application's own operation.

Access: Entering multiple subform values with one entry in the form

I've been using Access to create simple databases for a while with great success, but have run into a problem I can't find an answer to.
We ship individualized serialized units to various end-users, and occasionally to resellers that stock them for end-users. I must keep track of which serial numbers end up with each end-users.
The first database I created to handle this recorded company information in one table using their account number as primary key, order information in a second table using the order number as the primary key and linked via the company name, and unit information in a third table with the serial number as the primary key and linked via the order number.
This worked very well until I had to account for these stock orders with a reseller. As it was structured, every unit was linked to one company via the sales order. The issue is that I may ship 20 units on one order to Company A, who then sells 5 to Company B and 3 to Company C.
I realized I needed to link the company name directly to the units, not the orders and have fixed that.
My issue now is simplicity in entering information in the form. My previous database involved the employee in our shipping department merely entering the sales order, selecting the customer name from a drop down menu, then scanning the serial numbers in a subform. This was to ensure simplicity and try to eliminate human error. He had only three things to input, and most of the input was done by scanning barcodes.
As it is currently structured now, the employees out in shipping would have to populate the company name for every record in the subform with the serial number and that complicates things in a way that is unacceptable. At the point of shipping, the company name will always be the same for every unit in the subform.
So.
How would I go about creating a form where the company name is entered once in the form, and automatically populates itself for every record in the subform? The caveat here is that I must also be able to go back occasionally and change the company name of individual units in an order without necessarily affecting the rest of the order. I suppose it starts out as a one-to-many relationship that then must be able to change.
I hope that makes sense.
I have looked for answers using various approaches with auto-fill and relationships and not preserving data integrity, but I feel the answer is just beyond my reach.
The only solution I can think of is to create another field in the unit table for the end-user, and perhaps write a formula that sets this default value as the company name from the order that shipped it. This seems unnecessarily complicated and redundant, there has to be a better way.