I'm looking for some guidance in the best way to store user specific data in an SQL database. I'm a little new to SQL so I'm hoping this is a fairly easy concept for those familiar.
I've been reading about normalisation and other good practices as I'm aware that setting a good foundation for the database is crucial and hard to change later.
I think an easy way to explain my scenario is this:
Each website user can choose to create one or more "projects".
Within each project a user will set an "object". This object can be created by the user or it can be chosen from a list of objects which have been created by other users.
Each object has a variable number of settings. Let's say an object could have between 5 - 25 settings. Each setting could simply be an integer value between 0 - 100.
Originally I thought about doing it this way:
Project Table
+-----------+-------------+------+---------+----------+---------+----------+------+--------+
| ProjectID | ProjectName | User | Object1 | Object2 | SetID | Notes | Date | Photo |
+-----------+-------------+------+---------+----------+---------+----------+------+--------+
| PID0001 | My Project | Bob | OBJ0001 | OBJ00056 | SID0045 | my notes | | |
+-----------+-------------+------+---------+----------+---------+----------+------+--------+
Each user can create a project and reference different objects and object settings profiles within that project.
Object Table
+---------+------------+--------+---------+-------+--------+----------+-------+-------+---------+---------+--------+
| ObjID | ObjName | ObjVer | Date | User | Set1ID | Set1Name | Set1X | Set1Y | Set1Min | Set1Max | Set2ID |
+---------+------------+--------+---------+-------+--------+----------+-------+-------+---------+---------+--------+
| OBJ0001 | My Object | Bob | | Bob | S00013 | Volts | 12 | 52 | 1 | 80 | S00032 |
+---------+------------+--------+---------+-------+--------+----------+-------+-------+---------+---------+--------+
This table would define all the configurable settings for the object. It could range from 1 settings to 25 settings. In this example, each setting the user adds to the object would have 6 parameters such as min/max allowed values, name, id etc.
If I do it this way, I would end up with over 100 columns many of which could be empty...
Object Settings Table
+---------+-------------+---------+------------+------+---------+---------+---------+
| SetID | Setname | ObjID | Date | User | Set1Val | Set2Val | Set3Val |
+---------+-------------+---------+------------+------+---------+---------+---------+
| SID0045 | My Settings | OBJ0001 | 12-12-2017 | Bob | 12 | 32 | 98 |
+---------+-------------+---------+------------+------+---------+---------+---------+
In this table, each row would define a user's settings profile for that object - basically just the value for the settings which were defined in the object table. Each user could have a different set of settings for the same object when it's used in their project.
So, the above method seems bad to me. It makes sense in my head but the number of columns will get out of control when allowing multiple settings.
I suppose the better way of doing this would be to go vertical by adding a row for each setting or setting column but I'm just not sure how this would look. How can I structure it this way while still allowing the "sharing" of objects between user projects?
Related
I am trying to build one giant schema that makes data users to query easier, in order to achieve that, streaming events have to be joined with User Metadata by USER_ID and ID. In data engineering, This operation is called "Data Enrichment" right? the tables below are the example.
# `Event` (Stream)
+---------+--------------+---------------------+
| UERR_ID | EVENT | TIMESTAMP |
+---------+--------------+---------------------+
| 1 | page_view | 2020-04-10T12:00:11 |
| 2 | button_click | 2020-04-10T12:01:23 |
| 3 | page_view | 2020-04-10T12:01:44 |
+---------+--------------+---------------------+
# `User Metadata` (Static)
+----+-------+--------+
| ID | NAME | GENDER |
+----+-------+--------+
| 1 | Matt | MALE |
| 2 | John | MALE |
| 3 | Alice | FEMALE |
+----+-------+--------+
==> # Result
+---------+--------------+---------------------+-------+--------+
| UERR_ID | EVENT | TIMESTAMP | NAME | GENDER |
+---------+--------------+---------------------+-------+--------+
| 1 | page_view | 2020-04-10T12:00:11 | Matt | MALE |
| 2 | button_click | 2020-04-10T12:01:23 | John | MALE |
| 3 | page_view | 2020-04-10T12:01:44 | Alice | FEMALE |
+---------+--------------+---------------------+-------+--------+
I was developing this using Spark, and User Metadata is stored in MySQL, then I realized it would be waste of parallelism of Spark if the spark code includes joining with MySQL tables right?
The bottleneck will be happening on MySQL if traffic will be increased I guess..
Should I store those table to key-value store and update it periodically?
Can you give me some idea to tackle this problem? How you usually handle this type of operations?
Solution 1 :
As you suggested you can keep a local cache copy of in key-value pair on your local and updated the cache as regular interval.
Solution 2 :
You can use a MySql to Kafka Connector as below,
https://debezium.io/documentation/reference/1.1/connectors/mysql.html
For every DML or table alter operations on your User Metadata Table there will be a respective event fired to a Kafka topic (e.g. db_events). You can run a thread in parallel in your Spark streaming job which polls db_events and updates your local cache key-value.
This solution would make your application a near-real time application in true sense.
One over head I can see is that there will be need to run a Kafka Connect service with Mysql Connector (i.e. Debezium) as a plugin.
I've recently tried to create an executable with python 2.7 which can read a MySQL database.
The database (named 'montre') regroups two tables : patient and proto_1
Here is the content of those tables :
mysql> select * from proto_1;
+----+------------+---------------------+-------------+-------------------+-----
----------+----------+
| id | Nom_Montre | Date_Heure | Temperature | Pulsion_cardiaque | Taux
_oxy_sang | Humidite |
+----+------------+---------------------+-------------+-------------------+-----
----------+----------+
| 1 | montre_1 | 2017-11-27 19:33:25 | 22.30 | NULL |
NULL | NULL |
| 2 | montre_1 | 2017-11-27 19:45:12 | 22.52 | NULL |
NULL | NULL |
+----+------------+---------------------+-------------+-------------------+-----
----------+----------+
mysql> select * from patient;
+----+-----------+--------+------+------+---------------------+------------+----
----------+
| id | nom | prenom | sexe | age | date_naissance | Nom_Montre | com
mentaires |
+----+-----------+--------+------+------+---------------------+------------+----
----------+
| 2 | RICHEMONT | Robert | M | 37 | 1980-04-05 23:43:00 | montre_3 | ess
aye2 |
| 3 | PIERRET | Mandy | F | 22 | 1995-04-05 10:43:00 | montre_4 | ess
aye3 |
| 14 | PIEKARZ | Allan | M | 22 | 1995-06-01 10:32:56 | montre_1 | Hea
lthy man |
+----+-----------+--------+------+------+---------------------+------------+----
----------+
As I'm just used to code in C (no OOP), I didn't create class in the python project (shame on me...). But I managed, in two files, to create something (with mysql.connector) which can print (on the cmd) my database and excecute sub like looking-for() etc.
Now, I want to create a GUI for users with pyqt. Unfortunately, I saw that the structure is totally different, with class etc. But okay, I tried to go throught this and I've created a GUI which allows to display the table "patient". But I didn't manage (in the datasheet of QT) to find how I can use the programs I've already created to display. Neither how to display in a tableWidget only several rows of my table patient for exemple (Using QSQL).
For example, if I want to display all the table patient, I use this line (pyQt):
self.model.setTable("patient")
For this one, I got it, but that disturb me because there is no MySQL coding requisites to display my table and so I don't know how to sort only the rows we want to see and display them. If we only want to see, for example, the ID n°2, how to display in the table:widget only Robert ?
To recap, I want to know :
If I can take the coding I've created and combine it with pyQT
How to display (tableWidget) only rows which are selected by MySQL. Is that possible ?
Please find in the URL my code for a better understanding of my problem :
https://drive.google.com/file/d/1nxufjJfF17P5hN__CBEcvrbuHF-23aHN/view?usp=sharing
I hope I was clear, thank you all for your help !
I am creating an mobile application. In this app, I have created a Login and Register activity. I have also created a online Database using AWS(Amazon Web Service) to store all the login information of the user upon registering.
In my database, i have a table called 'users'. This table holds the following fields "fname","lname","username","password". This part works and successfully stores data from my phone to the database.
for example,
| fname | lname | username | password |
| ------ | ------ | -------- | -------- |
| john | doe | jhon123 | 1234 |
Inside the app, I have an option where the user may click on "Start Log", which will record a start and end values on a seekBar.
How can i create a table under a user who is logged in.
(Essentially, i want to be able to create multiple tables under a user.)
for example,
This table should be under the user "john123":
| servo | Start | End |
| ------ | ------ | --- |
| 1 | 21 | 30 |
| 2 | 30 | 11 |
| 3 | 50 | 41 |
| 4 | 0 | 15 |
I know its a confusing question, but
i am essentially just trying to have multiple tables linked to a user.
As to:
How to create a table for a user in a Database
Here are some GUI tools you might find useful:
MySQL - MySQL Workbench
PostgreSQL - PG Admin
As for creating a separate table for each user, refer to #Polymath's answer. There is no benefit in creating separate tables for each user (you might as well use a json file).
What you should do is create a logs table that has a user_id attribute referencing the id in the users table.
-------------------------------------------------------
| id | fname | lname | username | password |
| -- | ------ | ------ | -------- | ------------------- |
| 1 | john | doe | jhon123 | encrypted(password) |
-------------------------------------------------------
|______
|
V
---------------------------------------
| id | user_id | servo_id | start | end |
| -- | ------- | -------- | ----- | --- |
| 1 | 1 | 1 | 21 | 30 |
| 2 | 1 | 2 | 30 | 11 |
---------------------------------------
You should also look into database normalization as your "john123" table is not in 3NF. The servo should be decomposed out of logs table if it will be logged by multiple users or multiple times (which I'm guessing is the case for you).
In reading this I wonder if your design is right. It sounds like you are trying to create a table for each user. I also wonder how scalable it is to have a unique table per user. if you scale to millions of users you will have millions of tables to manage and will probably need a separate index table to find the right table for the user. Why a table for each? Why not a single table with the UserID as a use key value. You can extract the data just by filtering on the UserID.
Select * FROM UsersData ORDER BY DateTime WHERE User == UserID
However I will leave that for you to ponder.
You mentioned that this is a Mobile App. I think what you need to do is look at AWS Federated access and Cognito which will allow you to Identify a user using federate Identities. Pass the User unique Id , plus a temporary (one use) credentials linked to an access Role. Combined this way, you can scale to millions of users with full authentication without managing millions of accounts.
RL
Edit for future viewers: Aside from the accepted answer which helped me I found some really good info here .
I've got a database with a single table for displaying inventory on a website (RVs). It stores the typical info: year, make, model, etc. I originally made it with 6 extra columns for storing "special features", but I don't like having such a hard limit on what options can be listed. Since I've never messed with more than a single table my gut instinct was to just add 24 or so more columns to cover everything, but something in my head told me that there might be a better way. So when do I decide N columns is too many? The data in these columns will commonly not be unique.
(Sorry for crappy diagram)
Current table design:
-----------------------------------------------------------------------
| id | year | make | model | price | ft_1 | ft_2 | ft_3 | ft_4 | ft_5 |
-----------------------------------------------------------------------
| | | | | | | | | | |
-----------------------------------------------------------------------
Possible better design:
table #1
------------------------------------
| id | year | make | model | price |
------------------------------------
| | | | | |
------------------------------------
table #2
---------------------------------------------
| unique_id(?) | feature | unit_ref |
---------------------------------------------
| 0 | "Diesel Pusher" | 2,6,14 |
---------------------------------------------
I feel like a bonus of the second table might be that I could more easily propagate a dropdown containing all the previously entered features to speed up adding new units to inventory.
Is this the right way to go about it, or should I just add more columns and be content?
Thanks.
Believe it or not, your best option would likely be to add a third table.
Since each record in your rvs table can be linked to multiple rows in the features table, and each feature can correspond to multiple rvs, you have a many-to-many relationship which is inherently difficult to maintain in a relational dbms. By adding a third "intersection" table you convert it to a one-to-many-to-one relationship which can be enforced declaratively by the dbms.
Your table structure would then become something like
rvs
------------------------------------
| id | year | make | model | price |
------------------------------------
| | | | | |
------------------------------------
features
--------------------------
| id | feature |
--------------------------
| 1192 | "Diesel Pusher" |
--------------------------
rv_features
----------------------
| rv_id | feature_id |
----------------------
| | |
----------------------
How do you make use of this? Suppose you want to record the fact that the 2016 Travelmore CampMaster has a 25kW diesel generator. You would first add a record to rvs like
--------------------------------------------------
| id | year | make | model | price |
--------------------------------------------------
| 0231 | 2016 | Travelmore | CampMaster | 750000 |
| 2101 | 2016 | Travelmore | Domestant | 650000 |
--------------------------------------------------
(Note the value in the id column is entirely arbitrary; its sole purpose is to serve as the primary key which uniquely identifies the record. It can encode meaningful information, but it must be something that will not change throughout the life of the record it identifies.)
You then add (or already have) the generator in the features table:
--------------------------------
| id | feature |
--------------------------------
| 1192 | Diesel Pusher 450hp |
| 3209 | diesel generator 25kW |
--------------------------------
Finally, you associate the rv to the feature with a record in rv_features:
----------------------
| rv_id | feature_id |
----------------------
| 0231 | 3209 |
| 0231 | 1192 |
| 2101 | 3209 |
----------------------
(I've added a few other records to each table for context.)
Now, to retrieve the features of the 2016 CampMaster, you use the following SQL query:
SELECT r.year, r.make, r.model, f.feature
FROM rvs r, features f, rv_features rf
WHERE r.id = rf.rv_id
AND rv.feature_id = f.id
AND r.id = '2031';
to get
----------------------------------------------------------
| year | make | model | feature |
----------------------------------------------------------
| 2016 | Travelmore | CampMaster | diesel generator 25kW |
| 2016 | Travelmore | CampMaster | Diesel Pusher 450hp |
----------------------------------------------------------
To see the rvs with a 25kW generator, change the query to
SELECT r.year, r.make, r.model, f.feature
FROM rvs r, features f, rv_features rf
WHERE r.id = rf.rv_id
AND rv.feature_id = f.id
AND f.id = '3209';
Sherantha's link to A Quick-Start Tutorial on Relational Database Design actually looks like a good intro to table design and normalization; you might find it useful.
There is a thing calles "third normal form" it says that everything without the unique ids shuld be unique. This means you need to make a table for year, a table for make a table for models etc and a table where you can combine all these ids to one connected dataset.
But this is not always practical, io think the best way to take this is something in between, like tables for entrys that repeat very often, but there dont need to be an extra table for price with unique ids, that would be overkill i think.
Based upon your scenario, if you believe no. of features columns remain same then no need for second table. And in case if there any possibility that features can be increased at any time in future then you should break up your table into two. (RVS & Features). Then create a third table that identify RVS & features as it seems there is many-to-many relationship. So I suggest you to use three tables.
I think it is better for you to be more familiar with relational database design. This is a short but great article I have found earlier.
I am currently in the process of converting the player saving features of a game's multi-player engine into an SQL database for the integration of a webpage to display/modify/sell characters. The original system stored all data into text files which was an awful way of dealing with this data as it was fixed to the game only. Within the Text files the user's Username, Password, ID, and player-data was stored, allowing for only one character. I have separated this into tables and can successfully save and load character data. The tables I have are quite large so for example purposes I will use the following:
account:
+----+----------+----------+
| ID | Username | Password |
+----+----------+----------+
| 1 | Player1 | 123456 | (Secure passwords much?)
| 2 | Player2 | password | (These are actually hashed in the real db)
+----+----------+----------+
account_character:
+------------+--------------+
| Account_ID | Character_ID |
+------------+--------------+
| 1 | 1 |
| 1 | 2 |
| 2 | 3 |
+------------+--------------+
character:
+----+-----------+-----------+-----------+--------+--------+
| ID | PositionX | PositionY | PositionZ | Gender | Energy | etc....
+----+-----------+-----------+-----------+--------+--------+
| 1 | 100 | 150 | 1 | 1 | 100 |
| 2 | 30 | 90 | 0 | 1 | 100 |
| 3 | 420 | 210 | 2 | 0 | 53.5 |
+----+-----------+-----------+-----------+--------+--------+
These tables are linked using relationships.
What I have so far is, the user logs in and the server queries their username and matches the password. If the passwords match, the server begins to load the character data based on the ID loaded from the account during logging in.
This is where I am stuck. I have successfully done this through phpmyadmin using the SQL command interface, but as it was around 4AM I was tired and accidentally closed the tab that contained the command before I saved it. I have tried to replicate this but I simply cannot obtain the data I require in the query.
I've recently completed a course in databases at college and got a distinction, but for the life of me I cannot get this to work again... I have followed tutorials but as the situations usually differ from mine I cannot apply them until I understand them. I know I'm going to kick myself once I have a working command.
Tl;dr - I wish to query all character data linked to an account using the account's 'ID'.
I think this should work:
SELECT
*
FROM
account_character ac
INNER JOIN account a ON ac.Account_ID = a.ID
INNER JOIN character c on ac.Character_ID = c.ID
WHERE
account.Username = ? AND
account.Password = ?
;
We start by joining together all the relevant tables, and then filter to get characters just for the current user.