The best way to manage refresh tokens server-side

The best way to manage refresh tokens server-side - mysql

I am using redis to store it in userId:refreshToken.
However, this method prevents one user from logging into multiple devices.
So I try to change it to the format of userId_accessToken:refreshToken.
However, this method should be del->insert whenever the access token or refresh token is changed.
So I'm debating between two methods.
Save it in redis as above.
Save in DB as [id, userId, refreshToken, accessToken, expDate].
In mysql, I will create a cron that will delete it after the expDate.
In redis, It will apply ttl when creating it.
What's a better way?
Our server's memory is 3969424.
Database uses rds and mysql.
If there's another good way, that's great too!

I would choose whichever is simpler to implement
and another thought, you can use MyRocks engine to automatic delete old keys
MyRocks-ttl
CREATE TABLE t1 (a INT, b INT, c INT, PRIMARY KEY (a), KEY(b)) ENGINE=ROCKSDB COMMENT "ttl_duration=3600;";
In the above examples, we set ttl_duration to 3600 meaning that we expect rows older than 3600 seconds to be removed from the database.

Related

Can I do Change Data Capture with MariaDb's Automatic Data Versioning

We're using MariaDb in production and we've added a MariaDb slave so that our data team can perform some ETL tasks from this slave to our datawarehouse. However, they lack a proper Change Data Capture feature (i.e. they want to know which rows from the production table changed since yesterday in order to query rows that actually changed).
I saw that MariaDb's 10.3 had an interesting feature that allowed to perform a SELECT on an older version of a table. However, I haven't found resources that supported the idea that it could be used for CDC, any feedback on this feature?
If not, we'll probably resort to streaming the slave's binlogs to our datawarehouse but that looks challenging..
Thanks for your help!

(As a supplement to Stefans answer)
Yes, the System-Versioning can be used for CDC because the validity-period in ROW_START (Object starts to be valid) and ROW_END (Object is now invalid) can be interpreted when an INSERT-, UPDATE- or DELETE-query happened. But it's more cumbersome as with alternative CDC-variants.
INSERT:
Object was found for the first time
ROW_START is the insertion time
UPDATE:
Object wasn't found for the first time
ROW_START is the update time
DELETE:
ROW_END lies in the past
there is no new entry for this object in the next few lines
I'll add a picture to clarify this.
You can see that this versioning is space saving because you can combine the information about INSERT and DELETE of an object in one line, but to check for DELETEs is costly.
In the example above I used a Table with a clear Primary Key. So a check for the-same-object is easy: just look at the id. If you want to capture changes in talbes with an key-combination this can also make the whole process more annoying.
Edit: another point is that the protocol-Data is kept in the same table as the "real" data. Maybe this is faster for an INSERT than known alternativ solution like the tracking per TRIGGER (like here), but if changes are made quite frequent on the table and you want to process/analyse the CDC-Data this can cause performance problems.

MariaDB supports System-Versioned Tables since version 10.3.4. System version tables are specified in the SQL:2011 standard. They can be used for automatically capturing previous versions of rows. Those versions can then be queried to retrieve their values as they have been set at a specific point in time.
The following text and code example is from the official MariaDB documentation
With system-versioned tables, MariaDB Server tracks the points in time
when rows change. When you update a row on these tables, it creates a
new row to display as current without removing the old data. This
tracking remains transparent to the application. When querying a
system-versioned table, you can retrieve either the most current
values for every row or the historic values available at a given point
in time.
You may find this feature useful in efficiently tracking the time of
changes to continuously-monitored values that do not change
frequently, such as changes in temperature over the course of a year.
System versioning is often useful for auditing.
With adding SYSTEM VERSIONING to a newly created or an already existing table (using ALTER), the table will be expanded by row_start and row_end time stamp columns which allow retrieving the record valid within the time between the start and the end timestamps.
CREATE TABLE accounts (
id INT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(255),
amount INT
) WITH SYSTEM VERSIONING;
It is then possible to retrieve data as it was at a specific time (with SELECT * FROM accounts FOR SYSTEM_TIME AS OF '2019-06-18 11:00';), all versions within a specific time range
SELECT * FROM accounts
FOR SYSTEM_TIME
BETWEEN (NOW() - INTERVAL 1 YEAR)
AND NOW();
or all versions at once:
SELECT * FROM accounts
FOR SYSTEM_TIME ALL;

When should I create another field as Primary Key?

I'm developing a home automation system using MySQL. I have some arduinos connected through ethernet shields and a Raspberry Pi that manages them using a MQTT server. This server handles the communication between all the devices (each arduino is only connected to the raspberry, which processes the request and sends another request to the same or another arduino).
Also, each arduino is identified by its MAC address.
I have an input (for reading switches and sensors) and an output (turning on and off lamps) system using the arduinos. Each value is stored in the input and output tables.
device
- id : CHAR(12) PK NOT NULL // The MAC Address
- type : VARCHAR(5) NOT NULL // I also manage a door lock system
input
- device : CHAR(12) NOT NULL // FK from device table
- selection : TINYINT NOT NULL // Selects input port
- value : INT // Stores the input value
The output table is very similar. Both tables have other fields not important to my question.
When someone presses a switch a message is sent to the server, the server processes the request, updates the database and sends back other messages to other arduinos according to a set of tables that manages triggers.
I started noticing some delay turning on the lamp and after some code dump I found out that the majority of the time is spent on the database query.
Is it better if instead of using the MAC address as the PK I create another field (INT AUTO_INCREMENT)? What engine is fastest os better for this situation?
PS: The server runs a long running PHP script (it was the best language I knew at the time I started developing this and I was using the web UI as a reference. I know that Python may be better for this case).

No, the difference between CHAR(12) and some size of INT cannot explain a performance problem. Sure, a 1-byte TINYINT UNSIGNED would probably be better, but not worth it for such a 'small' project.
Please provide SHOW CREATE TABLE and the queries, plus EXPLAIN SELECT for any slow queries.
The PRIMARY KEY is accessed via a BTree (see Wikipedia); it is very efficient, regardless of the size of the table, and regardless of the size of the column(s) in the PK.
Here's one reason why I insist on seeing the schema. If, for example, the CHAR is a different CHARACTER SET or different COLLATION on a pair of tables, the JOIN between the tables would not be able to use the index, thereby slowing down the query by orders of magnitude.

From Primary Key Tutorial
Because MySQL works faster with integers, the data type of the primary
key column should be the integer e.g., INT, BIGINT.You can choose a
smaller integer type: TINYINT, SMALLINT, etc. However, you should make
sure that the range of values of the integer type for the primary key
is sufficient for storing all possible rows that the table may have.
Without seeing you full schema for the entire database, it would be hard to give you a bunch of recommendations. But in my experience, I always like to just let my PK be an autoincrement integer. I would then make my MAC address an index (possibly unique) to make joining efficient.

Foreign key in different database connection

While building my app, I came across a problem. I have some database tables with information, I want to reuse for different applications. Mainly for authentication and user privileges.
That is why i decided to split my database into two, one for user data (data I will need for other applications) and another for application related data (data I will need only for this).
In some cases, I need to reference a foreign key from one database on another database. I had no problem doing so while databases are in the same connection. I did it like so:
CREATE TABLE `database1`.`table1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`foreign_key` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `table1_foreign_key_foreign` (`foreign_key`),
CONSTRAINT `table1_foreign_key_foreign` FOREIGN KEY (`foreign_key`) REFERENCES `database2`.`table2` (`id`)
);
Now here is my problem. I am getting to know Docker and I would like to create a container for each database. If my understanding is correct, each container acts as a different connection.
Is it even possible to reference a foreign key on different database connection?
Is there another way of referencing a foreign key from one Docker container on another?
Any suggestions or comments would be much appreciated.

Having a foreign key cross database boundaries is a bad idea for multiple reasons.
Scaling out: You are tying the databases to the same instance. Moving a database to a new instance becomes much more complicated, and you definitely do not want to end up with a FK constraint running over a linked server. Please, no. Don't.
Disaster Recovery: Your DR process has a significant risk. Are your backups capturing the data at the exact same point in time? If not, there is the risk that the related data will not match after a restore. Even a difference of a few seconds can invalidate the integrity of the relationship.
Different subsystems: Each database requires resources. Some are explicit, others are shared, but there is overhead for each database running in your instance.
Security: Each database has its own security implementation. Different logins and access permissions. If a user in your DATA database needs to lookup a value against the USER database, you'll need to manage permissions in both. Segregating the data by database doesn't solve or enhance your security, it just makes it more complicated. The overhead to manage the security for the sensitive data doesn't change, you'll still need to review and manage users and permissions based on the data (not the location of the data). You should be able to implement exactly the same security controls within the single database.

No, that is not possible. You can not create FK to different instance of DB (or other Docker container in your case).
You may try to make this check on application level.

Adding a time dimension to MySQL cells

Is there a way to keep a timestamped record of every change to every column of every row in a MySQL table? This way I would never lose any data and keep a history of the transitions. Row deletion could be just setting a "deleted" column to true, but would be recoverable.
I was looking at HyperTable, an open source implementation of Google's BigTable, and this feature really wet my mouth. It would be great if could have it in MySQL, because my apps don't handle the huge amount of data that would justify deploying HyperTable. More details about how this works can be seen here.
Is there any configuration, plugin, fork or whatever that would add just this one functionality to MySQL?

I've implemented this in the past in a php model similar to what chaos described.
If you're using mysql 5, you could also accomplish this with a stored procedure that hooks into the on update and on delete events of your table.
http://dev.mysql.com/doc/refman/5.0/en/stored-routines.html

I do this in a custom framework. Each table definition also generates a Log table related many-to-one with the main table, and when the framework does any update to a row in the main table, it inserts the current state of the row into the Log table. So I have a full audit trail on the state of the table. (I have time records because all my tables have LoggedAt columns.)
No plugin, I'm afraid, more a method of doing things that needs to be baked into your whole database interaction methodology.

Create a table that stores the following info...
CREATE TABLE MyData (
ID INT IDENTITY,
DataID INT )
CREATE TABLE Data (
ID INT IDENTITY,
MyID INT,
Name VARCHAR(50),
Timestamp DATETIME DEFAULT CURRENT_TIMESTAMP)
Now create a sproc that does this...
INSERT Data (MyID, Name)
VALUES(#MyID,#Name)
UPDATE MyData SET DataID = ##IDENTITY
WHERE ID = #MyID
In general, the MyData table is just a key table. You then point it to the record in the Data table that is the most current. Whenever you need to change data, you simply call the sproc which Inserts the new data into the Data table, then updates the MyData to point to the most recent record. All if the other tables in the system would key themselves off of the MyData.ID for foreign key purposes.
This arrangement sidesteps the need for a second log table(and keeping them in sync when the schema changes), but at the cost of an extra join and some overhead when creating new records.

Do you need it to remain queryable, or will this just be for recovering from bad edits? If the latter, you could just set up a cron job to back up the actual files where MySQL stores the data and send it to a version control server.

ms access replication id as foreign key

I am currently "forced" to create a database in ms access 2007.
Due to replication issues i have decided to use autonumber as ReplicationID for my Users table.
On my venues table i would like to use this id as the user created.
I have tried to use the userID in textboxes accross the main form, but it outputs
{guid {BF40D0A0-A1F3-4C98-A9B6-D9D075F0BBA3}}
and when using this value to insert into my Venues table, it generates a new ReplicationID.
Am i missing some setting where it will use the GUID provided, or do you have any other suggestion.
Regards.

I don't pretend to understand replication in Access. However the two best resources are
Subject: INFO: Replication and
GUIDs, the Good, the Bad, and the
Ugly Basically don't use GUID as
a primary key.
Jet Replication Wiki David Fenton has stated it would
be fine if those without experience
created sections for issues they
wanted explained.

OK i found it. What happened was that you cannot insert a guid string in ms access, you have to use the actual object. Then all was fine.
dont use StringFromGUID/GUIDFromString, just insert the object as is and all is well.
Any questions and i will gladly explain X-)

Don't expose autonumber values to users, especially of the GUID flavour!

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008