couchbase per user data approach - couchbase

having a bit of trouble finding the correct way to model per user data in couchbase and sync up via couchbase mobile for user specific data. In couchdb you have a separate database per user. What is the best approach in couchbase ?

In Couchbase there is no such thing as "user data". Its generic and open for your designs.
Normally when you design your object domain model for Couchbase you would inject metadata in your key structure.
For example:
Key for Account: "Acc#123456789" - where prefix "Acc#" is telling about type of the key, and "123456789" adds particular address instance of this key...resulting in the unique key.
Similarly if you want to encode associated address to the account, you'd architect the following key: "Acc#123456789#Addr" - where postfix "#Addr" identifies type of the key for address object
Now, if you want to have user specific keys, you just simply inject user identifier into the key value (from the example above):
key "Acc#123456789" will transform to "usr#12345#Acc#123456789"
key "Acc#123456789#Addr" will transform to "usr#12345#Acc#123456789#Addr"
Read more on Couchbase data modeling here and keys and metadata

You can create your documents in that way which can able to ease the retrieve all related entity. e.g.
Create your documents with user_{Guid} consider Guid as your UserID
Make all other related document with this same Guid like credential_{Guid} so when user logged in we can have userID in session and get all information of that user.

Related

Best approach for migrating the "wrong" foreign key to the "correct" foreign key?

For context, I have a Laravel 6 project which made a rather odd choice, to put it mildly, on how to manage relationships when I inherited it.
I have a user object which has it's usual autoincrement id, as well as a "system_id" which is provided by an external system.
For most of the project, relationships involving a user object make use of their "id" field as the foreign key in the belongsTo() part of the relationship which is all well and good.
However, one many-to-many relationship, specifically the one used for the relationship between a user model and a group model, uses the user model's "system_id" field as the foreign key instead of the usual "id" field used everywhere else which is beginning to cause all kinds of development headaches, and is already in production.
So as part of a cleanup project of the system, I intend on migrating the pivot table to use the user model's "id" field. The challenge now is the following:
In a database-agnostic way, how to copy the matching id to the "user_id" foreign key field in the pivot table given a known "system_id".
How will it look in a migration? Is a migration even a good option or should it be done directly in the database instead?
Anything else I should account for?
Is this even a good idea in the first place or should we just live with it?
Obviously, a backup will be made and the whole thing will be tested in a test environment first before it's attempted in production.

Migrate data from MySQL with auto-increment Ids to the Google Datastore?

I am trying to migrate some data from MySql to the Datastore. I have a table called User with auto-increment primary keys (Bigint(20)). Now I want to move the data from the User table to the datastore.
My plan was let the Datastore generate new Ids for the migrated users and all the new user created after the migration is done. However we have many services (notifications, urls etc) that depend on the old ids. So I want to use the old ids for the migrated user, however how can I guarantee that all new generated ids won't collide with the migrated Ids?
Record the maximum and minimum ids before migrating. Migrate all the sql rows to datastore entities, setting entity.key.id = sql.row.id.
To prevent new datastore ids from colliding with the old ones, always call AllocateIds() to allocate new ids. In C#, the code looks like this:
Key key;
Key incompleteKey = _db.CreateKeyFactory("Task").CreateIncompleteKey();
do
{
key = _db.AllocateId(incompleteKey);
} while (key.Path[0].Id >= minOldId && key.Path[0].Id <= maxOldId);
// Use new key for new entity.
In reality, you are more likely to win the lottery than to see a key collide, so it won't cost anything more to check against the range of old ids.
You cannot hint/tell the Datastore to reserve specific IDs. So, if you manually set IDs when inserting existing data, and later have the Datastore assign an ID, it my pick an ID that you have already used. Depending on the operation you are using (e.g. INSERT or UPSERT), the operation may fail or overwrite the existing entity.
You need to come up with a migration plan to map existing IDs to Datastore IDs. Depending on the number of tables you have and the complexity of relations between them, this could become a time consuming project, but you should still be able to do it.
Let's take a simple example and assume you have two tables:
USER (USER_ID is primary key)
USER_DATA (USER_ID is foreign key)
You could possibly add another column to USER (or another way) to map the USER_ID to DATASTORE_ID. Here, you call Datastore's allocateID method for the Kind you want to use and store the returned ID into the new column.
Now, you can move USER data to Cloud Datastore ignoring the MySQL User ID, instead use the ID from the new column.
To migrate the data from USER_DATA, do a join between the two tables and push the data using datastore ID.
Also, note that using sequential IDs (referred to as monotonically increasing values) could cause performance issues with Datastore. So, you probably want to use IDs that are generated by the Datastore.

Using sAMAccountName as uid in MySQL database

I have an application that authenticate with LDAP and returns a JWT with the sAMAccountname of the logged user.
This application have a MySQL database where I'd like to store the user in different tables (fields like createdBy, updatedBy, etc.) and I was wondering what is the correct way of handling this:
using the sAMAccount name as identifier (so the createdBy will be a VARCHAR(25))
using a link table to match the sAMAccountname with an autoincremented identifier
Normally I would choose the "id" way, it's faster and easier to read in my opinion, but I'm not really into linking users from LDAP dictionary and changing their id in my database, so honestly I would choose the first option.
What are the pro/cons of using a string as uid ? In my case it's likely to be only for statuses like updatedBy, cratedBy, deletedBy etc. so I won't have hardlinks between multiple tables using an user identifier.
I think you should create user table with a surrogate primary key (autoincrementing one) and make unique index on sAMAccount column.
Natural primary keys are good because they just naturally describe a record they point to. But the downsize of using them is that they consume too much space in the index. Index lookups / rebuilds are slower. Tables consume more space also.
I'd connect everything using an id as primary key.
ONe thing is that the sAMAccountName is not necessarilly unique. Think of a user changing her or his name. The sAMAccountName might then change but it's still the same user. When you connect everything via an ID you can change the sAMAccountName-field without breaking everything.
But that's just my 2 cent

Creating User Accounts with unique uid in OpenLDAP

I'm trying to create a number of user accounts (class is OpenLDAPperson) in OpenLDAP and the problem I'm having is that the uid property is not enforced as UNIQUE by LDAP.
Looks like only object class "account" enforces uid uniquenes, unfortunately, I cannot seem to combine it with object classes like OpenLDAPperson or OrganizationalPerson.
Can anyone recommend a best practice for creating users and have uid uniqueness also enforced in LDAP?
Thank you
I ran into the same problem and switched to OpenDJ(formerly OpenDS). It supports unique index on UID.

Using Database Primary Key in HTML ID

Just wanted to ask.
I have site where each user is linked to an ID in the Database and this Primary Key is included in many tables. The fastest way for me to pull a users information is to have this ID.
Would it be considered bad practice to put this ID in website HTML code? eg id="theIDnumber"
Otherwise i can just use the username and then reference this in the Database for this ID - which is fine but using the ID would be faster I believe.
thoughts?
I'd say no, if your keys are predictable. A trivial example: if you are using sequentially incrementing primary keys users can extract information from data that could be a privacy concern. e.g. they can infer which account was created before their account. Life also becomes easy for those trying to systematically leech information from your site.
Some related reading
https://stackoverflow.com/a/7452072/781695
You give your end users the opportunity to mess with those variables
and pass any data that they like. The counter measure to mitigate this
vulnerability is to create indirect object references instead. This
may sound like a big change, but it does not necessarily have to be.
You don't have to go and rekey all your tables or anything, you can do
it just by being clever with your data through the use of an indirect
reference map.
https://security.stackexchange.com/a/33524/37949
Hiding database keys isn't exactly required, but it does make life
more difficult if an attacker is trying to reference internal IDs in
an attack. Direct references to file names and other such internal
identifiers can allow attackers to map the internal structure of the
server, which might be useful in other attacks. This also invites path
injection and directory traversal problems.
https://www.owasp.org/index.php/Insecure_Direct_Object_Reference_Prevention_Cheat_Sheet
An object reference map is first populated with a list of authorized
values which are temporarily stored in the session. When the user
requests a field (ex: color=654321), the application does a lookup in
this map from the session to determine the appropriate column name. If
the value does not exist in this limited map, the user is not
authorized. Reference maps should not be global (i.e. include every
possible value), they are temporary maps/dictionaries that are only
ever populated with authorized values.