How to deal with NGSI-LD tenants ? Create ? List ? Delete? - fiware

I have a hard time trying to find concrete information about how to deal with tenants in a NGSI-LD Context Broker.
The ETSI specification defines multi-tenancy but it seems it doesn't specify any operations to create a tenant, list available tenants and remove a tenant.
I presume each broker is free to implement multi-tenancy its own way but I've searched the tenant keyword in broker documentations (at least for Orion-LD, Stellio and Scorpio) with no success.
Thanks to this stackoverflow post I've successfully created a tenant in Orion-LD.
I'd like to know if there are some tenants operations (documented or undocumented) exposed by brokers.
Especially any facilities to remove a tenant - along with all resources that have been created "inside".
Thanks.

So, first of all, tenants are created "on the fly". In comes a request to create an entity, subscription, or registration, and the HTTP header "NGSILD-Tenant" specifies a tenant that does not already exists, then the tenant is created and the entity/sub/reg is created "under it". For ALL operations, the HTTP header is used. jsonldContexts are different, those are "omnipresent" and exist regardless of the tenant used.
GET operations (like all other operations) can use the NGSILD-Tenant header to indicate on which tenant the GET is to be performed, but if that tenant does not exist, it will naturally not be created - an error is retuirned instead.
There are no official endpoints to list tenants (nor delete - that would be a bit dangerous!), but in the case of Orion-LD, I implemented an endpoint for debugging purposes: GET /ngsi-ld/ex/v1/tenants. That one you can use if you please. Just remember, no other NGSI-LD broker is supporting that endpoint

Related

I want to know what Orion's Federation feature is

I want to know what Orion's Federation feature is.
I have read the Orion documentation and tried the federation function, and the data was registered in all three Orions.
I thought that the second Orion acts as a Proxy and does not store data, but is that not the case?
If all three Orions store data, is it correct to say that the Federation does not have its own function, but is a concatenation of Subscriptions?
The Federation mechanism you refer is the one on the documentation. In this case, it is based on subscriptions that copy the data among the brokers on entities changes.
Orion also has the registration and request forwarding mechanism. This case, once the register is done, the Context Broker forward the requests to the one registered. This approach sounds closer to the one you are describing but I encourage you to use the first method (based on subscriptions) since all the advanced operations like filtering are working without issues.
A well-designed Federation should not allow value shadowing by a local database. Unfortunately, that's not a feature offered by design by the Orion Broker. That is up to your design and nothing prevents the shadowing, as far as I understand i.e. even you have registered a provider for an Entity, someone can create locally such an Entity, which may be a mess and a source of conflict.
An alternative that may work is to perform the Federation by a REST Service custom built by you that just acts as a proxy against different Orion Brokers. If this custom service is a single entry point you could guarantee a pure Federated system with no shadowing of values.

Migrating previously collected datasets to FIWARE backend

Having at hand, the task of migrating previously collected environmental datasets (weather, airquality, noise etc) from sensors deployed in different locations, and stored in several tables of MySQL database, to my instance of fiware Orion CB, and thus persisted to fiware backend.
The challenges are many:
the data isn't stored in fiware standards, so must be transformed according to the fiware data models.
not all tables are a good candidates of being transformed to an Entity.
some Entities need have field values from several tables as attributes. For instance, defining AirQualityObserved Entity-type would have attributes from these tables: airquality, co, co2, no2 and deployment. So mapping these attributes to a particular Entity-type is a challenge.
As this is a one-time upload (not live data), I am thinking of two possibilities to go about it.
Add an LwM2M client, to keep sending data to an IoTAgent and eventually passed to Orion CB until the last record.
Create a Python script that "pretends" to be a contextProvider to the Orion instance, sending data (say every 5sec) until the last record.
I have not come across a case in my literature search that addresses such a situation. Is there any recommendations from FIWARE Foundation for situations similar to this?
How would you suggest about data fields --> Entity's attributes mapping that actually need be combined from several tables?
IOTA usage makes sense when you have live data (I mean, a real device sending information to the FIWARE platform). However, you say this is a one-time upload, so the Python script option seems better this case.
(A little terminological comment here: your script will take the role of context producer. A context provider is a different actor, related with registrations and query/update forwarding. See this piece of documentation for additional detail).
With regards to the data fields to Entity's attributes mapping I don't have any particular suggestion. This is just a matter of analyzing the data model (i.e. entity attributes) and find how to set that information from your data in the tables.

Microservice Architecture design

I have few doubts on Microservice architecture.
Lets say there are microservices A, B and C.
A maintains the context of a job apart from other things it does and B,C work to fulfill that job by doing respective tasks for that job.
Here I have questions.
1. DB design
I am talking about SQL here.
Usage of foreign keys simplifies lot of things.
But as I understand microservice architecture, every microservice maintaines its own data and data has to be queried from that service if required.
Does it mean no foreign keys referring to tables in another microservices?
2. Data Flow
As I see here are two ways.
All the queries are done using jobId maintained uniquely in all microservices for a job.
Client requests go directly to individual service for a task. To get summary of the job, client queries individual microservices collects the data and passes to user.
Do everything through coordinating microservice. Client requests go to service A and in tern service A will gather info from all other microservices for that jobId and passed that to user.
Which of the above two has to be followed and why?
You're correct thinking that microservices should ideally have their own data structure so they can be deployed independently. However there are several design patterns that help you, and that doesn't necessarily translates in "No FK". Please refer to:
Database per service
Sagas
API Composition
CQRS
The patterns listed above answer both your questions.
Does it mean no foreign keys referring to tables in another microservices?
Not in the database sense. One microservice may hold IDs of remote entities but should not assume anything about the remote microservice persistence (i.e. the database type, it could be anything from SQL to NoSQL).
Which of the above two has to be followed and why?
This really depends. There are two types of architectures: choreography and orchestration. Both of them are good. Which one to use? Only you can decide. Here are a few blog posts about them:
Microservices — When to React Vs. Orchestrate
Benefits of Microservices - Choreography over Orchestration, Low Coupling and High Cohesion
Also, the solution to this SO question might be useful.

Protecting client-side database IDs in a web app [duplicate]

I've heard that exposing database IDs (in URLs, for example) is a security risk, but I'm having trouble understanding why.
Any opinions or links on why it's a risk, or why it isn't?
EDIT: of course the access is scoped, e.g. if you can't see resource foo?id=123 you'll get an error page. Otherwise the URL itself should be secret.
EDIT: if the URL is secret, it will probably contain a generated token that has a limited lifetime, e.g. valid for 1 hour and can only be used once.
EDIT (months later): my current preferred practice for this is to use UUIDS for IDs and expose them. If I'm using sequential numbers (usually for performance on some DBs) as IDs I like generating a UUID token for each entry as an alternate key, and expose that.
There are risks associated with exposing database identifiers. On the other hand, it would be extremely burdensome to design a web application without exposing them at all. Thus, it's important to understand the risks and take care to address them.
The first danger is what OWASP called "insecure direct object references." If someone discovers the id of an entity, and your application lacks sufficient authorization controls to prevent it, they can do things that you didn't intend.
Here are some good rules to follow:
Use role-based security to control access to an operation. How this is done depends on the platform and framework you've chosen, but many support a declarative security model that will automatically redirect browsers to an authentication step when an action requires some authority.
Use programmatic security to control access to an object. This is harder to do at a framework level. More often, it is something you have to write into your code and is therefore more error prone. This check goes beyond role-based checking by ensuring not only that the user has authority for the operation, but also has necessary rights on the specific object being modified. In a role-based system, it's easy to check that only managers can give raises, but beyond that, you need to make sure that the employee belongs to the particular manager's department.
There are schemes to hide the real identifier from an end user (e.g., map between the real identifier and a temporary, user-specific identifier on the server), but I would argue that this is a form of security by obscurity. I want to focus on keeping real cryptographic secrets, not trying to conceal application data. In a web context, it also runs counter to widely used REST design, where identifiers commonly show up in URLs to address a resource, which is subject to access control.
Another challenge is prediction or discovery of the identifiers. The easiest way for an attacker to discover an unauthorized object is to guess it from a numbering sequence. The following guidelines can help mitigate that:
Expose only unpredictable identifiers. For the sake of performance, you might use sequence numbers in foreign key relationships inside the database, but any entity you want to reference from the web application should also have an unpredictable surrogate identifier. This is the only one that should ever be exposed to the client. Using random UUIDs for these is a practical solution for assigning these surrogate keys, even though they aren't cryptographically secure.
One place where cryptographically unpredictable identifiers is a necessity, however, is in session IDs or other authentication tokens, where the ID itself authenticates a request. These should be generated by a cryptographic RNG.
While not a data security risk this is absolutely a business intelligence security risk as it exposes both data size and velocity. I've seen businesses get harmed by this and have written about this anti-pattern in depth. Unless you're just building an experiment and not a business I'd highly suggest keeping your private ids out of public eye. https://medium.com/lightrail/prevent-business-intelligence-leaks-by-using-uuids-instead-of-database-ids-on-urls-and-in-apis-17f15669fd2e
It depends on what the IDs stand for.
Consider a site that for competitive reason don't want to make public how many members they have but by using sequential IDs reveals it anyway in the URL: http://some.domain.name/user?id=3933
On the other hand, if they used the login name of the user instead: http://some.domain.name/user?id=some they haven't disclosed anything the user didn't already know.
The general thought goes along these lines: "Disclose as little information about the inner workings of your app to anyone."
Exposing the database ID counts as disclosing some information.
Reasons for this is that hackers can use any information about your apps inner workings to attack you, or a user can change the URL to get into a database he/she isn't suppose to see?
We use GUIDs for database ids. Leaking them is a lot less dangerous.
If you are using integer IDs in your db, you may make it easy for users to see data they shouldn't by changing qs variables.
E.g. a user could easily change the id parameter in this qs and see/modify data they shouldn't http://someurl?id=1
When you send database id's to your client you are forced to check security in both cases. If you keep the id's in your web session you can choose if you want/need to do it, meaning potentially less processing.
You are constantly trying to delegate things to your access control ;) This may be the case in your application but I have never seen such a consistent back-end system in my entire career. Most of them have security models that were designed for non-web usage and some have had additional roles added posthumously, and some of these have been bolted on outside of the core security model (because the role was added in a different operational context, say before the web).
So we use synthetic session local id's because it hides as much as we can get away with.
There is also the issue of non-integer key fields, which may be the case for enumerated values and similar. You can try to sanitize that data, but chances are you'll end up like little bobby drop tables.
My suggestion is to implement two stages of security.
"Security through obscurity": You can have integer Id as primary key and Gid as GUID as surrogate key in tables. Whereas integer Id column is used for relations and other database back-end and internal purposes (and even for select list keys in web apps to avoid unnecessary mapping between Gid and Id while loading and saving) and Gid is used for REST Urls i.e for GET,POST, PUT, DELETE etc. So that one cannot guess the other record id. This gives first level of protection against guess-based attacks. (i.e. number series guessing)
Access based control at Server side : This is most important, and you have various way to validate the request based on roles and rights defined in application. Its up to you to decide.
From the perspective of code design, a database ID should be considered a private implementation detail of the persistence technology to keep track of a row. If possible, you should be designing your application with absolutely no reference to this ID in any way. Instead, you should be thinking about how entities are identified in general. Is a person identified with their social security number? Is a person identified with their email? If so, your account model should only ever have a reference to those attributes. If there is no real way to identify a user with such a field, then you should be generating a UUID before hitting the DB.
Doing so has a lot of advantages as it would allow you to divorce your domain models from persistence technologies. That would mean that you can substitute database technologies without worrying about primary key compatibility. Leaking your primary key to your data model is not necessarily a security issue if you write the appropriate authorization code but its indicative of less than optimal code design.

FIWARE Orion: Multitenancy and Federation

I would like to know how can I create tenants in Orion. I understand that you need to inform Orion of the multitenancy by adding the -multiservice parameter. However, I do not know what is necessary to create, indeed, a tenant.
From what I have seen in the documentation, my guess is that you just need to add the Fiware-Service header in each HTTP request. Therefore, if you send a updateContext with a Fiware-Service=t_01 Orion will automatically create that tenant. Is this right?
In addition to this, I also would like to know if multitenancy works correctly (or is planned to work correctly) in the following scenario:
In other words, is it possible to have several Gateways, each one with an Orion with one or more tenants, and all the Orions of all the Gateways federated to a global Orion in the cloud? Will the tenancy be respected in the global orion?
Thanks.
Your are right: Orion creates tenants "on the fly", the first time a create entity/registry operation in the given tenant is processed.
Regarding the scenario in the picture I understand that you refer to the federation approach described in this section in the manual, based on notifyContext. Given that Orion includes the Fiware-Service header in notifications, it should work.