I have a few related web sites and it seems rather unfortunate that they have completely separate user databases. I've been contemplating different options on how to unify the databases:
Rework the sites to be running on one copy of my content management system rather than independent software. Pros: Seems clean. Cons: Complicated by needing to rewrite a lot of the backend of one of the sites to support the different features of the other site.
Use the OAuth backend I wrote to interface with Facebook to authenticate back and forth between the sites. Pros: Seems to be using OAuth for what it was meant to do. Cons: it requires at least some redundancy where I'd need to store duplicate user data on both sites and this could get out of sync. Also seems like overkill for two sites running on the same server.
Connect to both databases whenever an account is created or modified on either site and apply the modifications to the other site. Pros: seems to avoid risk of falling out of sync and avoids complications of having to create and receive OAuth data between the sites. Cons: it requires full duplication of user information between the sites.
Choose one of the sites as having the canonical database and have the user authentication mechanism of the other site connect to the first site's MySQL database, while still connecting to a separate database for the rest of the site's functionality.
I'm not totally happy with any of the options, although #4 feels like the simplest to implement as I'm thinking about it. Nonetheless, before I embark on such a project, I thought I'd ask for potential pitfalls I might be overlooking since none of the ideas are entirely trival. I'd appreciate advice on which might be considered "best practices" and, perhaps, more importantly, which one would cause the most impact on server resources. I'm using Perl's DBD::MySQL to interact with the databases.
Related
For a client I'm going to deliver a SaaS solution, SaaS in that matter it's still closed to a limited clients that has to sign a contract with us, so not shared world wide and the client base will be around 5-10 companies.
Our first client, the pilot client so to speak, has it as a requirement that they can perform SQL queries (read mode only) on the data, so they can make analysis in Excel along with what our application serves.
My question is that I would for maintain reasons prefer to serve everything in the same codebase, but I'm wondering how I can make sure, clients can access other clients SQL records?
I'm using Laravel so the solution for different installations would be to make everything in maintainable packages and upgrade all installations from there, but this can grow to a lot of work.
How to have the solution with only one solution I'm still not sure, maybe it is to have a separate database per client? That would require a central database to point them to the right database of course, or maybe only have some of the tables in another database, but it already sounds like a mess to me
In Laravel it is possible to have multiple database connections. As such your thoughts on giving the clients thier own database is going to be the most secure option.
Have your default database be your main application database which will be settings/auth etc.
For each client store their personal data in a separate database per client and only allow them to query this database.
Although I don't know the specifics of your application my true suggestion is to avoid the SQL queries completely and build an API.
Your SaaS clients should not have to be concerned with the internal implementation of your database structure. A well built API gives you freedom to modify the database as needed and the SaaS client the peace of mind that their "interface" is not in a technically permanent state of flux.
Something I have searched for but cannot find a straight answer to is this:
For a given service, if there are two instances of that service deployed to two machines, do they share the same persistent store or do they have separate stores with some syncing mechanism (master/slave, clustering)?
E.g. I have a OrderService backed by MySQL. We're getting many orders in so I need to scale this service up, so we deploy a second OrderService. Where does its data come from?
It may sound silly but, to me, every discussion makes it seem like the service and database are a packaged unit that are deployed together. But few discussions mention what happens when you deploy a second service.
Posting this as an answer because it's too long for a comment.
Microservices are self contained components and as such are responsible for their own data. If you want the get to the data you have to talk to the service API. This applies mainly to different kinds of services (i.e. you don't share a database among services that offer different kinds of business functionality - that's bad practice because you couple services at the heap through the database and it's then easy to couple more things that would normally be done at the API level but it's more convenient to do them through the database => you risk loosing componentization).
But if you have the same kind of service then there are, as you mentioned, two obvious choices: share a database or have each service contain it's own database.
Now you have to ask yourself which solution do you chose:
Are these OrderServices of yours truly capable of working on their own, or do you need to have all the orders in the same database for reporting or access by other applications?
determine what is your actual bottleneck. Is it the database? If not then share the database. Is it the services? If not then distribute your data.
need to distribute the data? What are your choices, what are your needs? Do you need to be consistent all the time or eventual consistency is good enough? Do you need to have separate databases and synchronize them manually or does your database installation handle replication and partitioning out of the box?
etc
What I'm trying to say is that in this kind of situations the answer is: it depends. And something that we tech geeks often forget to do before embarking on such distributed/scalability/architecture journeys is to talk to business. Often business can handle a certain degree of inconsistencies, suboptimal processes or looking up data in more places instead of one (i.e. what you think is important might not necessarily be for business). So talk to them and see what they can tolerate. Might be cheaper to resolve something in an operational way than to invest a lot into trying to build a highly distributable system.
For a site that is using a Sandbox mode, such as a Payment site, would a separate database be used, or the same one?
I am examining two schemas for the production and sandbox environment. Here are the two options.
OPTION 1:
Clone database, route requests to the correct database based upon sandbox mode.
OPTION 2
Single database, 'main tables' have an is_sandbox boolean.
What would be the pros and cons of each method?
In most situations, you'd want to keep two separate databases. There's no good reason to have the two intermingled in the same database, and a lot of very good reasons to keep them separated:
Keeping track of which entities are in which "realm" (production vs. sandbox) is extra work for your code, and you'll likely have to include it in a lot of places.
You'll need that logic in your database schema as well. UNIQUE indexes all have to include the realm, for instance.
If you forget any of that code, you've got yourself a potential security vulnerability. A malicious user could cause data from one realm to influence the other. Depending on what your application is, this could range anywhere from annoying to terrifying. (If it's a payment application, for instance, the potential consequences are incredibly dire: "pretend" money from the sandbox could be converted into real money!)
Even if your code is all perfect, there'll still be some information unavoidably leaked between the realms. For instance, if your application uses any sequential identifiers (AUTO_INCREMENT in MySQL, for instance), gaps in values seen in the sandbox will correspond with values used in production. Whether this matters is debatable, though.
Using two separate databases neatly solves all these problems. It also means you can easily clean out the sandbox when needed.
Exception: If your application is an almost entirely public web site (e.g, like Stack Overflow or Wikipedia), or involves social aspects that are difficult to replicate in a sandbox (like Facebook), more integrated sandboxes may make more sense.
I've tried to find answer to my question but i couldn't find the right answer yet (would be glad if you point me to one). I'm a newbie when it comes to running services (websites, forum, wikis, emails). I'm rather experimenting.
I have couple of websites (mainly wordpress), mail server, forum, wikis, and file sharing (owncloud) hosted on one server.
Until now every time I would install new service I would create new database (mysql), just like the install readme's would advice. I would like to connect some of the services together. Mainly unified user database.
What is the best way to do it. Is having multiple databases versus one db heavier for my servers cpu load? Is it secure? Is it easy to administrate it?
If cpu load isn't issue while having multiple db's is it possible to create user database and link it to the services databases i would like to link it to?
Having multiple applications (forum, wiki, ...) access the same database is not likely to have any effect on CPU usage, but there are other drawbacks:
Table names used by applications might have conflicts (many of them might have a "session" or "posts" table). Some web apps have a feature to prefix table names with a string, like "wp_session" and "wp_posts" for example to get around conflicts.
Yes, it's less secure. When one of the applications has a security hole and someone manages to access its database, data of all applications is compromised.
Multiple databases is likely to be easier to manage when doing application upgrades, backups, removing or adding applications to the mix.
Accidentally break one database, and you'll break all apps.
To get the applications use the same authentication database it's usually not enough to point them at the same database, as they're likely to use a different database schema for storing user information (different columns in the auth database), different hashing for password storage, and so on.
The question is quite broad, and the specific answer depends a lot on the actual applications you're using. The best approach in general is probably to pick applications which support a protocol such as OpenID or OAuth, or an authentication backend such as an LDAP database or PAM (Pluggable Authentication Module). These methods allow you to use a single user database managed by a single method. The apps all need to work with the same backend. In any case, it's likely to be quite a learning experience to get it running smoothly.
I've just started working on a project that will involve multiple people entering data from multiple geographic locations. I've been asked to prepare forms in Access 2003 to facilitate this data entry. Right now, copies of the DB (with my tables and forms) will be distributed to each of the sites, returned to me, and then I get to hammer them all together. I can do that, but I'm hoping that there is a better way - if not for this project, then for future projects.
We don't have any funding for real programming support, so it's up to me. I am comfortable with HTML, CSS, and SQL, have played around with Django a fair bit, and am a decently fast learner. I don't have much time to design forms, but they don't have to actually function for a few months.
I think there are some substantial benefits to web-based forms (primary keys are set centrally, I can monitor data entry, form changes are immediately and universally deployed, I don't have to do tech support for different versions of Access). But I'd love to hear from voices of experience about the actual benefits and hazards of this stuff.
This is very lightweight data entry - three forms attached to three tables, linked by person ID, certainly under 5000 total records. While this is hardly bank account-type information, I do take the security of these data seriously, so that's an additional consideration. Any specific technology recommendations?
Options that involve Access:
use Jet replication. If the machines where the data editing is being done can be connected via wired LAN to the central network, synchronization would be very easy to implement (via the simple Direct Synchronization, only a couple lines of code). If not (as seems the case), it's an order of magnitude more complex and requires significint setup of the remote systems. For an ongoing project, it can be a very good solution. For a one off, not so much. See the Jet Replication Wiki for lots of information on Jet Replication. One advantage of this solution is that it works completely offline (i.e., no Internet connection).
use Access for the front end and SQL Server (or some other server database) for the back end. Provide a mechanism for remote users to connect to the centrally-hosted database server, either over VPN (preferred) or by exposing a non-standard port to the open Internet (not recommended). For lightweight editing, this shouldn't require overmuch optimization of the Access app to get a usable application, but it isn't going to be as fast as a local connection, and how slow will depend on the users' Internet connections. This solution does require an Internet connection to be used.
host the Access app on a Windows Terminal Server. If the infrastructure is available and there's a budget for CALs (or if the CALs are already in place), this is a very, very easy way to share an Access app. Like #2, this requires an Internet connection, but it puts all the administration in one central location and requires no development beyond what's already been done to create the existing Access app.
For non-Access solutions, it's a matter of building a web front end. For the size app you've outlined, that sounds pretty simple for the person who already knows how to do that, not so much for the person who doesn't!
Even though I'm an Access developer, based on what you've outlined, I'd probably recommend a light-weight web-based front end, as simple as possible with no bells and whistles. I use PHP, but obviously any web scripting environment would be appropriate.
I agree with David: a web-based solution sounds the most suitable.
I use CodeCharge Studio for that: it has a very Access-like interface, lots of wizards to create online forms etc. CCS offers a number of different programming languages; I use PHP, as part of a LAMP stack.