After reaching some bottlenecks with the MySQL cluster my application uses (too many connections, deadlocks), I've been advised to queue all the writes using Kafka or Redis. A near real-time solution sounds, indeed, much better than having database connection errors in production, so I went for this recommendation.
However, I'm having troubles understanding where does Kafka (or a Redis stream) fit in my architecture. I currently use multiple MySQL servers with an HAProxy in front, then split read and write queries to different ports, to prevent the cluster from going out of sync. Each port is then directed to one server (in case of write queries) or many servers (in case of read queries).
I assume I would have to configure the write endpoint so that all write queries go through Kafka first, but it seems like I simply know too little to connect the dots based on the tutorials I've been watching (most examples turn certain MySQL tables into streams, but that would be too late for my use case).
My question would be: is there a simple way to redirect SQL queries to Kafka (something like switching the MySQL host and port configuration to Kafka or using some kind of plug-in) or do I have to write my own adapter?
(I'm using PHP with Laravel for my app)
Related
I am starting to plan a multi region (us-east & us-west) web app that involves AWS RDS MySQL db. i am going to put this in AWS. Can any aws guru clarify my concern?
I will have the multi AZ for redundancy/High Availibity. And the Read DB accross regions for faster READ request processing.
My concern/question:
If the master DB instance is in US-west. and if the write request from instances/computes/app server in us-east are routed to db endpoint which is in us-west, does this cause lag in the app OR is it the way how many AWS users uses?
The read instance local to the app servers are not for writes.
You can't defeat the speed of light.
Having a server write to the database that may be 80ms away may not result in acceptable performance. Only you can determine this.
You run into the same issue if you use MySQL replication across regions.
Now, if you just want to have read replicas across regions, with all writes directed to a single region, you can probably make that work.
If you really need a fast, globally distributed database, consider using something like DynamoDB.
I'm using the autobahn server in twisted to provide an RPC API. Some calls require queries to the database and multiple clients may be connected via websocket to the server.
I am using the SqlAlchemy ORM to access the database.
What are the pros and cons of the two following approaches for dealing with SqlAlchemy sessions.
Create and destroy a session for every RPC call
Create a single session when the server starts and use it in every RPC call
Which would you recommend and why? (I'm leaning towards 2)
The recommended way of doing SQL-based database access from Twisted (and Autobahn) with databases like PostgreSQL, Oracle or SQLite would be twisted.enterprise.adbapi.
twisted.enterprise.adbapi will run queries on a background thread pool, which is required, since most database drivers are blocking.
Sidenote: for PostgreSQL, there is a native-asynchronous, non-blocking
driver also: txpostgres.
Now, if you put an ORM like SQLAlchemy on top of the native SQL driver, I'm not sure how this will work together (if at all) with twisted.enterprise.adbapi.
So from the options you mention
Is a no go, since most drivers are blocking (and Autobahn's RPCs run on the main thread = Twisted reactor thread - and you MUST not block that).
With this, you need to put the database session(s) in background threads (again, to not block).
Also see here.
If you're using SQLAlchemy and Twisted together, consider using Alchimia rather than the built-in adbapi.
So I have a small game in node.js(only the server of course) which has map data and player accounts stored in a mysql database. Right now I constructed it in a way that minimizes the amount of queries made by loading data from the database and keeping it in javascript objects/arrays or whatever seems appropriate and only writing to the database when needed.
Now I was thinking: Is this really worth it? In many cases it would be alot better(in terms of data would be more save and WAY more up-to-date) to hardly store data in the server and just loading it from the database when needed(respectively writing when it needs to be changed).
My question is: Is it efficient/save/recommendable to have the server read/write from the database often rather than having data from the database in javascript variables in the server?
Additional info:
-The nodejs server and my mysql server are on the same machine and a query usually takes less than 1ms or maybe 3ms for big queries like loading room data.
-I am using a module simply called mysql.
-If needed I will include extra info, just ask in a comment.
Really depends on your Use-Case. Generally speaking, I would not add another layer of caching in node.js but handle that in your db with a bigger cache and optimized queries.
Question 1:
I am using MySQL Connector /J to connect to MySQL. I am creating connection for every request. I need to use connection pool. Whether i need to choose c3p0 or i could use MysqlConnectionPool class provided by the connector library.
Question 2:
I may need to load balace / failover between two MySQL database servers. I could use jdbc:mysql://host,host2/dbname to do the failover automatically. I want to use connection pool and failover in combination. How should i acheive it.
I'd recommend using C3PO or something else. It'll integrate into a Java EE app server better, and it's database agnostic.
Your second question is a good deal more complicated. Load balancing is usually done with an appliance of some kind, like an F5 or ACE, that stands between the client and the load balanced instances. Is that how you're doing it? How do you plan to keep the data in synch if you load balance between the two? If the connections aren't "sticky", you'll expect to find INSERTed data in both instances.
Maybe this reference can help you get started:
http://www.howtoforge.com/loadbalanced_mysql_cluster_debian
I come from the cliche land of PHP and MySQL on Dreamhost. BUT! I am also a javascript jenie and I've been dying to get on the Node.js train. In my reading I've discovered inadvertently a NoSQL solution called Redis!
With my shared web host and limited server experience (I know how to install Linux on one of my old dell's and do some basic server admin) how can I get started using Redis and Node.js? and the next best question is -- what does one even use Redis for? What situation would Redis be better suited than MySQL? And does Node.js remove the necessity for Apache? If so why do developers recommend using NGINX server?
Lots of questions but there doesnt seem to be a solid source out there with this info all in one place!
Thanks again for your guidance and feedback!
NoSQL is just an inadequate buzz word.
I'll attempt to answer the latter part of the question.
Redis is a key-value store database system. Speed is its primary objective, so most of its use comes from event driven implementations (as it goes over in its reddit tutorial).
It excels at areas like logging, message transactions, and other reactive processes.
Node.js on the other hand is mainly for independent HTTP transactions. It is basically used to serve content (much like a web server, but Node.js really wouldn't be necessarily public facing) very fast which makes it useful for backend business logic applications.
For example, having a C program calculate stock values and having Node.js serve the content for another internal application to retrieve or using Node.js to serve a web page one is developing so one's coworkers can view it internally.
It really excels as a middleman between applications.
Redis
Redis is an in-memory datastore : All your data are stored in the memory meaning that a huge database means huge memory usage, but with really fast access and lookup.
It is also a key-value store : You don't have any realtionships, or queries to retrieve your data. You can only set a key value pair, and retreive it by its id. (Redis also provides useful types such as sets and hashes).
These particularities makes Redis really well suited for storing sessions in a web application, creating indexes on a database, handling real-time data like analytics.
So if you need something that will "replace" MySQL for storing your basic application models I suggest you try something like MongoDB, Riak or CouchDB that are document store.
Document stores manages your data as something analogous to JSON objects (I know it's a huge shortcut).
Read this article if you want to know more about popular nosql databases.
Node.js
Node.js provides asynchrous I/O for the V8 JavaScript engine.
When you run a node server, it listens on a port on your machine (e.g. 3000). It does not do any sort of Domain name resolution and Virtual Host handling so you have to use a http server with a proxy such as Apache or nginx.
Choosing over nginx in production is a matter of performance, and I find it easier to use. But I suggest you use the one you're the most comfortable with.
To get started with it just install them and start playing with it. HowToNode
You can get a free plan from https://redistogo.com/ - it is a hosted redis database instance.
Quick intro to redis data types and basic commands is available here - http://redis.io/topics/data-types-intro.
A good comparison of when to use what is here - http://playbook.thoughtbot.com/choosing-platforms/databases/