Measuring scalability of the web app hosted on cloud - mysql

I have developed a social networking site using the elgg framework and I am hosting it on amazon cloud (Amazon EC2, the free tier micro instance service) and thus develop a benchmark for it.I am creating around 200 columns describing each user (most of them dummy) and after that I should create around a million users with each users profile updated with some data.This is done to reflect the image of big data.When hosted on cloud we should measure the cloud's performance based on a query and an update action for all users. The problem is how to create so many users? Which tool would be optimal to choose? Done this, I should also consider storage on a file system(HDFS) and do the same with some modifications (The output should be a row and the input should be an unstructured data).
For elgg framework we are using mysql as backend. I have no idea how to start with it. Any suggestions would be really helpful.
Thank you.

I had to perform a similar task recently and came up with this script.. maybe it can help you

Related

Will GoogleBot's indexing cause CloudSQL to be expensive on a low traffic website (Afraid of Google's CloudSQL pricing)

here is the issue:
I used the CloudSQL price calculator to estimate the price of running a website, my website has 1000-2000 URLs and each URL will use the DB in some way, I don't have more than 1GB of data, and I mostly deal with reads, for a small 50k record table, nothing super-complicated, I don't currently have very complex queries either, and I write into the db only once a week maybe a couple records here and there, I've even considered SQLITE tbh.
I don't currently have a lot of traffic, maybe people come visit once a day, however, GoogleBot will continuously try to index the website via the sitemap, which causes some times lots of requests on the server.
Currently, I have a normal php+mysql website which does the job on a DigitalOcean instance, which doesn't take a lot of resources, however, I want to move to Cloud Run in order to try the Cloud Run technology, but running MySQL directly on the VM is discouraged (as per this question Should I run mysql on google cloud run? (or any database))
So I'm kind of afraid of using CloudSQL and then having GoogleBot destroying my credit card by doing lots of concurrent requests into the CloudSQL Database during daily indexation.
Traffic doesn't scare me (I don't have any), but crawlers do.
Should I use CloudSQL for this usecase?
Will my credit card be destroyed?
Are these valid concerns?
Any opinion from experienced CloudSQL Users would be greatly appreciated.
If you consider fully managed database instance Google Cloud is definitely good choice for you.
If you want to optimize GoogleBot crawling, you can do it from here
However, if you experience high server load from specific sites/services you may consider blocking them or using Google Cloud CDN caching
Please read this article will explain how to deal with heavy bot load on the website
Your concerns do not sound valid to me, since you can limit GoogleBot crawling rate.
Since Cloud Run is compute platform STATELESS container service, it is not suited to install MySQL. If you are searching to install your own MySQL server and manage it, you can do it on Cloud Compute Engine using one click solution from Marketplace

What should be concern when deploy an large-scale application?

I tried to set up a micro-blog web site with simple function.
And in the future, I would set some API for the mobile app.
The main feature is simple. People can register, post blog , tag articles, and comments.
Currently, I am using laravel framework + Mysql + Apache and host on VPS.
(Hardwere spec is HD:160 GB,CPU:8core,RAM:8 GB.)
The database tables are basic,including user,comments,article,tags,and tags pivot table.
Everything works fine.
But
I had a little concern about scalability and performance.
Since I have no any expereince about scale a web site.
Could someone give me some key concepts of what should I concern if the users numbers increase to 10,000 ~100,000?
I am OK with change my host platform or even change the framework and database at the begining.
All I try to avoid is that the web site might be crash after deploy a period of time.
The update and transfer would be a disaster.Thanks
Look into scalable cloud hosting, such as digital ocean or amazon, where you can scale your capacity as you grow.
These companies allow you to start small with a "slice" of a server and as you grow you can grow into multiple servers. Load balancing is usually done on their end so all you need to do is focus on your application

Hosted Database v Cloud Database

I have looked everywhere...
whats the difference between a hosted database and a cloud database they seem like the same things?
Thanks
Both "hosted database" and "cloud database" mean that the database lives on the servers of some external provider/hoster.
The hoster might even be the same in both cases.
The main difference is that the "cloud" plans are usually meant to scale more (at a higher monthly fee), so you'd use them when you expect your site to get huge soon and need to quickly adjust server capacity when needed.
On the other hand, "hosted" plans are not that expensive, but have more limitations (server speed, database size...) and are more suited for "small" websites.
Of course this isn't by any means an "official" description of the two terms, but that's the impression that I get every time I see "cloud" or "hosted" webspaces/databases/services/whatever.
It depends on the context in which they're being used, but, yes, they usually mean the same thing. When I see the term cloud database being used they are usually referencing some cloud platform like Amazon EC2 or Microsoft Azure instead of GoDaddy or HostGator or something. Plus, cloud is the new buzz word - I'm sure it sells better. Lol.
As Christian Specht said, the cloud servers really scale more. So why you need more scaling? and why there are many featured options in cloud database service selection?
Things are not like before. Before smartphones and earlier pc operating systems, users gets information from the server only when they log on the specific web page using their credentials. But now apps like facebook shows notifications, provide ads etc and collect/push other data in parallel while we are looking at something else irrelevant.
Hosted database are reliable to access the database when users log onto the web page. But in case of the lastest smart phone applications, it needs to access the database everytime starting from its birth (installation on the device). So for each installation, the minimum workload over the server is expected to raise up.
So more scalability is required here. More simultaneous connections, Input/Output operation requests are expected daily. So with the dedicated servers with the core purpose, and with the configurable package selection based on your expectation of user count and bandwidth usage, Cloud Service is not yet another marketing term, but is a helpful service.

Cloud service for large number of small MySQL databses?

I have an application which is going to be distributed to a hosting platform, most probably phpfog.
It is very similar to how WordPress.com operates, where each customer can host their own individual installation of the app on our servers. We host the 'work' files and provide the database (However, it is NOT WordPress; it's a custom app).
Each user of the application has their own separate MySQL database.
I am wondering what the most cost-effective service would be to provide this. It seems that most cloud services offer, for instance, one massive 50GB database. It is definitely conceivable that instead of an individual database, we have one huge one and prefix all the tables per user. But that seems really bloated and unwieldy. It's also not really possible without major structural changes to have one big database for everyone (And the same tables inside it for everyone) as the app is primarily designed to be standalone.
Each database really won't get that big. We are talking low GB - I'd suggest the biggest would be 5GB. However, there will be a LOT of them as obviously it's one per customer.
What would be the most cost- and performance-effective way of handling this?
Amazon RDS in fact provides a database server rather than an individual sales page; I misunderstood their offerings.
In this case, RDS is a drop-in replacement for existing MySQL databases and will work perfectly.

MySQL AND Filemaker Pro?

I have a client that wants to use Filemaker for a few things in their office, and may have me building a web app.
The last time I used, or thought about, or even heard of, Filemaker was about 10 years ago, and I seem to remember that I don't want to use it as the back end of a sophisticated web app, so I am thinking to try to sell them on MySQL.
However, will their Filemaker database talk to MySQL? Any idea how best to talk them down from Filemaker?
You may have a hard time talking them out of FileMaker, because it was actually a pretty clever tool for making small, in-house database applications, and it had a very loyal user base. But you're right--it's not a good tool for making a web application.
I had a similar problem with a client who was still using a custom dBase IV application. Fortunately, Perl's CPAN archive has modules for talking to anything. So I wrote a script that exported the entire dBase IV database every night, and uploaded it into MySQL as a set of read-only tables.
Unfortunately, this required taking MySQL down for 30 minutes every night. (It was a big database, and we had to convert free-form text to HTML.) So we switched to PostgreSQL, and performed the entire database update as a single transaction.
But what if you need read-write access to the FileMaker database? In that case, you've got several choices, most of them bad:
Build a bi-directional synchronization tool.
Get rid of FileMaker entirely. If the client's FileMaker databases are trivial, this may be relatively easy. I'd begin by writing a quick-and-dirty clone of their most important databases and demoing it to them in a web browser.
The client may actually be best served by a FileMaker-based web application. If so, refer them to Google.
But how do you sell the client on a given choice? It's probably best to lay out the costs and benefits of each choice, and let the client decide which is best for their business. You might lose the job, but you'll maintain a reputation for honest advice, and you won't get involved in a project that's badly suited to your client.
We develop solutions with both FileMaker and PHP/MySQL. Our recommendation is to do the web app in a web app optimised technology like MySQL.
Having said that, FileMaker does have a solid PHP API so if the web app has relatively lightweight demands (e.g. in house use) then use that and save yourself the trouble of synchronisation.
FileMaker's ESS technology let's FileMaker use an SQL db as the backend data source, which gives you 2 options:
Use ESS as a nice tight way to synchronise right within FileMaker - that way you'd have a "native" data source to work with within the FileMaker solution per se.
Use ESS to allow FileMaker to be used as a reporting/data mining/casual query and edit tool directly on the MySQL tables - it works sweet.
We've found building a sophisticated application in FileMaker with ESS/MySQL backend to be very tricky, so whether you select 1 or 2 from above depends on how sophisticated and heavy duty that FileMaker usage is.
Otherwise, SyncDek has a good reputation as a third party solution for automating Synchronisation.
I've been tackling similar problems and found a couple of solutions that emk hasn't mentioned...
FileMaker can link to external SQL data sources (ESS) so you can use ODBC to connect to a MySQL (or other) database and share data. You can find more information here. we tried it and found it to be pretty slow to be honest
Syncdek is a product that claims to allow you to perform data replication and data transmission between Filemaker, MySQL and other structured sources.
It is possible to use Filemaker's Instant Web Publishing as a web service that your app can then push and pull data through. We found a couple of wrappers for this in python and php
you can put a trigger in the FileMaker database so that every time a record is changed (or part of a record you are interest in) you can call a web service that updates a MySQL or memcached version of that data that your website can access.
I found that people like FileMaker because it gives them a very visual interface onto their data - it's very easy to make quite large self-contained applications without too much development knowledge. But, when it comes to collaboration with many users or presenting this data in a format other than the FileMaker application we found performance a real problem.