How to deploy an eth2 rpc node to obtain transactions? - ethereum

I want to deploy an eth rpc node and only need to obtain transaction data related to me. Do I only need to use Geth? Do I need to use Prysm?
The transaction data I get using this way:
https://github.com/Adamant-im/ETH-transactions-storage
This method is found from How to get Ethereum transaction list by address
I see some introduction here SYNCHRONIZATION MODES https://ethereum.org/en/developers/docs/nodes-and-clients/#sync-modes
There are several modes as follows:
Full sync / Fast sync / Light sync / Snap sync / Optimistic sync / Checkpoint sync
After reading the description, I still don't know which one is the most suitable for only obtaining transaction data.
I'm a novice please give me an example of starting the functionality I need.

Related

Is AWS Aurora Multi-Master cluster suitable for WordPress

I have a situation where during peak moments my writer database even on the largest 96 core AWS instance becomes maxed (due to limited edition promotions where we process hundreds of orders per second).
I have seen that Aurora offer a multi-master setup where all nodes of the cluster are able to write - https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-multi-master.html
In the docs they mention:
If two DB instances attempt to modify the same data page at almost the same instant, a write conflict occurs. The earliest change request is approved using a quorum voting mechanism. That change is saved to permanent storage. The DB instance whose change isn't approved rolls back the entire transaction containing the attempted change. Rolling back the transaction ensures that data is kept in a consistent state, and applications always see a predictable view of the data. Your application can detect the deadlock condition and retry the entire transaction.
I am not really sure what they mean here by "data page". I am pretty sure WordPress doesn't use transactions at all but when thousands of orders are coming in and being pushed into the same table will this cause write errors that will cause orders to fail?
I have looked online and cannot find anyone talking about using WordPress with Aurora multi-master cluster. Is it compatible?

Push Data onto Queue vs Pull Data by Workers

I am building a web site backend that involves a client submitting a request to perform some expensive (in time) operation. The expensive operation also involves gathering some set of information for it to complete.
The work that the client submits can be fully described by a uuid. I am hoping to use a service oriented architecture (SOA) (i.e. multiple micro-services).
The client communicates with the backend using RESTful communication over HTTP. I plan to use a queue that the workers performing the expensive operation can poll for work. The queue has persistence and offers decent reliability semantics.
One consideration is whether I gather all of the data needed for the expensive operation upstream and then enqueue all of that data or whether I just enqueue the uuid and let the worker fetch the data.
Here are diagrams of the two architectures under consideration:
Push-based (i.e. gather data upstream):
Pull-based (i.e. worker gathers the data):
Some things that I have thought of:
In the push-based case, I would be likely be blocking while I gathered the needed data so the client's HTTP request would not be responded to until the data is gathered and then enqueued. From a UI standpoint, the request would be pending until the response comes back.
In the pull based scenario, only the worker needs to know what data is required for the work. That means I can have multiple types of clients talking to various backends. If the data needs change I update just the workers and not each of the upstream services.
Any thing else that I am missing here?
Another benefit of the pull based approach is that you don't have to worry about the data getting stale in the queue.
I think you already pretty much explained that the second (pull-based) approach is better.
If a user's request should anyway be processed asynchronously, why wait for the data to be gathered and then return a response. You need just to queue a work item and return HTTP response.
Passing data via queue is not a good option. If you get the data upstream, you will have to pass it somehow other than via queue to the worker (usually BLOB storage). That is additional work that is not really needed in your case.
I would recommend Cadence Workflow instead of queues as it supports long running operations and state management out of the box.
Cadence offers a lot of other advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
See the presentation that goes over Cadence programming model.

Node.js system requirements for 50.000 concurrent connections

The situation is that about 50.000 electronic devices are going to connect to a webservice created in node.js once per minute. Each one is going to send a POST request containg some JSON data.
All this data should be secured.
The web service is going to receive those requests, saving the data to a database.
Also reading requests are possible to get some data from the DB.
I think to build up a system based on the following infrastructure:
Node.js + memcached + (mysql cluster OR Couchbase)
So, what memory requirements do I need to assign to my web server to be able to handle all this connections? Suppose that in the pessimistic possibility I would have 50.000 concurrent requests.
And what if I use SSL to secure the connections? Do I add too much overhead per connection?
Should I scale the system to handle them?
What do you suggest me?
Many thanks in advance!
Of course, it is impossible to provide any valuable calculations, since it is always very specific. I would recommend you just to develop scalable and expandable system architecture from the very beginning. And use JMeter https://jmeter.apache.org/ for load testing. Then you will be able to scale from 1000s to unlimited connections.
Here is a 1 000 000 connections article http://www.slideshare.net/sh1mmer/a-million-connections-and-beyond-nodejs-at-scale
Remember that your nodejs application will be single threaded. Meaning your performance will degrade horribly when you increase the number of concurrent requests.
What you can do to increase your performance is create a node process for each core that you have on your machine all of them behind a proxy (say nginx), and you can also use multiple machines for your app.
If you make requests only to memcache then your api won't degrade. But once you start querying mysql it will start throttling your other requests.
Edit:
As suggested in the comments you could also use clusters to fork worker processes and let them compete amongst each other for incoming requests. (Workers will run on a separate thread, thereby allowing you to use all cores).
Node.js on multi-core machines

Starting MySQL Cluster

Im new to clustering and Im doing a project on cluster database. I want to make use of MySQL Cluster. Im using it for a small scale database and this is my plan:
5 node:
1 management node,
2 SQL node,
2 API node.
My questions are:
1) Is my plan for the node process alright?
2) What should I do when I got the error "Failed to allocate node id..."?
3) Is it a requirement to use multi-threaded data node?
4) Where do I place my web server page for the user to access the database?
Please reply. Thank you so much.
This answer might be a little late but can be helpful for someone else getting started:
1) Is my plan for the node process alright?
Your plan for the node process is ok for a small cluster. I would recommend adding an additional management node and 2 more Data Nodes if the number of replicas you have configured is 2. Reason being since you currently have only 2 data nodes, your cluster will not be functionally should once of those nodes "die" . This is because of the two-phase commit that takes place. In the case of a faiure only 1 data node will be able to persist the data , the other one would be unreachable and therefore the transaction would be marked as incomplete.
2) What should I do when I got the error "Failed to allocate node
id..."?
This error is usually thrown if you have assigned the same id to other nodes in your configuration file. Each node should have a unique Id.
3) Is it a requirement to use multi-threaded data node?
It is not a requirement but recommended. Using mulch-threaded data node allows you to leverage modern computer architecture with multiply cpus to allow for your data processing queries to be processed much faster. As a result updates and queries will be done much faster
4) Where do I place my web server page for the user to access the
database?
Hmm.Not sure why you want to achieve here. This would be a separate question, if you are using PHP or usually any other language. You will have to have a web server configured. Place them in the root of the http directory to get started

Handling database create / update operation for concurrent REST API calls in Django

This is more of a follow up thread though the question is very generic. I have a Django web application with REST API (implemented with tastypie) accessed through Apache web server.
I am adding API call logging functionality so that for every call to the web application I will be creating an entry in a specific application log table in MySQL database.
User base of this application is limited. I am not expecting large volume of concurrent API calls at this point or near future.
I have the following options:
1. multithread locking
2. multiprocess locking mechanisms
3. ORM transaction or data base locking
I am not sure if I should use any locking feature at all for wrapping MySQL db entry creation / update operations.
How are these kind of cases anyway dealt with for large volume of concurrent POST API calls in Django web applications?
Thanks,