firebase functions postgresql max_connections best practice - google-cloud-functions

Could not find any answer to this question
google cloud function eats up all available postgresql connections
library used pg#8.8.0
firebase reports 20 active users and postgresql is up to 100 parallel connection
const client = await pool.connect();
await client.query(`INSERT INTO ${table} (id, update_time, doc)
VALUES
($1,NOW(),$2)
ON CONFLICT (id) DO UPDATE
SET update_time = excluded.update_time,
doc = excluded.doc;`, [documentId, document]);
client.release();
am I doing something wrong ?
how can I detect if google spins multiple instances of the same function ?

Related

Pyramid + SQLAlchemy + Zope App returns wrong results with raw SQL

I have a Pyramid 2.X + SQLAlchemy + Zope App created using the official CookieCutter.
There is a table called "schema_b.table_a" with 0 records.
In the below view count(*) should be more than 0 but it returns 0
#view_config(route_name='home', renderer='myproject:templates/home.jinja2')
def my_view(request):
# Call external REST API. This uses HTTP requests. The API inserts in schema_b.table_a
call_thirdparty_api()
mark_changed(request.dbsession)
sql = "SELECT count(*) FROM schema_b.table_a"
total = request.dbsession.execute(sql).fetchone()
print(total) # Total is 0
return {}
On the other hand, the following code returns the correct count(*):
#view_config(route_name='home', renderer='myproject:templates/home.jinja2')
def my_view(request):
engine = create_engine(request.registry.settings.get("sqlalchemy.url"), poolclass=NullPool)
connection = engine.connect()
# Call external REST API. This uses HTTP requests. The API inserts in table_a
call_thirdparty_api()
sql = "SELECT count(*) FROM schema_b.table_a"
total = connection.execute(sql).fetchone()
print(total) # Total is not 0
connection.invalidate()
engine.dispose()
return {}
It seems that request.session is not able to see the data inserted by the external REST API but it is not clear to me why or how to correct it.
Pyramid and Zope provide transaction managers that extend transactions to far beyond databases. In your example I think a transaction was started in mysql when the request was received on the server by the pyramid_tm package, their documentation states:
"At the beginning of a request a new transaction is started using the request.tm.begin() function."
https://docs.pylonsproject.org/projects/pyramid_tm/en/latest/index.html
Because mysql supports consistent nonblocking reads on the transaction you join when calling request.dbsession.execute you query a snapshot of the database made at the start of the transaction. When you use the normal SQLAlchemy function to execute the query a new transaction is created and the expected result is returned.
https://dev.mysql.com/doc/refman/8.0/en/innodb-consistent-read.html
This is very confusing in this situation. But I must admit it's impressive how well it seems to work.

Import CSV MySQL Workbench Mac Catalina Very Slow

I have a Macbook pro 2019 touch,i5, 8M,HD 256 GB used 75GB, OSX 10.15.2. Installed AMPPS 3.9 and MSQL Workbench 8.0
I'm importing to Localhost, a mysql table in csv 1.9 GB 15 million register, after 12 hours only 3 Million have been imported.
Already probe XAMPP VM and in the mariaDB console failed to do LOAD DATA INFILE. It disconnects from the database.
I'm going to process the table with PHP. Any help or recommendation so that these processes become faster ?
I use Sequel pro to read remote databases, very fast but no longer works for localhost.
I had a similar problem in importing a csv file (approx 5 mil records) into a mysql table. I managed to do that using nodejs as it is able to open and read a file line by line (without loading the entire file into ram). The script reads 100 lines, then makes an INSERT then reads another 100 lines and so on. It processed the entire file in about 9 minutes. It uses the readline module. The main part of the script looks something like this:
const Fs = require('fs');
const Mysql = require('mysql2');
const Readline = require('readline');
const fileStream = Fs.createReadStream('/path/to/file');
var dbConnection = Mysql.createConnection({
host : "yourHost",
user : "yourUser",
password : "yourPassword",
database : "yourDatabase"
});
const rl = Readline.createInterface({
input: fileStream,
crlfDelay: Infinity
});
async function run() {
var lineElements = [];
for await (const line of rl) {
// Each line in will be successively available here as `line`.
lineElements = line.split(",");
// now lineElements is an array containing all the values of the current line
// here you can queue up multiple lines, to make a bigger insert
// or just insert line by line
await dbConnection.query('INSERT INTO ..........');
}
}
run();
The script above inserts on line per query. Fell free to modify it, if you want a query to insert 100 lines or any other value.
As a side note: because my file was "trusted" I did not used prepared statements, as I think that the simple query is faster. I do not know if the speed gain was significant, as I did not do any tests.

Loading data to Big Query from Local Database

I am wanting to start a data warehouse in Google Big Query but I'm not sure how to actually schedule jobs to get the data into the cloud.
To give some background.
I have a MySQL database hosted on-prem which I currently take a demp of each night as a backup. My idea is that I can send this dump to the Google Cloud and have it import the data into Big Query.
I have thought that I could send the dump and probably use a cloud scheduler function to then run something that opens the dump and does this but I'm unsure how these services all fit together.
I'm a bit of a newby with the Google Cloud so if there is a better way to achieve this then I'm happy to change my plan of action.
Thanks in advance.
As the new EXTERNAL_QUERY has been launched and you can query from BigQuery a Cloud SQL instance, your best shot right now is:
Setup replica from your current instance to a Cloud SQL instance, follow this guide.
Understand how Cloud SQL federated queries let's you query from BigQuery Cloud SQL instances.
You get this way a live access to your relational database as:
Example query that you run on BigQuery:
SELECT * EXTERNAL_QUERY(
'connection_id',
'''SELECT * FROM mysqltable AS c ORDER BY c.customer_id'');
You can even join Bigquery table with SQL table:
Example:
SELECT c.customer_id, c.name, SUM(t.amount) AS total_revenue,
rq.first_order_date
FROM customers AS c
INNER JOIN transaction_fact AS t ON c.customer_id = t.customer_id
LEFT OUTER JOIN EXTERNAL_QUERY(
'connection_id',
'''SELECT customer_id, MIN(order_date) AS first_order_date
FROM orders
GROUP BY customer_id''') AS rq ON rq.customer_id = c.customer_id
GROUP BY c.customer_id, c.name, rq.first_order_date;
In order to achieve this you will need to create a Cloud Storage bucket running
gsutil mb gs://BUCKET_NAME.
After creating the bucket you need to create a cloud function triggered by the bucket using the finalize option.
You can follow this sample function
'use strict';
const Storage = require('#google-cloud/storage');
const BigQuery = require('#google-cloud/bigquery');
// Instantiates a client
const storage = Storage();
const bigquery = new BigQuery();
/**
* Creates a BigQuery load job to load a file from Cloud Storage and write the data into BigQuery.
*
* #param {object} data The event payload.
* #param {object} context The event metadata.
*/
exports.loadFile = (data, context) => {
const datasetId = 'Your_Dataset_name';
const tableId = 'Your_Table_ID';
const jobMetadata = {
skipLeadingRows: 1,
writeDisposition: 'WRITE_APPEND'
};
// Loads data from a Google Cloud Storage file into the table
bigquery
.dataset(datasetId)
.table(tableId)
.load(storage.bucket(data.bucket).file(data.name), jobMetadata)
.catch(err => {
console.error('ERROR:', err);
});
console.log(`Loading from gs://${data.bucket}/${data.name} into ${datasetId}.${tableId}`);
};
Then create your BigQuery dataset using your desired schema
And now you can upload your csv file into your bucket and you will see the uploaded data in your bigquery.

Passing C++ variable in MySQL Statement using MySQL Driver in MySQL 8.0+

I have looked through Stack Overflow's plethora of answers and questions but they are all for older versions of MySQL. I have also scoured the bowls of the internet for an answer to this and tried numerous different methods to no avail. So the question, how do I use a C++ variable in a MySQL query.
For instance:
pstmt = con->prepareStatement("SELECT balance FROM accounts WHERE id = [C++ Var]");
Where [C++ Var] would be the C++ variable. I am not sure if it is a good idea to use older methods from MySQL 5 in MySQL 8 or not. I think the most effective way would be to use the SET #var_name = data method, but I cannot find any way to implement that in C++. Currently I am using MySQL/C++ Connector 8.0 with the driver. Thank You!
I don't see why you need to use a MySQL #xxx variable here.
Using the old MySQL C++ API (which seems to be modeled after the Java JDBC API) prepareStatment call should look like:
pstmt = con->prepareStatement("SELECT balance FROM accounts WHERE id = ?");
And later on you should use something like
pstmt->setInt(1, userId); //where userId is a C++ int
ResultSet res* = pstmt->executeQuery()
Using the newer C++ API (X Dev API) the calls would look similar to the following:
Table accounts = db.getTable("accounts");
auto query = accounts.select("balance").where("id = :account_id");
RowResult res = query.bind("account_id", account_id).execute(); // where account_id is the name of an int variable.

Pass custom variables in MySQL connection

I am setting up a MySQL connection (in my case PDO but it shouldn't matter) in a REST API.
The REST API uses an internal authentication (username / password). There are multiple user groups accessing the REST API, e.g. customers, IT, backend, customer service. They all use the same MySQL connection in the end because they also use the same end points most of the time.
In the MySQL database I would like to save the user who is responsible for a change in a data set.
I would like to implement this on the MySQL layer through a trigger. So, I have to pass the user information from the REST API to this trigger somehow. There are some MySQL calls like CURRENT_USER() or status that allow to query for meta-information. My idea was to somehow pass additional information in the connection string to MySQL, so that I don't have to use different database users but I am still able to retrieve this information from within the trigger.
I have done some research and don't think it is possible, but since it would facilitate my task a lot, I still wanted to ask on SO if someone did know a solution for my problem.
I would set a session variable on connect.
Thanks to the comment from #Álvaro González for reminding me about running a command on PDO init.
The suggestion of adding data to a temp table isn't necessary. It's just as good to set one or more session variables, assuming you just need a few scalars.
$pdo = new PDO($dsn, $user, $password, [
PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION,
PDO::MYSQL_ATTR_INIT_COMMAND => "SET #myvar = 'myvalue', #myothervar = 'othervalue'"
]);
It's also possible to set session variables at any time after connect, with a call to $pdo->exec().
$pdo->exec("SET #thirdvar = 1234");
You can read session variables in your SQL queries:
$stmt = $pdo->query("SELECT #myvar, #myothervar");
foreach ($stmt->fetchAll(PDO::FETCH_ASSOC) as $row) {
print_r($row);
}
You can also read session variables in triggers:
CREATE TRIGGER mytrig BEFORE INSERT ON mytable
FOR EACH ROW
SET NEW.somecolumn = #myvar;