Using sqlalchemy to generate: SELECT * ... INTO OUTFILE "file"; - mysql

I have recently started using SQLALCHEMY to query a my-sql database. I want to generate a select statement that uses the "INTO OUTFILE <file>" syntax to export query results to a test file. For example:
SELECT *
FROM table
INTO OUTFILE '/tmp/export.txt';
Is there a way to generate the "INTO OUTFILE..." clause using SQLALCHEMY?
If not, can I subclass one of the SQLALCHEMY classes so I can build that clause myself?
Thanks.

I did some thinking and poking around the examples on the SQLAlchemy site and figured it out. (Also posted to sql-alchemy user reciptes)
from sqlalchemy import *
from sqlalchemy.sql.expression import Executable, ClauseElement
from sqlalchemy.ext import compiler
class SelectIntoOutfile(Executable, ClauseElement):
def __init__(self, select, file):
self.select = select
self.file = file
#compiler.compiles(SelectIntoOutfile)
def compile(element, compiler, **kw):
return "%s INTO OUTFILE '%s'" % (
compiler.process(element.select), element.file
)
e = SelectIntoOutfile(select([s.dim_date_table]).where(s.dim_date_table.c.Year==2009), '/tmp/test.txt')
print e
eng.execute(e)

Related

Hybrid property expression concatenation for both MySQL and SQLite

flask-sqlalchemy model:
from sqlalchemy import extract
from sqlalchemy.sql import func
from app import db
class Entry(db.Model):
date_order = db.Column(db.Date, nullable=False)
version_number = db.Column(db.Integer, nullable=False, default=1)
#hybrid_property
def display_name(self):
return f"{self.date_order.year} ({self.version_number})"
#display_name.expression
def display_name(cls):
return func.concat(extract("year", cls.date_order), " (", cls.version_number, ")")
This works with MySQL. I had to use func.concat because with the simple addition it would cast the date year to an integer and just add them together instead of concatenation. I tested with my custom API and Flask shell:
In [1]: dp = Entry.query.first().display_name
In [2]: Entry.query.filter_by(display_name=dp).all()
Out[2]: [...returns a bunch of entries with that display name...]
But my testing environment runs an SQLite instance. I have this unit test:
# `create_entry` fixture commits the instance to db
def test_user_display_name_expression(create_entry):
entry = create_entry(date_order=date(2022, 11, 11), version_number=3)
filtered = Entry.query.filter_by(display_name="2022 (3)").one()
assert filtered == entry
This returns an error:
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such function: concat
Is there any way to create this concatenation expression so both SQL implementation would query on it?
After many inputs from
PChemGuy I managed to find the issue:
Neither func.concat or adding non-casted properties work as concatenation.
Instead we need sqlalchemy.sql.expression.cast to cast every property to sqlalchemy.String:
from sqlalchemy import extract, String
from sqlalchemy.sql.expression import cast
#display_name.expression
def display_name(cls):
return (
cast(extract("year", cls.date_order), String)
+ " ("
+ cast(cls.version_number, String)
+ ")"
)
Both MysQL and SQLite can understand this.

Glue Job Succeeded but no data inserted into the target table (Aurora Mysql)

I created a glue job using the visual tab like below. First I connected to a mysql table as data source which is already in my data catalog. Then in the transform node, I wrote a custom sql query to select only one column from the source table. Validated with the data preview feature and the transformation node works fine. Now I want to write the data to the existing database table that has only one column with 'string' data type. Glue job succeeded but I dont see the data in the table.
Below is the automatic script generated from Glue Job Visual.
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from awsglue import DynamicFrame
def sparkSqlQuery(glueContext, query, mapping, transformation_ctx) -> DynamicFrame:
for alias, frame in mapping.items():
frame.toDF().createOrReplaceTempView(alias)
result = spark.sql(query)
return DynamicFrame.fromDF(result, glueContext, transformation_ctx)
args = getResolvedOptions(sys.argv, ["JOB_NAME"])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args["JOB_NAME"], args)
# Script generated for node MySQL
MySQL_node1650299412376 = glueContext.create_dynamic_frame.from_catalog(
database="glue_rds_test",
table_name="test_customer",
transformation_ctx="MySQL_node1650299412376",
)
# Script generated for node SQL
SqlQuery0 = """
select CUST_CODE from customer
"""
SQL_node1650302847690 = sparkSqlQuery(
glueContext,
query=SqlQuery0,
mapping={"customer": MySQL_node1650299412376},
transformation_ctx="SQL_node1650302847690",
)
# Script generated for node MySQL
MySQL_node1650304163076 = glueContext.write_dynamic_frame.from_catalog(
frame=SQL_node1650302847690,
database="glue_rds_test",
table_name="test_customer2",
transformation_ctx="MySQL_node1650304163076",
)
job.commit()
For me the problem was the double-quotes of the selected fields in the SQL query. Dropping the use of double quotes solved it. There is no mention of it in the Spark SQL Syntax documentation
For example, I "wrongly" used this query syntax:
select "CUST_CODE" from customer
instead of this "correct" one :
select CUST_CODE from customer
Your shared sample code does not seem to have this syntax issue, but I thought putting the answer here might be of a help to others.

How to use MySQL's standard deviation (STD, STDDEV, STDDEV_POP) function inside SQLAlchemy?

I need to use the STD function of MySQL through SQLAlchemy, but after a couple of minutes of search, it looks like there is no func.<> way of using this one in SQLAlchemy. Is it not supported, or am I missing something?
Found this issue while coding some aggregates on SQLAlchemy.
Citing the docs:
Any name can be given to func. If the function name is unknown to SQLAlchemy, it will be rendered exactly as is. For common SQL functions which SQLAlchemy is aware of, the name may be interpreted as a generic function which will be compiled appropriately to the target database.
Basically func will generate a function matching the attribute "func." if its not a common function of which SQLAlchemy is aware of (like func.count).
To keep the advantages of RDBMS abstraction that comes with any ORM I always suggest to use ANSI functions to decouple the code from the DB Engine.
For a working sample you can add a connection string and execute the following code:
from sqlalchemy.orm import sessionmaker
from sqlalchemy import func, create_engine, Column
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.types import DateTime, Integer, String
# Add your connection string
engine = create_engine('My Connection String')
Base = declarative_base(engine)
Session = sessionmaker(bind=engine)
db_session = Session()
# Make sure to have a table foo in the db with foo_id, bar, baz columns
class Foo(Base):
__tablename__ = 'foo'
__table_args__ = { 'autoload' : True }
query = db_session.query(
func.count(Foo.bar).label('count_agg'),
func.avg(Foo.foo_id).label('avg_agg'),
func.stddev(Foo.foo_id).label('stddev_agg'),
func.stddev_samp(Foo.foo_id).label('stddev_samp_agg')
)
print(query.statement.compile())
It will generate the following SQL
SELECT count(foo.bar) AS count_agg,
avg(foo.foo_id) AS avg_agg,
stddev(foo.foo_id) AS stddev_agg,
stddev_samp(foo.foo_id) AS stddev_samp_agg
FROM foo

How to use Django ORM to function on a field

This question is a follow-up to this one.
I'm running a Django application on top of a MySQL (actually MariaDB) database.
My Django Model looks like this:
from django.db import models
from django.db.models import Count, Sum
class myModel(models.Model):
my_string = models.CharField(max_length=32,)
my_date = models.DateTimeField()
#staticmethod
def get_stats():
logger.info(myModel.objects.values('my_string').annotate(
count=Count("my_string"),
sum1=Sum('my_date'),
sum2=Sum(# WHAT GOES HERE?? #),
)
)
When I call get_stats, it gives me the count and the sum1.
However, for sum2, I want the sum of the following Database expression for each matching row: my_date + 0 (this converts it to a true integer value).
What should I put in the expression above to get that sum returned in sum2?
When I change it to sum2=Sum(F('my_date')), I get the following exception: http://gist.github.com/saqib-zmi/50bf572a972bae5d2871
Not sure, but try F() expression
from datetime import timedelta
myModel.objects.annotate(
count=Count("my_string"),
sum1=Sum('my_date'),
sum2=Sum(F('my_date') + timedelta(days=1))
).values('my_string', 'count', 'sum1', 'sum2')
https://docs.djangoproject.com/en/dev/ref/models/queries/#f-expressions

Need example of use of PreparedStatement with Anorm scala

I am using Anorm for querying a MySQL database from Playframework 2.1. I created a prepared statement like this.
import play.api.db.DB
import anorm._
val stat = DB.withConnection(implicit c => SQL("SELECT name, email FROM user WHERE id=?").filledStatement)
Now how do I use it? Am I event doing this right? I am totally ignorant of the anorm API and I already went through the source code without gaining much insight.
Code examples are more that welcome.
A good example on the Anorm usage is given in the respective tutorial. It also contains some examples that pass dynamic parameters to the queries. You should start by writing your query and replace declare placeholders like {somePlaceholder} in the query string. You can later assign values using the .on() method like this:
SQL(
"""
select * from Country c
join CountryLanguage l on l.CountryCode = c.Code
where c.code = {countryCode};
"""
).on("countryCode" -> "FRA")
Or in your case:
import play.api.db.DB
import anorm._
val stat = DB.withConnection(implicit c =>
SQL("SELECT name, email FROM user WHERE id={id}").on("id" -> 42)
)