sqlEngine = create_engine(url='mysql+pymysql://root:root#127.0.0.1:3306/tick') dbConnection = sqlEngine.connect() def fractal1(fln): fl = pd.read_sql_table(fln + '1min', dbConnection, index_col=0) return fl
this is base code to read sql table and it is executing with out any errors
but when i import this function to another python file and execute it ;it is returning the same table not the updated one (my table updates every 1 min and i tried using time.sleep(1))
from * import fractal1 print(fractal1(fln)
is it just bug or am i doing some thing wrong
Related
I have connected to Spotify's API in Python to extract the top twenty tracks of a searched artist. I am trying to store the data in MySQL Workbench in a database named 'spotify_api', I created called 'spotify'. Before I added my code to connect to MySQL Workbench, my code worked correctly and was able to extract the list of tracks, but I have run into issues in getting my code to connect to my database. Below is the code I have written to both extract the data and store it into my database:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import mysql.connector
mydb = mysql.connector.connect(
host = "localhost",
user = "root",
password = "(removed for question)",
database = "spotify_api"
)
mycursor = mydb.cursor()
sql = 'DROP TABLE IF EXISTS spotify_api.spotify;'
mycursor.execute(sql)
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id="(removed for question)",
client_secret="(removed for question)"))
results = sp.search(q='sza', limit=20)
for idx, track in enumerate(results['tracks']['items']):
print(idx, track['name'])
sql = "INSERT INTO spotify_api.spotify (tracks, items) VALUES (" + \
str(idx) + ", '" + track['name'] + "');"
mycursor.execute(sql)
mydb.commit()
print(mycursor.rowcount, "record inserted.")
mycursor.execute("SELECT * FROM spotify_api.spotify;")
myresult = mycursor.fetchall()
for x in myresult:
print(x)
mycursor.close()
Every time I run my code in the VS Code terminal, I receive an error stating that my table doesn't exist. This is what it states:
"mysql.connector.errors.ProgrammingError: 1146 (42S02): Table 'spotify_api.spotify' doesn't exist"
I'm not sure what I need to fix in my code or in my database in order to eliminate this error and get my data stored into my table. In my table I have created two columns 'tracks' and 'items', but I'm not sure if my issues lie in my database or in my code.
Well, it seems pretty clear. You ran
DROP TABLE IF EXISTS spotify_api.spotify;
...
INSERT INTO spotify_api.spotify (tracks, items) VALUES ...
We won't even raise the spectre of the Chuck Berry
track titled little ol' Bobby Tables here.
You DROP'd it, then tried to INSERT into it.
That won't work.
You'll need to CREATE TABLE prior to the INSERT.
I created a glue job using the visual tab like below. First I connected to a mysql table as data source which is already in my data catalog. Then in the transform node, I wrote a custom sql query to select only one column from the source table. Validated with the data preview feature and the transformation node works fine. Now I want to write the data to the existing database table that has only one column with 'string' data type. Glue job succeeded but I dont see the data in the table.
Below is the automatic script generated from Glue Job Visual.
import sys
from awsglue.transforms import *
from awsglue.utils import getResolvedOptions
from pyspark.context import SparkContext
from awsglue.context import GlueContext
from awsglue.job import Job
from awsglue import DynamicFrame
def sparkSqlQuery(glueContext, query, mapping, transformation_ctx) -> DynamicFrame:
for alias, frame in mapping.items():
frame.toDF().createOrReplaceTempView(alias)
result = spark.sql(query)
return DynamicFrame.fromDF(result, glueContext, transformation_ctx)
args = getResolvedOptions(sys.argv, ["JOB_NAME"])
sc = SparkContext()
glueContext = GlueContext(sc)
spark = glueContext.spark_session
job = Job(glueContext)
job.init(args["JOB_NAME"], args)
# Script generated for node MySQL
MySQL_node1650299412376 = glueContext.create_dynamic_frame.from_catalog(
database="glue_rds_test",
table_name="test_customer",
transformation_ctx="MySQL_node1650299412376",
)
# Script generated for node SQL
SqlQuery0 = """
select CUST_CODE from customer
"""
SQL_node1650302847690 = sparkSqlQuery(
glueContext,
query=SqlQuery0,
mapping={"customer": MySQL_node1650299412376},
transformation_ctx="SQL_node1650302847690",
)
# Script generated for node MySQL
MySQL_node1650304163076 = glueContext.write_dynamic_frame.from_catalog(
frame=SQL_node1650302847690,
database="glue_rds_test",
table_name="test_customer2",
transformation_ctx="MySQL_node1650304163076",
)
job.commit()
For me the problem was the double-quotes of the selected fields in the SQL query. Dropping the use of double quotes solved it. There is no mention of it in the Spark SQL Syntax documentation
For example, I "wrongly" used this query syntax:
select "CUST_CODE" from customer
instead of this "correct" one :
select CUST_CODE from customer
Your shared sample code does not seem to have this syntax issue, but I thought putting the answer here might be of a help to others.
I am switching from SQLite to MySQL and getting some problems with my Queries. At the moment i am trying to SELECT from a Table where the PackageName is like a variable from a Text Input in my GUI.
The Query looks like this:
test = self.ms.ExecQuery("SELECT PackageID,PackageName,ServiceFee,Cost,LatestChanges FROM Package WHERE PackageName=?", (self.search_entry.GetValue(),))
and my ExecQuery looks like this:
def ExecQuery(self, sql):
cur = self._Getconnect()
cur.execute(sql)
relist = cur.fetchall()
cur.close()
self.conn.close()
return relist
And the error i am getting:
TypeError: ExecQuery() takes 2 positional arguments but 3 were given
What do i need to change to get this running?
You have sql as the 2nd parameter in your ExecQuery, yet you are passing in a sql query as the 1st parameter:
test = self.ms.ExecQuery("SELECT PackageID,PackageName,ServiceFee,Cost,LatestChanges FROM Package WHERE PackageName=?", (self.search_entry.GetValue(),))
I need to
1. run a select query on MYSQL DB and fetch the records.
2. Records are processed by python script.
I am unsure about the way I should proceed. Is xcom the way to go here? Also, MYSQLOperator only executes the query, doesn't fetch the records. Is there any inbuilt transfer operator I can use? How can I use a MYSQL hook here?
you may want to use a PythonOperator that uses the hook to get the data,
apply transformation and ship the (now scored) rows back some other place.
Can someone explain how to proceed regarding the same.
Refer - http://markmail.org/message/x6nfeo6zhjfeakfe
def do_work():
mysqlserver = MySqlHook(connection_id)
sql = "SELECT * from table where col > 100 "
row_count = mysqlserver.get_records(sql, schema='testdb')
print row_count[0][0]
callMYSQLHook = PythonOperator(
task_id='fetch_from_testdb',
python_callable=mysqlHook,
dag=dag
)
Is this the correct way to proceed?
Also how do we use xcoms to store the records for the following MySqlOperator?'
t = MySqlOperator(
conn_id='mysql_default',
task_id='basic_mysql',
sql="SELECT count(*) from table1 where id > 10",
dag=dag)
I was really struggling with this for the past 90 minutes, here is a more declarative way to follow for newcomers:
from airflow.hooks.mysql_hook import MySqlHook
def fetch_records():
request = "SELECT * FROM your_table"
mysql_hook = MySqlHook(mysql_conn_id = 'the_connection_name_sourced_from_the_ui', schema = 'specific_db')
connection = mysql_hook.get_conn()
cursor = connection.cursor()
cursor.execute(request)
sources = cursor.fetchall()
print(sources)
...your DAG() as dag: code
task = PythonOperator(
task_id = 'fetch_records',
python_callable = fetch_records
)
This returns to the logs the contents of your DB query.
I hope this is of use to someone else.
Sure, just create a hook or operator and call the get_records() method: https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/hooks/dbapi.html
I'm trying to call a stored procedure on a multi-db Django installation, but am not having any luck getting results. The stored procedure (which is on the secondary database) always returns an empty array in Django, but the expected result does appear when executed in a mysql client.
My view.py file
from SomeDBModel import models
from django.db import connection
def index(request, someid):
#Some related django-style query that works here
loc = getLocationPath(someid, 1)
print(loc)
def getLocationPath(id, someval):
cursor = connection.cursor()
cursor.callproc("SomeDB.spGetLocationPath", [id, someval])
results = cursor.fetchall()
cursor.close()
return results
I have also tried:
from SomeDBModel import models
from django.db import connections
def index(request, someid):
#Some related Django-style query that works here
loc = getLocationPath(someid, 1)
print(loc)
def getLocationPath(id, someval):
cursor = connections["SomeDB"].cursor()
cursor.callproc("spGetLocationPath", [id, someval])
results = cursor.fetchall()
cursor.close()
return results
Each time I print out the results, I get:
[]
Example of data that should be retrieved:
{
Path: '/some/path/',
LocalPath: 'S:\Some\local\Path',
Folder: 'SomeFolderName',
Code: 'SomeCode'
}
One thing I also tried was to print the result of cursor.callproc. I get:
(id, someval)
Also, printing the result of cursor._executed gives:
b'SELECT #_SomeDB.spGetLocationPath_arg1, #_SomeDB.spGetLocationPath_arg2'
Which seems to not have any reference to the stored procedure I want to run at all. I have even tried this as a last resort:
cursor.execute("CALL spGetLocationPath("+str(id)+","+str(someval)+")")
but I get an error about needing multi=True, but putting it in the execute() function doesn't seem to work like some sites have suggested, and I don't know where else to put it in Django.
So...any ideas what I missed? How can I get stored procedures to work?
These are the following steps that I took:
Made my stored procedure dump results into a temporary table so as to flatten the result set to a single result set. This got rid of the need for multi=True
In addition, I made sure the user at my IP address had access to call stored procedures in the database itself.
Finally, I continued to research the callproc function. Eventually someone on another site suggested the following code, which worked:
cur = connections["SomeDB"].cursor()
cur.callproc("spGetLocationPath", [id, someval])
res = next(cur.stored_results()).fetchall()
cur.close()