How do you setup a MySQL connection for Orange? - mysql

Has anyone ever setup a SQL connection for Orange? The API (https://docs.biolab.si//3/data-mining-library/reference/data.sql.html) does not provide any decent examples, from my read of things. If you could point me to a link or show me an example connection object in Python, that would be great. I am trying to do some CN2 classification on a table in my MySQL database.

It is possible using a ODBC connector:
from pyodbc import connect
connector = connect('Driver={MySQL ODBC 5.3 Unicode Driver};'
'Server=server name or IP;'
'Database=database name;'
'UID=User;'
'PWD=password;')
cursor = connector.cursor() # Creation of the cursor for data swept.
# Execution of the SQL Query.
cursor.execute("SELECT id, data1, data2 FROM table1")
# All data of "table1" are saved in "data".
data = cursor.fetchall()

Related

Why does R upload data much faster than KNIME or Workbench?

What I want to know is, what the heck happens, under the hoods, when I upload data through R and it turns to be way much faster than MySQL Workbench or KNIME?
I work with data and, everyday, I upload data into a MySQL server. I used to upload data using KNIME since it was much faster than uploading with MySQL Workbench (select the table -> "import data").
Some infos: The CSV has 4000 rows and 15 columns. The library I used in R is RMySQL. The node I used in KNIME is database writer.
library('RMySQL')
df=read.csv('C:/Users/my_user/Documents/file.csv', encoding = 'UTF-8', sep=';')
connection <- dbConnect(
RMySQL::MySQL(),
dbname = "db_name",
host = "yyy.xxxxxxx.com",
user = "vitor",
password = "****"
)
dbWriteTable(connection, "table_name", df, append=TRUE, row.names=FALSE)
So, to test, I did the exact same process, using the same file. It took 2 minutes in KNIME and only seconds in R.
Everything happens under the hood! Data upload to DB depends on parameters such as interface between DB and tool, network connectivity, batch size set, memory available for tool and tool data processing speed itself and probably some more. In your case RMySQL package uses batch size of 500 by default and KNIME only 1 so probably that is where the difference comes from. Try setting it to 500 in KNIME and then compare. Have no clue how MySQL Workbench works...

R MySQL to no-default schema

I am trying to utilize dbWriteTable to write from R to MySQL. When I connect to MySQL I created an odbc connection so I can just utilize the command:
abc <- DBI::dbConnect(odbc::odbc(),
dsn = "mysql_conn")
In which I can see all my schemas for the MySQL instance. This is great when I want to read in data such as:
test_query <- dbSendQuery(abc, "SELECT * FROM test_schema.test_file")
test_query <- dbFetch(test_query)
The problem I have is when I want to create a new table in one of the schema's how to declare the schema I want to write to in
dbWriteTable(abc, value = new_file, name = "new_file", overwrite=T)
I imagine I have to define the test_schema in the dbWriteTable portion but haven't been able to get it to work. Thoughts?

How can I connect to a MySQL database into Apache Spark using SparkR?

I am working on Spark 2.0 and SparkR libs. I want to get a sample code on how can I do following things in SparkR?
Connect to a MySQL or any other SQL database using SparkR.
Write SQL queries like SELECT , UPDATE etc. to modify a table in that database.
I know to do it using R. However I would need some help to use Spark Sessions or SparkSQL context. I am using R Studio for the development.
Moreover, how do we submit this R code as Spark Batch to run continuously at a regular intervals?
jdbcurl <- "jdbc:mysql://xxx.xxx.x.x:xxxx/database"
data <- read.jdbc(jdbcurl, "tablename", user = "user", password = "password" )

How to parsimoniously refer to a data frame in RMySQL

I have a MySQL table that I am reading with the RMySQL package of R. I would like to be able to directly refer to the data frame stored in the table so I can seamlessly interact with it rather than having to execute RMySQL statement every time I want to do something. Is there a way to accomplish this? I tried:
data <- dbReadTable(conn = con, name = 'tablename')
For example, if I now want to check how many rows I have in this table I would run:
nrow(data)
Does this go through the database connection, or am I now storing the object "data" locally, defeating the whole purpose of using an external database?
data <- dbReadTable(conn = con, name = 'tablename')
This command downloads all the data into a local R dataframe (assuming you have enough RAM). Any operations with data from that point forward do not require the SQL connection.

Cannot switch a connection string in LLBL Gen Pro

I have two databases with the same schema inside a Sql 2008 R2 Server, of which names are Database1 and Database2. I connected and performed queries on the Database1, and then changed to Database2 to fetch my entities using the following code
this.ConnectionString = "Server=TestServer; Database=Database2;Trusted_Connection=true";
using (IDataAccessAdapter adapter = new DataAccessAdapter(this.ConnectionString))
{
var entities = new EntityCollection<T>();
adapter.FetchEntityCollection(entities, null);
return entities;
}
(The connection string was set before executing the code).
I debugged the application and looked at the value of the connection string, it pointed to the Database2.
However, when I executed the above code, the result was return from the Database1. And if I looked at SQL Profiler, the statement was executed against Database1.
So, could anyone know what was going on? Why the query was executed against the Database1, not Database2.
PS: If I used the above connection string with plain ADO.NET, I was able to retrieve data from Database2.
Thanks in advance.
I have figured out what was going on. The reason was: by default LLBL Gen Pro uses fully qualified names like [database1].[dbo].[Customer] to access database objects, and the catalog is specified when generating entities. So you can't access objects just by changing the connection string.
Hence, to change to another database you have to override the default catalogue by using following code
var adapter= new DataAccessAdapter(ConnectionString, false,
CatalogNameUsage.ForceName, DbName)
{CommandTimeOut = TenMinutesTimeOut};
More information can be found at the following link