Use SparkSession.sql() with JDBC

Use SparkSession.sql() with JDBC - mysql

Problem :
I would like to use JDBC connection to make a custom request using spark.
The goal of this query is to optimized memory allocation on workers, because of that I can't use :
ss.read
.format("jdbc")
.option("url", "jdbc:postgresql:dbserver")
.option("dbtable", "schema.tablename")
.option("user", "username")
.option("password", "password")
.load()
Currently :
I am currently trying to run :
ss = SparkSession
.builder()
.appName(appName)
.master("local")
.config(conf)
.getOrCreate()
ss.sql("some custom query")
Configuration :
url=jdbc:mysql://127.0.0.1/database_name
driver=com.mysql.jdbc.Driver
user=user_name
password=xxxxxxxxxx
Error :
[info] Exception encountered when attempting to run a suite with class name: db.TestUserProvider *** ABORTED ***
[info] org.apache.spark.sql.AnalysisException: Table or view not found: users; line 1 pos 14
[info] at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
[info] at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupTableFromCatalog(Analyzer.scala:459)
[info] at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:478)
[info] at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$8.applyOrElse(Analyzer.scala:463)
[info] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
[info] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:61)
[info] at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
[info] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:60)
[info] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:58)
[info] at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:58)
Assumption :
I guess there is a configuration error, but I can't find out where.

Spark can read and write data to/from relational databases using the JDBC data source (like you did in your first code example).
In addition (and completely separately), spark allows using SQL to query views that were created over data that was already loaded into a DataFrame from some source. For example:
val df = Seq(1,2,3).toDF("a") // could be any DF, loaded from file/JDBC/memory...
df.createOrReplaceTempView("my_spark_table")
spark.sql("select a from my_spark_table").show()
Only "tables" (called views, as of Spark 2.0.0) created this way can be queried using SparkSession.sql.
If your data is stored in a relational database, Spark will have to read it from there first, and only then would it be able to execute any distributed computation on the loaded copy. Bottom line - we can load the data from the table using read, create a temp view, and then query it:
ss.read
.format("jdbc")
.option("url", "jdbc:mysql://127.0.0.1/database_name")
.option("dbtable", "schema.tablename")
.option("user", "username")
.option("password", "password")
.load()
.createOrReplaceTempView("my_spark_table")
// and then you can query the view:
val df = ss.sql("select * from my_spark_table where ... ")

Related

Apache Spark SQL get_json_object java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String

I am trying to read a json stream from an MQTT broker in Apache Spark with structured streaming, read some properties of an incoming json and output them to the console. My code looks like that:
val spark = SparkSession
.builder()
.appName("BahirStructuredStreaming")
.master("local[*]")
.getOrCreate()
import spark.implicits._
val topic = "temp"
val brokerUrl = "tcp://localhost:1883"
val lines = spark.readStream
.format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
.option("topic", topic).option("persistence", "memory")
.load(brokerUrl)
.toDF().withColumn("payload", $"payload".cast(StringType))
val jsonDF = lines.select(get_json_object($"payload", "$.eventDate").alias("eventDate"))
val query = jsonDF.writeStream
.format("console")
.start()
query.awaitTermination()
However, when the json arrives I get the following errors:
Exception in thread "main" org.apache.spark.sql.streaming.StreamingQueryException: Writing job aborted.
=== Streaming Query ===
Identifier: [id = 14d28475-d435-49be-a303-8e47e2f907e3, runId = b5bd28bb-b247-48a9-8a58-cb990edaf139]
Current Committed Offsets: {MQTTStreamSource[brokerUrl: tcp://localhost:1883, topic: temp clientId: paho7247541031496]: -1}
Current Available Offsets: {MQTTStreamSource[brokerUrl: tcp://localhost:1883, topic: temp clientId: paho7247541031496]: 0}
Current State: ACTIVE
Thread State: RUNNABLE
Logical Plan:
Project [get_json_object(payload#22, $.id) AS eventDate#27]
+- Project [id#10, topic#11, cast(payload#12 as string) AS payload#22, timestamp#13]
+- StreamingExecutionRelation MQTTStreamSource[brokerUrl: tcp://localhost:1883, topic: temp clientId: paho7247541031496], [id#10, topic#11, payload#12, timestamp#13]
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:300)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189)
Caused by: org.apache.spark.SparkException: Writing job aborted.
at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:92)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan.getByteArrayRdd(SparkPlan.scala:247)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:296)
at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3384)
at org.apache.spark.sql.Dataset.$anonfun$collect$1(Dataset.scala:2783)
at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:3365)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3365)
at org.apache.spark.sql.Dataset.collect(Dataset.scala:2783)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$15(MicroBatchExecution.scala:537)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$14(MicroBatchExecution.scala:533)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:351)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:349)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runBatch(MicroBatchExecution.scala:532)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$2(MicroBatchExecution.scala:198)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:351)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:349)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:166)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:160)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279)
... 1 more
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 8, localhost, executor driver): java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String
at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String(rows.scala:46)
at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String$(rows.scala:46)
at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:619)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$2(WriteToDataSourceV2Exec.scala:117)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:116)
at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.$anonfun$doExecute$2(WriteToDataSourceV2Exec.scala:67)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:405)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:1887)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:1875)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:1874)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1874)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:926)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2108)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2057)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2046)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:64)
... 34 more
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String
at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String(rows.scala:46)
at org.apache.spark.sql.catalyst.expressions.BaseGenericInternalRow.getUTF8String$(rows.scala:46)
at org.apache.spark.sql.catalyst.expressions.GenericInternalRow.getUTF8String(rows.scala:195)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:619)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$2(WriteToDataSourceV2Exec.scala:117)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:116)
at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.$anonfun$doExecute$2(WriteToDataSourceV2Exec.scala:67)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:121)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:405)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
I am sending the JSON records using mosquitto broker and they look like this:
mosquitto_pub -m '{"eventDate": "2020-11-11T15:17:00.000+0200"}' -t "temp"

It seems that every strings coming from Bahir stream source provider raise this error. For instance the following code also raises this error :
spark.readStream
.format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
.option("topic", topic).option("persistence", "memory")
.load(brokerUrl)
.select("topic")
.writeStream
.format("console")
.start()
It looks like Spark does not recognize strings coming from Bahir, maybe some kind of weird string class version issue. I've tried the following actions to make the code work:
setup java version to 8
upgrade spark version from 2.4.0 to 2.4.7
setup scala version to 2.11.12
use decode function with all possible encoding combinations instead of .cast(StringType) to transform column "payload" to String
use substring function on column "payload" to recreate a compatible String.
Finally, I got working code by recreating the string using constructor and dataset:
val lines = spark.readStream
.format("org.apache.bahir.sql.streaming.mqtt.MQTTStreamSourceProvider")
.option("topic", topic).option("persistence", "memory")
.load(brokerUrl)
.select("payload")
.as[Array[Byte]]
.map(payload => new String(payload))
.toDF("payload")
This solution is rather ugly but at least it works.
I believe that there is nothing wrong with the code provided in the question and I suspect a bug on Bahir or Spark side preventing Spark to handle String from Bahir source.

com.mysql.jdbc.Driver not found in spark2 scala

I am using Jupyter Notebook with Scala kernel, below is my code to import mysql table to a dataframe:
val sql="""select * from customer"""
val df_customer = spark.read
.format("jdbc")
.option("url", "jdbc:mysql://localhost:3306/ccfd")
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", s"( $sql ) t")
.option("user", "root")
.option("password", "xxxxxxx")
.load()
Below is the error:
Name: java.lang.ClassNotFoundException
Message: com.mysql.jdbc.Driver
StackTrace: at scala.reflect.internal.util.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:62)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:45)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions$$anonfun$6.apply(JDBCOptions.scala:79)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:79)
at org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions.<init>(JDBCOptions.scala:35)
at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:34)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:340)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:239)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:227)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:164)
Can anyone share a working code snippet here? I am using Spark2, session named spark is ready when I start the kernel in a new notebook.
Thank you in advance.

Testing a DSPComplex ROM

I'm working on building a DSPComplex ROM still and have hit what I think may be an actual Chisel problem.
I've built the ROM, can generate a verilog output from the code that looks reasonable, but can't seem to test the module with even the most basic of testers. I've simplified it below to the most basic checking.
The error is a stack overflow like the following:
$ sbt 'testOnly taylor.TaylorTest'
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=8G; support was removed in 8.0
[info] Loading settings from plugins.sbt ...
[info] Loading project definition from /home/jcondley/Zendar/kodo/ZenFPGA/chisel/project
[info] Loading settings from build.sbt ...
[info] Set current project to zen-chisel (in build file:/home/jcondley/Zendar/kodo/ZenFPGA/chisel/)
[info] Compiling 1 Scala source to /home/jcondley/Zendar/kodo/ZenFPGA/chisel/target/scala-2.12/classes ...
[warn] there were 5 feature warnings; re-run with -feature for details
[warn] one warning found
[info] Done compiling.
[info] Compiling 1 Scala source to /home/jcondley/Zendar/kodo/ZenFPGA/chisel/target/scala-2.12/test-classes ...
[warn] there were two deprecation warnings (since chisel3, will be removed by end of 2017); re-run with -deprecation for details
[warn] there were two feature warnings; re-run with -feature for details
[warn] two warnings found
[info] Done compiling.
[info] [0.004] Elaborating design...
[deprecated] DspComplex.scala:22 (1029 calls): isLit is deprecated: "isLit is deprecated, use litOption.isDefined"
[deprecated] DspComplex.scala:22 (1029 calls): litArg is deprecated: "litArg is deprecated, use litOption or litTo*Option"
[deprecated] DspComplex.scala:23 (1029 calls): isLit is deprecated: "isLit is deprecated, use litOption.isDefined"
[deprecated] DspComplex.scala:23 (1029 calls): litArg is deprecated: "litArg is deprecated, use litOption or litTo*Option"
[warn] There were 4 deprecated function(s) used. These may stop compiling in a future release - you are encouraged to fix these issues.
[warn] Line numbers for deprecations reported by Chisel may be inaccurate; enable scalac compiler deprecation warnings via either of the following methods:
[warn] In the sbt interactive console, enter:
[warn] set scalacOptions in ThisBuild ++= Seq("-unchecked", "-deprecation")
[warn] or, in your build.sbt, add the line:
[warn] scalacOptions := Seq("-unchecked", "-deprecation")
[info] [1.487] Done elaborating.
Total FIRRTL Compile Time: 1887.8 ms
Total FIRRTL Compile Time: 770.6 ms
End of dependency graph
Circuit state created
[info] TaylorTest:
[info] TaylorWindow
[info] taylor.TaylorTest *** ABORTED ***
[info] java.lang.StackOverflowError:
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.evaluate(LoFirrtlExpressionEvaluator.scala:264)
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.$anonfun$resolveDependency$1(LoFirrtlExpressionEvaluator.scala:453)
[info] at firrtl_interpreter.Timer.apply(Timer.scala:40)
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.resolveDependency(LoFirrtlExpressionEvaluator.scala:445)
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.getValue(LoFirrtlExpressionEvaluator.scala:81)
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.evaluate(LoFirrtlExpressionEvaluator.scala:304)
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.$anonfun$resolveDependency$1(LoFirrtlExpressionEvaluator.scala:453)
[info] at firrtl_interpreter.Timer.apply(Timer.scala:40)
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.resolveDependency(LoFirrtlExpressionEvaluator.scala:445)
[info] at firrtl_interpreter.LoFirrtlExpressionEvaluator.getValue(LoFirrtlExpressionEvaluator.scala:81)
[info] ...
This looks suspiciously like another ROM issue from here:
https://github.com/freechipsproject/chisel3/issues/642
but trying Chick's response here:
export SBT_OPTS="-Xmx2G -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=2G -Xss2M -Duser.timezone=GMT"
seems to not solve the issue (and one of the options, MaxPermSize is ignored)
Is this a legitimate Chisel bug with ROMs or is something else going on here?
Actual module with the ROM:
package taylor
import chisel3._
import chisel3.util._
import chisel3.experimental.FixedPoint
import dsptools.numbers._
import scala.io.Source
class TaylorWindow(len: Int, window: Seq[FixedPoint]) extends Module {
val io = IO(new Bundle {
val d_valid_in = Input(Bool())
val sample = Input(DspComplex(FixedPoint(16.W, 8.BP), FixedPoint(16.W, 8.BP)))
val windowed_sample = Output(DspComplex(FixedPoint(32.W, 8.BP), FixedPoint(32.W, 8.BP)))
val d_valid_out = Output(Bool())
})
val win_coeff = VecInit(window.map(x=>DspComplex.wire(x, FixedPoint(0, 16.W, 8.BP))).toSeq) // ROM storing our coefficients.
io.d_valid_out := io.d_valid_in
val counter = RegInit(UInt(10.W), 0.U)
// Implicit reset
io.windowed_sample:= io.sample * win_coeff(counter)
when(io.d_valid_in) {
counter := counter + 1.U
}
}
object TaylorDriver extends App {
val filename = "src/test/test_data/taylor_coeffs"
val coeff_file = Source.fromFile(filename).getLines
val double_coeffs = coeff_file.map(x => x.toDouble)
val fp_coeffs = double_coeffs.map(x => FixedPoint.fromDouble(x, 16.W, 8.BP))
val fp_seq = fp_coeffs.toSeq
chisel3.Driver.execute(args, () => new TaylorWindow(1024, fp_seq))
}
Tester Code:
package taylor
import chisel3._
import chisel3.util._
import chisel3.experimental.FixedPoint
import dsptools.numbers.implicits._
import scala.io.Source
import chisel3.iotesters
import chisel3.iotesters.{ChiselFlatSpec, Driver, PeekPokeTester}
class TaylorWindowUnitTest(dut: TaylorWindow) extends PeekPokeTester(dut) {
val filename = "src/test/test_data/taylor_coeffs"
val coeff_file = Source.fromFile(filename).getLines
val double_coeffs = coeff_file.map(x => x.toDouble)
val fp_coeffs = double_coeffs.map(x => FixedPoint.fromDouble(x, 16.W, 8.BP))
val fp_seq = fp_coeffs.toSeq
poke(dut.io.d_valid_in, Bool(false))
expect(dut.io.d_valid_out, Bool(false))
}
class TaylorTest extends ChiselFlatSpec {
behavior of "TaylorWindow"
backends foreach {backend =>
it should s"test the basic Taylow Window" in {
Driver(() => new TaylorWindow(1024, getSeq()), backend)(c => new TaylorWindowUnitTest(c)) should be (true)
}
}
def getSeq() : Seq[FixedPoint] = {
val filename = "src/test/test_data/taylor_coeffs"
val coeff_file = Source.fromFile(filename).getLines
val double_coeffs = coeff_file.map(x => x.toDouble)
val fp_coeffs = double_coeffs.map(x => FixedPoint.fromDouble(x, 16.W, 8.BP))
fp_coeffs.toSeq
}
}

This looks like a failure in the firrtl-interpreter (one of the Scala based Chisel simulator) which can have problems with a number of large firrtl constructs. If you have verilator installed can you try changing
backends foreach {backend =>
to
Seq("verilator") foreach {backend =>
and see what happens. Another think to try is Treadle which is the new version of the interpreter and is not in full release yet but is available in snapshot versions of the chisel ecosystem. It should be able to handle this also.

Using JBDC to read sql file in spark scala collecting Warehouse error

I am trying to read MySQL file using Spark Scala. Following is the code I tried
val dataframe_mysql = sqlContext.read.format("jdbc")
.option("url","jdbc:mysql://xx.xx.xx.xx:xx")
.option("driver", "com.mysql.jdbc.Driver")
.option("dbtable", "schema.xxxx")
.option("user", "xxxx").option("password", "xxxxx").load()
but I am collecting Warehouse path error as following:
Warehouse path is 'file:/C:/Users/Owner/eclipse-workspace/stProject/spark-ware‌house/'. Exception in thread "main" java.lang.NullPointerException at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.res‌olveTable(JDBCRDD.sc‌ala:72) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation‌.(JDBCRelation‌.scala:113) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelation‌Provider.createRelat‌ion(JdbcRelationProv‌ider.scala:45) at

Best way to pass the schema name as a variable to a query

I have a PlayFramework server (with Anorm) which operates against a database with several schemas, all of them with the same tables.
Most of my "access to database" functions look like:
def findById(zoneName: String, id: Long): Option[Employee] = {
DB.withConnection { implicit connection =>
SQL("""select *
from """+zoneName+"""employee
where employee._id = {id}"""
.on(
'_id -> id
).as(simpleParser.singleOpt)
}
}
But I know this is a wrong approach, because it is not SQL-Injection-safe and of course it is tedious to write in every function.
I want to use String interpolation to correct this, it works well with my id variable, but it doesn't with zoneName:
def findById(zoneName: String, id: Long): Option[Employee] = {
DB.withConnection { implicit connection =>
SQL"""select *
from $zoneName.employee
where employee._id = 1"""
.as(simpleParser.singleOpt)
}
}
Gives me:
info] ! #6lenhal6c - Internal server error, for (GET) [/limbo/br/employee/1] ->
[info]
[info] play.api.Application$$anon$1: Execution exception[[PSQLException: ERROR: syntax error at or near «$1»
[info] Position: 25]]
[info] at play.api.Application$class.handleError(Application.scala:296) ~[play_2.11-2.3.8.jar:2.3.8]
[info] at play.api.DefaultApplication.handleError(Application.scala:402) [play_2.11-2.3.8.jar:2.3.8]
[info] at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$3$$anonfun$applyOrElse$4.apply(PlayDefaultUpstreamHandler.scala:320) [play_2.11-2.3.8.jar:2.3.8]
[info] at play.core.server.netty.PlayDefaultUpstreamHandler$$anonfun$3$$anonfun$applyOrElse$4.apply(PlayDefaultUpstreamHandler.scala:320) [play_2.11-2.3.8.jar:2.3.8]
[info] at scala.Option.map(Option.scala:146) [scala-library-2.11.5.jar:na]
[info] Caused by: org.postgresql.util.PSQLException: ERROR: syntax error at or near «$1»
[info] Position: 25
[info] at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2198) ~[postgresql-9.3-1102.jdbc4.jar:na]
[info] at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1927) ~[postgresql-9.3-1102.jdbc4.jar:na]
[info] at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) ~[postgresql-9.3-1102.jdbc4.jar:na]
[info] at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:561) ~[postgresql-9.3-1102.jdbc4.jar:na
Tested also with ${zoneName} with the same result.
Any help or advice about how to write this would be appreciated, thank you in advance!

Using Anorm String interpolation, any $expression is to be provided a parameter, that is to say if it's a string it will quoted by the JDBC driver.
If you want to substitute part of the SQL statement with string (e.g. dynamic schema), either you can use concatenation, or since latest versions (2.4.0-M3 or 2.3.8) the syntax #$expr.
val table = "myTable"
SQL"SELECT * FROM #$table WHERE id=$id"
// SELECT * FROM myTable WHERE id=?

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Use SparkSession.sql() with JDBC - mysql

Related

Apache Spark SQL get_json_object java.lang.String cannot be cast to org.apache.spark.unsafe.types.UTF8String

com.mysql.jdbc.Driver not found in spark2 scala

Testing a DSPComplex ROM

Using JBDC to read sql file in spark scala collecting Warehouse error

Best way to pass the schema name as a variable to a query

Categories

Resources