How are reactive streams used in Slick for inserting data

How are reactive streams used in Slick for inserting data - mysql

In Slick's documentation examples for using Reactive Streams are presented just for reading data as a means of a DatabasePublisher. But what happens when you want to use your database as a Sink and backpreasure based on your insertion rate?
I've looked for equivalent DatabaseSubscriber but it doesn't exist. So the question is, if I have a Source, say:
val source = Source(0 to 100)
how can I crete a Sink with Slick that writes those values into a table with schema:
create table NumberTable (value INT)

Serial Inserts
The easiest way would be to do inserts within a Sink.foreach.
Assuming you've used the schema code generation and further assuming your table is named "NumberTable"
//Tables file was auto-generated by the schema code generation
import Tables.{Numbertable, NumbertableRow}
val numberTableDB = Database forConfig "NumberTableConfig"
We can write a function that does the insertion
def insertIntoDb(num : Int) =
numberTableDB run (Numbertable += NumbertableRow(num))
And that function can be placed in the Sink
val insertSink = Sink[Int] foreach insertIntoDb
Source(0 to 100) runWith insertSink
Batched Inserts
You could further extend the Sink methodology by batching N inserts at a time:
def batchInsertIntoDb(nums : Seq[Int]) =
numberTableDB run (Numbertable ++= nums.map(NumbertableRow.apply))
val batchInsertSink = Sink[Seq[Int]] foreach batchInsertIntoDb
This batched Sink can be fed by a Flow which does the batch grouping:
val batchSize = 10
Source(0 to 100).via(Flow[Int].grouped(batchSize))
.runWith(batchInsertSink)

Although you can use a Sink.foreach to achieve this (as mentioned by Ramon) it is safer and likely faster (by running the inserts in parallel) to use the mapAsync Flow. The problem you will face with using Sink.foreach is that it does not have a return value. Inserting into a database via slicks db.run method returns a Future which will then escape out of the steams returned Future[Done] which completes as soon as the Sink.foreach finishes.
implicit val system = ActorSystem("system")
implicit val materializer = ActorMaterializer()
class Numbers(tag: Tag) extends Table[Int](tag, "NumberTable") {
def value = column[Int]("value")
def * = value
}
val numbers = TableQuery[Numbers]
val db = Database.forConfig("postgres")
Await.result(db.run(numbers.schema.create), Duration.Inf)
val streamFuture: Future[Done] = Source(0 to 100)
.runWith(Sink.foreach[Int] { (i: Int) =>
db.run(numbers += i).foreach(_ => println(s"stream 1 insert $i done"))
})
Await.result(streamFuture, Duration.Inf)
println("stream 1 done")
//// sample 1 output: ////
// stream 1 insert 1 done
// ...
// stream 1 insert 99 done
// stream 1 done <-- stream Future[Done] returned before inserts finished
// stream 1 insert 100 done
On the other hand the def mapAsync[T](parallelism: Int)(f: Out ⇒ Future[T]) Flow allows you to run the inserts in parallel via the parallelism paramerter and accepts a function from the upstream out value to a future of some type. This matches our i => db.run(numbers += i) function. The great thing about this Flow is that it then feeds the result of these Futures downstream.
val streamFuture2: Future[Done] = Source(0 to 100)
.mapAsync(1) { (i: Int) =>
db.run(numbers += i).map { r => println(s"stream 2 insert $i done"); r }
}
.runWith(Sink.ignore)
Await.result(streamFuture2, Duration.Inf)
println("stream 2 done")
//// sample 2 output: ////
// stream 2 insert 1 done
// ...
// stream 2 insert 100 done
// stream 1 done <-- stream Future[Done] returned after inserts finished
To prove the point you can even return a real result from the stream rather than a Future[Done] (With Done representing Unit). This stream will also add a higher parallelism value and batching for extra performance. *
val streamFuture3: Future[Int] = Source(0 to 100)
.via(Flow[Int].grouped(10)) // Batch in size 10
.mapAsync(2)((ints: Seq[Int]) => db.run(numbers ++= ints).map(_.getOrElse(0))) // Insert batches in parallel, return insert count
.runWith(Sink.fold(0)(_+_)) // count all inserts and return total
val rowsInserted = Await.result(streamFuture3, Duration.Inf)
println(s"stream 3 done, inserted $rowsInserted rows")
// sample 3 output:
// stream 3 done, inserted 101 rows
Note: You probably won't see better performance for such a small data set, but when I was dealing with a 1.7M insert I was able to get the best performance on my machine with a batch size of 1000 and parallelism value of 8, locally with postgresql. This was about twice as good as not running in parallel. As always when dealing with performance your results may vary and you should measure for yourself.

I find Alpakka's documentation to be excellent and it's DSL to be really easy to work with reactive streams.
This is the docs for Slick: https://doc.akka.io/docs/alpakka/current/slick.html
Inserting example:
Source(0 to 100)
.runWith(
// add an optional first argument to specify the parallelism factor (Int)
Slick.sink(value => sqlu"INSERT INTO NumberTable VALUES(${value})")
)

Related

Inconsistent behaviour when attempting to write Dataframe to CSV in Apache Spark

I'm trying to output the optimal hyperparameters for a decision tree classifier I trained using Spark's MLlib to a csv file using Dataframes and spark-csv. Here's a snippet of my code:
// Split the data into training and test sets (10% held out for testing)
val Array(trainingData, testData) = assembledData.randomSplit(Array(0.9, 0.1))
// Define cross validation with a hyperparameter grid
val crossval = new CrossValidator()
.setEstimator(classifier)
.setEstimatorParamMaps(paramGrid)
.setEvaluator(new BinaryClassificationEvaluator)
.setNumFolds(10)
// Train model
val model = crossval.fit(trainingData)
// Find best hyperparameter combination and create an RDD
val bestModel = model.bestModel
val hyperparamList = new ListBuffer[(String, String)]()
bestModel.extractParamMap().toSeq.foreach(pair => {
val hyperparam: Tuple2[String,String] = (pair.param.name,pair.value.toString)
hyperparamList += hyperparam
})
val hyperparameters = sqlContext.sparkContext.parallelize(hyperparamList.toSeq)
// Print the best hyperparameters
println(bestModel.extractParamMap().toSeq.foreach(pair => {
println(s"${pair.param.parent} ${pair.param.name}")
println(pair.value)
}))
// Define csv path to output results
var csvPath: String = "/root/results/decision-tree"
val hyperparametersPath: String = csvPath+"/hyperparameters"
val hyperparametersFile: File = new File(hyperparametersPath)
val results = (hyperparameters, hyperparametersPath, hyperparametersFile)
// Convert RDD to Dataframe and write it as csv
val dfToSave = spark.createDataFrame(results._1.map(x => Row(x._1, x._2)))
dfToSave.write.format("csv").mode("overwrite").save(results._2)
// Stop spark session
spark.stop()
After finishing a Spark job, I can see the part-00*... and _SUCCESS files inside the path as expected. However, though there are 13 hyperparameters total in this case (confirmed by printing them on screen), cat-ing the csv files shows not every hyperparameter was written to csv:
user#master:~$ cat /root/results/decision-tree/hyperparameters/part*.csv
checkpointInterval,10
featuresCol,features
maxDepth,5
minInstancesPerNode,1
Also, the hyperparameters that do get written change in every execution. This is executed on a HDFS-based Spark cluster with 1 master and 3 workers that have exactly the same hardware. Could it be a race condition? If so, how can I solve it?
Thanks in advance.

I think I figured it out. I expected dfTosave.write.format("csv")save(path) to write everything to the master node, but since the tasks are distributed to all workers, each worker saves its part of the hyperparameters to a local CSV in its filesystem. Because in my case the master node is also a worker, I can see its part of the hyperparameters. The "inconsistent behaviour" (i.e. seeing different parts in each execution) is caused by whatever algorithm Spark uses for distributing partitions among workers.
My solution will be to collect the CSVs from all workers using something like scp or rsync to build the full results.

Why my Spark Streaming program processes so slow?

I am currently writing a Spark Streaming. My task is pretty simple, just receiving json message from kafka and do some text filtering (contains TEXT1, TEXT2, TEXT3, TEXT4). The code looks like:
val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](
ssc, kafkaParams, topics)
messages.foreachRDD { rdd =>
val originrdd = rdd.count()
val record = rdd.map(_._2).filter(x=>x.contains(TEXT1)).filter( x=>x.contains(TEXT2)).filter(x=>x.contains(TEXT3)).filter(x=>x.contains(TEXT4))
val afterrdd = record.count()
println("original number of record: ", originrdd)
println("after filtering number of records:", afterrdd)
}
It is about 4 kb for each JSON message, and around 50000 records from Kafka every 1 second.
The processing time, for the above task, takes 3 seconds for each batch, so it couldn't achieve real-time performance. I have storm for the same tasks, and it performs much faster.

Well, you have made 3 unnecessary RDD's in this process.
val record = rdd.map(_._2).filter(x => {
x.contains(TEXT1) &&
x.contains(TEXT2) &&
x.contains(TEXT3) &&
x.contains(TEXT4)
}
Also, worth reading. https://spark.apache.org/docs/latest/streaming-programming-guide.html#performance-tuning

Shortest path performance in Graphx with Spark

I am creating a graph from a gz compressed json file of edge and vertices type.
I have put the files in a dropbox folder here
I load and map these json records to create the vertices and edge types required by graphx like this:
val vertices_raw = sqlContext.read.json("path/vertices.json.gz")
val vertices = vertices_raw.rdd.map(row=> ((row.getAs[String]("toid").stripPrefix("osgb").toLong),row.getAs[Long]("index")))
val verticesRDD: RDD[(VertexId, Long)] = vertices
val edges_raw = sqlContext.read.json("path/edges.json.gz")
val edgesRDD = edges_raw.rdd.map(row=>(Edge(row.getAs[String]("positiveNode").stripPrefix("osgb").toLong, row.getAs[String]("negativeNode").stripPrefix("osgb").toLong, row.getAs[Double]("length"))))
val my_graph: Graph[(Long),Double] = Graph.apply(verticesRDD, edgesRDD).partitionBy(PartitionStrategy.RandomVertexCut)
I then use this dijkstra implementation I found to compute a shortest path between two vertices:
def dijkstra[VD](g: Graph[VD, Double], origin: VertexId) = {
var g2 = g.mapVertices(
(vid, vd) => (false, if (vid == origin) 0 else Double.MaxValue, List[VertexId]())
)
for (i <- 1L to g.vertices.count - 1) {
val currentVertexId: VertexId = g2.vertices.filter(!_._2._1)
.fold((0L, (false, Double.MaxValue, List[VertexId]())))(
(a, b) => if (a._2._2 < b._2._2) a else b)
._1
val newDistances: VertexRDD[(Double, List[VertexId])] =
g2.aggregateMessages[(Double, List[VertexId])](
ctx => if (ctx.srcId == currentVertexId) {
ctx.sendToDst((ctx.srcAttr._2 + ctx.attr, ctx.srcAttr._3 :+ ctx.srcId))
},
(a, b) => if (a._1 < b._1) a else b
)
g2 = g2.outerJoinVertices(newDistances)((vid, vd, newSum) => {
val newSumVal = newSum.getOrElse((Double.MaxValue, List[VertexId]()))
(
vd._1 || vid == currentVertexId,
math.min(vd._2, newSumVal._1),
if (vd._2 < newSumVal._1) vd._3 else newSumVal._2
)
})
}
g.outerJoinVertices(g2.vertices)((vid, vd, dist) =>
(vd, dist.getOrElse((false, Double.MaxValue, List[VertexId]()))
.productIterator.toList.tail
))
}
I take two random vertex id's:
val v1 = 4000000028222916L
val v2 = 4000000031019012L
and compute the path between them:
val results = dijkstra(my_graph, v1).vertices.map(_._2).collect
I am unable to compute this locally on my laptop without getting a stackoverflow error. I can see that it is using 3 out of 4 cores available. I can load this graph and compute shortest 10 paths per second with the igraph library in Python on exactly the same graph. Is this an inefficient means of computing paths? At scale, on multiple nodes the paths will compute (no stackoverflow error) but it is still 30/40seconds per path computation.

As you can read on the python-igraph github
"It is intended to be as powerful (ie. fast) as possible to enable the
analysis of large graphs."
In order to explain why it is taking 4000x more time on apache-spark than on local python, you may take a look here (A deep dive into performance bottlenecks with Spark PMC member Kay Ousterhout.) to see that it is probably due to a bottleneck:
... beginning with the idea that network and disk I/O are major bottlenecks ...
You may not need to store your data in-memory because the job may not get that much faster. This is saying that if you moved the serialized compressed data from on-disk to in-memory...
you may also see here & here some informations , but best final method is to benchmark your code to know where the bottleneck is.

Slick 3.0 (scala) queries don't return data till they are run multiple times (I think)

So I'm very (extremely) new to Databases and slick and scala, so I was using the example code from their documentation at http://slick.typesafe.com/doc/3.0.0/gettingstarted.html
My problem is that for some reason, I have to run a query multiple times before it returns data. I have to rerun it atleast 3-4 times before it returns results. I use a for-loop to rerun the query and they don't necessarily give me the exact same results each time either.
to create two tables as followed:
class Patients(tag: Tag) extends Table[(String, String, Int, String)](tag, "Patientss") {
def PID = column[String]("Patient Id", O.PrimaryKey)
def Gender = column[String]("Gender")
def Age = column[Int]("Age")
def Ethnicity = column[String]("Ethnicity")
def * = (PID, Gender, Age, Ethnicity)
}
val patientsss = TableQuery[Patients]
class DrugEffect(tag: Tag) extends Table[(String, String, Double)](tag, "DrugEffectss") {
def DrugID = column[String]("Drug ID", O.PrimaryKey)
def PatientID = column[String]("Patient_ID")
def DrugEffectssss = column[Double]("Drug Effect")
def * = (DrugID, PatientID, DrugEffectssss)
def Patient = foreignKey("Patient_FK", PatientID, patientsss)(_.PID)}
val d_effects = TableQuery[DrugEffect]
I then create these tables using
val create_empty = DBIO.seq((patientsss.schema ++ d_effects.schema).create)
val setup_1 = db.run(create_empty)
I have actual data in two text files, which I parse through using a buffered reader.
I store all the Drug ID's in a list creatively named DrugIds
Then, I start filling in the tables in the following way
I first fill in the Patients table:
while (switch != 1) {
val Patient = CurPatient.split("\\s+")
if (Patient(2).toUpperCase() == "NA" || (Patient(2).toFloat % 1 != 0))
age = -1
else age = Patient(2).toInt
val insertPatient: DBIO[Option[Int]] = patientsss ++= Seq(
(Patient(0), Patient(1), age, Patient(3))
)
var future = db.run(insertPatient)
CurPatient = PatientReader.readLine()
if (CurPatient == null)
switch = 1 //switch to 1
}
For the DrugEffects table, I do the following:
while (switch != 1) {
val Effect = CurEffect.split("\\s+")
for (i <- 1 until DrugIds.size - 1) {
if (Effect(i).toUpperCase() == "NA")
d_ef = -1.00
else d_ef = (Effect(i).toFloat).asInstanceOf[Double]
val insertEffect: DBIO[Option[Int]] = d_effects ++= Seq(
(DrugIds(i), Effect(0), d_ef)
)
var future2 = db.run(insertEffect)
}
CurEffect = EffectReader.readLine()
if (CurEffect == null)
switch = 1
}
Then I run a query with the following piece of code
val q1 = for {
c <- patientsss
} yield (c.PID, c.Gender, c.Age, c.Ethnicity)
db.stream(q1.result).foreach(println)
This should just give me all the data in the Patient's table, but it doesn't necessarily do that.
Sometimes, I get the following error (but not always):
java.util.concurrent.RejectedExecutionException: Task slick.backend.DatabaseComponent$DatabaseDef$$anon$3#47089c2c rejected from java.util.concurrent.ThreadPoolExecutor#6453123[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 215]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
at scala.concurrent.impl.ExecutionContextImpl$$anon$1.execute(ExecutionContextImpl.scala:136)
at slick.backend.DatabaseComponent$DatabaseDef$class.scheduleSynchronousStreaming(DatabaseComponent.scala:253)
at slick.jdbc.JdbcBackend$DatabaseDef.scheduleSynchronousStreaming(JdbcBackend.scala:38)
at slick.backend.DatabaseComponent$BasicStreamingActionContext.restartStreaming(DatabaseComponent.scala:516)
at slick.backend.DatabaseComponent$BasicStreamingActionContext.request(DatabaseComponent.scala:531)
at slick.backend.DatabasePublisher$$anon$3$$anonfun$onNext$2.apply(DatabasePublisher.scala:50)
at slick.backend.DatabasePublisher$$anon$3$$anonfun$onNext$2.apply(DatabasePublisher.scala:49)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at scala.concurrent.impl.ExecutionContextImpl$AdaptedForkJoinTask.exec(ExecutionContextImpl.scala:121)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
If I run a more complex query, the data I get back is accurate to the parameters of the query, but the same problems occur, which is that the results are either duplicated or non-existent or not complete (when I rerun the query multiple times).
Explain like I'm 5 if you can, or point me to a resource that can help me solve these problems
EDIT:
bjfletcher's answer worked (Thanks!), but now I have another problem:
Every now and again, the code will fail with the error:
Exception in thread "main" org.h2.jdbc.JdbcSQLException: Table "Patientss" not found; SQL statement:
insert into "Patientss" ("Patient Id","Gender","Age","Ethnicity") values (?,?,?,?) [42102-162]
at org.h2.message.DbException.getJdbcSQLException(DbException.java:329)
at org.h2.message.DbException.get(DbException.java:169)
at org.h2.message.DbException.get(DbException.java:146)
at org.h2.command.Parser.readTableOrView(Parser.java:4758)
at org.h2.command.Parser.readTableOrView(Parser.java:4736)
at org.h2.command.Parser.parseInsert(Parser.java:954)
at org.h2.command.Parser.parsePrepared(Parser.java:375)
at org.h2.command.Parser.parse(Parser.java:279)
at org.h2.command.Parser.parse(Parser.java:251)
at org.h2.command.Parser.prepareCommand(Parser.java:217)
at org.h2.engine.Session.prepareLocal(Session.java:415)
at org.h2.engine.Session.prepareCommand(Session.java:364)
at org.h2.jdbc.JdbcConnection.prepareCommand(JdbcConnection.java:1121)
at org.h2.jdbc.JdbcPreparedStatement.<init>(JdbcPreparedStatement.java:71)
at org.h2.jdbc.JdbcConnection.prepareStatement(JdbcConnection.java:267)
at slick.jdbc.JdbcBackend$SessionDef$class.prepareStatement(JdbcBackend.scala:252)
at slick.jdbc.JdbcBackend$BaseSession.prepareStatement(JdbcBackend.scala:386)
at slick.jdbc.JdbcBackend$SessionDef$class.withPreparedStatement(JdbcBackend.scala:301)
at slick.jdbc.JdbcBackend$BaseSession.withPreparedStatement(JdbcBackend.scala:386)
at slick.driver.JdbcInsertInvokerComponent$BaseInsertInvoker.preparedInsert(JdbcInsertInvokerComponent.scala:177)
at slick.driver.JdbcInsertInvokerComponent$BaseInsertInvoker$$anonfun$internalInsertAll$1.apply(JdbcInsertInvokerComponent.scala:201)
at slick.jdbc.JdbcBackend$BaseSession.withTransaction(JdbcBackend.scala:422)
at slick.driver.JdbcInsertInvokerComponent$BaseInsertInvoker.internalInsertAll(JdbcInsertInvokerComponent.scala:198)
at slick.driver.JdbcInsertInvokerComponent$BaseInsertInvoker.insertAll(JdbcInsertInvokerComponent.scala:194)
at slick.driver.JdbcInsertInvokerComponent$InsertInvokerDef$class.$plus$plus$eq(JdbcInsertInvokerComponent.scala:73)
at slick.driver.JdbcInsertInvokerComponent$BaseInsertInvoker.$plus$plus$eq(JdbcInsertInvokerComponent.scala:152)
at slick.driver.JdbcActionComponent$InsertActionComposerImpl$$anonfun$$plus$plus$eq$1.apply(JdbcActionComponent.scala:459)
at slick.driver.JdbcActionComponent$InsertActionComposerImpl$$anonfun$$plus$plus$eq$1.apply(JdbcActionComponent.scala:459)
at slick.driver.JdbcActionComponent$InsertActionComposerImpl$$anon$8.run(JdbcActionComponent.scala:449)
at slick.driver.JdbcActionComponent$InsertActionComposerImpl$$anon$8.run(JdbcActionComponent.scala:447)
at slick.backend.DatabaseComponent$DatabaseDef$$anon$2.liftedTree1$1(DatabaseComponent.scala:231)
at slick.backend.DatabaseComponent$DatabaseDef$$anon$2.run(DatabaseComponent.scala:231)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Doesn't happen all the time, but very often, and I have no clue what this means

All the DB calls will return to you immediately with Futures, even if they've not finished with their operations. This is asynchronous not synchronous.
You can change your code to accommodate the Futures in one of two ways:
you can use Await.result with all DB calls, to wait at that point until they complete, for example: Await.result(db.run(insertEffect), Duration.Inf)
you can use .map (or .flatMap if you're using another Future from within), with code that you want to run when the DB operation is complete. For example: db.run(insertEffect).map(_ => ... do stuff... )
Have a look another Stack Overflow thread regarding the exception with some ideas as to the cause.

How to call Stored Procedures and defined functions in MySQL with Slick 3.0

I have defined in my db something like this
CREATE FUNCTION fun_totalInvestorsFor(issuer varchar(30)) RETURNS INT
NOT DETERMINISTIC
BEGIN
RETURN (SELECT COUNT(DISTINCT LOYAL3_SHARED_HOLDER_ID)
FROM stocks_x_hldr
WHERE STOCK_TICKER_SIMBOL = issuer AND
QUANT_PURCHASES > QUANT_SALES);
END;
Now I have received an answer from Stefan Zeiger (Slick lead) redirecting me here: User defined functions in Slick
I have tried (having the following object in scope):
lazy val db = Database.forURL("jdbc:mysql://localhost:3306/mydb",
driver = "com.mysql.jdbc.Driver", user = "dev", password = "root")
val totalInvestorsFor = SimpleFunction.unary[String, Int]("fun_totalInvestorsFor")
totalInvestorsFor("APPLE") should be (23)
Result: Rep(slick.lifted.SimpleFunction$$anon$2#13fd2ccd fun_totalInvestorsFor, false) was not equal to 23
I have also tried while having an application.conf in src/main/resources like this:
tsql = {
driver = "slick.driver.MySQLDriver$"
db {
connectionPool = disabled
driver = "com.mysql.jdbc.Driver"
url = "jdbc:mysql://localhost/mydb"
}
}
Then in my code with #StaticDatabaseConfig("file:src/main/resources/application.conf#tsql")
tsql"select fun_totalInvestorsFor('APPLE')" should be (23)
Result: Error:(24, 9) Cannot load #StaticDatabaseConfig("file:src/main/resources/application.conf#tsql"): No configuration setting found for key 'tsql'
tsql"select fun_totalInvestorsFor('APPLE')" should be (23)
^
I am also planning to call stored procedures that return one tuple of three values, via sql"call myProc(v1).as[(Int, Int, Int)]
Any ideas?
EDIT: When making
sql""""SELECT COUNT(DISTINCT LOYAL3_SHARED_HOLDER_ID)
FROM stocks_x_hldr
WHERE STOCK_TICKER_SIMBOL = issuer AND
QUANT_PURCHASES > QUANT_SALES""".as[(Int)]
results in SqlStreamingAction[Vector[Int], Int, Effect] instead of the suggested DBIO[Int] (from what I infer) suggested by the documentation

I've been running into exactly the same problem for the past week. After some extensive research (see my post here, I'll be adding a complete description of what I've done as a solution), I decided it can't be done in Slick... not strictly speaking.
But, I'm resistant to adding pure JDBC or Anorm into our solution stack, so I did find an "acceptable" fix, IMHO.
The solution is to get the session object from Slick, and then use common JDBC to manage the stored function / stored procedure calls. At that point you can use any third party library that makes it easier... although in my case I wrote my own function to set up the call and return a result set.
val db = Database.forDataSource(DB.getDataSource)
var response: Option[GPInviteResponse] = None
db.withSession {
implicit session => {
// Set up your call here... (See my other post for a more detailed
// answer with an example:
// procedure is eg., "{?=call myfunction(?,?,?,?)}"
val cs = session.conn.prepareCall(procedure.toString)
// Set up your in and out parameters here
// eg. cs.setLong(index, value)
val result = cs.execute()
val rc = result.head.asInstanceOf[Int]
response = rc match {
// Package up the response to the caller
}
}
}
db.close()
I know that's pretty terse, but as I said, see the other thread for a more complete posting. I'm putting it together right now and will post the answer shortly.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

How are reactive streams used in Slick for inserting data - mysql

Related

Inconsistent behaviour when attempting to write Dataframe to CSV in Apache Spark

Why my Spark Streaming program processes so slow?

Shortest path performance in Graphx with Spark

Slick 3.0 (scala) queries don't return data till they are run multiple times (I think)

How to call Stored Procedures and defined functions in MySQL with Slick 3.0

Categories

Resources