I'm having some difficulty with delete.many and update.many using the new builders whilst trying to convert my previous version's (working) code into reactivemongo 0.16.5 ("org.reactivemongo" %% "play2-reactivemongo" % "0.16.5-play26", "org.reactivemongo" %% "reactivemongo-akkastream" % "0.16.5". As you'll see; I'm using this within the Play plugin so dealing with JSON (rather than BSON)
I'm going from the official documentation here. My errors are similar for both update & delete so I'll just post for update here to keep it trim.
Update command
def updateMany(collName: String)(quJsa: JsArray)(orderedBool: Boolean = false): Future[MultiBulkWriteResult] = {
lazy val updateBuilder = getCollection(collName).map(_.update(orderedBool))
quJsa.asOpt[Seq[JsObject]].map(
_.map(
x => x.as[MongoUpdateBuilder]
)
).map(
_.map(
x => updateBuilder.flatMap(
_.element(q = x.q, u = x.getSetUpdate, upsert = x.upsertBool, multi = x.multiBool)
)
)
).map(
x => Future.sequence(x)
).map(
_.flatMap(
x => updateBuilder.flatMap(
_.many(x)
)
)
).getOrElse(getMultiBulkWriteResultErrorF("input not recognised as jsarr"))
}
Custom update builder model
case class MongoUpdateBuilder(q: JsObject, u: JsObject, upsertBool: Boolean, multiBool: Boolean) {
def getSetUpdate = Json.obj("$set" -> u)
}
object MongoUpdateBuilder {
implicit val mongoUpdateBuilderFormat = Json.format[MongoUpdateBuilder]
}
Error container
def getMultiBulkWriteResultErrorF(errStr: String): Future[MultiBulkWriteResult] = {
val mbwr = MultiBulkWriteResult(
ok = false,
n = 0,
nModified = 0,
upserted = Seq(),
writeErrors = Seq(WriteError(index = 0, code = 404, errmsg = errStr)),
writeConcernError = None,
code = Some(404),
errmsg = Some(errStr),
totalN = 0
)
Future.successful(mbwr)
}
And the main issue:
no type parameters for method flatMap: (f: reactivemongo.play.json.collection
.JSONCollection#UpdateBuilder => scala.concurrent.Future[S])(implicit executor: scala.concurrent.ExecutionContext)scala.concurrent.Future[S] exist so that it can be applied to arguments (reactivemongo.play.json.collection.JSONCollection#UpdateBuilder => scala.concurrent.Future[_1.UpdateCommand.UpdateElement] forSome { val _1: reactivemongo.play.json.collection.J
SONCollection }) [error] --- because ---
[error] argument expression's type is not compatible with formal parameter type;
[error] found : reactivemongo.play.json.collection.JSONCollection#UpdateBuilder => scala.concurrent.Future[_1.UpdateCommand.UpdateElement] forSome { val _1: reactivemongo.play.jso
n.collection.JSONCollection }
[error] required: reactivemongo.play.json.collection.JSONCollection#UpdateBuilder => scala.concurrent.Future[?S]
[error] x => updateBuilder.flatMap(
So the issue seems to be this line - updateBuilder.flatMap. The Future cannot be flattened with these types (JSONCollection#UpdateBuilder & JSONCollection#UpdateCommand.UpdateElement). So I'm struggling with this one. Please reach out if you can see the issue here. Many thanks!
I'm trying to build a power analysis tool in Scala. In particular, I'm trying to create a function which does return the required sample size given some restrictions (statistical power, effect size, the ratio between the two sample sizes, significance level and the type of hypothesis test we would like to run).
I wrote the following code:
import org.apache.commons.math3.analysis.UnivariateFunction
import org.apache.commons.math3.analysis.solvers.BrentSolver
import org.apache.commons.math3.distribution.NormalDistribution
import scala.math._
object normal_sample_size extends App {
val theta1 = 0.0025
val theta2 = theta1 * 1.05
val split = 0.50
val power = 0.80
val significance = 0.05
val alternative = "two sided"
val result = requiredSizeCalculationNormal(effectSizeCalculation(theta1, theta2), split, power, significance, alternative)
println(result)
def effectSizeCalculation(mu1: Double, mu2: Double): Double = {
2 * math.asin(math.sqrt(mu1)) - 2 * math.asin(math.sqrt(mu2))
}
def requiredSizeCalculationNormal(effect_size: Double, split_ratio: Double, power: Double, significance: Double, alternative: String): Unit = {
if (alternative == "two sided") {
val z_score = new NormalDistribution().inverseCumulativeProbability(significance / 2.0)
val func = new UnivariateFunction {def value(n: Double) = (1 - new NormalDistribution().cumulativeProbability(z_score.abs - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))) +
new NormalDistribution().cumulativeProbability(z_score - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))}
val solver = new BrentSolver()
try {solver.solve(500, func, 0.001, 1000000)} catch {case _:Throwable => None}
} else if (alternative == "less") {
//val z_score = new NormalDistribution().inverseCumulativeProbability(significance)
//new NormalDistribution().cumulativeProbability(z_score - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))
} else if (alternative == "greater") {
//val z_score = new NormalDistribution().inverseCumulativeProbability(significance).abs
//1 - new NormalDistribution().cumulativeProbability(z_score - effect_size * sqrt(n * split_ratio * n * (1 - split_ratio)))
} else {
println("Error message: Specify the right alternative. Possible values are 'two sided', 'less', 'greater'")
}
}
}
requiredSizeCalculationNormal() recognises what type of hypothesis testing is dealing with ("two sided", "greater" or "less") and then computes the required sample size. This last point is where I'm struggling! Using Brent's Method (from math3 library) I would like to find the value of n which solve the function.
Going through the code, focusing only on the "two sided" part of my function (I'll complete "greater" and "less" in a second stage), I've tried to create a function named func. Then using the command try {solver.solve(500, func, 0.001, 1000000)} catch {case _:Throwable => None} I'm trying to solve it using max 500 iterations and within the interval (0.001, 1000000). Unfortunately the command doesn't return neither a result nor an error message.
EDIT: I've removed from the code quoted above the //catch {case _:Throwable => None} (None was redundant anyway). That piece of code was actually catching all exceptions and doing nothing with them.
The exceptions I see running the code are:
Exception in thread "main" org.apache.commons.math3.exception.NoBracketingException: function values at endpoints do not have different signs, endpoints: [2, 10,000,000], values: [0.05, 1]
at org.apache.commons.math3.analysis.solvers.BrentSolver.doSolve(BrentSolver.java:118)
at org.apache.commons.math3.analysis.solvers.BaseAbstractUnivariateSolver.solve(BaseAbstractUnivariateSolver.java:190)
at org.apache.commons.math3.analysis.solvers.BaseAbstractUnivariateSolver.solve(BaseAbstractUnivariateSolver.java:195)
at normal_sample_size$.requiredSizeCalculationNormal(normal_sample_size.scala:46)
at normal_sample_size$.delayedEndpoint$normal_sample_size$1(normal_sample_size.scala:31)
at normal_sample_size$delayedInit$body.apply(normal_sample_size.scala:22)
at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.App$$anonfun$main$1.apply(App.scala:76)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:35)
at scala.App$class.main(App.scala:76)
at normal_sample_size$.main(normal_sample_size.scala:22)
at normal_sample_size.main(normal_sample_size.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:483)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Process finished with exit code 1
So the problem seems to be that the function doesn't change sign within the specified interval.
What is a standard way of profiling Scala method calls?
What I need are hooks around a method, using which I can use to start and stop Timers.
In Java I use aspect programming, aspectJ, to define the methods to be profiled and inject bytecode to achieve the same.
Is there a more natural way in Scala, where I can define a bunch of functions to be called before and after a function without losing any static typing in the process?
Do you want to do this without changing the code that you want to measure timings for? If you don't mind changing the code, then you could do something like this:
def time[R](block: => R): R = {
val t0 = System.nanoTime()
val result = block // call-by-name
val t1 = System.nanoTime()
println("Elapsed time: " + (t1 - t0) + "ns")
result
}
// Now wrap your method calls, for example change this...
val result = 1 to 1000 sum
// ... into this
val result = time { 1 to 1000 sum }
In addition to Jesper's answer, you can automatically wrap method invocations in the REPL:
scala> def time[R](block: => R): R = {
| val t0 = System.nanoTime()
| val result = block
| println("Elapsed time: " + (System.nanoTime - t0) + "ns")
| result
| }
time: [R](block: => R)R
Now - let's wrap anything in this
scala> :wrap time
wrap: no such command. Type :help for help.
OK - we need to be in power mode
scala> :power
** Power User mode enabled - BEEP BOOP SPIZ **
** :phase has been set to 'typer'. **
** scala.tools.nsc._ has been imported **
** global._ and definitions._ also imported **
** Try :help, vals.<tab>, power.<tab> **
Wrap away
scala> :wrap time
Set wrapper to 'time'
scala> BigDecimal("1.456")
Elapsed time: 950874ns
Elapsed time: 870589ns
Elapsed time: 902654ns
Elapsed time: 898372ns
Elapsed time: 1690250ns
res0: scala.math.BigDecimal = 1.456
I have no idea why that printed stuff out 5 times
Update as of 2.12.2:
scala> :pa
// Entering paste mode (ctrl-D to finish)
package wrappers { object wrap { def apply[A](a: => A): A = { println("running...") ; a } }}
// Exiting paste mode, now interpreting.
scala> $intp.setExecutionWrapper("wrappers.wrap")
scala> 42
running...
res2: Int = 42
This what I use:
import System.nanoTime
def profile[R](code: => R, t: Long = nanoTime) = (code, nanoTime - t)
// usage:
val (result, time) = profile {
/* block of code to be profiled*/
}
val (result2, time2) = profile methodToBeProfiled(foo)
There are three benchmarking libraries for Scala that you can avail of.
Since the URLs on the linked site are likely to change, I am pasting the relevant content below.
SPerformance - Performance Testing framework aimed at automagically comparing performance tests and working inside Simple Build Tool.
scala-benchmarking-template - SBT template project for creating Scala (micro-)benchmarks based on Caliper.
Metrics - Capturing JVM- and application-level metrics. So you know what's going on
testing.Benchmark might be useful.
scala> def testMethod {Thread.sleep(100)}
testMethod: Unit
scala> object Test extends testing.Benchmark {
| def run = testMethod
| }
defined module Test
scala> Test.main(Array("5"))
$line16.$read$$iw$$iw$Test$ 100 100 100 100 100
I use a technique that's easy to move around in code blocks. The crux is that the same exact line starts and ends the timer - so it is really a simple copy and paste. The other nice thing is that you get to define what the timing means to you as a string, all in that same line.
Example usage:
Timelog("timer name/description")
//code to time
Timelog("timer name/description")
The code:
object Timelog {
val timers = scala.collection.mutable.Map.empty[String, Long]
//
// Usage: call once to start the timer, and once to stop it, using the same timer name parameter
//
def timer(timerName:String) = {
if (timers contains timerName) {
val output = s"$timerName took ${(System.nanoTime() - timers(timerName)) / 1000 / 1000} milliseconds"
println(output) // or log, or send off to some performance db for analytics
}
else timers(timerName) = System.nanoTime()
}
Pros:
no need to wrap code as a block or manipulate within lines
can easily move the start and end of the timer among code lines when being exploratory
Cons:
less shiny for utterly functional code
obviously this object leaks map entries if you do not "close" timers,
e.g. if your code doesn't get to the second invocation for a given timer start.
ScalaMeter is a nice library to perform benchmarking in Scala
Below is a simple example
import org.scalameter._
def sumSegment(i: Long, j: Long): Long = (i to j) sum
val (a, b) = (1, 1000000000)
val execution_time = measure { sumSegment(a, b) }
If you execute above code snippet in Scala Worksheet you get the running time in milliseconds
execution_time: org.scalameter.Quantity[Double] = 0.260325 ms
The recommended approach to benchmarking Scala code is via sbt-jmh
"Trust no one, bench everything." - sbt plugin for JMH (Java
Microbenchmark Harness)
This approach is taken by many of the major Scala projects, for example,
Scala programming language itself
Dotty (Scala 3)
cats library for functional programming
Metals language server for IDEs
Simple wrapper timer based on System.nanoTime is not a reliable method of benchmarking:
System.nanoTime is as bad as String.intern now: you can use it,
but use it wisely. The latency, granularity, and scalability effects
introduced by timers may and will affect your measurements if done
without proper rigor. This is one of the many reasons why
System.nanoTime should be abstracted from the users by benchmarking
frameworks
Furthermore, considerations such as JIT warmup, garbage collection, system-wide events, etc. might introduce unpredictability into measurements:
Tons of effects need to be mitigated, including warmup, dead code
elimination, forking, etc. Luckily, JMH already takes care of many
things, and has bindings for both Java and Scala.
Based on Travis Brown's answer here is an example of how to setup JMH benchmark for Scala
Add jmh to project/plugins.sbt
addSbtPlugin("pl.project13.scala" % "sbt-jmh" % "0.3.7")
Enable jmh plugin in build.sbt
enablePlugins(JmhPlugin)
Add to src/main/scala/bench/VectorAppendVsListPreppendAndReverse.scala
package bench
import org.openjdk.jmh.annotations._
#State(Scope.Benchmark)
#BenchmarkMode(Array(Mode.AverageTime))
class VectorAppendVsListPreppendAndReverse {
val size = 1_000_000
val input = 1 to size
#Benchmark def vectorAppend: Vector[Int] =
input.foldLeft(Vector.empty[Int])({ case (acc, next) => acc.appended(next)})
#Benchmark def listPrependAndReverse: List[Int] =
input.foldLeft(List.empty[Int])({ case (acc, next) => acc.prepended(next)}).reverse
}
Execute benchmark with
sbt "jmh:run -i 10 -wi 10 -f 2 -t 1 bench.VectorAppendVsListPreppendAndReverse"
The results are
Benchmark Mode Cnt Score Error Units
VectorAppendVsListPreppendAndReverse.listPrependAndReverse avgt 20 0.024 ± 0.001 s/op
VectorAppendVsListPreppendAndReverse.vectorAppend avgt 20 0.130 ± 0.003 s/op
which seems to indicate prepending to a List and then reversing it at the end is order of magnitude faster than keep appending to a Vector.
I took the solution from Jesper and added some aggregation to it on multiple run of the same code
def time[R](block: => R) = {
def print_result(s: String, ns: Long) = {
val formatter = java.text.NumberFormat.getIntegerInstance
println("%-16s".format(s) + formatter.format(ns) + " ns")
}
var t0 = System.nanoTime()
var result = block // call-by-name
var t1 = System.nanoTime()
print_result("First Run", (t1 - t0))
var lst = for (i <- 1 to 10) yield {
t0 = System.nanoTime()
result = block // call-by-name
t1 = System.nanoTime()
print_result("Run #" + i, (t1 - t0))
(t1 - t0).toLong
}
print_result("Max", lst.max)
print_result("Min", lst.min)
print_result("Avg", (lst.sum / lst.length))
}
Suppose you want to time two functions counter_new and counter_old, the following is the usage:
scala> time {counter_new(lst)}
First Run 2,963,261,456 ns
Run #1 1,486,928,576 ns
Run #2 1,321,499,030 ns
Run #3 1,461,277,950 ns
Run #4 1,299,298,316 ns
Run #5 1,459,163,587 ns
Run #6 1,318,305,378 ns
Run #7 1,473,063,405 ns
Run #8 1,482,330,042 ns
Run #9 1,318,320,459 ns
Run #10 1,453,722,468 ns
Max 1,486,928,576 ns
Min 1,299,298,316 ns
Avg 1,407,390,921 ns
scala> time {counter_old(lst)}
First Run 444,795,051 ns
Run #1 1,455,528,106 ns
Run #2 586,305,699 ns
Run #3 2,085,802,554 ns
Run #4 579,028,408 ns
Run #5 582,701,806 ns
Run #6 403,933,518 ns
Run #7 562,429,973 ns
Run #8 572,927,876 ns
Run #9 570,280,691 ns
Run #10 580,869,246 ns
Max 2,085,802,554 ns
Min 403,933,518 ns
Avg 797,980,787 ns
Hopefully this is helpful
I like the simplicity of #wrick's answer, but also wanted:
the profiler handles looping (for consistency and convenience)
more accurate timing (using nanoTime)
time per iteration (not total time of all iterations)
just return ns/iteration - not a tuple
This is achieved here:
def profile[R] (repeat :Int)(code: => R, t: Long = System.nanoTime) = {
(1 to repeat).foreach(i => code)
(System.nanoTime - t)/repeat
}
For even more accuracy, a simple modification allows a JVM Hotspot warmup loop (not timed) for timing small snippets:
def profile[R] (repeat :Int)(code: => R) = {
(1 to 10000).foreach(i => code) // warmup
val start = System.nanoTime
(1 to repeat).foreach(i => code)
(System.nanoTime - start)/repeat
}
You can use System.currentTimeMillis:
def time[R](block: => R): R = {
val t0 = System.currentTimeMillis()
val result = block // call-by-name
val t1 = System.currentTimeMillis()
println("Elapsed time: " + (t1 - t0) + "ms")
result
}
Usage:
time{
//execute somethings here, like methods, or some codes.
}
nanoTime will show you ns, so it will hard to see. So I suggest that you can use currentTimeMillis instead of it.
While standing on the shoulders of giants...
A solid 3rd-party library would be more ideal, but if you need something quick and std-library based, following variant provides:
Repetitions
Last result wins for multiple repetitions
Total time and average time for multiple repetitions
Removes the need for time/instant provider as a param
.
import scala.concurrent.duration._
import scala.language.{postfixOps, implicitConversions}
package object profile {
def profile[R](code: => R): R = profileR(1)(code)
def profileR[R](repeat: Int)(code: => R): R = {
require(repeat > 0, "Profile: at least 1 repetition required")
val start = Deadline.now
val result = (1 until repeat).foldLeft(code) { (_: R, _: Int) => code }
val end = Deadline.now
val elapsed = ((end - start) / repeat)
if (repeat > 1) {
println(s"Elapsed time: $elapsed averaged over $repeat repetitions; Total elapsed time")
val totalElapsed = (end - start)
println(s"Total elapsed time: $totalElapsed")
}
else println(s"Elapsed time: $elapsed")
result
}
}
Also worth noting you can use the Duration.toCoarsest method to convert to the biggest time unit possible, although I am not sure how friendly this is with minor time difference between runs e.g.
Welcome to Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_60).
Type in expressions to have them evaluated.
Type :help for more information.
scala> import scala.concurrent.duration._
import scala.concurrent.duration._
scala> import scala.language.{postfixOps, implicitConversions}
import scala.language.{postfixOps, implicitConversions}
scala> 1000.millis
res0: scala.concurrent.duration.FiniteDuration = 1000 milliseconds
scala> 1000.millis.toCoarsest
res1: scala.concurrent.duration.Duration = 1 second
scala> 1001.millis.toCoarsest
res2: scala.concurrent.duration.Duration = 1001 milliseconds
scala>
adding on => method with name & seconds
profile[R](block: => R,methodName : String): R = {
val n = System.nanoTime()
val result = block
val n1 = System.nanoTime()
println(s"Elapsed time: ${TimeUnit.MILLISECONDS.convert(n1 - n,TimeUnit.NANOSECONDS)}ms - MethodName: ${methodName}")
result
}