How to find all maximum elements in a scalaz.Foldable container - scalaz

scalaz.Foldable has a maximumBy method that finds a maximum element in a container. But is there an elegant way to find them all using scalaz? ie:
Vector(Person("Ben", 1), Person("Jil", 3), Person("Bob", 3)).maximumsBy(_.age)
== Vector(Person("Jil", 3), Person("Bob", 3))
I have a problem where, if there are several equal maximum values, I want to select among these candidates randomly.

You can do something like that
implicit def MaxNonEmptyListSemigroup[A : Order]:
Semigroup[NonEmptyList[A]] = new Semigroup[NonEmptyList[A]] {
def append(l1: NonEmptyList[A], l2: =>NonEmptyList[A]): NonEmptyList[A] =
Order[A].apply(l1.head, l2.head) match {
case GT => l1
case LT => l2
case EQ => l1 append l2
}
}
// returns None if the list is empty
// otherwise returns Some(non-empty-list of maximum elements)
list.foldMap1Opt(a => NonEmptyList.nels(a)) :: Option[NonEmptyList[A]]

Ideally, maximumsBy will return the maximums in the same type of container as was provided. To do this efficiently seems to require scalaz.Reducer, a typeclass that models append- and prepend- to a container.
import scalaz._
import Ordering._
import std.AllInstances._
object Maximums extends App {
def maximumsBy[F[_]: Foldable, A, B: Order](fa: F[A])(f: A => B)
(implicit r: Reducer[A, F[A]]): Option[F[A]] =
Foldable[F].foldMapLeft1Opt(fa)(a => (f(a), r.unit(a))) {
case (curr#(max, maxes), a) => {
val next = f(a)
Order[B].apply(next, max) match {
case GT => (next, r.unit(a))
case LT => curr
case EQ => (max, r.snoc(maxes, a))
}
}
}.map(_._2)
println(maximumsBy(Vector(("a", 1), ("c", 3), ("c", 3)))(_._2))
println(maximumsBy(List(("a", 1), ("c", 3), ("c", 3)))(_._2))
//Output:
//Some(Vector((c,3), (c,3)))
//Some(List((c,3), (c,3)))
}
I was slightly dismayed by how complex maximumsBy ended up. Are there any ways to simplify it, while keeping the same behavior?

Related

Better way to create dictionary of functions

#attempt 1: works
f(x::Int64) = x +1
my_functions = Dict("f" => f)
#attempt 2: does not work, something is wrong
new_functions = Dict("g" => g(x::Int64) = x + 5)
I'm a novice and new to Julia. Is there a way to accomplish this similar to my 2nd attempt above? Thanks
You can use anonymous function syntax like this:
new_functions = Dict("g" => x::Int64 -> x + 5)
You can read the details how they are used in the Julia manual: https://docs.julialang.org/en/latest/manual/functions/#man-anonymous-functions-1.
Edit: notice that if you initially add only one function to the dictionary its type will be too restrictive, like: Dict{String,getfield(Main, Symbol("##3#4"))}, e.g.:
julia> new_functions = Dict("g" => x::Int64 -> x + 5)
Dict{String,getfield(Main, Symbol("##15#16"))} with 1 entry:
"g" => ##15#16()
So you probably should specify the type explicitly like:
julia> new_functions = Dict{String, Function}("g" => x::Int64 -> x + 5)
Dict{String,Function} with 1 entry:
"g" => ##23#24()
or add at least two entries to the dictionary initially:
julia> new_functions = Dict("g" => x::Int64 -> x + 5, "h" => x -> x+1)
Dict{String,Function} with 2 entries:
"g" => ##11#13()
"h" => ##12#14()
For completeness: there's also the possibility to use the normal multi-line function syntax as an expression, which will create a function object with a name (like a "named function expression" in JavaScript; this is handy if you need recursion):
julia> Dict("g" => function g(x::Int); x + 5; end)
Dict{String,typeof(g)} with 1 entry:
"g" => g
The first ; in the line is necessary here. #BogumiƂ's caveats about typing the Dict apply as well, as you can see.
Using the short-form syntax is possible, too, but you have to put the expression into parentheses:
Dict("g" => (g(x::Int) = x + 5))

Play JSON Reads[T]: split a JsArray into multiple subsets

I have a JSON structure that contains an array of events. The array is "polymorphic" in the sense that there are three possible event types A, B and C:
{
...
"events": [
{ "eventType": "A", ...},
{ "eventType": "B", ...},
{ "eventType": "C", ...},
...
]
}
The three event types don't have the same object structure, so I need different Reads for them. And apart from that, the target case class of the whole JSON document distinguishes between the events:
case class Doc(
...,
aEvents: Seq[EventA],
bEvents: Seq[EventB],
cEvents: Seq[EventC],
...
)
How can I define the internals of Reads[Doc] so that the json array events is split into three subsets which are mapped to aEvents, bEvents and cEvents?
What I tried so far (without being succesful):
First, I defined a Reads[JsArray] to transform the original JsArray to another JsArray that only contains events of a particular type:
def eventReads(eventTypeName: String) = new Reads[JsArray] {
override def reads(json: JsValue): JsResult[JsArray] = json match {
case JsArray(seq) =>
val filtered = seq.filter { jsVal =>
(jsVal \ "eventType").asOpt[String].contains(eventTypeName)
}
JsSuccess(JsArray(filtered))
case _ => JsError("Must be an array")
}
}
Then the idea is to use it like this within Reads[Doc]:
implicit val docReads: Reads[Doc] = (
...
(__ \ "events").read[JsArray](eventReads("A")).andThen... and
(__ \ "events").read[JsArray](eventReads("B")).andThen... and
(__ \ "events").read[JsArray](eventReads("C")).andThen... and
...
)(Doc.apply _)
However, I don't know how to go on from here. I assume the andThen part should look something like this (in case of event a):
.andThen[Seq[EventA]](EventA.reads)
But that doesn't work since I expect the API to create a Seq[EventA] by explicitly passing a Reads[EventA] instead of Reads[Seq[EventA]]. And apart from that, since I've never got it running, I'm not sure if this whole approach is reasonable in the first place.
edit: in case the original JsArray contains unknown event types (e.g. D and E), these types should be ignored and left out from the final result (instead of making the whole Reads fail).
put implicit read for every Event type like
def eventRead[A](et: String, er: Reads[A]) = (__ \ "eventType").read[String].filter(_ == et).andKeep(er)
implicit val eventARead = eventRead("A", Json.reads[EventA])
implicit val eventBRead = eventRead("B", Json.reads[EventB])
implicit val eventCRead = eventRead("C", Json.reads[EventC])
and use Reads[Doc] (folding event list to separate sequences by types and apply result to Doc):
Reads[Doc] = (__ \ "events").read[List[JsValue]].map(
_.foldLeft[JsResult[ (Seq[EventA], Seq[EventB], Seq[EventC]) ]]( JsSuccess( (Seq.empty[EventA], Seq.empty[EventB], Seq.empty[EventC]) ) ){
case (JsSuccess(a, _), v) =>
(v.validate[EventA].map(e => a.copy(_1 = e +: a._1)) or v.validate[EventB].map(e => a.copy(_2 = e +: a._2)) or v.validate[EventC].map(e => a.copy(_3 = e +: a._3)))
case (e, _) => e
}
).flatMap(p => Reads[Doc]{js => p.map(Doc.tupled)})
it will create Doc in one pass through events list
JsSuccess(Doc(List(EventA(a)),List(EventB(b2), EventB(b1)),List(EventC(c))),)
the source data
val json = Json.parse("""{"events": [
| { "eventType": "A", "e": "a"},
| { "eventType": "B", "ev": "b1"},
| { "eventType": "C", "event": "c"},
| { "eventType": "B", "ev": "b2"}
| ]
|}
|""")
case class EventA(e: String)
case class EventB(ev: String)
case class EventC(event: String)
I would model the fact that you store different event types in your JS array as a class hierarchy to keep it type safe.
sealed abstract class Event
case class EventA() extends Event
case class EventB() extends Event
case class EventC() extends Event
Then you can store all your events in a single collection and use pattern matching later to refine them. For example:
case class Doc(events: Seq[Event]) {
def getEventsA: Seq[EventA] = events.flatMap(_ match {
case e: EventA => Some(e)
case _ => None
})
}
Doc(Seq(EventA(), EventB(), EventC())).getEventsA // res0: Seq[EventA] = List(EventA())
For implementing your Reads, Doc will be naturally mapped to the case class, you only need to provide a mapping for Event. Here is what it could look like:
implicit val eventReads = new Reads[Event] {
override def reads(json: JsValue): JsResult[Event] = json \ "eventType" match {
case JsDefined(JsString("A")) => JsSuccess(EventA())
case JsDefined(JsString("B")) => JsSuccess(EventB())
case JsDefined(JsString("C")) => JsSuccess(EventC())
case _ => JsError("???")
}
}
implicit val docReads = Json.reads[Doc]
You can then use it like this:
val jsValue = Json.parse("""
{
"events": [
{ "eventType": "A"},
{ "eventType": "B"},
{ "eventType": "C"}
]
}
""")
val docJsResults = docReads.reads(jsValue) // docJsResults: play.api.libs.json.JsResult[Doc] = JsSuccess(Doc(List(EventA(), EventB(), EventC())),/events)
docJsResults.get.events.length // res1: Int = 3
docJsResults.get.getEventsA // res2: Seq[EventA] = List(EventA())
Hope this helps.

Mapping a sequence of results from Slick monadic join to Json

I'm using Play 2.4 with Slick 3.1.x, specifically the Slick-Play plugin v1.1.1. Firstly, some context... I have the following search/filter method in a DAO, which joins together 4 models:
def search(
departureCity: Option[String],
arrivalCity: Option[String],
departureDate: Option[Date]
) = {
val monadicJoin = for {
sf <- slickScheduledFlights.filter(a =>
departureDate.map(d => a.date === d).getOrElse(slick.lifted.LiteralColumn(true))
)
fl <- slickFlights if sf.flightId === fl.id
al <- slickAirlines if fl.airlineId === al.id
da <- slickAirports.filter(a =>
fl.departureAirportId === a.id &&
departureCity.map(c => a.cityCode === c).getOrElse(slick.lifted.LiteralColumn(true))
)
aa <- slickAirports.filter(a =>
fl.arrivalAirportId === a.id &&
arrivalCity.map(c => a.cityCode === c).getOrElse(slick.lifted.LiteralColumn(true))
)
} yield (fl, sf, al, da, aa)
db.run(monadicJoin.result)
}
The output from this is a Vector containing sequences, e.g:
Vector(
(
Flight(Some(1),123,216,2013,3,1455,2540,3,905,500,1150),
ScheduledFlight(Some(1),1,2016-04-13,90,10),
Airline(Some(216),BA,BAW,British Airways,United Kingdom),
Airport(Some(2013),LHR,Heathrow,LON,...),
Airport(Some(2540),JFK,John F Kennedy Intl,NYC...)
),
(
etc ...
)
)
I'm currently rendering the JSON in the controller by calling .toJson on a Map and inserting this Vector (the results param below), like so:
flightService.search(departureCity, arrivalCity, departureDate).map(results => {
Ok(
Map[String, Any](
"status" -> "OK",
"data" -> results
).toJson
).as("application/json")
})
While this sort of works, it produces output in an unusual format; an array of results (the rows) within each result object the joins are nested inside objects with keys: "_1", "_2" and so on.
So the question is: How should I go about restructuring this?
There doesn't appear to be anything which specifically covers this sort of scenario in the Slick docs. Therefore I would be grateful for some input on what might be the best way to refactor this Vector of Seq's, with a view to renaming each of the joins or even flattening it out and only keeping certain fields?
Is this best done in the DAO search method before it's returned (by mapping it somehow?) or in the controller after I get back the Future results Vector from the search method?
Or I'm wondering whether it would be preferable to abstract this sort of mutation out somewhere else entirely, using a transformer perhaps?
You need JSON Reads/Writes/Format Combinators
In the first place you must have Writes[T] for all your classes (Flight, ScheduledFlight, Airline, Airport).
Simple way is using Json macros
implicit val flightWrites: Writes[Flight] = Json.writes[Flight]
implicit val scheduledFlightWrites: Writes[ScheduledFlight] = Json.writes[ScheduledFlight]
implicit val airlineWrites: Writes[Airline] = Json.writes[Airline]
implicit val airportWrites: Writes[Airport] = Json.writes[Airport]
You must implement OWrites[(Flight, ScheduledFlight, Airline, Airport, Airport)] for Vector item also. For example:
val itemWrites: OWrites[(Flight, ScheduledFlight, Airline, Airport, Airport)] = (
(__ \ "flight").write[Flight] and
(__ \ "scheduledFlight").write[ScheduledFlight] and
(__ \ "airline").write[Airline] and
(__ \ "airport1").write[Airport] and
(__ \ "airport2").write[Airport]
).tupled
for writing whole Vector as JsAray use Writes.seq[T]
val resultWrites: Writes[Seq[(Flight, ScheduledFlight, Airline, Airport, Airport)]] = Writes.seq(itemWrites)
We have all to response your data
flightService.search(departureCity, arrivalCity, departureDate).map(results =>
Ok(
Json.obj(
"status" -> "Ok",
"data" -> resultWrites.writes(results)
)
)

Why JSArray parsing behaves differently, depending on the code structure, while logic remained?

I'm doing small refactoring, trying to keep logical outcomes intact:
After refactoring:
val mapped:Seq[Option[String]] = (mr.getNormalizedValue(1) \ "getapworkflowinfo1" ).as[JsArray].value.map(v => {
(v \ "Description").as[String] match {
case value if List("referral to electrophysiology").exists(value.toLowerCase.equals) =>
Some("true")
case _ =>
None
}}
)
mapped.flatten.lastOption
To:
val referralIndicators: Seq[Boolean] =
(mr.getNormalizedValue(1) \ "getapworkflowinfo1").as[JsArray].value
// Step 1.1 Extracting and checking description
.map(d => (d \ "Description").as[String].toLowerCase().equals("referral to electrophysiology"))
// Step 2. Returning if at least once there was referral to electrophysiology
Some(referralIndicators.exists(v => v)).map(v => v.toString)
Which should be logically equal (and there for should generate the same outputs on the same inputs).
Effectively improves parsing, and results returned in refactored code are better, then before.
Can someone explain, what is the different between those two?

find function matlab in numpy/scipy

Is there an equivalent function of find(A>9,1) from matlab for numpy/scipy. I know that there is the nonzero function in numpy but what I need is the first index so that I can use the first index in another extracted column.
Ex: A = [ 1 2 3 9 6 4 3 10 ]
find(A>9,1) would return index 4 in matlab
The equivalent of find in numpy is nonzero, but it does not support a second parameter.
But you can do something like this to get the behavior you are looking for.
B = nonzero(A >= 9)[0]
But if all you are looking for is finding the first element that satisfies a condition, you are better off using max.
For example, in matlab, find(A >= 9, 1) would be the same as [~, idx] = max(A >= 9). The equivalent function in numpy would be the following.
idx = (A >= 9).argmax()
matlab's find(X, K) is roughly equivalent to numpy.nonzero(X)[0][:K] in python. #Pavan's argmax method is probably a good option if K == 1, but unless you know apriori that there will be a value in A >= 9, you will probably need to do something like:
idx = (A >= 9).argmax()
if (idx == 0) and (A[0] < 9):
# No value in A is >= 9
...
I'm sure these are all great answers but I wasn't able to make use of them. However, I found another thread that partially answers this:
MATLAB-style find() function in Python
John posted the following code that accounts for the first argument of find, in your case A>9 ---find(A>9,1)-- but not the second argument.
I altered John's code which I believe accounts for the second argument ",1"
def indices(a, func):
return [i for (i, val) in enumerate(a) if func(val)]
a = [1,2,3,9,6,4,3,10]
threshold = indices(a, lambda y: y >= 9)[0]
This returns threshold=3. My understanding is that Python's index starts at 0... so it's the equivalent of matlab saying 4. You can change the value of the index being called by changing the number in the brackets ie [1], [2], etc instead of [0].
John's original code:
def indices(a, func):
return [i for (i, val) in enumerate(a) if func(val)]
a = [1, 2, 3, 1, 2, 3, 1, 2, 3]
inds = indices(a, lambda x: x > 2)
which returns >>> inds [2, 5, 8]
Consider using argwhere in Python to replace MATLAB's find function. For example,
import numpy as np
A = [1, 2, 3, 9, 6, 4, 3, 10]
np.argwhere(np.asarray(A)>=9)[0][0] # Return first index
returns 3.
import numpy
A = numpy.array([1, 2, 3, 9, 6, 4, 3, 10])
index = numpy.where(A >= 9)
You can do this by first convert the list to an ndarray, then using the function numpy.where() to get the desired index.