Is there any already available library that can convert JSON string (most probably more than 1 rows of data) to CSV file.
I googled a lot for any such libraries in Scala, but I could find none.
What I am required to do is retrieve Data from a DB source, the result-set are in JSON format, and convert them into CSV.
Before what I did was converted the JSON into relevant Seq[case-class] and tried to used libraries like:
Scala-csv (tototoshi)
Might-csv
But these didn't prove much useful when in case of case class containing deep hierarchies.
Any Suggestions??
product-collections will convert a Seq[case class] to csv.
case class Foo(a:Int,b:String)
Seq(Foo(1,"aa"),Foo(2,"bb")).csvIterator.mkString("\n")
res27: String =
1,"aa"
2,"bb"
To deserialize json I'd probably use scala pickling.
"Deep hierarchies" are likely to be problematic.
Not entirely sure what you mean by deep hierarchies. Something like this?
case class Baz(i: Int)
case class Bar(baz: Baz)
case class Foo(bar: Bar)
In which case kantan.csv can answer part of the equation: turning deep case class hierarchies into CSV data. It's fairly trivial, provided you don't mind a shapeless dependency:
import kantan.csv.ops._
import kantan.csv.generic.codecs._
val fs: List[Foo] = ???
fs.foldLeft(new File("output.csv").asCsvWriter[Foo])(_ write _).close
If you have an issue with shapeless, you can provide encoders for your case classes yourself:
import kantan.csv._
implicit val bazCodec = Codec.caseCodec1(Baz.apply, Baz.unapply)(0)
implicit val barCodec = Codec.caseCodec1(Bar.apply, Bar.unapply)(0)
implicit val fooCodec = Codec.caseCodec1(Foo.apply, Foo.unapply)(0)
A bit more boilerplatey, but still acceptable, I think.
Related
I've got deeply nested JSON parsers (using json4s.jackson) that I'm trying to simplify using case classes.
My problem is... some of the fields start with numbers, but scala cannot have an arg name that starts with a numeric character.
Example:
import org.json4s.jackson.JsonMethods._
import org.json4s._
implicit val formats = DefaultFormats
val jsonStr = """{"5gLog":{"i":99}}""" // <--- note the field "5gLog"
val jval = parse(jsonStr)
case class Raw5gLog(i: Int)
val raw5gLog = (jval \ "5gLog").extract[Raw5gLog]
This works. But what I need to do, because these fields are nested deep within the JSON... is something like this...
val jsonStr = """{"xgLog":{"i":99}}"""
val jval = parse(jsonStr)
case class RawRecord(xgLog: Raw5gLog)
val rawRecord = jval.extract[RawRecord]
This would work... if the fields were named like xgLog, but the fields are actually named like 5gLog as above, and I can't give an arg name to a class like 5gLog...
case class RawRecord(5gLog: Raw5gLog)
// error: Invalid literal number
I thought about something like
parse(jsonStr.replace("\"5g", "\"fiveg"))
But there's real data, beyond the field names, in the JSON that can be impacted.
The best solution I can figure is to add extra apply methods to the affected case classes...
case class RawRecord(fivegLog: Raw5gLog)
object RawRecord {
def apply(jval: JValue): RawRecord =
RawRecord( (jval \ "5gLog").extract[Raw5gLog] )
}
val rawRecord = RawRecord(jval)
But I feel like every time I make some structurally different workaround like this for an edge case, it's always the beginning of my code turning into a mess. I could give every case class a new apply method and use it for everything, but it seems like a lot of extra code for a small subset of the data.
Is there a better way?
Scala can use any string as a variable name, but you may have to quote it with backticks:
case class RawRecord(`5gLog`: Raw5gLog)
You also need to do this if you have a field called type or any other reserved word. This is how - can be a function name, for example.
I have used Circe previously for case class serialization / deserialization, and love how it can be used without the boilerplate code required by other Scala JSON libraries, but I'm running into an issue now I'm not sure how to resolve. I have an ADT (a sealed trait with several case class instances) that I would like to treat (from my Akka Http Service, using akka-http-json) generically (ie, return a List[Foo], where Foo is the trait-type), but when I do so using Circe's auto-deriviation (via Shapeless), it serializes the instances using the specific case class name as a 'discriminator' (eg, if my List[Foo] contains instances of Foo1, then each element in the resulting serialized list will have the key Foo1). I would like to eliminate the type name as a discriminator (ie, so that instead of having each element in the sequence prefixed with the type name-- eg, "Foo1": {"id : "1", name : "First",...}, I just want to serialize the case class instances to contain the fields of the case class: eg, {"id":"1,"name:"First",...}...Essentially, I'd like to eliminate the type name keys (I don't want the front-end to have to know what concrete case class each element belongs to on the back-end).All elements in the list to be serialized will be of the same concrete-type, all of which would be subtypes of my ADT (trait) type. I believe this can be done using Circe's semi-auto derivation, though I haven't had a chance to figure out exactly how. Basically, I would like to use as much of Circe's auto-derivation as possible, but eliminate outer-level class names from appearing in the resulting JSON. Any help / suggestions would be very much appreciated! Thanks!
you can do it following the instruction in the doc: https://circe.github.io/circe/codecs/adt.html
import cats.syntax.functor._
import io.circe.{ Decoder, Encoder }, io.circe.generic.auto._
import io.circe.syntax._
object GenericDerivation {
implicit val encodeEvent: Encoder[Event] = Encoder.instance {
case foo # Foo(_) => foo.asJson
case bar # Bar(_) => bar.asJson
case baz # Baz(_) => baz.asJson
case qux # Qux(_) => qux.asJson
}
implicit val decodeEvent: Decoder[Event] =
List[Decoder[Event]](
Decoder[Foo].widen,
Decoder[Bar].widen,
Decoder[Baz].widen,
Decoder[Qux].widen
).reduceLeft(_ or _)
}
import GenericDerivation._
import io.circe.parser.decode
decode[Event]("""{ "i": 1000 }""")
// res0: Either[io.circe.Error,Event] = Right(Foo(1000))
(Foo(100): Event).asJson.noSpaces
// res1: String = {"i":100}
This may not be the best answer, but after some more searching this is what I've been able to find. Instead of having the class name as a key in the Json produced, it can be serialized as a field as following:
implicit val genDevConfig: Configuration = Configuration.default.withDescriminator("type")
(you can use whatever field name here you'd like; Travis Brown's previous example for a similar issue used a field named what_am_i). So my apologies-- I do not yet know if there is a canonical or widely accepted solution to this problem, especially one that will easily work with Akka Http, using libraries such as akka-http-json, where I still seem to be encountering some issues, though I'm sure I'm probably overlooking something obvious! Anyway, my apologies for asking a question that seems to come up repeatedly!
I have some scala code that requires the use of implicits for serializing and deserializing json.
We previously had something that worked by putting these implicit statements (simplified with dummies):
(in some class SomeClass1)
implicit val some1format = Json.format[SomeItem1]
implicit val some2format = Json.format[SomeItem2]
...
All as class-level variables. Any method within the class was then able to convert from Json just fine.
However, we are trying to move the implicit definitions of these formats to a separate object.
So we created an object (for example: SomeFormatters), which only contains these implicits:
object SomeFormatters {
implicit val some1format = Json.format[SomeItem1]
implicit val some2format = Json.format[SomeItem2]
}
When I try to import this object into SomeClass1, I get a compilation error saying that no deserializer was found for SomeItem1 or SomeItem2, even though I am importing SomeFormatters. (The IDE says the import of SomeFormatters is unused though, so I already knew something was off.)
What's the proper way to get SomeClass1 to know about the implicit definitions in SomeFormatters?
The issue was that there were no type annotations for implicit values -
Instead of:
implicit val some1format = Json.format[SomeItem1]
I needed to put:
implicit val some1format: Format[SomeItem1] = Json.format[SomeItem1]
I have HTTP client written in Scala that uses json4s/jackson to serialize and deserialize HTTP payloads. For now I was using only Scala case classes as model and everything was working fine, but now I have to communicate with third party service. They provided me with their own model but its written in Java, so now I need to deserialize jsons also to Java classes. It seams to work fine with simple classes but when class contains collections like Lists or Maps json4s has problems and sets all such fields to null.
Is there any way to handle such cases? Maybe I should use different formats (I'm using DefaultFormats + few custom ones). Example of problem with test:
import org.json4s.DefaultFormats
import org.json4s.jackson.Serialization.read
import org.scalatest.{FlatSpec, Matchers}
class JavaListTest extends FlatSpec with Matchers{
implicit val formats = DefaultFormats
"Java List" should "be deserialized properly" in {
val input = """{"list":["a", "b", "c"]}"""
val output = read[ObjectWithList](input)
output.list.size() shouldBe 3
}
}
And sample Java class:
import java.util.List;
public class ObjectWithList {
List<String> list;
}
I have also noticed that when I'll try to deserialize to Scala case class that contains java.util.List[String] type of field I'll get an exception of type: org.json4s.package$MappingException: Expected collection but got List[String]
Key for solving your issue, is composition of formatters. Basically you want to define JList formatter as list formatter composed with toJList function.
Unfortunately, json4s Formatters are extremely difficult to compose, so I used the Readers for you to get an idea. I also simplified an example, to having only java list:
import DefaultReaders._
import scala.collection.JavaConverters._
implicit def javaListReader[A: Reader]: Reader[java.util.List[A]] = new Reader[util.List[A]] {
override def read(value: JValue) = DefaultReaders.traversableReader[List, A].read(value).asJava
}
val input = """["a", "b", "c"]"""
val output = Formats.read[java.util.List[String]](parse(input))
To my knowledge json4s readers will not work with java classes out of the box, so you might either need to implement the Serializer[JList[_]] the same way, or mirror your java classes with case classes and use them inside your domain.
P.S.
Highly recommend you to switch to circe or argonaut, then you will forget about the most problems with jsons.
Given lift-json 2.0 and the following Scala classes & sealed trait:
sealed trait Location
case class Coordinate(latitude: Double,
longitude: Double) extends Location
case class Address(...) extends Location
I'd like to be able to deserialize a Location object without determining the concrete implementation:
deserialize[Location](""" { "latitude":0.0, "longitude":0.0 } """)
should produce the equivalent of:
val location: Location = Coordinate(0.0, 0.0)
Any way of doing this?
This may not be what you want, but with:
implicit val formats = net.liftweb.json.DefaultFormats
.withHints(ShortTypeHints(List(classOf[Geo], classOf[Address])))
enables you to write
val loc: Location = read(write(Geo(0.0, 0.0)))
However your json then gets a TypeHint:
{"jsonClass":"Geo","latitude":0.0,"longitude":0.0}
This formats can be played with somewhat, here is a nice post about type hints.
Lift-Json won't automatically detect a subclass to deserialize to because there is no straight forward way to do that. You might have 2 subclasses of Location that accept latitude and longitude constructor parameters, or some other ambiguity.
Type hints are definitely one way to go. If you don't want to muddy up your JSON with scala specific info though, you can also deserialize the string representation to a JValue, then inspect the structure to determine which class to bind to.
def location(str: String): Location = {
val jv = JsonParser.parse(string)
jv match {
case jo: JObject if jo.children.size == 2 => Extraction.extract[Coordninate](jo)
case _ => Extraction.extract[Address](jv)
}
}
If you can't choose your type based on arity, that would be more complicated, but you get the general idea.
#"Dave Whittaker" and #"Rin malavi" are right in general. Also, if you want a temporary solution, you may use:
str.extractOpt[Address] getOrElse str.extract[Coordinate]