I'm trying to parse a CSV file. Actually I have this code :
alias NimbleCSV.RFC4180, as: CSV
defmodule Siren do
def parseCSV do
IO.puts("Let's parse CSV file!")
stream = File.stream!("name.csv")
original_line = CSV.parse_stream(stream)
filter_line = Stream.filter(original_line, fn
["JeremyGuthrie" | _] -> true
_ -> false
end)
map = Stream.map(filter_line,
fn [name, team, position, height, weight, age] ->
%{name: name, team: team, position: position,
height: String.to_integer(height),
weight: String.to_integer(weight),
age: Float.parse(age) |> elem(0)
}
end)
end
end
According to my view I build a stream who handle each line of my name.csv file. With NimbleCSV library I parse this line and avoid the header line. Then, I filter each line to keep only the one corresponding to JeremyGuthrie. And finally I stock the line element into a structured data map. But now how to print just the name of my filter line : here JeremyGuthrie.
And I have an other question : I'm having some problems to filter my stream according to a number like an age, height or weight.
Here I apply Aleksei's advice with another code :
NimbleCSV.define(MyParser, separator: ";", escape: "\"")
defmodule Siren do
def parseCSV do
IO.puts("Let's parse CSV file!")
"ActeursEOF.csv"
|> File.stream!()
|> MyParser.parse_stream()
|> Stream.filter(fn
["RAZEL BEC" | _] -> true
["" | _] -> false
_ -> false
end)
|> Stream.map(fn [name, description, enr_competences] ->
%{name: name, description: description, enr_competences: enr_competences}
end)
|> Enum.to_list()
|> IO.inspect()
end
end
My output:
Compiling 1 file (.ex)
Let's parse CSV file!
[%{description: "Génie Civil", enr_competences: "Oui", name: "RAZEL BEC"}]
But now to close this subject I would to access and stock just the description for instance. And I don't see how to do that... And finally display this data.
Producing intermediate variables is redundant, in elixir we have Kernel.|>/2 aka pipe operator to pipe the functions’ output to the first argument of the next function.
"name.csv"
|> File.stream!()
|> CSV.parse_stream()
|> Stream.filter(fn
["JeremyGuthrie" | _] -> true
_ -> false
end)
|> Stream.map(fn
[name, team, position, height, weight, age] ->
%{name: name, team: team, position: position,
height: String.to_integer(height),
weight: String.to_integer(weight),
age: Float.parse(age) |> elem(0)
}
end)
|> Enum.to_list() # THIS
Note the last line in the chain. Streams are to be terminated to retrieve the result. Until the termination happens, it’s lazily constructed, but not evaluated at all. That makes it possible to e.g. produce and operate infinite streams.
Any greedy function from Enum module would do: Enum.take/2, or, as I pointed out above, Enum.to_list/1.
For the sake of reference, in the future, when you feel fully familiar with elixir, you might use Flow instead of Stream to parallelize mapping. For now (and for relatively small files) Stream is good enough.
Hello I'm a beginner in Elixir and I want to parse and stock a CSV file in an Elixir object.
But it's display that:
** (FunctionClauseError) no function clause matching in anonymous fn/1 in Siren.parseCSV/0
The following arguments were given to anonymous fn/1 in Siren.parseCSV/0:
# 1
["41", "5", "59", "N", "80", "39", "0", "W", "Youngstown", "OH"]
anonymous fn/1 in Siren.parseCSV/0
(elixir 1.10.3) lib/stream.ex:482: anonymous fn/4 in Stream.filter/2
(elixir 1.10.3) lib/stream.ex:1449: Stream.do_element_resource/6
(elixir 1.10.3) lib/stream.ex:1609: Enumerable.Stream.do_each/4
(elixir 1.10.3) lib/enum.ex:959: Enum.find/3
(mix 1.10.3) lib/mix/task.ex:330: Mix.Task.run_task/3
(mix 1.10.3) lib/mix/cli.ex:82: Mix.CLI.run_task/2
Here my code:
defmodule Siren do
def parseCSV do
IO.puts("Let's parse CSV file...")
File.stream!("../name.csv")
|> Stream.map(&String.trim(&1))
|> Stream.map(&String.split(&1, ","))
|> Stream.filter(fn
["LatD" | _] -> false
end)
|> Enum.find(fn State -> String
[LatD, LatM, LatS, NS, LonD, LonM, LonS, EW, City, State] ->
IO.puts("find -> #{State}")
true
end)
end
end
And the csv file:
LatD,LatM,LatS,NS,LonD,LonM,LonS,EW,City,State
41,5,59,N,80,39,0,W,Youngstown,OH
42,52,48,N,97,23,23,W,Yankton,SD
46,35,59,N,120,30,36,W,Yakima,WA
42,16,12,N,71,48,0,W,Worcester,MA
43,37,48,N,89,46,11,W,WisconsinDells,WI
36,5,59,N,80,15,0,W,Winston-Salem,NC
49,52,48,N,97,9,0,W,Winnipeg,MB
39,11,23,N,78,9,36,W,Winchester,VA
34,14,24,N,77,55,11,W,Wilmington,NC
39,45,0,N,75,33,0,W,Wilmington,DE
48,9,0,N,103,37,12,W,Williston,ND
41,15,0,N,77,0,0,W,Williamsport,PA
37,40,48,N,82,16,47,W,Williamson,WV
33,54,0,N,98,29,23,W,WichitaFalls,TX
37,41,23,N,97,20,23,W,Wichita,KS
40,4,11,N,80,43,12,W,Wheeling,WV
26,43,11,N,80,3,0,W,WestPalmBeach,FL
47,25,11,N,120,19,11,W,Wenatchee,WA
41,25,11,N,122,23,23,W,Weed,CA
The first issue is here:
|> Stream.filter(fn
["LatD" | _] -> false
end)
all the lines should pass this and the only first one matches the given clauses. This would fix the issue
|> Stream.filter(fn
["LatD" | _] -> false
_ -> true
end)
or
|> Stream.reject(&match?(["LatD" | _], &1))
Enum.find(fn State -> String after looks unclear and would be surely the next issue. I failed to understand what have you tried to achieve here.
The general advice would be: don’t reinvent the wheel and use NimbleCSV written by José Valim to parse CSVs, because there are lot of corner cases (like commas inside quotes in any field etc,) handled properly in the aforementioned library.
Aleksei Matiushkin gave you the right answer but also you have this function:
fn
State ->
String
[LatD, LatM, LatS, NS, LonD, LonM, LonS, EW, City, State] ->
IO.puts("find -> #{State}")
true
end
It accepts two possible values, either State which is an atom, or a list of 10 specific atoms.
What you want to do is use variables, and variables in Elixir start with a lowercase letter or an underscore if it has to be ignored.
fn
state ->
String
[latd, latm, lats, ns, lond, lonm, lons, ew, city, state] ->
IO.puts("find -> #{state}")
true
end
But in this case, the first clause of the function will always match anything because it acts like a catch-all clause.
What you probably want is:
fn
[_latd, _latm, _lats, _ns, _lond, _lonm, _lons, _ew, _city, state] ->
IO.puts("find -> #{state}")
# here decide if you want to return true or false,
# for instance `state == NC`
true
end
I have a set of Strings and using it as key values to get JValues from a Map:
val keys: Set[String] = Set("Metric_1", "Metric_2", "Metric_3", "Metric_4")
val logData: Map[String, JValue] = Map("Metric_1" -> JInt(0), "Metric_2" -> JInt(1), "Metric_3" -> null)
In the below method I'm parsing values for each metric. First getting all values, then filtering to get rid of null values and then transforming existing values to booleans.
val metricsMap: Map[String, Boolean] = keys
.map(k => k -> logData(k).extractOpt[Int]).toMap
.filter(_._2.isDefined)
.collect {
case (str, Some(0)) => str -> false
case (str, Some(1)) => str -> true
}
I've faced a problem when one of the keys is not found in the logData Map. So I'm geting a java.util.NoSuchElementException: key not found: Metric_4.
Here I'm using extractOpt to extract a value from a JSON and don't need default values. So probably extractOrElse will not be helpful since I only need to get values for existing keys and skip non-existing keys.
What could be a correct approach to handle a case when a key is not present in the logData Map?
UPD: I've achieved the desired result by .map(k => k -> apiData.getOrElse(k, null).extractOpt[Int]).toMap. However still not sure that it's the best approach.
That the values are JSON is a red herring--it's the missing key that's throwing the exception. There's a method called get which retrieves a value from a map wrapped in an Option. If we use Ints as the values we have:
val logData = Map("Metric_1" -> 1, "Metric_2" -> 0, "Metric_3" -> null)
keys.flatMap(k => logData.get(k).map(k -> _)).toMap
> Map(Metric_1 -> 1, Metric_2 -> 0, Metric_3 -> null)
Using flatMap instead of map means unwrap the Some results and drop the Nones. Now, if we go back to your actual example, we have another layer and that flatMap will eliminate the Metric_3 -> null item:
keys.flatMap(k => logData.get(k).flatMap(_.extractOpt[Int]).map(k -> _)).toMap
You can also rewrite this using a for comprehension:
(for {
k <- keys
jv <- logData.get(k)
v <- jv.extractOpt[Int]
} yield k -> v).toMap
I used Success and Failure in place of the JSON values to avoid having to set up a shell with json4s to make an example:
val logData = Map("Metric_1" -> Success(1), "Metric_2" -> Success(0), "Metric_3" -> Failure(new RuntimeException()))
scala> for {
| k <- keys
| v <- logData.get(k)
| r <- v.toOption
| } yield k -> r
res2: scala.collection.immutable.Set[(String, Int)] = Set((Metric_1,1), (Metric_2,0))
I am a total beginner with Erlang and functional programming in general. For fun, to get me started, I am converting an existing Ruby Sinatra REST(ish) API that queries PostgreSQL and returns JSON.
On the Erlang side I am using Cowboy, Epgsql and Jiffy as the JSON library.
Epgsql returns results in the following format:
{ok, [{column,<<"column_name">>,int4,4,-1,0}], [{<<"value">>}]}
But Jiffy expects the following format when encoding to JSON:
{[{<<"column_name">>,<<"value">>}]}
The following code works to convert epgsql output into suitable input for jiffy:
Assuming Data is the Epgsql output and Key is the name of the JSON object being created:
{_, C, R} = Data,
Columns = [X || {_, X, _, _, _, _} <- C,
Rows = tuple_to_list(hd(R)),
Result = {[{atom_to_binary(Key, utf8), {lists:zip(Columns, Rows)}}]}.
However, I am wondering if this is efficient Erlang?
I've looked into the documentation for Epgsql and Jiffy and can't see any more obvious ways to perform the conversion.
Thank you.
Yes, need parse it.
For example function parse result
parse_result({error, #error{ code = <<"23505">>, extra = Extra }}) ->
{match, [Column]} =
re:run(proplists:get_value(detail, Extra),
"Key \\(([^\\)]+)\\)", [{capture, all_but_first, binary}]),
throw({error, {non_unique, Column}});
parse_result({error, #error{ message = Msg }}) ->
throw({error, Msg});
parse_result({ok, Cols, Rows}) ->
to_map(Cols, Rows);
parse_result({ok, Counts, Cols, Rows}) ->
{ok, Counts, to_map(Cols, Rows)};
parse_result(Result) ->
Result.
And function convert result to map
to_map(Cols, Rows) ->
[ maps:from_list(lists:zipwith(fun(#column{name = N}, V) -> {N, V} end,
Cols, tuple_to_list(Row))) || Row <- Rows ].
And encode it to json. You can change my code and make output as proplist.