Elixir : how to display a structured data element? - csv

I'm trying to parse a CSV file. Actually I have this code :
alias NimbleCSV.RFC4180, as: CSV
defmodule Siren do
def parseCSV do
IO.puts("Let's parse CSV file!")
stream = File.stream!("name.csv")
original_line = CSV.parse_stream(stream)
filter_line = Stream.filter(original_line, fn
["JeremyGuthrie" | _] -> true
_ -> false
end)
map = Stream.map(filter_line,
fn [name, team, position, height, weight, age] ->
%{name: name, team: team, position: position,
height: String.to_integer(height),
weight: String.to_integer(weight),
age: Float.parse(age) |> elem(0)
}
end)
end
end
According to my view I build a stream who handle each line of my name.csv file. With NimbleCSV library I parse this line and avoid the header line. Then, I filter each line to keep only the one corresponding to JeremyGuthrie. And finally I stock the line element into a structured data map. But now how to print just the name of my filter line : here JeremyGuthrie.
And I have an other question : I'm having some problems to filter my stream according to a number like an age, height or weight.
Here I apply Aleksei's advice with another code :
NimbleCSV.define(MyParser, separator: ";", escape: "\"")
defmodule Siren do
def parseCSV do
IO.puts("Let's parse CSV file!")
"ActeursEOF.csv"
|> File.stream!()
|> MyParser.parse_stream()
|> Stream.filter(fn
["RAZEL BEC" | _] -> true
["" | _] -> false
_ -> false
end)
|> Stream.map(fn [name, description, enr_competences] ->
%{name: name, description: description, enr_competences: enr_competences}
end)
|> Enum.to_list()
|> IO.inspect()
end
end
My output:
Compiling 1 file (.ex)
Let's parse CSV file!
[%{description: "Génie Civil", enr_competences: "Oui", name: "RAZEL BEC"}]
But now to close this subject I would to access and stock just the description for instance. And I don't see how to do that... And finally display this data.

Producing intermediate variables is redundant, in elixir we have Kernel.|>/2 aka pipe operator to pipe the functions’ output to the first argument of the next function.
"name.csv"
|> File.stream!()
|> CSV.parse_stream()
|> Stream.filter(fn
["JeremyGuthrie" | _] -> true
_ -> false
end)
|> Stream.map(fn
[name, team, position, height, weight, age] ->
%{name: name, team: team, position: position,
height: String.to_integer(height),
weight: String.to_integer(weight),
age: Float.parse(age) |> elem(0)
}
end)
|> Enum.to_list() # THIS
Note the last line in the chain. Streams are to be terminated to retrieve the result. Until the termination happens, it’s lazily constructed, but not evaluated at all. That makes it possible to e.g. produce and operate infinite streams.
Any greedy function from Enum module would do: Enum.take/2, or, as I pointed out above, Enum.to_list/1.
For the sake of reference, in the future, when you feel fully familiar with elixir, you might use Flow instead of Stream to parallelize mapping. For now (and for relatively small files) Stream is good enough.

Related

Seeding the Database from a CSV file in Phoenix/Elixir

When I try to run: mix run priv/repo/seeds.exs, I have a problem: (FunctionClauseError) no function clause matching in anonymous fn/1 in :elixir_compiler_1.__FILE__/1 The following arguments were given to anonymous fn/1 in :elixir_compiler_1.__FILE__/1:
This is my seeds.exs file:
alias FlightsList.Repo
alias FlightsList.Management.Flights
File.stream!("C:/Users/vukap/phx_projects/flights_list/priv/repo/flights.csv")
|> Stream.drop(1)
|> CSV.decode(headers: [:Id, :Origin, :Destination, :DepartureDate, :DepartureTime, :ArrivalDate, :ArrivalTime, :Number])
|> Enum.each(fn {:ok, map} ->
Flights.changeset(
%Flights{},
%{Id: String.to_integer(map[:Id]), Origin: map[:Origin], Destination: map[:Destination], DepartureDate: String.to_integer(map[:DepartureDate]), DepartureTime: String.to_integer(map[:DepartureTime]), ArrivalDate: String.to_integer(map[:ArrivalDate]), ArrivalTime: String.to_integer(map[:ArrivalTime]), Number: map[:Number]})
|> Repo.insert!()
end)
How can I fix it?
It’s impossible to answer precisely until you have specified CSV library you used, or at least what the error actually says after The following arguments were given to anonymous fn/1, but the issue is definitely with CSV.decode/2 returning something different from {:ok, map} your next clause expects.
To fix this and similar issues, one would add the catch-all clause to the processing and examine the outcome.
...
|> Enum.each(fn
{:ok, map} -> Flights.changeset(...)
other -> IO.inspect(other, label: "Unexpected")
end)
Check what the above would print out and fix it accordingly.
I guess you are missing a separator in CSV.decode function, here is an example of how I do it, you can call stream_csv in seed file.
def store_it(row) do
{:ok, result} = row
%Segments{
id: result.id,
name: result.name
} |> Repo.insert!
end
def stream_csv do
Path.expand("~/Project/segmments.csv")
|> File.stream!
|> CSV.decode(separator: ?;, headers: [:id, :name])
|> Enum.each(&store_it/1)
end

Beginner in Elixir : parse CSV file

Hello I'm a beginner in Elixir and I want to parse and stock a CSV file in an Elixir object.
But it's display that:
** (FunctionClauseError) no function clause matching in anonymous fn/1 in Siren.parseCSV/0
The following arguments were given to anonymous fn/1 in Siren.parseCSV/0:
# 1
["41", "5", "59", "N", "80", "39", "0", "W", "Youngstown", "OH"]
anonymous fn/1 in Siren.parseCSV/0
(elixir 1.10.3) lib/stream.ex:482: anonymous fn/4 in Stream.filter/2
(elixir 1.10.3) lib/stream.ex:1449: Stream.do_element_resource/6
(elixir 1.10.3) lib/stream.ex:1609: Enumerable.Stream.do_each/4
(elixir 1.10.3) lib/enum.ex:959: Enum.find/3
(mix 1.10.3) lib/mix/task.ex:330: Mix.Task.run_task/3
(mix 1.10.3) lib/mix/cli.ex:82: Mix.CLI.run_task/2
Here my code:
defmodule Siren do
def parseCSV do
IO.puts("Let's parse CSV file...")
File.stream!("../name.csv")
|> Stream.map(&String.trim(&1))
|> Stream.map(&String.split(&1, ","))
|> Stream.filter(fn
["LatD" | _] -> false
end)
|> Enum.find(fn State -> String
[LatD, LatM, LatS, NS, LonD, LonM, LonS, EW, City, State] ->
IO.puts("find -> #{State}")
true
end)
end
end
And the csv file:
LatD,LatM,LatS,NS,LonD,LonM,LonS,EW,City,State
41,5,59,N,80,39,0,W,Youngstown,OH
42,52,48,N,97,23,23,W,Yankton,SD
46,35,59,N,120,30,36,W,Yakima,WA
42,16,12,N,71,48,0,W,Worcester,MA
43,37,48,N,89,46,11,W,WisconsinDells,WI
36,5,59,N,80,15,0,W,Winston-Salem,NC
49,52,48,N,97,9,0,W,Winnipeg,MB
39,11,23,N,78,9,36,W,Winchester,VA
34,14,24,N,77,55,11,W,Wilmington,NC
39,45,0,N,75,33,0,W,Wilmington,DE
48,9,0,N,103,37,12,W,Williston,ND
41,15,0,N,77,0,0,W,Williamsport,PA
37,40,48,N,82,16,47,W,Williamson,WV
33,54,0,N,98,29,23,W,WichitaFalls,TX
37,41,23,N,97,20,23,W,Wichita,KS
40,4,11,N,80,43,12,W,Wheeling,WV
26,43,11,N,80,3,0,W,WestPalmBeach,FL
47,25,11,N,120,19,11,W,Wenatchee,WA
41,25,11,N,122,23,23,W,Weed,CA
The first issue is here:
|> Stream.filter(fn
["LatD" | _] -> false
end)
all the lines should pass this and the only first one matches the given clauses. This would fix the issue
|> Stream.filter(fn
["LatD" | _] -> false
_ -> true
end)
or
|> Stream.reject(&match?(["LatD" | _], &1))
Enum.find(fn State -> String after looks unclear and would be surely the next issue. I failed to understand what have you tried to achieve here.
The general advice would be: don’t reinvent the wheel and use NimbleCSV written by José Valim to parse CSVs, because there are lot of corner cases (like commas inside quotes in any field etc,) handled properly in the aforementioned library.
Aleksei Matiushkin gave you the right answer but also you have this function:
fn
State ->
String
[LatD, LatM, LatS, NS, LonD, LonM, LonS, EW, City, State] ->
IO.puts("find -> #{State}")
true
end
It accepts two possible values, either State which is an atom, or a list of 10 specific atoms.
What you want to do is use variables, and variables in Elixir start with a lowercase letter or an underscore if it has to be ignored.
fn
state ->
String
[latd, latm, lats, ns, lond, lonm, lons, ew, city, state] ->
IO.puts("find -> #{state}")
true
end
But in this case, the first clause of the function will always match anything because it acts like a catch-all clause.
What you probably want is:
fn
[_latd, _latm, _lats, _ns, _lond, _lonm, _lons, _ew, _city, state] ->
IO.puts("find -> #{state}")
# here decide if you want to return true or false,
# for instance `state == NC`
true
end

In Elixir, How can I extract a lambda to a named function when the lambda is in a closure?

I have the following closure:
def get!(Item, id) do
Enum.find(
#items,
fn(item) -> item.id == id end
)
end
As I believe this looks ugly and difficult to read, I'd like to give this a name, like:
def get!(Item, id) do
defp has_target_id?(item), do: item.id = id
Enum.find(#items, has_target_id?/1)
end
Unfortunately, this results in:
== Compilation error in file lib/auction/fake_repo.ex ==
** (ArgumentError) cannot invoke defp/2 inside function/macro
(elixir) lib/kernel.ex:5238: Kernel.assert_no_function_scope/3
(elixir) lib/kernel.ex:4155: Kernel.define/4
(elixir) expanding macro: Kernel.defp/2
lib/auction/fake_repo.ex:28: Auction.FakeRepo.get!/2
Assuming it is possible, what is the correct way to do this?
The code you posted has an enormous amount of syntax errors/glitches. I would suggest you start with getting accustomed to the syntax, rather than trying to make Elixir better by inventing the things that nobody uses.
Here is the correct version that does what you wanted. The task might be accomplished with an anonymous function, although I hardly see a reason to make a perfectly looking idiomatic Elixir look ugly.
defmodule Foo do
#items [%{id: 1}, %{id: 2}, %{id: 3}]
def get!(id) do
has_target_id? = fn item -> item.id == id end
Enum.find(#items, has_target_id?)
end
end
Foo.get! 1
#⇒ %{id: 1}
Foo.get! 4
#⇒ nil
You can do this:
def get!(Item, id) do
Enum.find(
#items,
&compare_ids(&1, id)
)
end
defp compare_ids(%Item{}=item, id) do
item.id == id
end
But, that's equivalent to:
Enum.find(
#items,
fn item -> compare_ids(item, id) end
)
and may not pass your looks ugly and difficult to read test.
I was somehow under the impression Elixir supports nested functions?
Easy enough to test:
defmodule A do
def go do
def greet do
IO.puts "hello"
end
greet()
end
end
Same error:
$ iex a.ex
Erlang/OTP 20 [erts-9.2] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]
** (ArgumentError) cannot invoke def/2 inside function/macro
(elixir) lib/kernel.ex:5150: Kernel.assert_no_function_scope/3
(elixir) lib/kernel.ex:3906: Kernel.define/4
(elixir) expanding macro: Kernel.def/2
a.ex:3: A.go/0
wouldn't:
defp compare_ids(item, id), do: item.id == id
be enough? Is there any advantage to including %Item{} or making
separate functions for returning both true and false conditions?
What you gain by specifying the first parameter as:
func(%Item{} = item, target_id)
is that only an Item struct will match the first parameter. Here is an example:
defmodule Item do
defstruct [:id, :name, :description]
end
defmodule Dog do
defstruct [:id, :name, :owner]
end
defmodule A do
def go(%Item{} = item), do: IO.inspect(item.id, label: "id: ")
end
In iex:
iex(1)> item = %Item{id: 1, name: "book", description: "old"}
%Item{description: "old", id: 1, name: "book"}
iex(2)> dog = %Dog{id: 1, name: "fido", owner: "joe"}
%Dog{id: 1, name: "fido", owner: "joe"}
iex(3)> A.go item
id: : 1
1
iex(4)> A.go dog
** (FunctionClauseError) no function clause matching in A.go/1
The following arguments were given to A.go/1:
# 1
%Dog{id: 1, name: "fido", owner: "joe"}
a.ex:10: A.go/1
iex(4)>
You get a function clause error if you call the function with a non-Item, and the earlier an error occurs, the better, because it makes debugging easier.
Of course, by preventing the function from accepting other structs, you make the function less general--but because it's a private function, you can't call it from outside the module anyway. On the other hand, if you wanted to call the function on both Dog and Item structs, then you could simply specify the first parameter as:
|
V
func(%{}=thing, target_id)
then both an Item and a Dog would match--but not non-maps.
What you gain by specifying the first parameter as:
|
V
func(%Item{id: id}, target_id)
is that you let erlang's pattern matching engine extract the data you need, rather than calling item.id as you would need to do with this definition:
func(%Item{}=item, target_id)
In erlang, pattern matching in a parameter list is the most efficient/convenient/stylish way to write functions. You use pattern matching to extract the data that you want to use in the function body.
Going even further, if you write the function definition like this:
same variable name
| |
V V
func(%Item{id: target_id}, target_id)
then erlang's pattern matching engine not only extracts the value for the id field from the Item struct, but also checks that the value is equal to the value of the target_id variable in the 2nd argument.
Defining multiple function clauses is a common idiom in erlang, and it is considered good style because it takes advantage of pattern matching rather than logic inside the function body. Here's an erlang example:
get_evens(List) ->
get_evens(List, []).
get_evens([Head|Tail], Results) when Head rem 2 == 0 ->
get_evens(Tail, [Head|Results]);
get_evens([Head|Tail], Results) when Head rem 2 =/= 0 ->
get_evens(Tail, Results);
get_evens([], Results) ->
lists:reverse(Results).

What is the simplest way to do upsert with Ecto (MySQL)

Doing upsert is common in my app and I want to implement the cleanest and simple way to implement upsert.
Should I use fragments to implement native sql upsert?
Any idiomatic ecto way to do upsert?
You can use Ecto.Repo.insert_or_update/2, please note that for this to work, you will have to load existing models from the database.
model = %Post{id: 'existing_id', ...}
MyRepo.insert_or_update changeset
# => {:error, "id already exists"}
Example:
result =
case MyRepo.get(Post, id) do
nil -> %Post{id: id} # Post not found, we build one
post -> post # Post exists, using it
end
|> Post.changeset(changes)
|> MyRepo.insert_or_update
case result do
{:ok, model} -> # Inserted or updated with success
{:error, changeset} -> # Something went wrong
end
In my case insert_or_update raised an error due to the unique index constraint 🤔
What did work for me was Postgres v9.5 upsert through on_conflict parameter:
(considering unique column is called user_id)
changeset
|> MyRepo.insert(
on_conflict: :replace_all,
conflict_target: :user_id
)
If you're looking to upsert by something other than id, you can swap in get_by for get like this:
model = %User{email: "existing_or_new_email#heisenberg.net", name: "Cat", ...}
model |> User.upsert_by(:email)
# => {:found, %User{...}} || {:ok, %User{...}}
defmodule App.User do
alias App.{Repo, User}
def upsert_by(%User{} = record_struct, selector) do
case User |> Repo.get_by({selector, record_struct |> Map.get(selector)}) do
nil -> %User{} # build new user struct
user -> user # pass through existing user struct
end
|> User.changeset(record_struct |> Map.from_struct)
|> Repo.insert_or_update
end
end
On the off chance you're looking for a flexible approach that works across models and for multiple selectors (ie country + passport number), check out my hex package EctoConditionals!

Parsing file with multiple JSON entries in Scala

I have a JSON file that I am trying to parse using Scala. I have figured out how to use Scala JSON parsing library to parse 1 entry in this format:
{"name":"John","number":"005","fav_colour":"blue"}
this is the code that works:
val result = JSON.parseFull("""{"name":"John","number":"005","fav_colour":"blue"}""")
result match {
case Some(e) => println(e)
case None => println("Failed.")
}
This prints Map(name -> John, number -> 005, fav_colour -> blue)
The code is based of of this: https://gist.github.com/takezoe/1540223
However, I am working with a file like this:
""" {"name":"John","number":"005","fav_colour":"blue"}
{"name":"Mary","number":"010","fav_colour":"yellow"}
{"name":"Anna","number":"007","fav_colour":"pink"}
{"name":"Dave","number":"003","fav_colour":"purple"}
"""
Note, I also tried separating with commas and still it did not work.
I am just wondering if I have to write a function to separate each {bracketed entry} or if there is some functionality of the JSON library that I am missing. So far, when I pass in my file it returns None instead of Some(valid information).
Thanks!
You dont have a valid Json file. This would be valid:
[
{"name":"John","number":"005","fav_colour":"blue"},
{"name":"Mary","number":"010","fav_colour":"yellow"},
{"name":"Anna","number":"007","fav_colour":"pink"},
{"name":"Dave","number":"003","fav_colour":"purple"}
]
Result:
Some(List(Map(name -> John, number -> 005, fav_colour -> blue), Map(name -> Mary, number -> 010, fav_colour -> yellow), Map(name -> Anna, number -> 007, fav_colour -> pink), Map(name -> Dave, number -> 003, fav_colour -> purple)))
http://www.scalakata.com/522bdbfeebb25c7f5d823c7d
The format you use is convenient for gathering information over time, e.g. keeping logs.
You can parse it by reusing the parser combinators!
For example:
import scala.util.parsing.json.JSON
val parseResult = JSON.rep1(JSON.root)(new JSON.lexical.Scanner("{\"a\": 1} {\"b\": 2}"))
parseResult match {case JSON.Success (result, _) => result; case _ => Nil}
returns
List({"a" : 1.0}, {"b" : 2.0})