Asserting timestamp with microseconds equals mysql database value in Ecto/Phoenix - mysql

I've been playing around with Elixir Phoenix and have a simple integration test that checks that a json response of a model is the same as the json-rendered representation of that model.
The test looks like this:
test "#show renders a single link" do
conn = get_authenticated_conn()
link = insert(:link)
conn = get conn, link_path(conn, :show, link)
assert json_response(conn, 200) == render_json(LinkView, "show.json", link: link)
end
This used to work fine but following a recent mix deps.update the test has broken due a precision problem with the timestamps of the model. Here is the output from the test:
Assertion with == failed
code: json_response(conn, 200) == render_json(LinkView, "show.json", link: link)
lhs: %{"id" => 10, "title" => "A handy site to find stuff on the internet", "url" => "http://google.com", "inserted_at" => "2017-01-09T19:27:57.000000", "updated_at" => "2017-01-09T19:27:57.000000"}
rhs: %{"id" => 10, "title" => "A handy site to find stuff on the internet", "url" => "http://google.com", "inserted_at" => "2017-01-09T19:27:56.606085", "updated_at" => "2017-01-09T19:27:56.606093"}
We can see that the timestamps of the response given by the controller compared to the json rendering of the model do not match. This is because the MySQL database (5.7) rounds microseconds down to 0 whilst the in-memory Ecto model representation supports higher accuracy. My migration just uses Ecto's timestamps function.
What's the best way to get these tests to pass? I don't particularly care about microsecond precision for my timestamps but clearly Ecto has made it more accurate in a recent update. I have a feeling it might be a problem with MariaEx but not sure how to solve.

As mentioned in the Ecto v2.1 CHANGELOG, to get the old behavior of not keeping usec in automatic timestamps (like it was until Ecto < v2.1), you can add the following module attribute just before the call to schema in the relevant model(s):
#timestamps_opts [usec: false]

Related

Pyspark explain difference with and without custom schema for reading csv

I am reading a CSV file that has a header but creating my custom schema to read. I wanted to understand if there is a difference shown in explain if I provide a schema or not. My curiosity raised from this statement about read.csv on the doc
Loads a CSV file and returns the result as a DataFrame.
This function will go through the input once to determine the input schema if inferSchema is enabled. To avoid going through the entire data once, disable inferSchema option or specify the schema explicitly using schema.
I could see the time delay in my prompt when I provide the schema in comparison to inferSchema being used. But I don't see any difference in explain function. Below are my code and the output with schema provided
>> friends_header_df = spark.read.csv(path='resources/fakefriends-header.csv',schema=custom_schems, header='true', sep=',')
>> print(friends_header_df._jdf.queryExecution().toString())
== Parsed Logical Plan ==
Relation[id#8,name#9,age#10,numFriends#11] csv
== Analyzed Logical Plan ==
id: int, name: string, age: int, numFriends: int
Relation[id#8,name#9,age#10,numFriends#11] csv
== Optimized Logical Plan ==
Relation[id#8,name#9,age#10,numFriends#11] csv
== Physical Plan ==
FileScan csv [id#8,name#9,age#10,numFriends#11] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[file:/Users/sgudisa/Desktop/python data analysis workbook/spark-workbook/resour..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int,name:string,age:int,numFriends:int>
And below for reading with inferSchema option
>> friends_noschema_df = spark.read.csv(path='resources/fakefriends-header.csv',header='true',inferSchema='true',sep=',')
>> print(friends_noschema_df._jdf.queryExecution().toString())
== Parsed Logical Plan ==
Relation[userID#32,name#33,age#34,friends#35] csv
== Analyzed Logical Plan ==
userID: int, name: string, age: int, friends: int
Relation[userID#32,name#33,age#34,friends#35] csv
== Optimized Logical Plan ==
Relation[userID#32,name#33,age#34,friends#35] csv
== Physical Plan ==
FileScan csv [userID#32,name#33,age#34,friends#35] Batched: false, DataFilters: [], Format: CSV, Location: InMemoryFileIndex[file:/Users/sgudisa/Desktop/python data analysis workbook/spark-workbook/resour..., PartitionFilters: [], PushedFilters: [], ReadSchema: struct<userID:int,name:string,age:int,friends:int>
Except for the numbers changing for the columns in Parsed Logical plan, I don't see any explanation for spark reading all the data once.
InferSchema = false is the default option. You will get all columns as strings for the DF. But if you provide a schema you get your output.
Inferring a Schema means Spark will kick off an extra Job underwater to do exactly that; you can see that in fact. It will take longer but you will not see - as you state - anything in the explained Plan. Underwater is "underwater".

How to parse JSON response in Ruby

The end goal for this is to be part of a chatbot that returns an airport's weather.
Using import.io, I built an endpoint to query the weather service I'd which provides this response:
{"extractorData"=>
{"url"=>
"https://www.aviationweather.gov/metar/data?ids=kokb&format=decoded&hours=0&taf=off&layout=on&date=0",
"resourceId"=>"66ca907842aabb6b08b8bc12049ad533",
"data"=>
[{"group"=>
[{"Timestamp"=>[{"text"=>"Data at: 2135 UTC 12 Dec 2016"}],
"Airport"=>[{"text"=>"KOKB (Oceanside Muni, CA, US)"}],
"FullText"=>
[{"text"=>
"KOKB 122052Z AUTO 24008KT 10SM CLR 18/13 A3006 RMK AO2 SLP179 T01780133 58021"}],
"Temperature"=>[{"text"=>"17.8°C ( 64°F)"}],
"Dewpoint"=>[{"text"=>"13.3°C ( 56°F) [RH = 75%]"}],
"Pressure"=>
[{"text"=>
"30.06 inches Hg (1018.0 mb) [Sea level pressure: 1017.9 mb]"}],
"Winds"=>
[{"text"=>"from the WSW (240 degrees) at 9 MPH (8 knots; 4.1 m/s)"}],
"Visibility"=>[{"text"=>"10 or more sm (16+ km)"}],
"Ceiling"=>[{"text"=>"at least 12,000 feet AGL"}],
"Clouds"=>[{"text"=>"sky clear below 12,000 feet AGL"}]}]}]},
"pageData"=>
{"resourceId"=>"66ca907842aabb6b08b8bc12049ad533",
"statusCode"=>200,
"timestamp"=>1481578559306},
"url"=>
"https://www.aviationweather.gov/metar/data?ids=kokb&format=decoded&hours=0&taf=off&layout=on&date=0",
"runtimeConfigId"=>"2ddb288f-9e57-4b58-a690-1cd409f9edd3",
"timestamp"=>1481579246454,
"sequenceNumber"=>-1}
I seem to be running into two issues. How do I:
pull each field and write it into its own variable
ignore the "text" modifier in the response.
If you're getting a response object, you might want to do something like
parsed_json = JSON.parse(response.body)
Then you can do things like parsed_json[:some_field]
The simple answer is:
require 'json'
foo = JSON['{"a":1}']
foo # => {"a"=>1}
JSON is smart enough to look at the parameter and, based on whether it's a string or an Array or Hash, parse it or serialize it. In the above case it parsed it back into a Hash.
From that point it takes normal Ruby to dive into the hash you got back and access particular values:
foo = JSON['{"a":1, "b":[{"c":3}]}']
foo # => {"a"=>1, "b"=>[{"c"=>3}]}
foo['b'][0]['c'] # => 3
How to walk through a hash is covered extensively on the internet and here on Stack Overflow, so search around and see what you can find.

Rails query objects by key value of hash saved to column?

I have 2 objects, Visitors and Events. Visitors have multiple Events. An event stores parameters like this...
#<Event id: 5466, event_type: "Visit", visitor_token: "c26a6098-64bb-4652-9aa0-e41c214f42cb", contact_id: 657, data: {"url"=>"http://widget.powerpress.co/", "title"=>"Home (light) | Widget"}, created_at: "2015-12-17 14:51:53", updated_at: "2015-12-17 14:51:53", website_id: 2>
As you can see, there is a serialized text column called data that stores a hash with more data.
I need to find out if a visitor has visited a certain page, which would be very simple if the url parameter were it's own column, or if the hash were an hstore column, however it wasn't originally set up that way and it's a part of the saved hash.
Here's my attempted rails queries...
visitor.events.where("data -> url = :value", value: 'http://widget.powerpress.co/')
visitor.events.where("data like ?", "{'url' => 'http://widget.powerpress.co/'}")
visitor.events.where("data -> :key LIKE :value", :key => 'url', :value => "%http://widget.powerpress.co/%")
How does one properly query postgres to find objects that have a hash that contains a key with a specific value?
I suspect you're not looking for the right string. It should be "url"=>"http://widget.powerpress.co/", so:
visitor.events.where("data like ?", '%"url"=>"http://widget.powerpress.co/"%')
Check the right value directly in DB.
If you are storing hash in a text column, try following:
visitor.events.select{|ve| eval(ve.data)["url"] == "http://widget.powerpress.co/"}
Hope, it helps!
It worked for me.
visitor.events.select { |n| n.data && n.data['url'] == "http://widget.powerpress.co/"}

How can I get ruby's JSON to follow object references like Pry/PP?

I've stared at this so long I'm going in circles...
I'm using the rbvmomi gem, and in Pry, when I display an object, it recurses down thru the structure showing me the nested objects - but to_json seems to "dig down" into some objects, but just dump the reference for others> Here's an example:
[24] pry(main)> g
=> [GuestNicInfo(
connected: true,
deviceConfigId: 4000,
dynamicProperty: [],
ipAddress: ["10.102.155.146"],
ipConfig: NetIpConfigInfo(
dynamicProperty: [],
ipAddress: [NetIpConfigInfoIpAddress(
dynamicProperty: [],
ipAddress: "10.102.155.146",
prefixLength: 20,
state: "preferred"
)]
),
macAddress: "00:50:56:a0:56:9d",
network: "F5_Real_VM_IPs"
)]
[25] pry(main)> g.to_json
=> "[\"#<RbVmomi::VIM::GuestNicInfo:0x000000085ecc68>\"]"
Pry apparently just uses a souped-up pp, and while "pp g" gives me close to what I want, I'm kinda steering as hard as I can toward json so that I don't need a custom parser to load up and manipulate the results.
The question is - how can I get the json module to dig down like pp does? And if the answer is "you can't" - any other suggestions for achieving the goal? I'm not married to json - if I can get the data serialized and read it back later (without writing something to parse pp output... which may already exist and I should look for it), then it's all win.
My "real" goal here is to slurp up a bunch of info from our vsphere stuff via rbvmomi so that I can do some network/vm analysis on it, which is why I'd like to get it in a nice machine-parsed format. If I'm doing something stupid here and there's an easier way to go about this - lay it on me, I'm not proud. Thank you all for your time and attention.
Update: Based on Arnie's response, I added this monkeypatch to my script:
class RbVmomi::BasicTypes::DataObject
def to_json(*args)
h = self.props
m = h.merge({ JSON.create_id => self.class.name })
m.to_json(*args)
end
end
and now my to_json recurses down nicely. I'll see about submitting this (or the def, really) to the project.
The .to_json works in a recursive manner, the default behavior is defined as:
Converts this object to a string (calling to_s), converts it to a JSON string, and returns the result. This is a fallback, if no special method to_json was defined for some object.
json library has added some implementation for some common classes (check the left hand side of this documentation), such as Array, Range, DateTime.
For an array, to_json first convert all the elements to json object, concat then together, and then add the array mark [/].
For your case, you need to define your customized to_json method for GuestNicInfo, NetIpConfigInfo and NetIpConfigInfoIpAddress. I don't know your implementation about these three classes, so I wrote a example to demonstrate how to achieve this:
require 'json'
class MyClass
attr_accessor :a, :b
def initialize(a, b)
#a = a
#b = b
end
end
data = [MyClass.new(1, "foobar")]
puts data.to_json
#=> ["#<MyClass:0x007fb6626c7260>"]
class MyClass
def to_json(*args)
{
JSON.create_id => self.class.name,
:a => a,
:b => b
}.to_json(*args)
end
end
puts data.to_json
#=> [{"json_class":"MyClass","a":1,"b":"foobar"}]

GetMapping not working for Nest client in Elasticsearch

perhaps some of the documentation http://nest.azurewebsites.net/ is old because i'm running into a at least few issues...
i've got a json object 'search'. i am getting null returned from the GetMapping function. well, it returns a Nest.RootObjectMapping object, but all fields within are null. i can get the mapping fine using Sense or regular curl.
var mapping = elasticClient.GetMapping<MyJsonPOCO>();
any ideas?
also, just as example of other things going wrong, this search works, but adding 'fields' to it does not (i got the fields declaration per the documentation)
var result = elasticClient.Search<MyJsonPOCO>(s => s
.Query(q => q
.QueryString(qs => qs
.OnField(e => e.Title)
.Query("my search term"))));
if i use this query with the fields added (to just return 'title'), i get a json parser issue.
var result = elasticClient.Search<MyJsonPOCO>(s => s
.Fields(f => f.Title)
.Query(q => q
.QueryString(qs => qs
.OnField(e => e.Title)
.Query("my search term"))));
here's the error for that one:
An exception of type 'Newtonsoft.Json.JsonReaderException' occurred in Newtonsoft.Json.dll but was not handled in user code
Additional information: Error reading string. Unexpected token: StartArray. Path 'hits.hits[0].fields.title', line 1, position 227.
Elasticsearch 1.0 changed the way fields are returned in the search response
You need the NEST 1.0 beta1 release to work with Elasticsearch 1.0
http://www.elasticsearch.org/blog/introducing-elasticsearch-net-nest-1-0-0-beta1/
See also this github issue for more background information on the why and how to work with fields from 1.0 forward:
https://github.com/elasticsearch/elasticsearch-net/issues/590