Read data from MySQL via Vec<Row> - mysql

What's the best (simple, safe & efficient) way to read data from MySQL via Vec<Row>?
Here is an example function that uses a vector of (specific) tuples:
fn read_recs(tx: &mut Transaction) -> Result<HashMap<i64, String>> {
let q = format!("SELECT id, name FROM some_table");
let rows: Vec<(i64, String)> = tx.query(q)?;
let mut res = HashMap::new();
for r in rows {
res.insert(r.0, r.1);
}
Ok(res)
}
For simple queries I've found that approach good enough (as well as query_map()), but for queries returning more columns, I'd prefer to directly read into Vec<Row> and then somehow read the values in a simple (not too verbose) fashion like this:
let rows: Vec<Row> = tx.query(q)?;
for r in rows {
let id: i64 = extract(r, 0)?;
let name: String = extract(r, 1)?;
}
or maybe even this:
let rows: Vec<Row> = tx.query(q)?;
for r in rows {
let id: i64 = extract(r, "id")?;
let name: String = extract(r, "name")?;
}
Question is, what would that extract look like so that it either reads the data successfully or returns an error? I do not want it to panic in case, for example, the query returns NULL in a situation where I only expect not-null values. And obviously I would prefer not to have to deal with Option<...> values every time I read data from a field, as that would make my code way too verbose.
BTW I use Anyhow if that matters.

I don't know if I quite understand the question, but wouldn't this work?
fn extract<I, T>(row: &Row, index: I) -> anyhow::Result<T>
where
I: ColumnIndex,
T: FromValue,
{
row.get_opt(index).ok_or(ValueAlreadyTakenErrorOrSomething)?
}

Related

How to query using an IN clause and a `Vec` as parameter in Rust sqlx for MySQL?

Note: this is a similar but NOT duplicate question with How to use sqlx to query mysql IN a slice?. I'm asking for the Rust one.
This is what I try to do.
let v = vec![..];
sqlx::query("SELECT something FROM table WHERE column IN (?)").bind(v)
...
Then I got the following error
the trait bound `std::vec::Vec<u64>: sqlx::Encode<'_, _>` is not satisfied
Answer is in first on FAQ https://github.com/launchbadge/sqlx/blob/master/FAQ.md
How can I do a SELECT ... WHERE foo IN (...) query? In 0.6 SQLx will
support binding arrays as a comma-separated list for every database,
but unfortunately there's no general solution for that currently in
SQLx itself. You would need to manually generate the query, at which
point it cannot be used with the macros.
The error shows Vec is not an Encode that is required to be as a valid DB value. The Encode doc lists all the Rust types that have implemented the trait. Vec is not one.
You can use the following way to bind the parameters in IN with the values of a vector. Firstly, you need to expand the number of '?' in the IN expression to be the same number of the parameters. Then, you need to call bind to bind the values one by one.
let v = vec![1, 2];
let params = format!("?{}", ", ?".repeat(v.len()-1));
let query_str = format!("SELECT id FROM test_table WHERE id IN ( { } )", params);
let mut query = sqlx::query(&query_str);
for i in v {
query = query.bind(i);
}
let row = query.fetch_all(&pool).await?;
Please note if the target database is not MySql, you need to use $n, like $1, $2, instead of ?, as the parameter placeholder.

Why are these two function types in typescript different?

Using typescript, I'm finding why two different functions assigned to different local variables result in different signatures. I thought one was just more explicit.
let a: (number)=>number =
function(x: number): number {return 42;};
let z = function(x:number): number { return 42; };
> .type a
let a: (number: any) => number
> .type z
let z: (x: number) => number
I thought a was just a more explicit version of writing z, but somehow it gets typed more liberally as accepting any.
Using Typescript version 2.5.2
let a: (number)=>number
The parameter name is required. This is exactly equivalent to:
let a: (number: any)=>number
In other words, the first number here defines a parameter named "number"
What you need is,
let a: (x: number)=>number =
function(x: number): number {return 42;};
The name, x, doesn't matter.

Pass Function to reduce duplicate code

I'm trying to learn F# and I feel like I can write / rewrite this block of code to be more "idiomatic" F# but I just can't figure out how I can accomplish it.
My simple program will be loading values from 2 csv files: A list of Skyrim potion effects, and a list of Skyrim Ingredients. An ingredient has 4 Effects. Once I have the Ingredients, I can write something to process them - right now, I just want to write the CSV load in a way that makes sense.
Code
Here are my types:
type Effect(name:string, id, description, base_cost, base_mag, base_dur, gold_value) =
member this.Name = name
member this.Id = id
member this.Description = description
member this.Base_Cost = base_cost
member this.Base_Mag = base_mag
member this.Base_Dur = base_dur
member this.GoldValue = gold_value
type Ingredient(name:string, id, primary, secondary, tertiary, quaternary, weight, value) =
member this.Name = name
member this.Id = id
member this.Primary = primary
member this.Secondary = secondary
member this.Tertiary = tertiary
member this.Quaternary = quaternary
member this.Weight = weight
member this.Value = value
Here is where I parse an individual comma-separated string, per type:
let convertEffectDataRow (csvLine:string) =
let cells = List.ofSeq(csvLine.Split(','))
match cells with
| name::id::effect::cost::mag::dur::value::_ ->
let effect = new Effect(name, id, effect, Decimal.Parse(cost), Int32.Parse(mag), Int32.Parse(dur), Int32.Parse(value))
Success effect
| _ -> Failure "Incorrect data format!"
let convertIngredientDataRow (csvLine:string) =
let cells = List.ofSeq(csvLine.Split(','))
match cells with
| name::id::primary::secondary::tertiary::quaternary::weight::value::_ ->
Success (new Ingredient(name, id, primary, secondary, tertiary, quaternary, Decimal.Parse(weight), Int32.Parse(value)))
| _ -> Failure "Incorrect data format!"
So I feel like I should be able to build a function that accepts one of these functions or chains them or something, so that I can recursively go through the lines in the CSV file and pass those lines to the correct function above. Here is what I've tried so far:
type csvTypeEnum = effect=1 | ingredient=2
let rec ProcessStuff lines (csvType:csvTypeEnum) =
match csvType, lines with
| csvTypeEnum.effect, [] -> []
| csvTypeEnum.effect, currentLine::remaining ->
let parsedLine = convertEffectDataRow2 currentLine
let parsedRest = ProcessStuff remaining csvType
parsedLine :: parsedRest
| csvTypeEnum.ingredient, [] -> []
| csvTypeEnum.ingredient, currentLine::remaining ->
let parsedLine = convertIngredientDataRow2 currentLine
let parsedRest = ProcessStuff remaining csvType
parsedLine :: parsedRest
| _, _ -> Failure "Error in pattern matching"
But this (predictably) has a compile error on second instance of recursion and the last pattern. Specifically, the second time parsedLine :: parsedRest shows up does not compile. This is because the function is attempting to both return an Effect and an Ingredient, which obviously won't do.
Now, I could just write 2 entirely different functions to handle the different CSVs, but that feels like extra duplication. This might be a harder problem than I'm giving it credit for, but it feels like this should be rather straightforward.
Sources
The CSV parsing code I took from chapter 4 of this book: https://www.manning.com/books/real-world-functional-programming
Since the line types aren't interleaved into the same file and they refer to different csv file formats, I would probably not go for a Discriminated Union and instead pass the processing function to the function that processes the file line by line.
In terms of doing things idiomatically, I would use a Record rather than a standard .NET class for this kind of simple data container. Records provide automatic equality and comparison implementations which are useful in F#.
You can define them like this:
type Effect = {
Name : string; Id: string; Description : string; BaseCost : decimal;
BaseMag : int; BaseDuration : int; GoldValue : int
}
type Ingredient= {
Name : string; Id: string; Primary: string; Secondary : string; Tertiary : string;
Quaternary : string; Weight : decimal; GoldValue : int
}
That requires a change to the conversion function, e.g.
let convertEffectDataRow (csvLine:string) =
let cells = List.ofSeq(csvLine.Split(','))
match cells with
| name::id::effect::cost::mag::dur::value::_ ->
Success {Name = name; Id = id; Description = effect; BaseCost = Decimal.Parse(cost);
BaseMag = Int32.Parse(mag); BaseDuration = Int32.Parse(dur); GoldValue = Int32.Parse(value)}
| _ -> Failure "Incorrect data format!"
Hopefully it's obvious how to do the other one.
Finally, cast aside the enum and simply replace it with the appropriate line function (I've also swapped the order of the arguments).
let rec processStuff f lines =
match lines with
|[] -> []
|current::remaining -> f current :: processStuff f remaining
The argument f is just a function that is applied to each string line. Suitable f values are the functions we created above, e.g.convertEffectDataRow. So you can simply call processStuff convertEffectDataRow to process an effect file and processStuff convertIngredientDataRow to process and ingredients file.
However, now we've simplified the processStuff function, we can see it has type: f:('a -> 'b) -> lines:'a list -> 'b list. This is the same as the built-in List.map function so we can actually remove this custom function entirely and just use List.map.
let processEffectLines lines = List.map convertEffectDataRow lines
let processIngredientLines lines = List.map convertIngredientDataRow lines
(optional) Convert Effect and Ingredient to records, as s952163 suggested.
Think carefully about the return types of your functions. ProcessStuff returns a list from one case, but a single item (Failure) from the other case. Thus compilation error.
You haven't shown what Success and Failure definitions are. Instead of generic success, you could define the result as
type Result =
| Effect of Effect
| Ingredient of Ingredient
| Failure of string
And then the following code compiles correctly:
let convertEffectDataRow (csvLine:string) =
let cells = List.ofSeq(csvLine.Split(','))
match cells with
| name::id::effect::cost::mag::dur::value::_ ->
let effect = new Effect(name, id, effect, Decimal.Parse(cost), Int32.Parse(mag), Int32.Parse(dur), Int32.Parse(value))
Effect effect
| _ -> Failure "Incorrect data format!"
let convertIngredientDataRow (csvLine:string) =
let cells = List.ofSeq(csvLine.Split(','))
match cells with
| name::id::primary::secondary::tertiary::quaternary::weight::value::_ ->
Ingredient (new Ingredient(name, id, primary, secondary, tertiary, quaternary, Decimal.Parse(weight), Int32.Parse(value)))
| _ -> Failure "Incorrect data format!"
type csvTypeEnum = effect=1 | ingredient=2
let rec ProcessStuff lines (csvType:csvTypeEnum) =
match csvType, lines with
| csvTypeEnum.effect, [] -> []
| csvTypeEnum.effect, currentLine::remaining ->
let parsedLine = convertEffectDataRow currentLine
let parsedRest = ProcessStuff remaining csvType
parsedLine :: parsedRest
| csvTypeEnum.ingredient, [] -> []
| csvTypeEnum.ingredient, currentLine::remaining ->
let parsedLine = convertIngredientDataRow currentLine
let parsedRest = ProcessStuff remaining csvType
parsedLine :: parsedRest
| _, _ -> [Failure "Error in pattern matching"]
csvTypeEnum type looks fishy, but I'm not sure what you were trying to achieve, so just fixed the compilation errors.
Now you can refactor your code to reduce duplication by passing functions as parameters when needed. But always start with types!
You can certainly pass a function to another function and use a DU as a return type, for example:
type CsvWrapper =
| CsvA of string
| CsvB of int
let csvAfunc x =
CsvA x
let csvBfunc x =
CsvB x
let csvTopFun x =
x
csvTopFun csvBfunc 5
csvTopFun csvAfunc "x"
As for the type definitions, you can just use records, will save you some typing:
type Effect = {
name:string
id: int
description: string
}
let eff = {name="X";id=9;description="blah"}

Deedle Frame From Database, What is the Schema?

I am new to Deedle, and in documentation I cant find how to solve my problem.
I bind an SQL Table to a Deedle Frame using the following code:
namespace teste
open FSharp.Data.Sql
open Deedle
open System.Linq
module DatabaseService =
[<Literal>]
let connectionString = "Data Source=*********;Initial Catalog=*******;Persist Security Info=True;User ID=sa;Password=****";
type bd = SqlDataProvider<
ConnectionString = connectionString,
DatabaseVendor = Common.DatabaseProviderTypes.MSSQLSERVER >
type Database() =
static member contextDbo() =
bd.GetDataContext().Dbo
static member acAgregations() =
Database.contextDbo().AcAgregations |> Frame.ofRecords
static member acBusyHourDefinition() =
Database.contextDbo().AcBusyHourDefinition
|> Frame.ofRecords "alternative_reference_table_scan", "formula"]
static member acBusyHourDefinitionFilterByTimeAgregationTipe(value:int) =
Database.acBusyHourDefinition()
|> Frame.getRows
These things are working properly becuse I can't understand the Data Frame Schema, for my surprise, this is not a representation of the table.
My question is:
how can I access my database elements by Rows instead of Columns (columns is the Deedle Default)? I Thied what is showed in documentation, but unfortunatelly, the columns names are not recognized, as is in the CSV example in Deedle Website.
With Frame.ofRecords you can extract the table into a dataframe and then operate on its rows or columns. In this case I have a very simple table. This is for SQL Server but I assume MySQL will work the same. If you provide more details in your question the solution can narrowed down.
This is the table, indexed by ID, which is Int64:
You can work with the rows or the columns:
#if INTERACTIVE
#load #"..\..\FSLAB\packages\FsLab\FsLab.fsx"
#r "System.Data.Linq.dll"
#r "FSharp.Data.TypeProviders.dll"
#endif
//open FSharp.Data
//open System.Data.Linq
open Microsoft.FSharp.Data.TypeProviders
open Deedle
[<Literal>]
let connectionString1 = #"Data Source=(LocalDB)\MSSQLLocalDB;AttachDbFilename=C:\Users\userName\Documents\tes.sdf.mdf"
type dbSchema = SqlDataConnection<connectionString1>
let dbx = dbSchema.GetDataContext()
let table1 = dbx.Table_1
query { for row in table1 do
select row} |> Seq.takeWhile (fun x -> x.ID < 10L) |> Seq.toList
// check if we can connect to the DB.
let df = table1 |> Frame.ofRecords // pull the table into a df
let df = df.IndexRows<System.Int64>("ID") // if you need an index
df.GetRows(2L) // Get the second row, but this can be any kind of index/key
df.["Number"].GetSlice(Some 2L, Some 5L) // get the 2nd to 5th row from the Number column
Will get you the following output:
val it : Series<System.Int64,float> =
2 -> 2
>
val it : Series<System.Int64,float> =
2 -> 2
3 -> 3
4 -> 4
5 -> 5
Depending on what you're trying to do Selecting Specific Rows in Deedle might also work.
Edit
From your comment you appear to be working with some large table. Depending on how much memory you have and how large the table you still might be able to load it. If not these are some of things you can do in increasing complexity:
Use a query { } expression like above to narrow the dataset on the database server and convert just part of the result into a dataframe. You can do quite complex transformations so you might not even need the dataframe in the end. This is basically Linq2Sql.
Use lazy loading in Deedle. This works with series so you can get a few series and reassemble a dataframe.
Use Big Deedle which is designed for this sort of thing.

how to time an arbitrary function in f#

here's the problem. I need to time a function in f# using another function. I have this piece of code
let time f a =
let start = System.DateTime.Now in
let res = (fun f a -> f(a)) in
let finish = System.DateTime.Now in
(res, finish - start)
which I'm trying to call saying
time ackermann (2,9);;
I have a function ackermann that takes a tuple (s,n) as argument
Probably something fundamentally wrong with this but I don't think I'm far away from a solution that could and looks somewhat like this.
Any suggestions?
Oh btw. the error message I'm getting is saying :
stdin(19,1): error FS0030: Value restriction. The value 'it' has been inferred to have generic type
val it : (('_a -> '_b) -> '_a -> '_b) * System.TimeSpan
Either define 'it' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.
You have at least two issues:
Try let res = f a. You already have values f and a in scope, but you're currently defining res as a function which takes a new f and applies it to a new a.
Don't use DateTimes (which are appropriate for representing dates and times, but not short durations). Instead, you should be using a System.Diagnostics.Stopwatch.
You can do something like this:
let time f =
let sw = System.Diagnostics.Stopwatch.StartNew()
let r = f()
sw.Stop()
printfn "%O" sw.Elapsed
r
Usage
time (fun () -> System.Threading.Thread.Sleep(100))
I usually keep the following in my code files when sending a bunch of stuff to fsi.
#if INTERACTIVE
#time "on"
#endif
That turns on fsi's built-in timing, which provides more than just execution time:
Real: 00:00:00.099, CPU: 00:00:00.000, GC gen0: 0, gen1: 0, gen2: 0
I would do it like this:
let time f = fun a ->
let start = System.DateTime.Now
let res = f a
(res, System.DateTime.Now - start)
You can then use it to create timed functions e.g.
let timedAckermann = time ackermann
let (res, period) = timedAckermann (2,9)
You should also consider using System.Diagnostics.Stopwatch for timing instead of DateTimes.
As was already suggested, you should use Stopwatch instead of DateTime for this kind of timing analyses.
What wasn't mentioned yet is that if you for some reason need to use DateTime, then always consider using DateTime.UtcNow rather than DateTime.Now. The implementation of DateTime.Now can be paraphrased as "DateTime.UtcNow.ToLocalTime()", and that "ToLocalTime()" part is doing more than you might think it would do. In addition to having less overhead, DateTime.UtcNow also avoids headaches related to daylight savings time. You can find several articles and blog posts on the web on the differences between DateTime.Now and DateTime.UtcNow
Inspired by how FSharp Interactive does it (see https://github.com/Microsoft/visualfsharp/blob/master/src/fsharp/fsi/fsi.fs#L175), this will time the function plus report how much CPU, allocation, etc.
Example output: Real: 00:00:00.2592820, CPU: 00:00:26.1814902, GC gen0: 30, gen1: 1, gen2: 0
let time f =
let ptime = System.Diagnostics.Process.GetCurrentProcess()
let numGC = System.GC.MaxGeneration
let startTotal = ptime.TotalProcessorTime
let startGC = [| for i in 0 .. numGC -> System.GC.CollectionCount(i) |]
let stopwatch = System.Diagnostics.Stopwatch.StartNew()
let res = f ()
stopwatch.Stop()
let total = ptime.TotalProcessorTime - startTotal
let spanGC = [ for i in 0 .. numGC-> System.GC.CollectionCount(i) - startGC.[i] ]
let elapsed = stopwatch.Elapsed
printfn "Real: %A, CPU: %A, GC %s" elapsed total ( spanGC |> List.mapi (sprintf "gen%i: %i") |> String.concat ", ")
res