I started writing a program to compare two CSV files. After reading the documentation, I found a solution for I can't figure out how to print the differences from the second file since the function returns true/false
package main
import (
"encoding/csv"
"fmt"
"os"
"reflect"
)
func main() {
file, err := os.Open("sms_in_max.csv")
if err != nil {
fmt.Println(err)
}
reader := csv.NewReader(file)
records, _ := reader.ReadAll()
fmt.Println(records)
file2, err := os.Open("sms_out.csv")
if err != nil {
fmt.Println(err)
}
reader2 := csv.NewReader(file2)
records2, _ := reader2.ReadAll()
fmt.Println(records2)
allrs :=reflect.DeepEqual(records, records2)
fmt.Println(allrs)
}
The csv ReadAll() function returns slice of rows, where row is a slice of columns.
We can loop over the rows, and within a row, again loop over the columns and compare each column value.
Here is the code that prints all lines that have differences alongwith their line numbers.
package main
import (
"encoding/csv"
"fmt"
"os"
)
func main() {
file, err := os.Open("sms_in_max.csv")
if err != nil {
fmt.Println(err)
}
reader := csv.NewReader(file)
records, _ := reader.ReadAll()
fmt.Println(records)
file2, err := os.Open("sms_out.csv")
if err != nil {
fmt.Println(err)
}
reader2 := csv.NewReader(file2)
records2, _ := reader2.ReadAll()
fmt.Println(records2)
// allrs := reflect.DeepEqual(records, records2)
// fmt.Println(allrs)
// Prints lines at which there is difference
for i := range records {
diff := false
for j := range records[i] {
if records[i][j] != records2[i][j] {
diff = true
break
}
}
if diff {
fmt.Printf("Line %d: %v, %v\n", i+1, records[i], records2[i])
}
}
}
Related
Is there a way to insert csv file using this go library https://github.com/ClickHouse/clickhouse-go in one command (without reading csv and iterating through the content.). If there is a way can you provide me with the example.
if not how can we convert this system command and write it in golang using os/exec library.
cat /home/srijan/employee.csv | clickhouse-client --query="INSERT INTO test1 FORMAT CSV"
It's impossible with that go library. You can use http api https://clickhouse.com/docs/en/interfaces/http/ and any http go client
for example
package main
import (
"compress/gzip"
"fmt"
"io"
"io/ioutil"
"net/http"
"net/url"
"os"
)
func compress(data io.Reader) io.Reader {
pr, pw := io.Pipe()
gw, err := gzip.NewWriterLevel(pw, int(3))
if err != nil {
panic(err)
}
go func() {
_, _ = io.Copy(gw, data)
gw.Close()
pw.Close()
}()
return pr
}
func main() {
p, err := url.Parse("http://localhost:8123/")
if err != nil {
panic(err)
}
q := p.Query()
q.Set("query", "INSERT INTO test1 FORMAT CSV")
p.RawQuery = q.Encode()
queryUrl := p.String()
var req *http.Request
req, err = http.NewRequest("POST", queryUrl, compress(os.Stdin))
req.Header.Add("Content-Encoding", "gzip")
if err != nil {
panic(err)
}
client := &http.Client{
Transport: &http.Transport{DisableKeepAlives: true},
}
resp, err := client.Do(req)
if err != nil {
panic(err)
}
defer resp.Body.Close()
body, _ := ioutil.ReadAll(resp.Body)
if resp.StatusCode != 200 {
panic(fmt.Errorf("clickhouse response status %d: %s", resp.StatusCode, string(body)))
}
}
How can i create and array of maps[string]interface to add multiple json files (not all json in one, but in different). I created a code, that add all json files in one. But in the future i need to compare field in map[string]interface. I think that need to create a loop. Here my program code:
var master map[string]interface{}
func main() {
fileIndex := 3 // three json files. All named test1.json, test2.json and test3.json
for i := 1; i <= fileIndex; i++ {
fileName := fmt.Sprintf("%s%d%s", "test", i, ".json")
// Open jsonFile
jsonFile, err := os.Open(fileName)
if err != nil {
log.Println("Error:", err)
}
defer jsonFile.Close()
byteValue, _ := ioutil.ReadAll(jsonFile)
json.Unmarshal(byteValue, &master)
fmt.Println(master)
}
}
And here my 3 json:
First:
{
"name":"Kate",
"date":"2013-04-23T19:24:59.511Z",
"data":"is nice"
}
Second:
{
"name":"Gleison",
"date":"2012-04-23T19:25:00.511Z",
"data":"is a good person"
}
Third:
{
"name":"Rodrigo",
"date":"2013-04-23T20:24:59.511Z",
"data":"is kind"
}
It is necessary to divide them into different map[string]interface. Without creating struct.
For an array, I think you are looking []map[string]interface{}. You can simply create this variable and append it if I understood your question correctly
Here is a modified example
package main
import (
"encoding/json"
"fmt"
"io/ioutil"
"log"
"os"
)
func main() {
var master map[string]interface{}
var allMaster []map[string]interface{}
fileIndex := 3 // three json files. All named test1.json, test2.json and test3.json
for i := 1; i <= fileIndex; i++ {
fileName := fmt.Sprintf("%s%d%s", "test", i, ".json")
// Open jsonFile
jsonFile, err := os.Open(fileName)
if err != nil {
log.Println("Error:", err)
}
defer jsonFile.Close()
byteValue, _ := ioutil.ReadAll(jsonFile)
err = json.Unmarshal(byteValue, &master)
if err != nil {
log.Fatal(err)
}
allMaster = append(allMaster, master)
fmt.Println(allMaster)
}
}
I am trying to build an API, but to secure it properly I believe I need to go with RSA encryption for a private key stored on my server and a public key for the client. I have stored the generated private key into a JSON file, I plan to store on my server but to write to JSON, I needed to convert the type too []byte. Now when I try to retrieve the private key to generate a public key, but it will not let me use type bytes for *Publickey
The only other way I can think of to accomplish this goal is to seed the random number generator, so I can have the seed a secret on my server and then my private key should always generate to the same thing, any help this this would be great.
package main
import (
"bytes"
"crypto/rand"
"crypto/rsa"
"encoding/json"
"fmt"
"io/ioutil"
"os"
)
func main() {
mimicPrivateKey, err := rsa.GenerateKey(rand.Reader, 2048)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
buf := new(bytes.Buffer)
json.NewEncoder(buf).Encode(mimicPrivateKey)
secrets, _ := os.OpenFile("secrets.json", os.O_RDWR|os.O_APPEND|os.O_CREATE, 0666)
// Close the secrets file when the surrounding function exists
secrets.WriteString(buf.String())
secrets.Close()
secrets, _ = os.OpenFile("secrets.json", os.O_RDWR, 0666)
serverKey, _ := ioutil.ReadAll(secrets)
if serverKey != nil {
fmt.Println("can not open key")
}
serverKeyPublic := &serverKey.PublicKey
}
You need to Unmarshal it:
var data *rsa.PrivateKey
err = json.Unmarshal(serverKey, &data)
if err != nil {
panic(err)
}
And you may use
err = ioutil.WriteFile("secrets.json", buf.Bytes(), 0666)
and
serverKey, err := ioutil.ReadFile("secrets.json")
See:
package main
import (
"bytes"
"crypto/rand"
"crypto/rsa"
"encoding/json"
"fmt"
"io/ioutil"
)
func main() {
mimicPrivateKey, err := rsa.GenerateKey(rand.Reader, 2048)
if err != nil {
panic(err)
}
var buf bytes.Buffer
err = json.NewEncoder(&buf).Encode(mimicPrivateKey)
if err != nil {
panic(err)
}
err = ioutil.WriteFile("secrets.json", buf.Bytes(), 0666)
if err != nil {
panic(err)
}
serverKey, err := ioutil.ReadFile("secrets.json")
if err != nil {
panic(err)
}
var data *rsa.PrivateKey
err = json.Unmarshal(serverKey, &data)
if err != nil {
panic(err)
}
serverKeyPublic := data.PublicKey
fmt.Println(serverKeyPublic)
}
The Go code below reads in a 10,000 record CSV (of timestamp times and float values), runs some operations on the data, and then writes the original values to another CSV along with an additional column for score. However it is terribly slow (i.e. hours, but most of that is calculateStuff()) and I'm curious if there are any inefficiencies in the CSV reading/writing I can take care of.
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func ReadCSV(filepath string) ([][]string, error) {
csvfile, err := os.Open(filepath)
if err != nil {
return nil, err
}
defer csvfile.Close()
reader := csv.NewReader(csvfile)
fields, err := reader.ReadAll()
return fields, nil
}
func main() {
// load data csv
records, err := ReadCSV("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
// write results to a new csv
outfile, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
defer outfile.Close()
writer := csv.NewWriter(outfile)
for i, record := range records {
time := record[0]
value := record[1]
// skip header row
if i == 0 {
writer.Write([]string{time, value, "score"})
continue
}
// get float values
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record: %v, Error: %v", floatValue, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
valueString := strconv.FormatFloat(floatValue, 'f', 8, 64)
scoreString := strconv.FormatFloat(prob, 'f', 8, 64)
//fmt.Printf("Result: %v\n", []string{time, valueString, scoreString})
writer.Write([]string{time, valueString, scoreString})
}
writer.Flush()
}
I'm looking for help making this CSV read/write template code as fast as possible. For the scope of this question we need not worry about the calculateStuff method.
You're loading the file in memory first then processing it, that can be slow with a big file.
You need to loop and call .Read and process one line at a time.
func processCSV(rc io.Reader) (ch chan []string) {
ch = make(chan []string, 10)
go func() {
r := csv.NewReader(rc)
if _, err := r.Read(); err != nil { //read header
log.Fatal(err)
}
defer close(ch)
for {
rec, err := r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
ch <- rec
}
}()
return
}
playground
//note it's roughly based on DaveC's comment.
This is essentially Dave C's answer from the comments sections:
package main
import (
"encoding/csv"
"log"
"os"
"strconv"
)
func main() {
// setup reader
csvIn, err := os.Open("./path/to/datafile.csv")
if err != nil {
log.Fatal(err)
}
r := csv.NewReader(csvIn)
// setup writer
csvOut, err := os.Create("./where/to/write/resultsfile.csv"))
if err != nil {
log.Fatal("Unable to open output")
}
w := csv.NewWriter(csvOut)
defer csvOut.Close()
// handle header
rec, err := r.Read()
if err != nil {
log.Fatal(err)
}
rec = append(rec, "score")
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
for {
rec, err = r.Read()
if err != nil {
if err == io.EOF {
break
}
log.Fatal(err)
}
// get float value
value := rec[1]
floatValue, err := strconv.ParseFloat(value, 64)
if err != nil {
log.Fatal("Record, error: %v, %v", value, err)
}
// calculate scores; THIS EXTERNAL METHOD CANNOT BE CHANGED
score := calculateStuff(floatValue)
scoreString := strconv.FormatFloat(score, 'f', 8, 64)
rec = append(rec, scoreString)
if err = w.Write(rec); err != nil {
log.Fatal(err)
}
w.Flush()
}
}
Note of course the logic is all jammed into main(), better would be to split it into several functions, but that's beyond the scope of this question.
encoding/csv is indeed very slow on big files, as it performs a lot of allocations. Since your format is so simple I recommend using strings.Split instead which is much faster.
If even that is not fast enough you can consider implementing the parsing yourself using strings.IndexByte which is implemented in assembly: http://golang.org/src/strings/strings_decl.go?s=274:310#L1
Having said that, you should also reconsider using ReadAll if the file is larger than your memory.
Trying to get the result into a JSON string, I have to use MapScan because i have no structs that represent the data so here is what i did
import (
"fmt"
"log"
"encoding/json"
_ "github.com/jmoiron/sqlx"
_ "github.com/go-sql-driver/mysql"
)
func main() {
db, err := sqlx.Connect("mysql", "uname:pwd#/db")
if err != nil {
log.Fatal(err)
}
m := map[string]interface{}{}
//Go through rows
rows, err := db.Queryx("SELECT id,cname FROM items")
for rows.Next() {
err := rows.MapScan(m)
if err != nil {
log.Fatal(err)
}
}
//Marshal the map
b, _ := json.Marshal(m)
//Prints the resulted json
fmt.Printf("Marshalled data: %s\n", b)
}
The output is Marshalled data: {"cname":"c29tZWl0ZW0","id":"MA=="}
and it should be Marshalled data: {"cname":"someitem","id":0}
and I am not sure how to go around this since the values returned in base64 encodig, any ideas?
Just iterate over your map and decode the base64 strings prior to marshal the map:
for k, encoded := range m {
decoded, err := base64.StdEncoding.DecodeString(encoded)
if err != nil {
log.Fatal("error:", err)
}
m[k] = decoded
}
b, _ := json.Marshal(m)
You must add this to your imports : "encoding/base64".