I've been trying to get the go yaml package to parse a file with jsonlines entries.
Below is a simple example with three options of data to be parsed.
Option one is a multi-doc yaml example. Both docs parse ok.
Option two is a two jsonline example. The first line parses ok, but the second is missed.
Option three is a two jsonline example, but I've put yaml doc separators in between, to force the issue. Both of these parse ok.
From reading the yaml and json specs, I believe the second option, multiple jsonlines, ought to be handled by a yaml parser.
My questions are:
Should a YAML parser cope with jsonlines?
Am I using the go yaml package correctly?
package main
import (
"bytes"
"fmt"
"reflect"
"strings"
"gopkg.in/yaml.v2"
)
var testData = []string{
`
---
option_one_first_yaml_doc: ok_here
---
option_one_second_yaml_doc: ok_here
`,
`
{option_two_first_jsonl: ok_here}
{option_two_second_jsonl: missing}
`,
`
---
{option_three_first_jsonl: ok_here}
---
{option_three_second_jsonl: ok_here}
`}
func printVal(v interface{}, depth int) {
typ := reflect.TypeOf(v)
if typ == nil {
fmt.Printf(" %v\n", "<null>")
} else if typ.Kind() == reflect.Int || typ.Kind() == reflect.String {
fmt.Printf("%s%v\n", strings.Repeat(" ", depth), v)
} else if typ.Kind() == reflect.Slice {
fmt.Printf("\n")
printSlice(v.([]interface{}), depth+1)
} else if typ.Kind() == reflect.Map {
fmt.Printf("\n")
printMap(v.(map[interface{}]interface{}), depth+1)
}
}
func printMap(m map[interface{}]interface{}, depth int) {
for k, v := range m {
fmt.Printf("%sKey: %s Value(s):", strings.Repeat(" ", depth), k.(string))
printVal(v, depth+1)
}
}
func printSlice(slc []interface{}, depth int) {
for _, v := range slc {
printVal(v, depth+1)
}
}
func main() {
m := make(map[interface{}]interface{})
for _, data := range testData {
yamlData := bytes.NewReader([]byte(data))
decoder := yaml.NewDecoder(yamlData)
for decoder.Decode(&m) == nil {
printMap(m, 0)
m = make(map[interface{}]interface{})
}
}
}
jsonlines is newline delimited JSON. That means the individual lines are JSON, but not multiple lines and certainly not a whole file of multiple lines.
You will need to read the jsonlines input a line at a time, and those lines you should be able to process with go yaml, since YAML is a superset of JSON.
Since you also seem to have YAML end of indicator (---) lines in your test, you
need to process those as well.
Related
I'm trying to solve a task where I must to find one file with data in CSV format among other files with similar names and same size and print a number on 5th row 3rd column (indexes 4 and 2)
So I wrote this code
package main
import (
"encoding/csv"
"fmt"
"os"
"path/filepath"
)
var s [][]string
func walkfunc(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
buf, err1 := os.Open(path)
if err1 == nil {
var err2 error
r := csv.NewReader(buf)
s, err2 = r.ReadAll()
if err2 == nil {
fmt.Printf("found: %v", s[4][2])
}
}
defer buf.Close()
return nil
}
func main() {
const root = "./task/"
if err := filepath.Walk(root, walkfunc); err != nil {
fmt.Printf("error: %v", err)
}
}
And I got this in output
GOROOT=/usr/local/go #gosetup
GOPATH=/usr/local/go/bin #gosetup
/usr/local/go/bin/go build -o /private/var/folders/j2/ybr0drz13yq31dc67zmvkb1w0000gn/T/GoLand/___go_build_qwasd3_go /Users/user/Downloads/zadacha/qwasd3.go #gosetup
/private/var/folders/j2/ybr0drz13yq31dc67zmvkb1w0000gn/T/GoLand/___go_build_qwasd3_go
panic: runtime error: index out of range [4] with length 3
goroutine 1 [running]:
main.walkfunc({0x14000018120?, 0x0?}, {0x14000098d88?, 0x10247fe40?}, {0x0?, 0x0?})
/Users/user/Downloads/zadacha/qwasd3.go:23 +0x28c
path/filepath.walk({0x14000018120, 0xe}, {0x1024c9cf8, 0x140000685b0}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:433 +0xd0
path/filepath.walk({0x10248d4a8, 0x7}, {0x1024c9cf8, 0x140000684e0}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:457 +0x1fc
path/filepath.Walk({0x10248d4a8, 0x7}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:520 +0x6c
main.main()
/Users/user/Downloads/zadacha/qwasd3.go:37 +0x30
Process finished with the exit code 2
What am I doing wrong?
I was trying to run this code on MacBook.
The needed file contains table with numbers and I need to print a number on 5th row and 3rd column.
As other comments have pointed out, you need to check each CSV to make sure it's actually as big as you expect it to be. You could also add a simple check to try and make sure it's a CSV file before opening it by looking for a ".csv" extension.
Though, to directly address your error... The CSV reader may be able to interpret a plain txt file as CSV and not return an err, like:
buf := strings.NewReader(`A regular text file with 3 lines.
Line2
Line3
`)
r := csv.NewReader(buf)
records, err := r.ReadAll()
if err != nil {
fmt.Println("could not read all of CSV file!")
return err
}
fmt.Println(records)
prints:
[[A regular text file with 3 lines.] [Line2] [Line3]]
Just assuming that it's a CSV with the correct number of rows and columns:
fmt.Println("found", records[4][2])
gives the panic message you shared:
panic: runtime error: index out of range [4] with length 3
You at least need to check that your CSV has 5 rows, and if it does, then check if the 5th row has 3 columns before you try to read that field:
if len(records) < 5 {
fmt.Println(path, "does not have 5 rows")
return nil
}
if len(records[4]) < 3 {
fmt.Println(path, "5th row does not have 3 columns")
return nil
}
fmt.Println("found", records[4][2])
You could also do, inside your walkfunc, a basic check of the file path itself to see if it looks like a CSV:
if strings.ToLower(path[len(path)-4:]) != ".csv" {
fmt.Println(path, "is not a CSV")
return nil
}
I show all this code, plus a fully worked/integrated example in this Playground.
This question already has answers here:
Runtime error when parsing JSON array and map elements with trailing commas
(2 answers)
Closed 1 year ago.
user golang , write json unmarshal, error happened, Because of comma before "]".
import (
"encoding/json"
"github.com/c2h5oh/datasize"
xdsboot "github.com/envoyproxy/go-control-plane/envoy/config/bootstrap/v2"
"github.com/golang/protobuf/jsonpb"
)
err = json.Unmarshal(content, cfg)
if err != nil {
log.StartLogger.Fatalf("[config] [default load] json unmarshal config failed, error: %v", err)
}
return cfg
return error:
2021-07-14 06:53:39,637 [FATAL] [config] [default load] json unmarshal config failed, error: invalid character ']' looking for beginning of value
I use unit test case to run, find file input
func TestMosnEnvoyMode(t *testing.T) {
content, _ := ioutil.ReadFile("./test.json") // this file is input file
cfg := &MOSNConfig{}
if err := json.Unmarshal([]byte(content), cfg); err != nil {
t.Fatal(err)
}
if cfg.Mode() != Mix {
t.Fatalf("config mode is %d", cfg.Mode())
}
}
file content:
{
"stats_matcher": {
"inclusion_list": {
"patterns": [
{
"prefix": "cluster.xds-grpc"
},
{
"suffix": "ssl_context_update_by_sds"
}, // the error cause by here, ","
]
}
}
},
}
if I delete comma here
{
"suffix": "ssl_context_update_by_sds"
}, // the error cause by here, ","
It is OK!
Now, why and which json lib should I use? Because the input file can not change.
It's better if you can fix the JSON as it's invalid JSON to have trailing commas.
That being said, some languages support trailing commas natively, notably JavaScript, so you may see it in your data.
If you cannot change your data, switch to a JSON parser that supports trailing commas like HuJSON (aka Human JSON) which supports trailing commas and comments in JSON. It's a soft fork of encoding/json and the last 3 commits are from noted Xoogler and Ex-Golang team member Brad Fitzpatrick.
repo: https://github.com/tailscale/hujson
docs: https://pkg.go.dev/github.com/tailscale/hujson
The Unmarshal syntax is the same as encoding/json, just use:
err := hujson.Unmarshal(data, v)
I've used it and it works as described.
I'm new to Golang and have been doing alright but I have a strange issue that I have not encountered before when using fmt. This strange behavior is when I'm printing a string. At the end of the string (which has sub-strings) it is also printing out what appears to be the len() of each string although the number don't add up. Can anyone explain why this is happening and how to stop it?
Any help is greatly appreciated
Here is the code:
package main
import (
"fmt"
//"log"
"strings"
)
var e = "[{8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888 localhost:3303 4d50f447-7c93-42df-a03e-89c09626950a}]"
func main() {
tl := strings.Trim(e, "[{")
tr := strings.Trim(tl, "}]")
r := strings.TrimSpace(tr)
s := strings.Fields(r)
V_PK := s[0]
SERVER_ADDR := s[1]
A_KEY := s[2]
vv, _ := fmt.Printf("[{\"v_pk\": %q", V_PK)
pp, _ := fmt.Printf(",\"server_addr\": %q", SERVER_ADDR)
kk, _ := fmt.Printf(",\"a_key\": %q}] ", A_KEY)
rstr, _ := fmt.Println(vv, pp, kk)
stringc := string(rstr)
fmt.Println(stringc)
}
Expected output:
[{"v_pk": "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888","server_addr": "localhost:3303","a_key": "4d50f447-7c93-42df-a03e-89c09626950a"}]
Actual output:
[{"v_pk": "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888","server_addr": "localhost:3303","a_key": "4d50f447-7c93-42df-a03e-89c09626950a"}] 82 36 53
Why on earth would it be printing these string lengths on the end? It's probably obvious that I'm trying to build a JSON string so these numbers on the end are problematic when trying to import the string into a JSON interpreter.
Again, any help is appreciated!
Take a look at the documentation for fmt.Printf and its friends fmt.Println. The documentation reads:
Printf formats according to a format specifier and writes to standard output. It returns the number of bytes written and any write error encountered.
The line in your code
vv, _ := fmt.Printf("[{\"v_pk\": %q", V_PK)
prints the formatted string to standard output, then return the number of bytes written and stores that in vv. If you want to print the formatted string to standard output, just call fmt.Printf and ignore the output:
package main
import (
"fmt"
//"log"
"strings"
)
var e = "[{8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888 localhost:3303 4d50f447-7c93-42df-a03e-89c09626950a}]"
func main() {
tl := strings.Trim(e, "[{")
tr := strings.Trim(tl, "}]")
r := strings.TrimSpace(tr)
s := strings.Fields(r)
V_PK := s[0]
SERVER_ADDR := s[1]
A_KEY := s[2]
fmt.Printf("[{\"v_pk\": %q, \"server_addr\": %q, \"a_key\": %q}]\n", V_PK, SERVER_ADDR, A_KEY)
}
Or, if you want to store the formatted string to a new string variable, call fmt.Sprintf:
stringc := fmt.Sprintf("[{\"v_pk\": %q, \"server_addr\": %q, \"a_key\": %q}]", V_PK, SERVER_ADDR, A_KEY)
fmt.Println(stringc)
You can check out a working version at the playground.
You might also want to checkout the json package, which can do the parsing and serializing for you with properly defined structs:
package main
import (
"encoding/json"
"fmt"
)
func main() {
type Datum struct {
VPK string `json:"v_pk"`
Server string `json:"server_addr"`
AKey string `json:"a_key"`
}
data := []Datum{
{VPK: "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888",
Server: "localhost:3303",
AKey: "4d50f447-7c93-42df-a03e-89c09626950a",
}}
json, err := json.MarshalIndent(data, "", " ")
if err != nil {
// deal with error
}
fmt.Println(string(json))
}
Check it out at the go playground.
fmt.Printf returns the number of bytes written. The variables vv, pp, kk are the number of bytes written by those three Printf calls, and the three numbers printed are those numbers.
Although the output setting has been set to text
~/.aws/config
[default]
output=text
the aws-sdk-go returns json. The question is whether the output could be switched to text.
When:
aws route53 get-hosted-zone --id some-id
is run, the output looks as follows:
NAMESERVERS some-ns
NAMESERVERS some-ns1
NAMESERVERS some-ns2
NAMESERVERS some-ns3
According to the this AWS documentation one could set the configuration:
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-east-2")},
)
One attempt was to consult this Config struct, but an Output option seems to be omitted.
How to set the output to text?
Note: an issue has added to the github page of the aws-sdk-go as well.
Example
package main
import (
"fmt"
"log"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/route53"
)
func main() {
session, err := session.NewSession()
if err != nil {
log.Fatal(err)
}
r53 := route53.New(session)
listParams := &route53.ListResourceRecordSetsInput{
HostedZoneId: aws.String("some-id"),
}
records, err := r53.ListResourceRecordSets(listParams)
if err != nil {
log.Fatal(err)
}
fmt.Println(records)
}
returns:
{
IsTruncated: false,
MaxItems: "100",
ResourceRecordSets: [
{
Name: "some-domain.",
ResourceRecords: [{
Value: "some-ip"
}],
TTL: 7200,
Type: "A"
}
}
while aws route53 list-resource-record-sets --hosted-zone-id some-id, results in:
RESOURCERECORDSETS some-domain. 7200 A
RESOURCERECORDS some-ip
Problem
While it is possible to set the format of the aws-cli to output, it does not seem to be possible to do the same for the SDK.
Question
How to let the go-aws-sdk return text rather than json?
I have all of the information you need, you just have to unravel it from the response (records).
To get similar results from the last cli command:
for _, recordSet := range records.ResourceRecordSets {
log.Println("RESOURCERECORDSETS " + *recordSet.Name + strconv.Itoa(int(*recordSet.TTL)) + *recordSet.Type)
for _, record := range recordSet.ResourceRecords {
log.Println("RESOURCERECORDS " + *record.Value)
}
log.Println("")
}
I am using the following sample program:
func getEnv(appName string, env string) {
svc := elasticbeanstalk.New(session.New(), &aws.Config{Region: aws.String("us-east-1")})
params := &elasticbeanstalk.DescribeConfigurationSettingsInput{
ApplicationName: aws.String(appName), // Required
EnvironmentName: aws.String(env),
}
resp, err := svc.DescribeConfigurationSettings(params)
if err != nil {
fmt.Println(err.Error())
return
}
v := resp.ConfigurationSettings
fmt.Printf("%s", v)
}
It's printing out the following response; this looks like a valid json except for the missing quote makes. ex: ApplicationName and not "ApplicationName".
How do I parse this? or get a valid json from AWS?
ConfigurationSettings: [{
ApplicationName: "myApp",
DateCreated: 2016-01-12 00:10:10 +0000 UTC,
DateUpdated: 2016-01-12 00:10:10 +0000 UTC,
DeploymentStatus: "deployed",
Description: "Environment created from the EB CLI using \"eb create\"",
EnvironmentName: "stag-myApp-app-s1",
OptionSettings: [
...
resp.ConfigurationSettings is not in JSON format any more, the aws-sdk-go package handled that for you. When you do,
v := resp.ConfigurationSettings
v contains an instance []*ConfigurationSettingsDescription that was parsed from the JSON response, and you don't have to parse it yourself. What you are seeing when you print it out is the Go struct representation. You can just go ahead and use it:
if len(v) > 0 {
log.Println(v[0].ApplicationName)
}
This should print out myApp