Go panic: extra delimiter at end of line - csv

I'm reading the MaxMind GeoIP Lite City locations CSV file using Go:
csvFile, err := os.Open("/path/GeoLiteCity_20130702/GeoLiteCity-Location.csv")
defer csvFile.Close()
if err != nil {
panic(err)
}
csvf := csv.NewReader(csvFile)
csvf.Read() // skip header row
for {
fields, err := csvf.Read()
if err == io.EOF {
break
} else if err != nil {
panic(err)
}
// does nothing yet
}
The error I'm getting is:
panic: line 2, column 22: extra delimiter at end of line
goroutine 1 [running]: main.main()
/path/myprogram.go:239
+0x108f
goroutine 2 [runnable]: exit status 2
The file is quite long, but starts with these lines:
locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode
1,O1,,,,0.0000,0.0000,,
2,AP,,,,35.0000,105.0000,,
3,EU,,,,47.0000,8.0000,,
4,AD,,,,42.5000,1.5000,,
5,AE,,,,24.0000,54.0000,,
6,AF,,,,33.0000,65.0000,,
7,AG,,,,17.0500,-61.8000,,
8,AI,,,,18.2500,-63.1667,,
9,AL,,,,41.0000,20.0000,,
It appears to be properly formatted. Each row has 9 fields.
Line 239 is my line invoking the panic, panic(err). As you can see, it's failing on line 2 of the CSV file, which happens in the first iteration of the loop (line 1 is read before the loop, to skip the header row). Column 22 of line 2 is the second-to-last comma.
Am I missing something here? I don't see any trailing comma... (clarification: the commas at the end of each line must be there to indicate empty field values, so they're not trailing, as in, extra.)
UPDATE: The gophers have resolved this issue and the fix ships with Go 1.1.2.

There are even two trailing commas on each line.
Try setting csv.Reader.TrailingComma = true.
It really often helps taking a look at the source or at least the package documentation :-)

Here is a complete example for you. The key is csvf.TrailingComma = true.
package main
import (
"bytes"
"encoding/csv"
"fmt"
"io"
)
var csvData = `locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode
1,O1,,,,0.0000,0.0000,,
2,AP,,,,35.0000,105.0000,,
3,EU,,,,47.0000,8.0000,,
4,AD,,,,42.5000,1.5000,,
5,AE,,,,24.0000,54.0000,,
6,AF,,,,33.0000,65.0000,,
7,AG,,,,17.0500,-61.8000,,
8,AI,,,,18.2500,-63.1667,,
9,AL,,,,41.0000,20.0000,,
`
func main() {
csvFile := bytes.NewBufferString(csvData)
csvf := csv.NewReader(csvFile)
csvf.TrailingComma = true
csvf.Read() // skip header row
for {
fields, err := csvf.Read()
if err == io.EOF {
break
} else if err != nil {
panic(err)
}
// does nothing yet
fmt.Println(fields)
}
}

Related

How do I solve Golang filepath.walkfunc problem?

I'm trying to solve a task where I must to find one file with data in CSV format among other files with similar names and same size and print a number on 5th row 3rd column (indexes 4 and 2)
So I wrote this code
package main
import (
"encoding/csv"
"fmt"
"os"
"path/filepath"
)
var s [][]string
func walkfunc(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
buf, err1 := os.Open(path)
if err1 == nil {
var err2 error
r := csv.NewReader(buf)
s, err2 = r.ReadAll()
if err2 == nil {
fmt.Printf("found: %v", s[4][2])
}
}
defer buf.Close()
return nil
}
func main() {
const root = "./task/"
if err := filepath.Walk(root, walkfunc); err != nil {
fmt.Printf("error: %v", err)
}
}
And I got this in output
GOROOT=/usr/local/go #gosetup
GOPATH=/usr/local/go/bin #gosetup
/usr/local/go/bin/go build -o /private/var/folders/j2/ybr0drz13yq31dc67zmvkb1w0000gn/T/GoLand/___go_build_qwasd3_go /Users/user/Downloads/zadacha/qwasd3.go #gosetup
/private/var/folders/j2/ybr0drz13yq31dc67zmvkb1w0000gn/T/GoLand/___go_build_qwasd3_go
panic: runtime error: index out of range [4] with length 3
goroutine 1 [running]:
main.walkfunc({0x14000018120?, 0x0?}, {0x14000098d88?, 0x10247fe40?}, {0x0?, 0x0?})
/Users/user/Downloads/zadacha/qwasd3.go:23 +0x28c
path/filepath.walk({0x14000018120, 0xe}, {0x1024c9cf8, 0x140000685b0}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:433 +0xd0
path/filepath.walk({0x10248d4a8, 0x7}, {0x1024c9cf8, 0x140000684e0}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:457 +0x1fc
path/filepath.Walk({0x10248d4a8, 0x7}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:520 +0x6c
main.main()
/Users/user/Downloads/zadacha/qwasd3.go:37 +0x30
Process finished with the exit code 2
What am I doing wrong?
I was trying to run this code on MacBook.
The needed file contains table with numbers and I need to print a number on 5th row and 3rd column.
As other comments have pointed out, you need to check each CSV to make sure it's actually as big as you expect it to be. You could also add a simple check to try and make sure it's a CSV file before opening it by looking for a ".csv" extension.
Though, to directly address your error... The CSV reader may be able to interpret a plain txt file as CSV and not return an err, like:
buf := strings.NewReader(`A regular text file with 3 lines.
Line2
Line3
`)
r := csv.NewReader(buf)
records, err := r.ReadAll()
if err != nil {
fmt.Println("could not read all of CSV file!")
return err
}
fmt.Println(records)
prints:
[[A regular text file with 3 lines.] [Line2] [Line3]]
Just assuming that it's a CSV with the correct number of rows and columns:
fmt.Println("found", records[4][2])
gives the panic message you shared:
panic: runtime error: index out of range [4] with length 3
You at least need to check that your CSV has 5 rows, and if it does, then check if the 5th row has 3 columns before you try to read that field:
if len(records) < 5 {
fmt.Println(path, "does not have 5 rows")
return nil
}
if len(records[4]) < 3 {
fmt.Println(path, "5th row does not have 3 columns")
return nil
}
fmt.Println("found", records[4][2])
You could also do, inside your walkfunc, a basic check of the file path itself to see if it looks like a CSV:
if strings.ToLower(path[len(path)-4:]) != ".csv" {
fmt.Println(path, "is not a CSV")
return nil
}
I show all this code, plus a fully worked/integrated example in this Playground.

Strange number when using fmt.Println in Golang

I'm new to Golang and have been doing alright but I have a strange issue that I have not encountered before when using fmt. This strange behavior is when I'm printing a string. At the end of the string (which has sub-strings) it is also printing out what appears to be the len() of each string although the number don't add up. Can anyone explain why this is happening and how to stop it?
Any help is greatly appreciated
Here is the code:
package main
import (
"fmt"
//"log"
"strings"
)
var e = "[{8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888 localhost:3303 4d50f447-7c93-42df-a03e-89c09626950a}]"
func main() {
tl := strings.Trim(e, "[{")
tr := strings.Trim(tl, "}]")
r := strings.TrimSpace(tr)
s := strings.Fields(r)
V_PK := s[0]
SERVER_ADDR := s[1]
A_KEY := s[2]
vv, _ := fmt.Printf("[{\"v_pk\": %q", V_PK)
pp, _ := fmt.Printf(",\"server_addr\": %q", SERVER_ADDR)
kk, _ := fmt.Printf(",\"a_key\": %q}] ", A_KEY)
rstr, _ := fmt.Println(vv, pp, kk)
stringc := string(rstr)
fmt.Println(stringc)
}
Expected output:
[{"v_pk": "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888","server_addr": "localhost:3303","a_key": "4d50f447-7c93-42df-a03e-89c09626950a"}]
Actual output:
[{"v_pk": "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888","server_addr": "localhost:3303","a_key": "4d50f447-7c93-42df-a03e-89c09626950a"}] 82 36 53
Why on earth would it be printing these string lengths on the end? It's probably obvious that I'm trying to build a JSON string so these numbers on the end are problematic when trying to import the string into a JSON interpreter.
Again, any help is appreciated!
Take a look at the documentation for fmt.Printf and its friends fmt.Println. The documentation reads:
Printf formats according to a format specifier and writes to standard output. It returns the number of bytes written and any write error encountered.
The line in your code
vv, _ := fmt.Printf("[{\"v_pk\": %q", V_PK)
prints the formatted string to standard output, then return the number of bytes written and stores that in vv. If you want to print the formatted string to standard output, just call fmt.Printf and ignore the output:
package main
import (
"fmt"
//"log"
"strings"
)
var e = "[{8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888 localhost:3303 4d50f447-7c93-42df-a03e-89c09626950a}]"
func main() {
tl := strings.Trim(e, "[{")
tr := strings.Trim(tl, "}]")
r := strings.TrimSpace(tr)
s := strings.Fields(r)
V_PK := s[0]
SERVER_ADDR := s[1]
A_KEY := s[2]
fmt.Printf("[{\"v_pk\": %q, \"server_addr\": %q, \"a_key\": %q}]\n", V_PK, SERVER_ADDR, A_KEY)
}
Or, if you want to store the formatted string to a new string variable, call fmt.Sprintf:
stringc := fmt.Sprintf("[{\"v_pk\": %q, \"server_addr\": %q, \"a_key\": %q}]", V_PK, SERVER_ADDR, A_KEY)
fmt.Println(stringc)
You can check out a working version at the playground.
You might also want to checkout the json package, which can do the parsing and serializing for you with properly defined structs:
package main
import (
"encoding/json"
"fmt"
)
func main() {
type Datum struct {
VPK string `json:"v_pk"`
Server string `json:"server_addr"`
AKey string `json:"a_key"`
}
data := []Datum{
{VPK: "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888",
Server: "localhost:3303",
AKey: "4d50f447-7c93-42df-a03e-89c09626950a",
}}
json, err := json.MarshalIndent(data, "", " ")
if err != nil {
// deal with error
}
fmt.Println(string(json))
}
Check it out at the go playground.
fmt.Printf returns the number of bytes written. The variables vv, pp, kk are the number of bytes written by those three Printf calls, and the three numbers printed are those numbers.

golang yaml support for jsonlines

I've been trying to get the go yaml package to parse a file with jsonlines entries.
Below is a simple example with three options of data to be parsed.
Option one is a multi-doc yaml example. Both docs parse ok.
Option two is a two jsonline example. The first line parses ok, but the second is missed.
Option three is a two jsonline example, but I've put yaml doc separators in between, to force the issue. Both of these parse ok.
From reading the yaml and json specs, I believe the second option, multiple jsonlines, ought to be handled by a yaml parser.
My questions are:
Should a YAML parser cope with jsonlines?
Am I using the go yaml package correctly?
package main
import (
"bytes"
"fmt"
"reflect"
"strings"
"gopkg.in/yaml.v2"
)
var testData = []string{
`
---
option_one_first_yaml_doc: ok_here
---
option_one_second_yaml_doc: ok_here
`,
`
{option_two_first_jsonl: ok_here}
{option_two_second_jsonl: missing}
`,
`
---
{option_three_first_jsonl: ok_here}
---
{option_three_second_jsonl: ok_here}
`}
func printVal(v interface{}, depth int) {
typ := reflect.TypeOf(v)
if typ == nil {
fmt.Printf(" %v\n", "<null>")
} else if typ.Kind() == reflect.Int || typ.Kind() == reflect.String {
fmt.Printf("%s%v\n", strings.Repeat(" ", depth), v)
} else if typ.Kind() == reflect.Slice {
fmt.Printf("\n")
printSlice(v.([]interface{}), depth+1)
} else if typ.Kind() == reflect.Map {
fmt.Printf("\n")
printMap(v.(map[interface{}]interface{}), depth+1)
}
}
func printMap(m map[interface{}]interface{}, depth int) {
for k, v := range m {
fmt.Printf("%sKey: %s Value(s):", strings.Repeat(" ", depth), k.(string))
printVal(v, depth+1)
}
}
func printSlice(slc []interface{}, depth int) {
for _, v := range slc {
printVal(v, depth+1)
}
}
func main() {
m := make(map[interface{}]interface{})
for _, data := range testData {
yamlData := bytes.NewReader([]byte(data))
decoder := yaml.NewDecoder(yamlData)
for decoder.Decode(&m) == nil {
printMap(m, 0)
m = make(map[interface{}]interface{})
}
}
}
jsonlines is newline delimited JSON. That means the individual lines are JSON, but not multiple lines and certainly not a whole file of multiple lines.
You will need to read the jsonlines input a line at a time, and those lines you should be able to process with go yaml, since YAML is a superset of JSON.
Since you also seem to have YAML end of indicator (---) lines in your test, you
need to process those as well.

Writing multi line string to CSV file

How can I write a multi line value to a CSV file using the encoding/csv package?
fh, err := os.Create(fileName)
if err != nil {
log.Fatalf("Could not create file: %v", err)
}
defer fh.Close()
w := csv.NewWriter(fh)
normalValue := "I am a single line value"
multiValue := []string{"I am a ", "multi line value"}
w.Write([]string{normalValue, multiValue})
The result I would expect in the resulting CSV file:
I am a single line value,"I am a
multi line value"
How can I realize this, since csv.Write does not accept []string as an argument? Simply appending \n between each element of the multi line value does not achieve anything either.
You need to embed the \n in your multiline value. You can do this, for example, by using strings.Join:
w.Write([]string{
normalValue,
strings.Join(multiValue, "\n"),
})
https://play.golang.org/p/uWJnClpQ1OT

Golang file reading only reading last line

So I took some publicly available data that looks like this -
this is the file
http://expirebox.com/download/b149b744768fb11aee9c5e26ad409bcc.html
,,,% of Total Expenditure,,,
Function Code,Type of Activity,Expenditure,Dollars/Student (ADA),"This District (ADA 49,497)",All Unified School Districts,Statewide Average
1000-1999ÊÊ,INSTRUCTIONÊÊ,"$249,397,226","$5,039",42%,62%,62%
1000,Instruction,"$247,472,790ÊÊ","$5,000",42%,48%,49%
1110,Special Education: Separate Classes,"$1,004,074",$20,N/A,N/A,N/A
1120,Special Education: Resource Specialist Instruction,"$781,629",$16,N/A,N/A,N/A
1130,Special Education: Supplemental Aids & Services in Regular Classrooms,"$46,747",$1,N/A,N/A,N/A
1180,Special Education: Nonpublic Agencies/Schools (NPA/S),N/A,N/A,N/A,N/A,N/A
1190,Special Education: Other Specialized Instructional Services,"$91,985",$2,N/A,N/A,N/A
1100-1199,Instruction - Special Education,"$1,924,436ÊÊ",$39,0%,14%,13%
"Subtotal, INSTRUCTION",,"$249,397,226","$5,039",42%,62%,62%
2000-2999ÊÊ,INSTRUCTION-RELATED SERVICESÊÊ,"$132,783,414","$2,683",22%,12%,12%
2100,Instructional Supervision and Administration,"$89,551,041","$1,809",N/A,N/A,N/A
2110,Instructional Supervision,N/A,N/A,N/A,N/A,N/A
2120,Instructional Research,N/A,N/A,N/A,N/A,N/A
2130,Curriculum Development,"$348,369",$7,N/A,N/A,N/A
2140,In-house Instructional Staff Development,"$19,855",$0,N/A,N/A,N/A
2150,Instructional Administration of Special Projects,N/A,N/A,N/A,N/A,N/A
2100-2199,Instructional Supervision and Administration,"$89,919,265ÊÊ","$1,817",15%,4%,4%
2200,Administrative Unit (AU) of a Multidistrict SELPA,$0,$0,0%,0%,0%
2420,"Instructional Library, Media, and Technology","$8,295,033ÊÊ",$168,1%,1%,1%
2490,Other Instructional Resources,"$538,734",$11,N/A,N/A,N/A
2495,Parent Participation,"$97,830",$2,N/A,N/A,N/A
2490-2495,Other Instructional Resources,"$636,565ÊÊ",$13,0%,1%,0%
2700,School Administration,"$33,932,551ÊÊ",$686,6%,7%,7%
"Subtotal, INSTRUCTION-RELATED SERVICES",,"$132,783,414","$2,683",22%,12%,12%
3000-3999ÊÊ,PUPIL SERVICESÊÊ,"$45,325,938",$916,8%,8%,8%
4000-4999ÊÊ,ANCILLARY SERVICESÊÊ,"$2,207,263",$45,0%,1%,1%
5000-5999ÊÊ,COMMUNITY SERVICESÊÊ,$0,$0,0%,0%,0%
6000-6999ÊÊ,ENTERPRISEÊÊ,"$4,264",$0,0%,0%,0%
7000-7999ÊÊ,GENERAL ADMINISTRATIONÊÊ,"$27,916,858",$564,5%,5%,6%
8000-8999ÊÊ,PLANT SERVICESÊÊ,"$55,172,247","$1,115",9%,11%,10%
9000-9999ÊÊ,OTHER OUTGOÊÊ,"$81,981,716",N/A,14%,2%,2%
"Total Expenditures, All Activities",,"$594,788,926","$12,017",100%,100%,100%
It's in a csv.
I have tried this code
file, err := os.Open("expenses.csv")
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
and this
content, err := ioutil.ReadFile("expenses.csv")
lines := strings.Split(string(content), "\n")
fmt.Println(lines)
check(err)
dat, err := os.Open("expenses.csv")
check(err)
defer dat.Close()
reader := csv.NewReader(dat)
reader.LazyQuotes = true
reader.FieldsPerRecord = -1
rawCSVData, err := reader.ReadAll()
check(err)
fmt.Println(rawCSVData)
for _, each := range rawCSVData {
fmt.Println(each)
}
where check is
func check(e error) {
if e != nil {
panic(e)
}
}
In both cases I get this result -
"Total Expenditures, All Activities",,"$594,788,926","$12,017",100%,100%,100%,1%15%,4%,4%AA,N/A,N/Anified School Districts,Statewide Average
Rather than the all the lines.
Why am I only reading the last line?
The basic problem is that this file has \r line endings. It also isn't valid UTF-8. Together, those are going to cause Scanner a lot of trouble.
First, we can see exactly what's in the file using xxd
00000000: 2c2c 2c25 206f 6620 546f 7461 6c20 4578 ,,,% of Total Ex
00000010: 7065 6e64 6974 7572 652c 2c2c 0d46 756e penditure,,,.Fun
If you look, you'll see the line ending is 0d, which is \r. Scanner needs it to be either \r\n or \n.
Next, you may run into trouble because it isn't UTF-8. All those Ê in there are really 0xCA, which is not a valid UTF-8 encoding. We can see that in xxd again:
000000b0: 3939 39ca ca2c 494e 5354 5255 4354 494f 999..,INSTRUCTIO
000000c0: 4eca ca2c 2224 3234 392c 3339 372c 3232 N..,"$249,397,22
Go will probably just ship it along as bytes (and get Ê), which is what a lot of editors try to do, but it's likely to cause trouble.
If possible, reformat this file to use either Unix or Windows line endings in UTF-8.