Golang file reading only reading last line - csv

So I took some publicly available data that looks like this -
this is the file
http://expirebox.com/download/b149b744768fb11aee9c5e26ad409bcc.html
,,,% of Total Expenditure,,,
Function Code,Type of Activity,Expenditure,Dollars/Student (ADA),"This District (ADA 49,497)",All Unified School Districts,Statewide Average
1000-1999ÊÊ,INSTRUCTIONÊÊ,"$249,397,226","$5,039",42%,62%,62%
1000,Instruction,"$247,472,790ÊÊ","$5,000",42%,48%,49%
1110,Special Education: Separate Classes,"$1,004,074",$20,N/A,N/A,N/A
1120,Special Education: Resource Specialist Instruction,"$781,629",$16,N/A,N/A,N/A
1130,Special Education: Supplemental Aids & Services in Regular Classrooms,"$46,747",$1,N/A,N/A,N/A
1180,Special Education: Nonpublic Agencies/Schools (NPA/S),N/A,N/A,N/A,N/A,N/A
1190,Special Education: Other Specialized Instructional Services,"$91,985",$2,N/A,N/A,N/A
1100-1199,Instruction - Special Education,"$1,924,436ÊÊ",$39,0%,14%,13%
"Subtotal, INSTRUCTION",,"$249,397,226","$5,039",42%,62%,62%
2000-2999ÊÊ,INSTRUCTION-RELATED SERVICESÊÊ,"$132,783,414","$2,683",22%,12%,12%
2100,Instructional Supervision and Administration,"$89,551,041","$1,809",N/A,N/A,N/A
2110,Instructional Supervision,N/A,N/A,N/A,N/A,N/A
2120,Instructional Research,N/A,N/A,N/A,N/A,N/A
2130,Curriculum Development,"$348,369",$7,N/A,N/A,N/A
2140,In-house Instructional Staff Development,"$19,855",$0,N/A,N/A,N/A
2150,Instructional Administration of Special Projects,N/A,N/A,N/A,N/A,N/A
2100-2199,Instructional Supervision and Administration,"$89,919,265ÊÊ","$1,817",15%,4%,4%
2200,Administrative Unit (AU) of a Multidistrict SELPA,$0,$0,0%,0%,0%
2420,"Instructional Library, Media, and Technology","$8,295,033ÊÊ",$168,1%,1%,1%
2490,Other Instructional Resources,"$538,734",$11,N/A,N/A,N/A
2495,Parent Participation,"$97,830",$2,N/A,N/A,N/A
2490-2495,Other Instructional Resources,"$636,565ÊÊ",$13,0%,1%,0%
2700,School Administration,"$33,932,551ÊÊ",$686,6%,7%,7%
"Subtotal, INSTRUCTION-RELATED SERVICES",,"$132,783,414","$2,683",22%,12%,12%
3000-3999ÊÊ,PUPIL SERVICESÊÊ,"$45,325,938",$916,8%,8%,8%
4000-4999ÊÊ,ANCILLARY SERVICESÊÊ,"$2,207,263",$45,0%,1%,1%
5000-5999ÊÊ,COMMUNITY SERVICESÊÊ,$0,$0,0%,0%,0%
6000-6999ÊÊ,ENTERPRISEÊÊ,"$4,264",$0,0%,0%,0%
7000-7999ÊÊ,GENERAL ADMINISTRATIONÊÊ,"$27,916,858",$564,5%,5%,6%
8000-8999ÊÊ,PLANT SERVICESÊÊ,"$55,172,247","$1,115",9%,11%,10%
9000-9999ÊÊ,OTHER OUTGOÊÊ,"$81,981,716",N/A,14%,2%,2%
"Total Expenditures, All Activities",,"$594,788,926","$12,017",100%,100%,100%
It's in a csv.
I have tried this code
file, err := os.Open("expenses.csv")
if err != nil {
log.Fatal(err)
}
defer file.Close()
scanner := bufio.NewScanner(file)
for scanner.Scan() {
fmt.Println(scanner.Text())
}
if err := scanner.Err(); err != nil {
log.Fatal(err)
}
and this
content, err := ioutil.ReadFile("expenses.csv")
lines := strings.Split(string(content), "\n")
fmt.Println(lines)
check(err)
dat, err := os.Open("expenses.csv")
check(err)
defer dat.Close()
reader := csv.NewReader(dat)
reader.LazyQuotes = true
reader.FieldsPerRecord = -1
rawCSVData, err := reader.ReadAll()
check(err)
fmt.Println(rawCSVData)
for _, each := range rawCSVData {
fmt.Println(each)
}
where check is
func check(e error) {
if e != nil {
panic(e)
}
}
In both cases I get this result -
"Total Expenditures, All Activities",,"$594,788,926","$12,017",100%,100%,100%,1%15%,4%,4%AA,N/A,N/Anified School Districts,Statewide Average
Rather than the all the lines.
Why am I only reading the last line?

The basic problem is that this file has \r line endings. It also isn't valid UTF-8. Together, those are going to cause Scanner a lot of trouble.
First, we can see exactly what's in the file using xxd
00000000: 2c2c 2c25 206f 6620 546f 7461 6c20 4578 ,,,% of Total Ex
00000010: 7065 6e64 6974 7572 652c 2c2c 0d46 756e penditure,,,.Fun
If you look, you'll see the line ending is 0d, which is \r. Scanner needs it to be either \r\n or \n.
Next, you may run into trouble because it isn't UTF-8. All those Ê in there are really 0xCA, which is not a valid UTF-8 encoding. We can see that in xxd again:
000000b0: 3939 39ca ca2c 494e 5354 5255 4354 494f 999..,INSTRUCTIO
000000c0: 4eca ca2c 2224 3234 392c 3339 372c 3232 N..,"$249,397,22
Go will probably just ship it along as bytes (and get Ê), which is what a lot of editors try to do, but it's likely to cause trouble.
If possible, reformat this file to use either Unix or Windows line endings in UTF-8.

Related

How do I solve Golang filepath.walkfunc problem?

I'm trying to solve a task where I must to find one file with data in CSV format among other files with similar names and same size and print a number on 5th row 3rd column (indexes 4 and 2)
So I wrote this code
package main
import (
"encoding/csv"
"fmt"
"os"
"path/filepath"
)
var s [][]string
func walkfunc(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
buf, err1 := os.Open(path)
if err1 == nil {
var err2 error
r := csv.NewReader(buf)
s, err2 = r.ReadAll()
if err2 == nil {
fmt.Printf("found: %v", s[4][2])
}
}
defer buf.Close()
return nil
}
func main() {
const root = "./task/"
if err := filepath.Walk(root, walkfunc); err != nil {
fmt.Printf("error: %v", err)
}
}
And I got this in output
GOROOT=/usr/local/go #gosetup
GOPATH=/usr/local/go/bin #gosetup
/usr/local/go/bin/go build -o /private/var/folders/j2/ybr0drz13yq31dc67zmvkb1w0000gn/T/GoLand/___go_build_qwasd3_go /Users/user/Downloads/zadacha/qwasd3.go #gosetup
/private/var/folders/j2/ybr0drz13yq31dc67zmvkb1w0000gn/T/GoLand/___go_build_qwasd3_go
panic: runtime error: index out of range [4] with length 3
goroutine 1 [running]:
main.walkfunc({0x14000018120?, 0x0?}, {0x14000098d88?, 0x10247fe40?}, {0x0?, 0x0?})
/Users/user/Downloads/zadacha/qwasd3.go:23 +0x28c
path/filepath.walk({0x14000018120, 0xe}, {0x1024c9cf8, 0x140000685b0}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:433 +0xd0
path/filepath.walk({0x10248d4a8, 0x7}, {0x1024c9cf8, 0x140000684e0}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:457 +0x1fc
path/filepath.Walk({0x10248d4a8, 0x7}, 0x1024c9338)
/usr/local/go/src/path/filepath/path.go:520 +0x6c
main.main()
/Users/user/Downloads/zadacha/qwasd3.go:37 +0x30
Process finished with the exit code 2
What am I doing wrong?
I was trying to run this code on MacBook.
The needed file contains table with numbers and I need to print a number on 5th row and 3rd column.
As other comments have pointed out, you need to check each CSV to make sure it's actually as big as you expect it to be. You could also add a simple check to try and make sure it's a CSV file before opening it by looking for a ".csv" extension.
Though, to directly address your error... The CSV reader may be able to interpret a plain txt file as CSV and not return an err, like:
buf := strings.NewReader(`A regular text file with 3 lines.
Line2
Line3
`)
r := csv.NewReader(buf)
records, err := r.ReadAll()
if err != nil {
fmt.Println("could not read all of CSV file!")
return err
}
fmt.Println(records)
prints:
[[A regular text file with 3 lines.] [Line2] [Line3]]
Just assuming that it's a CSV with the correct number of rows and columns:
fmt.Println("found", records[4][2])
gives the panic message you shared:
panic: runtime error: index out of range [4] with length 3
You at least need to check that your CSV has 5 rows, and if it does, then check if the 5th row has 3 columns before you try to read that field:
if len(records) < 5 {
fmt.Println(path, "does not have 5 rows")
return nil
}
if len(records[4]) < 3 {
fmt.Println(path, "5th row does not have 3 columns")
return nil
}
fmt.Println("found", records[4][2])
You could also do, inside your walkfunc, a basic check of the file path itself to see if it looks like a CSV:
if strings.ToLower(path[len(path)-4:]) != ".csv" {
fmt.Println(path, "is not a CSV")
return nil
}
I show all this code, plus a fully worked/integrated example in this Playground.

Strange number when using fmt.Println in Golang

I'm new to Golang and have been doing alright but I have a strange issue that I have not encountered before when using fmt. This strange behavior is when I'm printing a string. At the end of the string (which has sub-strings) it is also printing out what appears to be the len() of each string although the number don't add up. Can anyone explain why this is happening and how to stop it?
Any help is greatly appreciated
Here is the code:
package main
import (
"fmt"
//"log"
"strings"
)
var e = "[{8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888 localhost:3303 4d50f447-7c93-42df-a03e-89c09626950a}]"
func main() {
tl := strings.Trim(e, "[{")
tr := strings.Trim(tl, "}]")
r := strings.TrimSpace(tr)
s := strings.Fields(r)
V_PK := s[0]
SERVER_ADDR := s[1]
A_KEY := s[2]
vv, _ := fmt.Printf("[{\"v_pk\": %q", V_PK)
pp, _ := fmt.Printf(",\"server_addr\": %q", SERVER_ADDR)
kk, _ := fmt.Printf(",\"a_key\": %q}] ", A_KEY)
rstr, _ := fmt.Println(vv, pp, kk)
stringc := string(rstr)
fmt.Println(stringc)
}
Expected output:
[{"v_pk": "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888","server_addr": "localhost:3303","a_key": "4d50f447-7c93-42df-a03e-89c09626950a"}]
Actual output:
[{"v_pk": "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888","server_addr": "localhost:3303","a_key": "4d50f447-7c93-42df-a03e-89c09626950a"}] 82 36 53
Why on earth would it be printing these string lengths on the end? It's probably obvious that I'm trying to build a JSON string so these numbers on the end are problematic when trying to import the string into a JSON interpreter.
Again, any help is appreciated!
Take a look at the documentation for fmt.Printf and its friends fmt.Println. The documentation reads:
Printf formats according to a format specifier and writes to standard output. It returns the number of bytes written and any write error encountered.
The line in your code
vv, _ := fmt.Printf("[{\"v_pk\": %q", V_PK)
prints the formatted string to standard output, then return the number of bytes written and stores that in vv. If you want to print the formatted string to standard output, just call fmt.Printf and ignore the output:
package main
import (
"fmt"
//"log"
"strings"
)
var e = "[{8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888 localhost:3303 4d50f447-7c93-42df-a03e-89c09626950a}]"
func main() {
tl := strings.Trim(e, "[{")
tr := strings.Trim(tl, "}]")
r := strings.TrimSpace(tr)
s := strings.Fields(r)
V_PK := s[0]
SERVER_ADDR := s[1]
A_KEY := s[2]
fmt.Printf("[{\"v_pk\": %q, \"server_addr\": %q, \"a_key\": %q}]\n", V_PK, SERVER_ADDR, A_KEY)
}
Or, if you want to store the formatted string to a new string variable, call fmt.Sprintf:
stringc := fmt.Sprintf("[{\"v_pk\": %q, \"server_addr\": %q, \"a_key\": %q}]", V_PK, SERVER_ADDR, A_KEY)
fmt.Println(stringc)
You can check out a working version at the playground.
You might also want to checkout the json package, which can do the parsing and serializing for you with properly defined structs:
package main
import (
"encoding/json"
"fmt"
)
func main() {
type Datum struct {
VPK string `json:"v_pk"`
Server string `json:"server_addr"`
AKey string `json:"a_key"`
}
data := []Datum{
{VPK: "8888a8558921d75ec8bc362efbe9a76b82ec002337534e9f06ce92cbf8c27c8888",
Server: "localhost:3303",
AKey: "4d50f447-7c93-42df-a03e-89c09626950a",
}}
json, err := json.MarshalIndent(data, "", " ")
if err != nil {
// deal with error
}
fmt.Println(string(json))
}
Check it out at the go playground.
fmt.Printf returns the number of bytes written. The variables vv, pp, kk are the number of bytes written by those three Printf calls, and the three numbers printed are those numbers.

Writing multi line string to CSV file

How can I write a multi line value to a CSV file using the encoding/csv package?
fh, err := os.Create(fileName)
if err != nil {
log.Fatalf("Could not create file: %v", err)
}
defer fh.Close()
w := csv.NewWriter(fh)
normalValue := "I am a single line value"
multiValue := []string{"I am a ", "multi line value"}
w.Write([]string{normalValue, multiValue})
The result I would expect in the resulting CSV file:
I am a single line value,"I am a
multi line value"
How can I realize this, since csv.Write does not accept []string as an argument? Simply appending \n between each element of the multi line value does not achieve anything either.
You need to embed the \n in your multiline value. You can do this, for example, by using strings.Join:
w.Write([]string{
normalValue,
strings.Join(multiValue, "\n"),
})
https://play.golang.org/p/uWJnClpQ1OT

Go panic: extra delimiter at end of line

I'm reading the MaxMind GeoIP Lite City locations CSV file using Go:
csvFile, err := os.Open("/path/GeoLiteCity_20130702/GeoLiteCity-Location.csv")
defer csvFile.Close()
if err != nil {
panic(err)
}
csvf := csv.NewReader(csvFile)
csvf.Read() // skip header row
for {
fields, err := csvf.Read()
if err == io.EOF {
break
} else if err != nil {
panic(err)
}
// does nothing yet
}
The error I'm getting is:
panic: line 2, column 22: extra delimiter at end of line
goroutine 1 [running]: main.main()
/path/myprogram.go:239
+0x108f
goroutine 2 [runnable]: exit status 2
The file is quite long, but starts with these lines:
locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode
1,O1,,,,0.0000,0.0000,,
2,AP,,,,35.0000,105.0000,,
3,EU,,,,47.0000,8.0000,,
4,AD,,,,42.5000,1.5000,,
5,AE,,,,24.0000,54.0000,,
6,AF,,,,33.0000,65.0000,,
7,AG,,,,17.0500,-61.8000,,
8,AI,,,,18.2500,-63.1667,,
9,AL,,,,41.0000,20.0000,,
It appears to be properly formatted. Each row has 9 fields.
Line 239 is my line invoking the panic, panic(err). As you can see, it's failing on line 2 of the CSV file, which happens in the first iteration of the loop (line 1 is read before the loop, to skip the header row). Column 22 of line 2 is the second-to-last comma.
Am I missing something here? I don't see any trailing comma... (clarification: the commas at the end of each line must be there to indicate empty field values, so they're not trailing, as in, extra.)
UPDATE: The gophers have resolved this issue and the fix ships with Go 1.1.2.
There are even two trailing commas on each line.
Try setting csv.Reader.TrailingComma = true.
It really often helps taking a look at the source or at least the package documentation :-)
Here is a complete example for you. The key is csvf.TrailingComma = true.
package main
import (
"bytes"
"encoding/csv"
"fmt"
"io"
)
var csvData = `locId,country,region,city,postalCode,latitude,longitude,metroCode,areaCode
1,O1,,,,0.0000,0.0000,,
2,AP,,,,35.0000,105.0000,,
3,EU,,,,47.0000,8.0000,,
4,AD,,,,42.5000,1.5000,,
5,AE,,,,24.0000,54.0000,,
6,AF,,,,33.0000,65.0000,,
7,AG,,,,17.0500,-61.8000,,
8,AI,,,,18.2500,-63.1667,,
9,AL,,,,41.0000,20.0000,,
`
func main() {
csvFile := bytes.NewBufferString(csvData)
csvf := csv.NewReader(csvFile)
csvf.TrailingComma = true
csvf.Read() // skip header row
for {
fields, err := csvf.Read()
if err == io.EOF {
break
} else if err != nil {
panic(err)
}
// does nothing yet
fmt.Println(fields)
}
}

sending JSON with go

I'm trying to send a JSON message with Go.
This is the server code:
func (network *Network) Join(
w http.ResponseWriter,
r *http.Request) {
//the request is not interesting
//the response will be a message with just the clientId value set
log.Println("client wants to join")
message := Message{-1, -1, -1, ClientId(len(network.Clients)), -1, -1}
var buffer bytes.Buffer
enc := json.NewEncoder(&buffer)
err := enc.Encode(message)
if err != nil {
fmt.Println("error encoding the response to a join request")
log.Fatal(err)
}
fmt.Printf("the json: %s\n", buffer.Bytes())
fmt.Fprint(w, buffer.Bytes())
}
Network is a custom struct. In the main function, I'm creating a network object and registering it's methods as callbacks to http.HandleFunc(...)
func main() {
runtime.GOMAXPROCS(2)
var network = new(Network)
var clients = make([]Client, 0, 10)
network.Clients = clients
log.Println("starting the server")
http.HandleFunc("/request", network.Request)
http.HandleFunc("/update", network.GetNews)
http.HandleFunc("/join", network.Join)
log.Fatal(http.ListenAndServe("localhost:5000", nil))
}
Message is a struct, too. It has six fields all of a type alias for int.
When a client sends an http GET request to the url "localhost:5000/join", this should happen
The method Join on the network object is called
A new Message object with an Id for the client is created
This Message is encoded as JSON
To check if the encoding is correct, the encoded message is printed on the cmd
The message is written to the ResponseWriter
The client is rather simple. It has the exact same code for the Message struct. In the main function it just sends a GET request to "localhost:5000/join" and tries to decode the response. Here's the code
func main() {
// try to join
var clientId ClientId
start := time.Now()
var message Message
resp, err := http.Get("http://localhost:5000/join")
if err != nil {
log.Fatal(err)
}
fmt.Println(resp.Status)
dec := json.NewDecoder(resp.Body)
err = dec.Decode(&message)
if err != nil {
fmt.Println("error decoding the response to the join request")
log.Fatal(err)
}
fmt.Println(message)
duration := time.Since(start)
fmt.Println("connected after: ", duration)
fmt.Println("with clientId", message.ClientId)
}
I've started the server, waited a few seconds and then ran the client. This is the result
The server prints "client wants to join"
The server prints "the json: {"What":-1,"Tag":-1,"Id":-1,"ClientId":0,"X":-1,"Y":-1}"
The client prints "200 OK"
The client crashes "error decoding the response to the join request"
The error is "invalid character "3" after array element"
This error message really confused me. After all, nowhere in my json, there's the number 3. So I imported io/ioutil on the client and just printed the response with this code
b, _ := ioutil.ReadAll(resp.Body)
fmt.Printf("the json: %s\n", b)
Please note that the print statement is the same as on the server. I expected to see my encoded JSON. Instead I got this
"200 OK"
"the json: [123 34 87 104 97 116 ....]" the list went on for a long time
I'm new to go and don't know if i did this correctly. But it seems as if the above code just printed the slice of bytes. Strange, on the server the output was converted to a string.
My guess is that somehow I'm reading the wrong data or that the message was corrupted on the way between server and client. But honestly these are just wild guesses.
In your server, instead of
fmt.Fprint(w, buffer.Bytes())
you need to use:
w.Write(buffer.Bytes())
The fmt package will format the Bytes() into a human-readable slice with the bytes represented as integers, like so:
[123 34 87 104 97 116 ... etc
You don't want to use fmt.Print to write stuff to the response. Eg
package main
import (
"fmt"
"os"
)
func main() {
bs := []byte("Hello, playground")
fmt.Fprint(os.Stdout, bs)
}
(playground link)
Produces
[72 101 108 108 111 44 32 112 108 97 121 103 114 111 117 110 100]
Use the Write() method of the ResponseWriter instead
You could have found this out by telneting to your server as an experiment - always a good idea when you aren't sure what is going on!