Error on trying to use Dataframe.to_json method - json

I'm trying to export a pandas dataframe to JSON with no luck. I've tried:
all_data.to_json("spdata.json") and all_data.to_json()
I get the same attribute error on both: 'DataFrame' object has no attribute 'to_json'. Just to make sure something isn't wrong with the DataFrame, i tested writing it to_csv and that worked.
Is there something i'm missing in my syntax or package i need to import? I am running Python version 2.7.5 which is part of an Enthought Canopy Express package. Imports at the beginning of my code are:
from pandas import Series, DataFrame
import pandas as pd
import numpy as np
from sys import argv
from datetime import datetime, timedelta
from dateutil.parser import parse

The to_json method was introduced to 0.12, so you'll need to upgrade your pandas to be able to use it.

Related

how to load modin dataframe from pyarrow or pandas

Since Modin does not support loading from multiple pyarrow files on s3, I am using pyarrow to load the data.
import s3fs
import modin.pandas as pd
from pyarrow import parquet
s3 = s3fs.S3FileSystem(
key=aws_key,
secret=aws_secret
)
table = parquet.ParquetDataset(
path_or_paths="s3://bucket/path",
filesystem=s3,
).read(
columns=["hotelId", "startDate", "endDate"]
)
# to get a pandas df the next step would be table.to_pandas()
If I know want to put the data in a Modin df for parallel computations without having to write to and read from a csv? Is there a way to construct the Modin df directly from a pyarrow.Table or at least from a pandas dataframe?
Mahesh's answer should work but I believe it would result in a full data copy (2X memory footprint by default: https://arrow.apache.org/docs/python/pandas.html#memory-usage-and-zero-copy)
At the time of writing Modin does have a native arrow integration, so you can directly convert using
from modin.pandas.utils import from_arrow
mdf = from_arrow(pyarrow_table)
You can't construct the Modin dataframe directly out of a pyarrow.Table, because pandas doesn't support that, and Modin only supports a subset of the pandas API. However, the table has a method that converts it to a pandas dataframe, and you can construct the Modin dataframe out of that. Using table from your code:
import modin.pandas as pd
modin_dataframe = pd.Dataframe(table.to_pandas())

Storing data, originally in a dictionary sequence to a dataframe in the json format of a webpage

I'm new to pandas. How to store data, originally in a dictionary sequence to a DataFrame in the json format of a webpage?
I am interpreting the question keeping in mind that you have the url of the webpage you want to read. Inspect that url and check if the data needed, is available in the json format. If present, an url will be provided containing all the data. We need that url in the following code:
First, import the pandas module.
import pandas as pd
import requests
import json
URL="url of the webpage having the json file"
r=requests.get(URL)
data= r.json()
Create the dataframe df.
df=pd.io.json.json_normalize(data)
Print the dataframe to check whether you have received the required one.
print(df)
I hope this answers your question.

Read data from thingspeak using python using urllib library

I wanted to read current data from thingspeak website
I used urllib library to get data using read url of the channel
i used python code
import urllib
from bs4 import BeautifulSoup
data=urllib.urlopen("https://api.thingspeak.com/channels/my_channel_no/feeds.json?
results=2")
print data.read()
select=repr(data.read())
print select
sel=select[20:]
print sel
https://api.thingspeak.com/channels/my_channel_no/feeds.json?results=2
I get this result from the query
'{"channel{"id":my_channel_no,"name":"voltage","latitude":"0.0","longitude":"0.0 ","field1":"Field Label 1","field2":"Field Label 2","created_at":"2018-04-05T16:33:14Z","updated_at":"2018-04-09T15:39:43Z","last_entry_id":108},"feeds":[{"created_at":"2018-04-09T15:38:42Z","entry_id":107,"field1":"20.00","field2":"40.00"},{"created_at":"2018-04-09T15:39:44Z","entry_id":108,"field1":"20.00","field2":"40.00"}]}'
But when this line was executed
select=repr(data.read())
the result was
"''"
And
sel=select[20:]
Output
''
This is used to reduce the length of the query
Can anyone give me sense of direction whats happening and solution for this selection of field value is the goal.

Is batch import option available?

Is there any method to import data from mysql to elasticsearch,batch by batch?If yes,how to do it??
Because the bulk import seems to be a problem.When i import 191000 items,only a few are being imported.

Can't understand this module/type error

I'm trying to use the Aeson JSON library in haskell. Right now, i just need to use "decode" to read a JSON dump.
import Data.Aeson
import Data.ByteString as BS
import Control.Applicative
main :: IO ()
main = print $ decode <$> BS.readFile "json"
I got the following error when trying to compile/run it:
Couldn't match type 'ByteString'
with 'Data.ByteString.Lazy.Internal.ByteString'
NB: 'ByteString is defined in 'Data.ByteString.Internal'
'Data.ByteString.Lazy.Internal.ByteString'
is defined in 'Data.ByteString.Lazy.Internal.ByteString
This error doesn't make sense to me. I tried importing the files described by ghc, but the import either fails or doesn't solve the problem.
Thanks
There are two variants of ByteString: A strict (the default one), exported by Data.ByteString, and a lazy one, exported by Data.ByteString.Lazy.
Aeson works on top of lazy byte string, so you should change your second line to
import Data.ByteString.Lazy as BS