Why extracting NEXRAD Level 3 products using the siphon package returns empty? - python-siphon

I am using the siphon package to extract NEXRAD Level 3 data following the example in this link: https://unidata.github.io/siphon/latest/examples/Radar_Server_Level_3.html. But it appears that the available datasets are empty. Does anyone know why and if there is a different package that I need to use to access NEXRAD Level 3 products? Thanks.
from datetime import datetime
import matplotlib.pyplot as plt
import numpy as np
from siphon.cdmr import Dataset
from siphon.radarserver import get_radarserver_datasets, RadarServer
ds = get_radarserver_datasets('http://thredds.ucar.edu/thredds/')
print(list(ds))
url = ds['NEXRAD Level III Radar from IDD'].follow().catalog_url
rs = RadarServer(url)
print(rs.variables)
query = rs.query()
query.stations('FTG').time(datetime.utcnow()).variables('N0Q')
rs.validate_query(query)
catalog = rs.get_catalog(query)
print(catalog.datasets)

The short answer is that the National Weather Service replaced the N?Q family of products (Digital Base Reflectivity) with "super-resolution" counterparts that have the identifier N?B, where ? refers to the elevation of the product.
So for the example you reference, replace N0Q with N0B and everything should work as before.

Related

CNN I'm trying to generate confusion Matrix and classification report for Multiclass classification of custom model. But values didn't seems correct

#Confusion Matrix
from sklearn.metrics import confusion_matrix
plt.figure(figsize=(16,9))
y_pred_labels = [ np.argmax(label) for label in predict ]
cm = confusion_matrix(test_set.classes, y_pred_labels)
# show cm
sns.heatmap(cm, annot=True, fmt='d',xticklabels=class_labels, yticklabels=class_labels)
from sklearn.metrics import classification_report
cr= classification_report(test_set.classes, y_pred_labels, target_names=class_labels)
print(cr)
[Load Data from directory](https://i.stack.imgur.com/p87gv.png)
[accuracy](https://i.stack.imgur.com/1dSab.png)
[evaluate](https://i.stack.imgur.com/LEV0X.png)
[predict](https://i.stack.imgur.com/Kiim2.png)
[cm and cr](https://i.stack.imgur.com/sQN9P.png)
[cr](https://i.stack.imgur.com/dMAaB.png)
[cm](https://i.stack.imgur.com/LzqcY.png)
Complete flow is as given screenshots.
Anyone Can find where is actual problem. How can i get the correct values in classification report? While in predictions values are correct by use of model. predict method and pass there a data set.

Convert JSON to dataframe with two indices

I'm trying to convert some financial data provided in JSON format into a single row in a dataframe. However, this JSON has the data with two indices or nested indices? I'm not sure how to appropriately describe the data.
So below is the code I'm using to pull the financial data.
import requests
import pandas as pd
stock ='AAPL'
BS = requests.get(f"https://financialmodelingprep.com/api/v3/financials/balance-sheet-statement/{stock}?period=quarter")
data = BS.json()
The output looks like this
{'symbol': 'AAPL',
'financials': [{'date': '2019-12-28',
'Cash and cash equivalents': '39771000000.0',
'Short-term investments': '67391000000.0',
'Cash and short-term investments': '1.07162e+11',
'Receivables': '20970000000.0',...}
I've tried the following
df = pd.DataFrame.from_dict(data, orient='index')
and
df = pd.DataFrame.from_dict(json_normalize(data), orient='columns')
Neither gets me what I want. Somehow I need to get rid of 'financials'. I want the data frame to
look like:
How do I do this?
So just use the dict of 'financials' when creating the dataframe.
import requests
import pandas as pd
stock ='AAPL'
BS = requests.get(f"https://financialmodelingprep.com/api/v3/financials/balance-sheet-statement/{stock}?period=quarter")
data = BS.json()
df = pd.DataFrame.from_dict(data['financials'])
print(df)

How to get data from Australia Bureau of Statistics using pandasmdx

Is there anyone who got ABS data using pandasmsdx library?
Here is the code to get data from European Central Bank (ECB) which is working.
from pandasdmx import Request
ecb = Request('ECB')
flow_response = ecb.dataflow()
print(flow_response.write().dataflow.head())
exr_flow = ecb.dataflow('EXR')
dsd = exr_flow.dataflow.EXR.structure()
data_response = ecb.data(resource_id='EXR', key={'CURRENCY': ['USD', 'JPY']}, params={'startPeriod': '2016'})
However, when I change Request('ECB') to Request('ABS') , error popups in 2nd line saying,
"{ValueError}This agency only supports requests for data, not dataflow."
Is there a way to get data from ABS?
documentation for pandasdmx: https://pandasdmx.readthedocs.io/en/stable/usage.html#basic-usage
Hope this will help
from pandasdmx import Request
Agency_Code = 'ABS'
Dataset_Id = 'ATSI_BIRTHS_SUMM'
ABS = Request(Agency_Code)
data_response = ABS.data(resource_id='ATSI_BIRTHS_SUMM', params={'startPeriod': '2016'})
#This will result into a stacked DataFrame
df = data_response.write(data_response.data.series, parse_time=False)
#A flat DataFrame
data_response.write().unstack().reset_index()
Australian Bureau of Statistics (ABS) only supports to their SDMX-JSON APIs, they don't send SDMX-ML messages like others. That's the reason it doesn't supports dataflow feature.
Please read for further reference: https://pandasdmx.readthedocs.io/en/stable/agencies.html#pre-configured-data-providers

Type Error: Result Set Is Not Callable - BeautifulSoup

I am having a problem with web-scraping. I am trying to learn how to do it, but I can't seem to get past some of the basics. I am getting an error, "TypeError: 'ResultSet' object is not callable" is the error I'm getting.
I've tried a number of different things. I was originally trying to use the "find" instead of "find_all" function, but I was having an issue with beautifulsoup pulling in a nonetype. I was unable to create an if loop that could overcome that exception, so I tried using the "find_all" instead.
page = requests.get('https://topworkplaces.com/publication/ocregister/')
soup = BeautifulSoup(page.text,'html.parser')all_company_list =
soup.find_all(class_='sortable-table')
#all_company_list = soup.find(class_='sortable-table')
company_name_list_items = all_company_list('td')
for company_name in company_name_list_items:
#print(company_name.prettify())
companies = company_name.content[0]
I'd like this to pull in all the companies in Orange County California that are on this list in a clean manner. As you can see, I've already accomplished pulling them in, but I want the list to be clean.
You've got the right idea. I think instead of immediately finding all the <td> tags (which is going to return one <td> for each row (140 rows) and each column in the row (4 columns)), if you want only the company names, it might be easier to find all the rows (<tr> tags) then append however many columns you want by iterating the <td>s in each row.
This will get the first column, the company names:
import requests
from bs4 import BeautifulSoup
page = requests.get('https://topworkplaces.com/publication/ocregister/')
soup = BeautifulSoup(page.text,'html.parser')
all_company_list = soup.find_all('tr')
company_list = [c.find('td').text for c in all_company_list[1::]]
Now company_list contains all 140 company names:
>>> print(len(company_list))
['Advanced Behavioral Health', 'Advanced Management Company & R³ Construction Services, Inc.',
...
, 'Wes-Tec, Inc', 'Western Resources Title Company', 'Wunderman', 'Ytel, Inc.', 'Zillow Group']
Change c.find('td') to c.find_all('td') and iterate that list to get all the columns for each company.
Pandas:
Pandas is often useful here. The page uses multiple sorts including company size, rank. I show rank sort.
import pandas as pd
table = pd.read_html('https://topworkplaces.com/publication/ocregister/')[0]
table.columns = table.iloc[0]
table = table[1:]
table.Rank = pd.to_numeric(table.Rank)
rank_sort_table = table.sort_values(by='Rank', axis=0, ascending = True)
rank_sort_table.reset_index(inplace=True, drop=True)
rank_sort_table.columns.names = ['Index']
print(rank_sort_table)
Depending on your sort, companies in order:
print(rank_sort_table.Company)
Requests:
Incidentally, you can use nth-of-type to select just first column (company names) and use id, rather than class name, to identify the table as faster
import requests
from bs4 import BeautifulSoup as bs
r = requests.get('https://topworkplaces.com/publication/ocregister/')
soup = bs(r.content, 'lxml')
names = [item.text for item in soup.select('#twpRegionalList td:nth-of-type(1)')]
print(names)
Note the default sorting is alphabetical on name column rather than rank.
Reference:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.sort_values.html

Getting html table data other than text content (get "title" tag data)

One table entry within a table row on an html table I am trying to scrape looks like so:
<td class="top100nation" title="PAK">
<img src="/images/flag/flags_pak.jpg" alt="PAK"></td>
The web page to which this belongs is the following: http://www.relianceiccrankings.com/datespecific/odi/?stattype=bowling&day=01&month=01&year=2014. The entire column to which this belongs in the table has similar table data (i.e. it's a column of images).
I am using lxml in a python script. (Open to using BeautifulSoup instead, if I have to for some reason.) For every other column in the table, I can extract the data I want on the given row by using 'data = entry.text_content()'. Obviously, this doesn't work for this column of images. But I don't want the image data in any case. What I want to get from this table data is the 'PAK' bit - that is, I want the name of the nation. I think this is extremely simple but unfortunately I am a simpleton who doesn't understand the library he is using.
Thanks in advance
Edit: Full script, as per request
import requests
import lxml.html as lh
import csv
with open('firstPageCricinfo','w') as file:
writer = csv.writer(file)
page = requests.get(url)
doc = lh.fromstring(page.content)
#rows of the table
tr_elements = doc.xpath('//tr')
data_array = [[] for _ in range(len(tr_elements))]
del tr_elements[0]
for t in tr_elements[0]:
name=t.text_content()
if name == "":
continue
print(name)
data_array[0].append(name)
#printing out first row of table, to check correctness
print(data_array[0])
for j in range(1,len(tr_elements)):
T=tr_elements[j]
i=0
for t in T.iterchildren():
#column is not at issue
if i != 3:
data=t.text_content()
#image-based column
else:
#what do I do here???
data = t.
data_array[j].append(data)
i+=1
#printing last row to check correctness
print(data_array[len(tr_elements)-1])
with open('list1','w') as file:
writer = csv.writer(file)
for i in range(0,len(tr_elements)):
writer.writerow(data_array[i])`
Along with lxml library you'll either need to use requests or some other library to get the website content.
Without seeing the code you have so far, I can offer a BeautifulSoup solution:
url = 'http://www.relianceiccrankings.com/datespecific/odi/?stattype=bowling&day=01&month=01&year=2014'
from bs4 import BeautifulSoup
import requests
soup = BeautifulSoup(requests.get(url).text, 'lxml')
r = soup.find_all('td', {'class': 'top100cbr'})
for td in r:
print(td.text.split('v')[1].split(',')[0].strip())
outputs about 522 items:
South Africa
India
Sri Lanka
...
Canada
New Zealand
Australia
England