I'm getting Yahoo Finance data as a JSON file (via the YahooFinancials python API) and I would like to be able to parse the data in a smart way to feed my Google Sheet.
For this example, I'm interested in getting the "cash" variable under the "date" nested structure. But as you'll see, sometimes there is no "cash" variable under the first date, so I would like the script/formula to go and get the "cash" variable that's under the second date structure.
Here is sample 1 of JSON code:
{ "balanceSheetHistoryQuarterly": {
"ABBV": [
{
"2018-12-31": {
"totalStockholderEquity": -2921000000,
"netTangibleAssets": -45264000000
}
},
{
"2018-09-30": {
"intangibleAssets": 26625000000,
"capitalSurplus": 14680000000,
"totalLiab": 69085000000,
"totalStockholderEquity": -2921000000,
"otherCurrentLiab": 378000000,
"totalAssets": 66164000000,
"commonStock": 18000000,
"otherCurrentAssets": 112000000,
"retainedEarnings": 6789000000,
"otherLiab": 16511000000,
"goodWill": 15718000000,
"treasuryStock": -24408000000,
"otherAssets": 943000000,
"cash": 8015000000,
"totalCurrentLiabilities": 15387000000,
"shortLongTermDebt": 1026000000,
"otherStockholderEquity": -2559000000,
"propertyPlantEquipment": 2950000000,
"totalCurrentAssets": 18465000000,
"longTermInvestments": 1463000000,
"netTangibleAssets": -45264000000,
"shortTermInvestments": 770000000,
"netReceivables": 5780000000,
"longTermDebt": 37187000000,
"inventory": 1786000000,
"accountsPayable": 10981000000
}
},
{
"2018-06-30": {
"intangibleAssets": 26903000000,
"capitalSurplus": 14596000000,
"totalLiab": 65016000000,
"totalStockholderEquity": -3375000000,
"otherCurrentLiab": 350000000,
"totalAssets": 61641000000,
"commonStock": 18000000,
"otherCurrentAssets": 128000000,
"retainedEarnings": 5495000000,
"otherLiab": 16576000000,
"goodWill": 15692000000,
"treasuryStock": -23484000000,
"otherAssets": 909000000,
"cash": 3547000000,
"totalCurrentLiabilities": 17224000000,
"shortLongTermDebt": 3026000000,
"otherStockholderEquity": -2639000000,
"propertyPlantEquipment": 2787000000,
"totalCurrentAssets": 13845000000,
"longTermInvestments": 1505000000,
"netTangibleAssets": -45970000000,
"shortTermInvestments": 196000000,
"netReceivables": 5793000000,
"longTermDebt": 31216000000,
"inventory": 1580000000,
"accountsPayable": 10337000000
}
},
{
"2018-03-31": {
"intangibleAssets": 27230000000,
"capitalSurplus": 14519000000,
"totalLiab": 65789000000,
"totalStockholderEquity": 3553000000,
"otherCurrentLiab": 125000000,
"totalAssets": 69342000000,
"commonStock": 18000000,
"otherCurrentAssets": 17000000,
"retainedEarnings": 4977000000,
"otherLiab": 17250000000,
"goodWill": 15880000000,
"treasuryStock": -15961000000,
"otherAssets": 903000000,
"cash": 9007000000,
"totalCurrentLiabilities": 17058000000,
"shortLongTermDebt": 6024000000,
"otherStockholderEquity": -2630000000,
"propertyPlantEquipment": 2828000000,
"totalCurrentAssets": 20444000000,
"longTermInvestments": 2057000000,
"netTangibleAssets": -39557000000,
"shortTermInvestments": 467000000,
"netReceivables": 5841000000,
"longTermDebt": 31481000000,
"inventory": 1738000000,
"accountsPayable": 10542000000
}
}
]
}
}
The first date structure (under 2018-12-31) doesn't contain the cash variable. So I would like the Google sheet to go and search for the same data in 2018-09-30 and if not available go and search in 2018-06-30.
OR just scan the nested structure dates and fetch the first "cash" occurrence that will be found.
Basically, I would like to know how to skip the name of the date variable (i.e.2018-12-31) as it doesn't really matter, and just make the formula seek for the first available "cash" variable.
Main questions recap
How to skip mentioning an exact nested level name and scan what's
inside?
How to keep scanning until you find the desired variable with
a value that is not "null" (this can happen)?
What would be the entire formula to achieve the following logic: Scan the JSON file until you find the value > if no value found, fallback to this IMPORTXML function that calls an external API.
Let me know if you need more context about the issue and thanks in advance for your help :)
EDIT: this is the IMPORTJSON formula I use in the cell of the spreadsheet right now.
=ImportJSON("https://api.myjson.com/bins/8mxvi", "/financial/balanceSheetHistoryQuarterly/ABBV/2018-31-12/cash", "noHeaders")
Obviously, this one returns an error as there is nothing under that date. The JSON is also the valid link I use just now.
=REGEXEXTRACT(FILTER(
TRANSPOSE(SPLIT(SUBSTITUTE(A1, ","&CHAR(10), "×"), "×")),
ISNUMBER(SEARCH("*"&"cash"&"*",
TRANSPOSE(SPLIT(SUBSTITUTE(A1, ","&CHAR(10), "×"), "×"))))), ": (.+)")
=INDEX(ARRAYFORMULA(SUBSTITUTE(REGEXEXTRACT(FILTER(TRANSPOSE(SPLIT(SUBSTITUTE(
TRANSPOSE(IMPORTDATA("https://api.myjson.com/bins/8mxvi")), ","&CHAR(10), "×"), "×")),
ISNUMBER(SEARCH("*"&"cash"&"*", TRANSPOSE(SPLIT(SUBSTITUTE(
TRANSPOSE(IMPORTDATA("https://api.myjson.com/bins/8mxvi")), ","&CHAR(10), "×"), "×"))))),
":(.+)"), ",", "")), 1, 1)
I have some Couchbase data in the following format
{
"id": "12343",
"transaction": {
"2018-01-11": 10,
"2017-12-01" : 20
},
"_type": "TransactionData"
}
I would like to get the ids whose transaction list contains key older than a given date ( for example, this object would not be retrieved for a value of "2017-11-01", but it does for "2017-12-12".
I made a view, but I would like to parameterise the date String:
function (doc, meta) {
if (doc._type == 'TransactionData') {
for (var key in doc.transaction) {
//I want to send the String value from java
if (key < "2018-02-21") {
emit(doc.id, null);
break;
}
}
}
}
I tried writing some N1QL query, but my server doesn't allow that, I can't change this configuration.
I don't think I can use the startKey, because I return a map of (id, null) pairs.
How can I filter the ids that have transactions older than a configurable date?
Thanks.
You can do like this:
function (doc, meta) {
if (doc._type == 'TransactionData') {
for (var key in doc.transaction) {
emit(doc.id, null);
}
}
}
user _count for Reduce function, then you can query using
query.range("2018-02-21", {}).reduce(true)
then you can take the value to see how many rows there are
Views are static indexes. Documents are processed once after each change, and any emitted results put into the index. You can't parameterize your function because it isn't rerun for every query. So you can't solve the problem the way you're approaching it. (You can do that with N1QL.)
Typically you solve this by adding a key range filter to your query. Look at the documentation for querying views. There are examples on how to select by date. You'll have to decide how you want to structure the index (view) you create.
I have following JSON data in MySQL JSON FIELD
{
"Session0":[
{
"r_type":"main",
"r_flag":"other"
},
{
"r_type":"sub",
"r_flag":"kl"
}
],
"Session1":[
{
"r_type":"up",
"r_flag":"p2"
},
{
"r_type":"id",
"r_flag":"mb"
}
],
"Session2":[
{
"r_type":"main",
"r_flag":"p2"
},
{
"r_type":"id",
"r_flag":"mb"
}
]
}
Now, I wish to search ALL sessions where r_type="main". The session number can vary, hence I can not use a OR query. So, I need something like: where
JSON_EXTRACT(field,"$.Session**[*].r_type")="main"
But this does not seem to work. I need to be able to use wildcard in the property's name and then search an array for a property inside it. How do I do that?
Following work's, but that limits our ability to have unlimited Sessions numbers.
SELECT field->"$.Session1[*].r_type" from table
Im trying to create a d3.js graph from a rails database. This takes the following json
{
"nodes":[
{
"name":"Sebo",
"group":4,
"id":1
},
{
"name":"Pierre",
"group":5,
"id":2
},
{
"name":"Bilbo",
"group":2,
"id":3
},
{
"name":"yyyyyyyy",
"group":2,
"id":4
}
],
"links":[
{
"source":3,
"target":2,
"value":null
},
{
"source":3,
"target":1,
"value":null
},
{
"source":4,
"target":2,
"value":null
},
{
"source":4,
"target":1,
"value":null
}
]
}
I have created a button that allows a current user to follow another user. This then gets stored in a database and eventually the graph can be re-visualised.
The problem is that the request to update the database is based on the current user id (from the database). This is a non-zero based indexing so the first user is user id:1. However the json uses zero based indexing. This means that if user_id=1 connects to user_id=4 then when the graph is seen again this connection is attributed to user_id:2. What would be great is if I could specify that the user_id index could start with zero so that the array and database are in agreement. Is this the correct way to think about this? Can I force the indexing of the user table to start at zero eg in a rails schema?
I have documents one is dependent to other. first:
{
"doctype": "closed_auctions",
"seller": {
"person": "person11304"
},
"buyer": {
"person": "person0"
},
"itemref": {
"item": "item1"
},
"price": 50.03,
"date": "11/17/2001",
"quantity": 1,
"type": "Featured",
"annotation": {
"author": {
"person": "person8597"
}
}
here you can see doc.buyer.person is dependent to another documents like this:
{
"doctype": "people",
"id": "person0",
"name": "Kasidit Treweek",
"profile": {
"income": 20186.59,
"interest": [
{
"category": "category251"
}
],
"education": "Graduate School",
"business": "No"
},
"watch": [
{
"open_auction": "open_auction8747"
}
]
}
How can I get buyer's name from these two documents? I means doc.buyer.person is connected with second document's id. It is join and from documentation it's not clear. http://docs.couchbase.com/couchbase-manual-2.0/#solutions-for-simulating-joins
Well, first off, let me point out that the very first sentence of the documentation section that you referenced says (I added the emphasis):
Joins between data, even when the documents being examined are
contained within the same bucket, are not possible directly within the
view system.
So, the quick answer to your question is that you have lots of options. Here are a few of them:
Assume you need only the name for a rather small subset of people. Create a view that outputs the PersonId as key and Name as value, then query the view for a specific name each time you need it.
Assume you need many people joined to many auctions. Download the full contents of the basic index from #1 and execute the join using linq.
Assume you need many properties of the person, not just the name. Download the Person document for each auction item.
Assume you need a small subset from both Auction and People. Index the fields from each that you need, include a type field, and emit all of them under the key of the Person. You will be able to query the view for all items belonging to the person.
The last approach was used in the example you linked to in your question. For performance, it will be necessary to tailor the approach to your usage scenario.
An other solution consist to merge datas in a custom reduce function.
// view
function (doc, meta) {
if (doc.doctype === "people") {
emit(doc.id, doc);
}
if (doc.doctype === "closed_auctions") {
emit(doc.buyer.person, doc);
}
}
// custom reduce
function (keys, values, rereduce) {
var peoples = values.filter(function (doc) {
return doc.doctype === "people";
});
for (var key in peoples) {
var people = peoples[key];
people.closed_auctions = (function (peopleId) {
return values.filter(function (doc) {
return doc.doctype === "closed_auctions" && doc.buyer.person === peopleId;
});
})(people.id);
}
return peoples;
}
And then you can query one user with "key" or multiple users with "keys".
After I don't know what the performances issues are with this method.