I have a Hospital pricing JSON file that management wants me to parse but the file is over 4 million rows and as all of you know Excel can only handle 1 million lines. Fortunately, they only want pricing from a certain hospital group. I know how to do a basic parse of JSON files using excel but don't know how to manipulate the parse so it only pulls down data matching a certain criteria.
I don't see a specific question here so I'll give a broad answer. I don't think Excel is the right tool for the job here. You're better off using either a scripting or programming tool to filter out the rows from the JSON file that you need. You can then also reuse the script you wrote when another one of these questions comes in. A simple and easy to use contender here would be python and its json module.
Related
I've been using Dax to help me Document my Power BI file. Using Dax queries I've been able to record all the fields that exist in the file, including calculated and measured fields. In my documentation process I am also looking to find a way to record visualizations on the report - namely the charts and graphs. Unfortunately, no Dax query I've read about provides a list of data such as the visualization title, what fields it's using, or what kind of graph it is. Is there any Dax query that provides this information, as a whole or any part of it?
In addition to attempting to document with Dax I have also looked at the raw XML data in the Power BI file (For those who may not know, you can rename your Power BI file from .pbix to .zip and view the raw data). The relevant files within PBI are either XML or JSON. Looking at ../Report/Layout.JSON specifically I have seen JSON-formatted text that includes visualization data. Is there any easy way to extract this data and format it in a more-readable fashion?
For clarity, I do not need the contents of the tables, but I would like a way to record what fields are being used in the visualization, rather than what fields merely exist.
EDIT: I've found a workaround. It isn't efficient, and I would still appreciate any knowledge on this subject
I mentioned going through the the Layout file, renaming it to .JSON and poking it in Notepad++. I've found that you can ctrl+f for "displayName", "queryRef" and ""title\":show\":true,\"text\":\"". Break these all to new lines and indent them with tab (Use ctrl+h and replace with \n\t in notepad). These indent the JSON-formatted lines for Power BI pages, fields called by visualizations, and the visualization titles (if they have any), respectively.
Save this document as .csv and load it into Excel by delimiting on tabs. Use your preferred process - I prefer query editor - editor to remove the other non-indented rows. There still may be a lot of excess characters on the indented lines which need to be removed manually. At the end of this process, though, I ended with 3 columns in excel listing the aforementioned fields I've been looking for.
On a PBIX file with more than a dozen pages and several hundred dependent fields this process took about three hours. If there are any faster ways to do this, I would love to hear about them
As you have noted, DAX doesn't help you in this case because it will tell you about the model rather than the visuals on the report pages. The Layout file works, but you have to parse it for the information you need. You could probably just pull that JSON file into Power BI and process it there to get the info you want. There are also third party tools that can help with this. I just looked at https://app.datavizioner.com/ and it lists the ID of the visual, the type of visual, and each field used in the visual. It is currently free and just requires you to upload a PBIT of your report. It doesn't have the title of the visual that we see, so you would have to find a way to map the IDs you see to the human-friendly title of the visuals if you need that.
See http://radacad.com/power-bi-helper. It can tell you tables and columns in use. It also can export a list of all tables, columns, formulas, and roles in your model.
If you want stuff on the visualizations and how they are configured, Layout.json is the only way I know. The file does open nicely in Power Query if you were so inclined to try to make something of it.
My new Power BI comparer tool documents the whole Power BI file (pbit). The "CompareVisuals"-tab should provide you with all the information necessary.
It also superfast: Just fill in the path to the pbits (you can fill in the same path into both fields, if you don't want to compare, but just to analyze one file).
https://www.thebiccountant.com/2019/09/14/compare-power-bi-files-with-power-bi-comparer-tool/
I am looking for the best way to store a set of data on my server, then from within an App I am building, retrieve random parts of that data. I am building an App that will present the end-user with study related questions. I have 40 subjects, with 50 multiple choice questions per subject, a few sample questions for each subject and only 1 correct answer per question. I have been considering using phpMyAdmin going down the SQL route, but considering I already have all of my data neatly arranged in an excel sheet with columns for 'Subject' 'Sample question bank' 'Real question bank' 'Answer bank' with the respective excel sheets (containing the actual content) listed under the Sample, Real and Answer Question Bank columns.
Is entering in and restructuring manually, all of my data, really the only/best way for me to move forward? Or is there another method, perhaps one of storing and accessing Excel files on a server, and being able to call data from a given column. The way my data is arranged, I will never need a specific question, the only pair of data that must match is the proper Answer to a question. All of my other calls for data within the Application will be random. i.e. I will be populating the App with 20/30/40 random questions from within a particular subject.
I apologize in advance if I am violating any rules or if my etiquette is improper. Thanks very much for anyones input or suggestions.
Import your existing spreadsheet into mysql.
Create a table with the specified column names.
From Excel save as CSV file
install mysql, if you haven't
setup admin account.
launch mysql
to import the CSV.
from mysql use INFILE
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
and your whole spreadsheet is now in mysql.
Then use php or perl to make a web-based interface. Of course, other programming languages are just as good.
I am designing a system with 30,000 objects or so and can't decide between the two: either have a JSON file pre computed for each one and get data by pointing to URL of the file (I think Twitter does something similar) or have a PHP/Perl/whatever else script that will produce JSON object on the fly when requested, from let's say database, and send it back. Is one more suited for than another? I guess if it takes a long time to generate the JSON data it is better to have already done JSON files. What if generating is as quick as accessing a database? Although I suppose one has a dedicated table in the database specifically for that. Data doesn't change very often so updating is not a constant thing. In that respect the data is static for all intense and purposes.
Anyways, any thought would be much appreciated!
Alex
You might want to try MongoDB which retrieves the objects as JSON and is highly scalable and easy to setup.
I want to store file data of a directory in a file. i.e., file name, file size etc so that I can reduce search time. My problem now is to find an efficient way to do it. I have considered json and xml but can't decide between these two. Also if there is a better way let me know.
I'd say that it's up to what kind of data you prefer to work with and to what structure of data you have (very simple as a list of word, less simple as a list of word and the number of time each word was searched,...)
For a list of word you can use a simple text file with one word per line or coma separated (csv), for a less simple structure, json or xml will work fine.
I like to work with json as it's more light than xml and less verbose. If you didn't plan to share this data and/or it isn't complex, you don't need the validation (xsd,...) offered by xml.
And even if you plan to share this data, you can work with json.
You'll need some server side code to write the data to a file, like php, java, python, ruby,...
I would recommend Json file if you use alomst like a properties file.
If you plan to store the data in the file into database then you can go for XML where u have to the option to use JAXB/JPA in java environment
I'm a real beginner when it comes time for this, so I apologize in advance.
The long and short of what I am looking for is a fairly simple concept - I want to pull JSON data off a server, parse it, and load it into excel, access, or some other type of tables. Basically, I want to be able to store the data so I can filter, sort, and query it.
To make matters a little more complicated, the server will only return truncated results with each JSON, so it will be necessary to make multiple requests to the server.
Are there tools out there or code available which will help me do what I am looking for? I am completely lost, and I have no idea where to start.
(please be gentle)
I'm glad seeing this question b/c I'm doing very similar things! And based on what I'd gone through, it has lot to do with how those tables are designed or even linked together at first, and then the mapping between these tables and different JSON objects at different depth or position in the original JSON file. After the mapping rules are made clear, the code can be done by merely hard-coding the mapping(I mean like: if you got JSON object after a certain parent of it, then you save the data into certain table(s)) if you're using some high level JSON paring library.
OK as i have to dash home from the office now:
Assuming that you are going to use Excel to Parse the data you are going to need:
1.Some Json Parser JSON Parser for VBA
2.Some code to download the JSON
3.A loop of VBA code that loops through each file and parses it into a sheet.
Is this ok for a starter? If you are struggling let me know and I will try and knock something up a little better over the weekend.