Modify CSV via Script - csv

I need to modify csv files via automated scripts. I need help in what direction I should look towards and what language the script should be in
Situation: I have a simple CSV file but I need an automated script that can edit certain fields and fill in blank ones with whatever I specify. What should my starting point be and what kind of a developer should I look for? Which coding language should he or she be knowledgable at?
Thank you!!

maybe you are looking for CSVfix, this is a tool for manipulating CSV data in the command shell. Take a look here: https://code.google.com/p/csvfix/
With it you can, among other things:
Reorder, remove, split and merge fields
Convert case, trim leading & trailing spaces
Search for specific content using regular expressions
Filter out duplicate data or data on exclusion lists
Enrich with data from other sources
Add sequence numbers and file source information
Split large CSV files into smaller files based on field contents
Perform arithmetic calculations on individual fields
Validate CSV data against a collection of validation rules
and convert between CSV and fixed format, XML, SQL and DSV
I hope this helps you out,
best regards,
Jürgen Jester

Related

Kibana Searches Using CSV file

I want to search Kibana against the contents of a csv file.
I have a large data of logs that i want to search for specific parameter(destination IP) and search that parameter against a csv file containing all the ips.
Can anybody give me some referees to relevant document which might help me get the required output.
How many IPs do you have in your CSV file? In general it sounds like a terms query that would be limited to 65K terms.
If it's a long list I'm not sure if this might not be simpler to do programatically against Elasticsearch directly than pasting around large queries or many elements.

How to document report visualizations in Power BI?

I've been using Dax to help me Document my Power BI file. Using Dax queries I've been able to record all the fields that exist in the file, including calculated and measured fields. In my documentation process I am also looking to find a way to record visualizations on the report - namely the charts and graphs. Unfortunately, no Dax query I've read about provides a list of data such as the visualization title, what fields it's using, or what kind of graph it is. Is there any Dax query that provides this information, as a whole or any part of it?
In addition to attempting to document with Dax I have also looked at the raw XML data in the Power BI file (For those who may not know, you can rename your Power BI file from .pbix to .zip and view the raw data). The relevant files within PBI are either XML or JSON. Looking at ../Report/Layout.JSON specifically I have seen JSON-formatted text that includes visualization data. Is there any easy way to extract this data and format it in a more-readable fashion?
For clarity, I do not need the contents of the tables, but I would like a way to record what fields are being used in the visualization, rather than what fields merely exist.
EDIT: I've found a workaround. It isn't efficient, and I would still appreciate any knowledge on this subject
I mentioned going through the the Layout file, renaming it to .JSON and poking it in Notepad++. I've found that you can ctrl+f for "displayName", "queryRef" and ""title\":show\":true,\"text\":\"". Break these all to new lines and indent them with tab (Use ctrl+h and replace with \n\t in notepad). These indent the JSON-formatted lines for Power BI pages, fields called by visualizations, and the visualization titles (if they have any), respectively.
Save this document as .csv and load it into Excel by delimiting on tabs. Use your preferred process - I prefer query editor - editor to remove the other non-indented rows. There still may be a lot of excess characters on the indented lines which need to be removed manually. At the end of this process, though, I ended with 3 columns in excel listing the aforementioned fields I've been looking for.
On a PBIX file with more than a dozen pages and several hundred dependent fields this process took about three hours. If there are any faster ways to do this, I would love to hear about them
As you have noted, DAX doesn't help you in this case because it will tell you about the model rather than the visuals on the report pages. The Layout file works, but you have to parse it for the information you need. You could probably just pull that JSON file into Power BI and process it there to get the info you want. There are also third party tools that can help with this. I just looked at https://app.datavizioner.com/ and it lists the ID of the visual, the type of visual, and each field used in the visual. It is currently free and just requires you to upload a PBIT of your report. It doesn't have the title of the visual that we see, so you would have to find a way to map the IDs you see to the human-friendly title of the visuals if you need that.
See http://radacad.com/power-bi-helper. It can tell you tables and columns in use. It also can export a list of all tables, columns, formulas, and roles in your model.
If you want stuff on the visualizations and how they are configured, Layout.json is the only way I know. The file does open nicely in Power Query if you were so inclined to try to make something of it.
My new Power BI comparer tool documents the whole Power BI file (pbit). The "CompareVisuals"-tab should provide you with all the information necessary.
It also superfast: Just fill in the path to the pbits (you can fill in the same path into both fields, if you don't want to compare, but just to analyze one file).
https://www.thebiccountant.com/2019/09/14/compare-power-bi-files-with-power-bi-comparer-tool/

Storing data of files in other file

I want to store file data of a directory in a file. i.e., file name, file size etc so that I can reduce search time. My problem now is to find an efficient way to do it. I have considered json and xml but can't decide between these two. Also if there is a better way let me know.
I'd say that it's up to what kind of data you prefer to work with and to what structure of data you have (very simple as a list of word, less simple as a list of word and the number of time each word was searched,...)
For a list of word you can use a simple text file with one word per line or coma separated (csv), for a less simple structure, json or xml will work fine.
I like to work with json as it's more light than xml and less verbose. If you didn't plan to share this data and/or it isn't complex, you don't need the validation (xsd,...) offered by xml.
And even if you plan to share this data, you can work with json.
You'll need some server side code to write the data to a file, like php, java, python, ruby,...
I would recommend Json file if you use alomst like a properties file.
If you plan to store the data in the file into database then you can go for XML where u have to the option to use JAXB/JPA in java environment

SSRS export format customizations and what are the limitations?

I have a 3rd party system a user uses which requires the user manually import new data when the user chooses. I have a view in MS SQL server that has the fields in the exact order that is wanted.
This 3rd party system needs the export file in a comma quote format. For this I want every single field surrounded with quotes and not just the ones that contain the field delimiter (a comma).
I have worked with the configuration files to try and customize how csv is exported. It seems the available options for the CSV renderer does not allow me to get to this format. I think? Am I making this more difficult than I need to? What do I need to do to get to a format like this?
Seeing as this report could be run without any parameters every time I am contemplating setting up a thing with Python, as I could accomplish exactly what I want in a very small number of lines of code. However, it would be nice if I could use SSRS as it takes away my need to figure out the delivery of the export file and is also a simple enough interface any user should be able to figure out how to use it.
Thanks.
MSSQL is a data source, to get data out of. Since you are simply looking for a way to extract data from the database, a python script to create the file exactly as you wish would be the simples explanation. K.I.S.S. :)

What would be the best format for multiple data entry?

For example i would like to input a list of students into a database, can you suggest any formatted csv file or excel?.. what would be the best?..any suggestions would be appreciated! thanks! :D
There are far more libraries available for a wider range of languages to work with CSV than Excel files, so if those are the only two choices, I'd strongly suggest CSV.
But if you're putting them into a database, you could probably write a nice tool to either prompt a human to enter all the data into a database, or scrape an existing source of student information to push into the database. I mean, the whole point of having the database is so you can write queries, so might as well write some tools for putting data in too. :)