How to work with Unstructured Excel Column/Header names which are spread across multiple rows using R - data-analysis

There is an excel sheet http://www.censusindia.gov.in/2011census/C-series/C08.html
Please refer to the Excel sheet For row "India" for Column "C-08".
I want to analyse these data in R.However , the Excel Headers or column names are unstructured .Some headers are located in the first row, others are located in either the 2nd,3rd, or 4th row. Beneath the 4th row is the first subset of data we want to generate graphs from, there are multiple subsets as you go down the excel sheet. Each of these subsets is separated by an empty row. The excel sheet isn't in a format that can be analysed in R.
Please suggest some solution to the issue .
Thank you so much in advance!!

Related

Google sheets - copy columns- create new columns in other sheet and paste on fist position

the spreadsheet has four columns with prices from my stores and updates them daily. I need to perform some automation so that when I click on it, the data from these columns will be copied and pasted into another sheet (same file). The data should be arranged so that the latest 4 columns are at the beginning of the 2nd article
does anyone have an idea?

Apps Script copy data in to formatted area?

So I'm sending data from an app to a google sheet and then trying to format the data. If I pre-format the cells in anticipation of the data, the rows of data end up being appended below the last formatted row. I'm also trying to base other cells off the anticipated data that is being imported, but the rows get appended below the last row with data, even if it's not in the way. For example, if I'm always importing 1 row and 3 columns at a time, I would like to prefill column 4 onwards withe formulas and have my data slot into the same row. Is this possible with google apps script? Or is the only way to have the row of data appended after the last row in the sheet? Ideally, I'd like it to be the first available row with the first 3 columns empty.
Using appendRow when you want to that the new row gets the same formatting of the above rows is "tricky". The safe way to go is to use the script to set the format of the appended row. One way to do this is by using copyTo with SpreadsheetApp.CopyPasteType.PASTE_FORMAT

Reference subsets (columns) of a table in Excel

I have a very large table in Excel with 200 columns about some students.
The first 10 columns identifies the student by name, age, etc.
The remaining 190 columns can be split into categories.
So my question is whether it is possible to create new sheets which duplicates some columns from the table, so I can split the table into 10 tables instead of having everything in the same table?
I know I can do this manually, but the problem is that the data set will be updated in the future, so I wonder if it is possible to use references or something like that?
If it is not possible, how would you solve such problem? Would you populate everything to a database (MySQL? Oracle?) and then extract in the sheets? The problem is also that I have to create some additional columns in each sheet, so I can not just override all content in a sheet.
Because you're seeking for some subsets of your data, you could transform the Range into a Table, load it to Excel Power Query and get rid of the columns you don't need.
Then load it again to another Sheet of your choice.
You can repeat this process as wished.

Select specific cells from multiple spreadsheets into SQL using SSIS

I need to loop through a series of spreadsheets (all in the same folder), pulling data from the same cells within the same named range in each, into an existing SQL database, using SSIS (SQL Server 2008 R2).
I started by using the information in How to loop through Excel files and load them into a database using SSIS package? as a point of reference.
However, because my files don't run in a strict columnar format (i.e. the whole of column C plus the whole of column E, etc.), I am struggling with it.
My sheet is as follows:
Basically, the area outlined in red (A6:E11) will be the named range (done this way to allow for additional rows as we move forward) and the yellow cells are those that I need to import.
Let's assume that the range will be named "My_Range"
I need to import a row into the database for each of the rows in the range (currently rows 6 through 11).
e.g.
DBase: Col1, Col2, Col3, Col4
Row 1 = B3....B4....C6....E6
Row 2 = B3....B4....C7....E7
Row 3 = B3....B4....C8....E8
etc..
Any help would be greatly appreciated as I need to find the most efficient way to do this for up to 100 files per night.
If you can help me to get the correct data in the correct format from just 1 file, I can work on the multiple-file problem next.
Thanks guys.
One of the nifty things you can do with the Excel source in SSIS is define the actual range you want. So instead of saying I want "Sheet1" Put into the Sheet1$A5:E.
Just ignore the columns you don't want.
Something llke this.
EDIT:
You might want to use an excel script source to grab the first 2 rows if they are always in the same spot.

Extracting Values from 300 sheets where vlookup doesn't work

I have an excel spreadsheet which has just under 300 sheets in it. The first sheet has a list of items which are all numbered 1.1.1, 1.1.2 etc. The rest of the sheets have some of the items listed on them and not in numerical order. I am trying to extract the quantity and total listed against these items on all the different sheets.
The sheets are complicated by the fact that they are not well structure so have section titles which are across merged cells.
I could get this information by hand using the search facility in excel and visit each instance of the number and then add up all the quantities and totals by hand. Is there any way I can automate this? i.e. by asking excel to take each unique identifier from sheet 1, find it in the rest of the sheets and return the quantity and/or total?
I tried using vlookup but it only seemed to return one of the values and ignore all the others.
Even if there was a formula that I had to change the unique identifier by hand that would be much quicker!
Thank you for any help you can give. I am not a programmer so constructing the vb by myself would probably take longer than doing it by hand!
If you add a $ before your lookup data for example
search - $A&2:&AZ&1000
then you are telling excel to look for anything in that array.
because if you do not have a $ in and you copy it, it starts excluding everything above that.