I have a database in mysql. I upload a set of data to the database table from a .csv file or .xls file. During retrieval I need to retrieve only the set of values which I have recently uploaded to the table in the database. I should not get all the values which stored in the table.(only values recently uploaded to the table should be retrieved). How can I do that using Laravel?
(1)
In the table where you store the data from CSV and XML, using a migration, create a flag to mark the records you didn't process yet, e.g. is_processed. Give it the default value of 0.
(2)
If you have data you already retrieved (processed), set those to 1, so they don't get processed again. You can do this manually with a query, based on the conditions that make sense in your case.
(3)
Then, after you store new data, they will be automatically set to is_processed = 0. This means you can identify them afterwards, like this:
$unprocessedData = MyModel::where('is_processed', 0)->get();
You can also order them if you need to, but - in general - that's the approach I would take.
Lets say you have a table/model named Users;
$getLatestUsers=Users::orderBy('created_at','desc')->take(5)->get();
this will take 5 recent records from database.
Well it depends on what you consider "recent" but you could tweak a function like this to fit your needs:
->where('created_at', '<', Carbon::now()->subMinutes(5)->toDateTimeString())
This would retrieve the elements created the last 5 minutes, but Carbon offers a wide variety of substractors.
PD: Remember to add use Carbon\Carbon; under your namespace declaration.
Hope this helps you.
$latest= Users::orderBy('created_at','desc')->take(10)->get();
this will return 10 recent values
Related
On our Wordpress site, we use a plugin called s2member and it stores the levels (roles) of our clients as well as the times they were assigned a specific level in our database. I would like to create a table that shows when a user was assigned a specific level. I'm having a challenge getting the data I need because of the way the data is stored in the field. It stores all of the levels along with the associated dates and times when a user's level was changed in one field. In addition, it stores all of the times as Unix timestamps. Here's an example of a typical field associated with a client:
a:20:{s:15:"1562695223.0001";s:6:"level0";s:15:"1562695223.0002";s:6:"level1";s:15:"1562695223.0003";s:6:"level2";s:15:"1562695223.0004";s:6:"level3";s:15:"1577906312.0001";s:11:"ccap_prepay";s:15:"1596575898.0001";s:12:"-ccap_prepay";s:15:"1596575898.0002";s:13:"ccap_graduate";s:15:"1596575898.0003";s:11:"ccap_prepay";s:15:"1596575898.0004";s:7:"-level3";s:15:"1597196952.0001";s:14:"-ccap_graduate";s:15:"1597196952.0002";s:12:"-ccap_prepay";s:15:"1597196952.0003";s:13:"ccap_graduate";s:15:"1597196952.0004";s:11:"ccap_prepay";s:15:"1598382433.0001";s:14:"-ccap_graduate";s:15:"1598382433.0002";s:12:"-ccap_prepay";s:15:"1598382433.0003";s:11:"ccap_prepay";s:15:"1598382433.0004";s:6:"level3";s:15:"1605290551.0001";s:12:"-ccap_prepay";s:15:"1605290551.0002";s:11:"ccap_prepay";s:15:"1605290551.0003";s:13:"ccap_graduate";}
There are four columns in this table: umeta_id; user_id; meta_key; meta_value. The data above is stored in the meta_value column.
You'll notice that it also has multiple ccap_* entries. CCAP stands for custom capapability and I would like to be able to chart those assignments and associated times as well.
Do you have any idea how I can accomplish this?
Thank you for any help you can give.
I talked to an engineer about this and he told me that I would need to learn Python and I believe he said I would need to learn how to also use Pandas and Numpy to extract the data I need but he wasn't exactly sure. I started taking a data analyst course on Coursera but I still haven't learned what I need to learn and it's already been several months. It would be great if someone could provide a solution that I could implement more quickly and use on an ongoing basis.
If there's a way to accomplish my goal by exporting this table to a CSV file and using Microsoft Excel or Google Sheets, I'm open to that too.
Here's an image of the table (if it helps):
Database table
Here's an example of my desired output:
Desired output
In my desired output, I used Excel and created a column that converts the Unix timestamp to a short date and another column where I used a nested IF statement to convert the CCAP or level to its meaning that we understand internally.
I have multiple archive tables storing similar kind of data in these tables but archived in the month wise format. Now, the requirement is to get all the archived data in to one table instead of multiple tables.
I am doing this activity with the help of Union all in SSIS, however it seems that it is taking random insert in the destination table.
Attach is the route taken for the transformation.
I want to prioritize the insert, please suggest!
You can add an extra column "Priority" to each of OLE DB sources with the corresponding priority for each source and then after union you can add Sort Component that sorts the data by Priority. But if you have a lot of data - that would be really inefficient because sort component will wait until all the source data is read.
I would suggest to write a proper source SQL statement that does the union/prioritization/sort for you and then insert into target.
Also if the sources are on different servers you can create Foreach loop container that will iterate through source tables and inset all of them into the target table. You can use this article for the reference.
I have job in Talend that is designed to bring together some data from different databases: one is a MySQL database and the other a MSSQL database.
What I want to do is match a selection of loan numbers from the MySQL database (about 82,000 loan numbers) to the corresponding information we have housed in the MSSQL database.
However, the tables in MSSQL to which I am joining the data from MySQL are much larger (~ 2 million rows), are quite wide, and thus cost much more time to query. Ideally I could perform an inner join between the two tables based on the loan number, but since they are in different databases this is not possible. The inner join that is performed inside a tMap occurs after the Lookup input has already returned its data set, which is quite large (especially since this particular MSSQL query will execute a user-defined function for each loan number).
Is there any way to create a global variable out of the output from the MySQL query (namely, the loan numbers selected by the MySQL query) and use that global variable as an IN clause in the MSSQL query?
This should be possible. I'm not working in MySQL but I have something roughly equivalent here that I think you should be able to adapt to your needs.
I've never actually answered a Stackoverflow question and while I was typing this the page started telling me I need at least 10 reputation to post more than 2 pictures/links here and I think I need 4 pics, so I'm just going to write it out in words here and post the whole thing complete with illustrations on my blog in case you need more info (quite likely, I should think!)
As you can see, I've got some data coming out of the table and getting filtered by tFilterRow_1 to only show the rows I'm interested in.
The next step is to limit it to just the field I want to use in the variable. I've used tMap_3 rather than a tFilterColumns because the field I'm using is a string and I wanted to be able to concatenate single quotes around it but if you're using an integer you might not need to do that. And of course if you have a lot of repetition you might also want to get a tUniqueRows in there as well to save a lot of unnecessary repetition
The next step is the one that does the magic. I've got a list like this:
'A1'
'A2'
'B1'
'B2'
etc, and I want to turn it into 'A1','A2','B1','B2' so I can slot it into my where clause. For this, I've used tAggregateRow_1, selecting "list" as the aggregate function to use.
Next up, we want to take this list and put it into a context variable (I've already created the context variable in the metadata - you know how to do that, right?). Use another tMap component, feeding into a tContextLoad widget. tContextLoad always has two columns in its schema, so map the output of the tAggregateRows to the "value" column and enter the name of the variable in the "key". In this example, my context variable is called MyList
Now your list is loaded as a text string and stored in the context variable ready for retrieval. So open up a new input and embed the variable in the sql code like this
"SELECT distinct MY_COLUMN
from MY_SECOND_TABLE where the_selected_row in ("+
context.MyList+")"
It should be as easy as that, and when I whipped it up it worked first time, but let me know if you have any trouble and I'll see what I can do.
I have created a CSV from a set of files in a directory that are numbered incrementally:
img1_1.jpg, img1_2.jpg ... img1_1999.jpg, img1_2000.jpg
The CSV output is like so:
filename, datetime
eg:
img1_1.JPG,2011-05-11 09:16:33.000000000
img1_3.jpg,2011-05-11 10:10:55.000000000
img1_4.jpg,2011-05-11 10:17:31.000000000
img1_6.jpg,2011-05-11 10:58:37.000000000
The problem is, there are a number of files missing in the listing, as some of the files don't exist. As a result, when imported, the actual row number does not match the file number.
Can anyone think of a reasonably efficient way to insert the missing rows so that the row number and filename matches up other than manually inserting rows for the missing ones? (There are over 800 missing rows).
Background
A previous programmer developed an uploader script and did not save the creation time of the mysql record in the database. I figured the easiest way to find the creation time for the majority of the records would be to output a directory listing of all the files and combine them in a spreadsheet.
You exactly need to do what you write in your comment to answer #tadman.
A text parser script to inject the missing lines with e.g. a date/time value that reflects the record is an empty one, i.e. there is no real data is behind it (e.g. date it to 1950-01-01 00:00:00). When it is done, bulk import the CSV.I think this must be the best and most efficient solution.
Also, think about any future insert/delete/update events might occur to your data.
That would possibly break the chain you initially have had, so you might prefer instead, to introduce a numeric field for the jpegs IDs (and index that field), and leave the PK "as is" (auto increment).
In this case you can avoid CSV manipulation, as well as being chained to your AUTO PK (means: you will not get in trouble if a new jpeg arrives with an ID which was previously deleted, or existing ID, etc).
So the solution really depends on how you want to use this table in the future. If you give more details, I am sure the community can come up with even more ideas.
If it's a one-time thing, it might be easiest to open up your csv in a spreadsheet.
If your table above is in sheet1, you could put something like the following in sheet2 (this is openoffice, but there are similar functions for Excel)
pre_filename | filename | datetime
img1_1 | = A2&".JPG" | =OFFSET(Sheet1.$B$1;MATCH(B2;Sheet1.$A$2:$A$4;0);0)
You should be able to select the three cells above and drag them down to however many you need.
I am writing the SSIS package to import the data from *.csv files to the SQL 2008 DB. The problem is that one of the file contains the duplicate records in the csv file and I want to extract only the distinct values from that source. Please see the image below.
Unfortunately, the generated files are not under my control and it is owned by the third party and I could not change the way they generated.
I did use the LookUp Component. But it only checks the existing data against the incoming data. It does not check the duplicate records in the incoming data.
I believe the sort component gives an option to remove duplicate rows.
Depends on how serious you want to get about the duplicates. Do you need a record of what was duplicated or is it enough to just get rid of them? Sort component will get rid of dups on the sort field. However, the dups may have different data in the other fields and then you want a differnt strategy. Usually I load all to staging tables and clean up from there. I send the dupes removed to an exception table (we have to answer a lot of questions from our customers about why things don't match what they sent) and I often use a set of business rules (and use either an execute SQl or data flow tasks to enforce the rules) to determine which one to pick if there are duplicates in one area but not another (say two business addresses when we can only store 1). I also make sure the client is aware of how we determine which of the two to pick.
Use SORT tool for that from Toolbox, then click on it. You will get all available input columns.
Check the column and change sortType direction and then check "remove rows with duplicate sort value".
Bring in the data from the csv file the way it is, then dedup it after it's loaded.
It'll be easier to debug, too.
I used Aggregate Component and Group By both QualificationID and UnitID. If you want, you can also use Sort Component too. Perhaps, my information might help others.