When performing long-running queries on Aurora Serverless I have seen the following errors after a few minutes:
ERROR 1080 (08S01) Forcing close of thread
ERROR 2013 (HY000) Lost connection to MySQL server during query
The queries are using mysql LOAD DATA LOCAL INFILE to load large (multi-GB) data files into the database.
How can I avoid these errors?
To solve this, you can change the parameter group item net_write_timeout to a more suitable value. Here's instructions for completing the steps from the console:
Go to RDS Console
Click "Parameter Groups" in the left pane
Click "Create Parameter Group"
On the Parameter Group Details page, for Type, choose DB Cluster Parameter Group; then give it a name and description, and click "Create"
Click on the name of the parameter group you created in step 4
Search for "net_write_timeout"
Click on the checkbox next to the parameter and click "Edit Parameters"
Change the value to an integer between 1-31536000 for the number of seconds you want it to wait before timing out, and click "save changes"
Click on Databases in the left pane
Click on the database and click "modify"
Under Additional Configuration > Database Options > DB Cluster Parameter Group, select the parameter group you created in step 4, and click "Continue"
Select "Apply Immediately" and click "Modify Cluster"
Break up your large, multi-GB uploads into smaller chunks. Aurora works better (and faster) loading one hundred 10MB files at once rather than one 1GB file. Assuming your data is already in a loadable format:
Split the file into parts using split
split -n l/100 --additional-suffix="_small" big_file.txt
This results in 100 files like xaa_small xab_small, etc.
find files that match the split suffixes using find
files=$(find . -name 'x*_small')
loop through the files and load each in parallel
for file in $files; do
echo "load data local infile '$file' into table my_table;" |
mysql --defaults-file=/home/ubuntu/.my.cnf --defaults-group-suffix=test &
done
Related
Disclaimer: new to SSIS and Active Directory
I have a need to extract all users within a particular Active Directory (AD) domain and import them into Excel. I have followed this: https://www.itnota.com/query-ldap-in-visual-studio-ssis/ in order to create my SSIS package. My SQL is:
LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=a*));Name,sAMAccountName
As you know there is a 1,000 row limit when pulling from the AD. In my SQL I currently have (name=a*) to test the process and it works. I need to know how to setup a loop with variables to pull all records and import into Excel (or whatever you experts recommend). Also, how do I know what the other field names are that are available to pull?
Thanks in advance.
How do I see what's in Active Directory
Tool recommendations are off topic for the site but a tool that you can download, no install required, is AD Explorer It's a MS tool that allows you to view your domain. Highly recommend people that need to see what's in AD use something like this as it shows you your basic structure.
What's my domain controller?
Start -> Command Prompt
Type set | find /i "userdnsdomain" and look for USERDNSDOMAIN and put that value in the connect dialog and I save it because I don't want to enter this every time.
Search/Find and then look yourself up. Here I'm going to find my account by using my sAMAccountName
The search results show only one user but there could have been multiples since I did a contains relationship.
Double clicking the value in the bottom results section causes the under pane window to update with the details of the search result.
This is nice because while the right side shows all the properties associated to my account, it's also updated the left pane to navigate to the CN. In my case it's CN=Users but again, it could be something else in your specific environment.
You might discover an interesting categorization for your particular domain. At a very large client, I discovered that my target users were all under a CN
(Canonical Name, I think) so I could use that in my AD query.
There are things you'll see here that you sure would like to bring into a data flow but you won't be able to. Like the memberOf that's a complex type and there's no equivalent in the data flow data types for it. I think Integer8 is also something that didn't work.
Loop the loop
The "trick" here is that we'll need to take advantage of the
The name of the AD provider has changed since I last looked at this. In VS 2017, I see the OLE DB Provider name as "OLE DB Provider for Microsoft Directory Service"
Put in your query and you should get results back. Let that happen so the metadata is set.
An ADO.NET source does not support parameterization as the OLE DB does. However, you can apply an Expression on the Data Flow which surfaces the component and that's what we'll do.
Click out of the Data Flow and back into the Control Flow and right click on the Data Flow and select Properties. In that properties window, find Expressions and click the ellipses ... Up pops the Property Expressions Editor
Find the ADO.NET source under Property and in the Expressions section, click the Ellipses.
Here, we'll use your same source query just to prove we're doing the right things
"LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=" + "a" + "*));Name,sAMAccountName"
We're doing string building here so the problem we're left to solve is how we can substitute something for the "a" in the above query.
The laziest route would be to
Create an SSIS variable of type String called CurrentLetter and initialize it to a
Update the expression we just created to be "LDAP://DC=JOHN,DC=JANE,DC=DOE;(&(objectCategory=person)(objectClass=user)(name=" + #[USer::CurrentLetter] + "*));Name,sAMAccountName"
Add a Foreach Loop Container (FELC) to your Control Flow.
Configure the FELC with an enumerator of "Foreach Item Enumerator"
Click the Columns...
Click Add (this results in Column 0 with data type String) so click OK
Fill the collection with each letter of the alphabet
In the Variable Mappings tab, assign Variable User::CurrentLetter to Index 0
Click OK
Old blog posts on the matter because I like clicks
https://billfellows.blogspot.com/2011/04/active-directory-ssis-data-source.html
http://billfellows.blogspot.com/2013/11/biml-active-directory-ssis-data-source.html
Steps
1.Created a user database - DB1
2.Created a table - TB1 with columns - cm1 and cm2
3.Installed the extension - "Database Administration Tool Extension"
4.Relaunched the application
5.Right clicked on the database - DB1 ................................
Observation -> I could see some options like 'Manage' but not properties
6.Right clicked on the column 'cm1' under the created table TB1.........................
Observation -> I could see only refresh option but not properties
7.Additional Step: Tried searching for other relevant extension but couldn't spot.
Question -> How can I get the properties option enabled for DB and columns in AzureDataStudio in MAC System
My DB has increased in size very much, above 10GB.
I see these tables:
emcxmp_wp_posts
zjrqwg_wp_posts
qtlmkn_wp_posts
shcjpe_wp_posts
stzbcj_wp_posts
tymbkf_wp_posts
ursnzw_wp_posts
vkhjml_wp_posts
oyjfup_wp_posts
voxfcz_wp_posts
xlhpaz_wp_posts
ybazlk_wp_posts
yjmify_wp_posts
ymsaun_wp_posts
yojkzl_wp_posts
yqlfun_wp_posts
wouevx_wp_posts
msyfsp_wp_posts
kqbjhz_wp_posts
kqjsio_wp_posts
lnfjsf_wp_posts
asvpky_wp_posts
bltyyt_wp_posts
cyuhqr_wp_posts
eudjso_wp_posts
And more, what happened and why these were created?
This is the intended behaviour of the plugin WP Reset Pro.
It creates snapshots with names as yours:
naming template for snapshot tables is {6_char_random_hex}{table_prefix_for_your_site}{original_table_name}
To delete snapshots, use the appropriate function:
If you do not need a specific snapshot on your WordPress site, you can quickly delete it:
Open Tools -> WP Reset -> Snapshots
Scroll down to the "User Created Snapshots" card
Select a snapshot & open the "Action" menu
Click "Delete snapshot"
Confirm that you want to delete it by clicking the red button
I am new to Jmeter and trying to carry out the following flows:
User Login with username and password
Page 1 is displayed with 10 invoices - User select ten invoices -
10 ajax call is executed (invoice1, invoice2,invoice3.. json file is generated with invoices as request)
Page 2 is displayed to view invoices
User log out
I have recorded the flow with blazemeter plugin on chrome.
The thread group in Jmeter has the following tasks:
I have 10 users in a file called users.txt and i am using CSV Data
set config to load them.
For each user I will load only 10 invoices from invoices.txt using
CSV Data set config to load them.
Since I have 10 users and each user needs 10 invoices, my
invoices.txt has 100 unique invoices.
Please find csv config for invoice below:
The problem is that I need each user to be assigned with 10 unique invoices and those 10 invoices cannot be allocated to another user.
Any idea how I can load 10 unique invoices for each user and make sure those invoices are not assigned again to another user?
invoices.txt should have only unique IDs before test start, you can share the IDs using:
CSV Data Set Config inside loop of users with attributes:
Sharing mode - All Threads - ID won't be repeated
Recycle on EOF? - False - for not to get invalid Id (<EOF>)
Stop thread on EOF? - True - Stop when file with unique IDs ends
You can consider using HTTP Simple Table Server instead of 2nd CSV Data Set Config.
HTTP Simple Table Server has KEEP option, given you set it to FALSE each used "invoice" will be removed, it will guarantee uniqueness even in case when you run your test in Distributed (Remote) mode
You can install HTTP Simple Table Server (as well as any other JMeter Plugin) using JMeter Plugins Manager
I have a task that I am working on that has me stumped. Hoping you can help me. I am using a data flow task which is basically inserting a row into a sqlite table. I was doing this using a "SQL Task" but unfortunately the only way to successfully insert a guid into the sqlite table is to convert it as a byte stream using the data flow task. I do not want to use a source database because my data is not flowing from one table to another. I really just want to take my populated variables and convert them to a byte stream which i can then insert successfully into a sqlite database. The issue is, i cannot use a dataflow task without a source database.
My work-around so far has been to declare a source database/table and only one column (but never use it in the data flow). This works fine and I am unable to insert the row into sqlite using my pre-set variables, but i am left with a somewhat annoying message in my Output log every time i do this:
Warning: 0x80047076 at , SSIS.Pipeline: The output column "" (117) on output "OLE DB Source Output" (11) and component "OLE DB Source" (1) is not subsequently used in the Data Flow task. Removing this unused output column can increase Data Flow task performance.
Anyone know of a good way to get this warning not to show up?
In your dataflow choose a Script Component.
When prompted to choose Source, Destination, or Transformation, choose Source.
Add your pre populated variables to the CustomProperties.ReadOnlyVariables section of the script tab.
Go to the Inputs and Outputs section.
Add a column to the default output for each of your variables.
In your script (if using C#) put something similar to the following in the CreateNewOutputRows() section
Output0Buffer.AddRow();
Output0Buffer.ContainerName = Variables.ContainerName;
Output0Buffer.TaskName = Variables.TaskName;
Output0Buffer.TaskStartDate = Variables.ContainerStartTime;
Save your script.
Connect your script component to your destination object.
If this is causing your package execution to get failed, you got an option of ignoring these warnings/errors..
Just double click the Source block in Dataflow and navigate to the last tab("Error OUtput") in left side pane and you need to select the option to ignore the errors. (I dont know eactly what phrase in that option will do it )