In the Preferences.sublime-settings file, a particular attribute called ensure_newline_at_eof_on_save exists and I'd like to only activate it for non-minified files. It'd be nice to pass it a regular expression like /min\.[^\.]+$/ to make it omit these files but that doesn't seem to work. Any suggestions?
EDIT
This is the contents of the trim_trailing_white_space.py file under the Default packages folder:
import sublime, sublime_plugin
class TrimTrailingWhiteSpace(sublime_plugin.EventListener):
def on_pre_save(self, view):
if view.settings().get("trim_trailing_white_space_on_save") == True:
trailing_white_space = view.find_all("[\t ]+$")
trailing_white_space.reverse()
edit = view.begin_edit()
for r in trailing_white_space:
view.erase(edit, r)
view.end_edit(edit)
class EnsureNewlineAtEof(sublime_plugin.EventListener):
def on_pre_save(self, view):
if view.settings().get("ensure_newline_at_eof_on_save") == True:
if view.size() > 0 and view.substr(view.size() - 1) != '\n':
edit = view.begin_edit()
view.insert(edit, view.size(), "\n")
view.end_edit(edit)
I have changed the second class to this implementation:
class EnsureNewlineAtEof(sublime_plugin.EventListener):
def on_pre_save(self, view):
if view.settings().get("ensure_newline_at_eof_on_save") == True:
if ".min." not in view.name():
if view.size() > 0 and view.substr(view.size() - 1) != '\n':
edit = view.begin_edit()
view.insert(edit, view.size(), "\n")
view.end_edit(edit)
But it still does not work.
This is actually pretty straightforward to do, and you don't even need regex to do it. Select Preferences -> Browse Packages... to open up Sublime's Packages folder in your operating system's file manager. Go into the Default folder and open trim_trailing_white_space.py in Sublime. Line 15 starts the third class definition in the file:
class EnsureNewlineAtEofCommand(sublime_plugin.TextCommand):
def run(self, edit):
if self.view.size() > 0 and self.view.substr(self.view.size() - 1) != '\n':
self.view.insert(edit, self.view.size(), "\n")
Delete the entire class (all four lines), then paste in the following:
class EnsureNewlineAtEofCommand(sublime_plugin.TextCommand):
def run(self, edit):
if ".min." not in self.view.name():
if self.view.size() > 0 and self.view.substr(self.view.size() - 1) != '\n':
self.view.insert(edit, self.view.size(), "\n")
What I did was add a test to see if the view's filename contains .min. (as in foobar.min.js or bottom.div.min.css, for example). If the filename does contain the pattern, nothing happens. If it doesn't, then a newline is added like normal.
Save the file (do not change the filename!), and restart Sublime for the changes to take effect.
Please note that the above instructions are for Sublime Text 2. ST3 does not have its packages directly available, so you'll need to install PackageResourceViewer from Package Control, select PackageResourceViewer: Open Resource from the Command Palette, then select Default and trim_trailing_white_space.py. Edit the file as above and save it, and the changes should take effect immediately, although you may wish to restart Sublime just for fun.
Related
I'm new to RE. I'm trying to solve a HackTheBox challenge called RAuth, with angr. I understand how to analyze and solve this challenge differently, without angr, but I really want to understand what is wrong with my angr script, or maybe what is the reason why angr is not feasible for this (and similar) case?
The application is easy, after it starts it's requesting the password from stdin, and exits if the password is incorrect:
>:~/challenges/rauth$ ./rauth
Welcome to secure login portal!
Enter the password to access the system:
wqeqwwqewqewqeqwewqeqweqweqweqweqwewqeqwe
You entered a wrong password!
When I run my script it works for around 15 minutes and dies with one of two errors:
"Killed"
"IndexError: tuple index out of range"
My angr script:
#!/usr/bin/env python
#coding: utf-8
import angr
import claripy
import time
import sys
def is_successful(st):
output = st.posix.dumps(sys.stdout.fileno())
if b'Successfully Authenticated' in output:
return True
return False
def should_abort(state):
output = state.posix.dumps(sys.stdout.fileno())
return b'You entered a wrong password!' in output
def main(round):
print("Checking "+str(round))
p = angr.Project('rauth',auto_load_libs=False)
flag_chars = [claripy.BVS('flag_%d' % i, 8) for i in range(round)]
flag = claripy.Concat(*flag_chars + [claripy.BVV(b'\n')])
st = p.factory.full_init_state(
args=['./rauth'],
add_options=angr.options.unicorn,
stdin=angr.SimPackets(name='stdin', content=[(flag, 32)])
#stdin=flag,
)
st.options.add(angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY)
st.options.add(angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS)
for k in flag_chars:
st.solver.add(k >= ord(" "))
st.solver.add(k <= ord("~"))
sm = p.factory.simulation_manager(st)
#sm.explore(find=is_successful,avoid=should_abort)
sm.explore(find=lambda s: b"Successfully" in s.posix.dumps(1),avoid=lambda s: b"wrong" in s.posix.dumps(1))
if len(sm.found) > 0:
for solution_state in ex.found:
print("[>>] {!r}".format(solution_state.solver.eval(flag,cast_to=bytes)))
else:
print("[>>] no solution found :(")
if __name__ == "__main__":
print(main(32))
Am I using angr for the case when it's not applicable? What am I missing?
A also tried to play with options, like removing unicorn, enabling auto_load_libs, using or disabling lambdas etc. No success.
I want to Load Multiple CSV files matching certain names into a dataframe. Currently i am looping through the whole folder and creating a list of filenames and then loading those csv's into the dataframe list and then concatenating that dataframe.
The approach i want to use (if possible) is to bypass all the code and read all files in a one liner kind of approach.
I know this can be done easily for single level of subfolders, but my subfolder structure is as follows
Root Folder
|
Subfolder1
|
Subfolder 2
|
X01.csv
Y01.csv
Z01.csv
|
Subfolder3
|
Subfolder4
|
X01.csv
Y01.csv
|
Subfolder5
|
X01.csv
Y01.csv
I want to read all "X01.csv" files while reading from Root Folder.
Is there a way i can read all the required files in code something like the below
filepath = "rootpath" + "/**/X*.csv"
df = spark.read.format("com.databricks.spark.csv").option("recursiveFilelookup","true").option("header","true").load(filepath)
This code works fine for single level of subfolders, is there any equivalent of this for multi level folders ? i thought the "recursiveFilelookup" option would look across all levels of subfolders, but apparently this is not the way it works.
Currently i am getting a
Path not found ... filepath
exception
any help please
Have you tried using the glob.glob function?
You can use it to search for files that match certain criteria inside a root path, and pass the list of files it finds to spark.read.csv function.
For example, I've recreated the folder structure from your example inside a Google Colab environment:
To get a list of all CSV files matching the criteria you've specified, you can use the following code:
import glob
rootpath = './Root Folder/'
# The following line of code looks through all files
# inside the rootpath recursively, trying to match the
# pattern specified. In this case, it tries to find any
# CSV file that starts with the letters X, Y, or Z,
# and ends with 2 numbers (ranging from 0 to 9).
glob.glob(rootpath + "**/[X|Y|Z][0-9][0-9].csv", recursive=True)
# Returns:
# ['./Root Folder/Subfolder5/Y01.csv',
# './Root Folder/Subfolder5/X01.csv',
# './Root Folder/Subfolder1/Subfolder 2/Y01.csv',
# './Root Folder/Subfolder1/Subfolder 2/Z01.csv',
# './Root Folder/Subfolder1/Subfolder 2/X01.csv']
Now you can combine this with spark.read.csv capability of reading a list of files to get the answer you're looking for:
import glob
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
rootpath = './Root Folder/'
spark.read.csv(glob.glob(rootpath + "**/[X|Y|Z][0-9][0-9].csv", recursive=True), inferSchema=True, header=True)
Note
You can specify more general patterns like:
glob.glob(rootpath + "**/*.csv", recursive=True)
To return a list of all csv files inside any subdirectory of rootpath.
Additionally, to consider only the immediate subdirectories files, you could use something like:
glob.glob(rootpath + "*.csv", recursive=True)
Edit
Based on your comments to this answer, does something like this works on Databricks?
from notebookutils import mssparkutils as ms
# databricks has a module called dbutils.fs.ls
# that works similarly to mssparkutils.fs, based on
# the following page of its documentation:
# https://docs.databricks.com/dev-tools/databricks-utils.html#ls-command-dbutilsfsls
def scan_dir(
initial_path: str,
search_str: str,
account_name: str,
):
"""Scan a directory and subdirectories for a string.
Parameters
----------
initial_path : str
The path to start the search. Accepts either a valid container name,
or the entire connection string.
search_str : str
The string to search.
account_name : str
The name of the account to access the container folders.
This value is only used, when the `initial_path`, doesn't
conform with the format: "abfss://<initial_path>#<account_name>.dfs.core.windows.net/"
Raises
------
FileNotFoundError
If the `initial_path` informed doesn't exist.
ValueError
If `initial_path` is not a string.
"""
if not isinstance(initial_path, str):
raise ValueError(
f'`initial_path` needs to be of type string, not {type(initial_path)}'
)
elif not initial_path.startswith('abfss'):
initial_path = f'abfss://{initial_path}#{account_name}.dfs.core.windows.net/'
try:
fdirs = ms.fs.ls(initial_path)
except Py4JJavaError as exc:
raise FileNotFoundError(
f'The path you informed \"{initial_path}\" doesn\'t exist'
) from exc
found = []
for path in fdirs:
p = path.path
if path.isDir:
found = [*found, *scan_dir(p, search_str)]
if search_str.lower() in path.name.lower():
# print(p.split('.net')[-1])
found = [*found, p.replace(path.name, "")]
return list(set(found))
Example:
# Change .parquet to .csv
spark.read.parquet(*scan_dir("abfss://CONTAINER_NAME#ACCOUNTNAME.dfs.core.windows.net/ROOT/FOLDER/", ".parquet"))
This method above worked for on Azure Synapse:
I am writing a wrapper over c libriary and this lib has file with almost all functions, let say, all_funcs.c. This file in turn requires compilation of lots of another c files
I have created all_funcs.pyx, where I wraped all functions, but I also want to create a submodule, that has access to functions from all_funcs.c. What works for now is adding all c-files to both Extensions in setup.py, however each c-file compiles twice: first for all_funcs.pyx and second for submodule extension.
Are there any ways to provide common sourse files to each Extension?
Example of current setup.py:
ext_helpers = Extension(name=SRC_DIR + '.wrapper.utils.helpers',
sources=[SRC_DIR + '/wrapper/utils/helpers.pyx'] + source_files_paths,
include_dirs=[SRC_DIR + '/include/'])
ext_all_funcs = Extension(name=SRC_DIR + '.wrapper.all_funcs',
sources=[SRC_DIR + '/wrapper/all_funcs.pyx'] + source_files_paths,
include_dirs=[SRC_DIR + '/include/'])
EXTENSIONS = [
ext_helpers,
ext_all_funcs,
]
if __name__ == "__main__":
setup(
packages=PACKAGES,
zip_safe=False,
name='some_name',
ext_modules=cythonize(EXTENSIONS, language_level=3)
)
source_files_paths - the list with common c source files
Note: this answer only explains how to avoid multiple compilation of c/cpp-files using libraries-argument of setup-function. It doesn't however explain how to avoid possible problems due to ODR-violation - for that see this SO-post.
Adding libraries-argument to setup will trigger build_clib prior to building of ext_modules (when running setup.py build or setup.py install commands), the resulting static library will also be automatically passed to the linker, when extensions are linked.
For your setup.py, this means:
from setuptools import setup, find_packages, Extension
...
#common c files compiled to a static library:
mylib = ('mylib', {'sources': source_files_paths}) # possible further settings
# no common c-files (taken care of in mylib):
ext_helpers = Extension(name=SRC_DIR + '.wrapper.utils.helpers',
sources=[SRC_DIR + '/wrapper/utils/helpers.pyx'],
include_dirs=[SRC_DIR + '/include/'])
# no common c-files (taken care of in mylib):
ext_all_funcs = Extension(name=SRC_DIR + '.wrapper.all_funcs',
sources=[SRC_DIR + '/wrapper/all_funcs.pyx'],
include_dirs=[SRC_DIR + '/include/'])
EXTENSIONS = [
ext_helpers,
ext_all_funcs,
]
if __name__ == "__main__":
setup(
packages=find_packages(where=SRC_DIR),
zip_safe=False,
name='some_name',
ext_modules=cythonize(EXTENSIONS, language_level=3),
# will be build as static libraries and automatically passed to linker:
libraries = [mylib]
)
To build the extensions inplace one should invoke:
python setupy.py build_clib build_ext --inplace
as build_ext alone is not enough: we need the static libraries to build before they can be used in extensions.
Often I find myself doing repetitive file & replace operations in a file. Most often that comes down to fixed find and replace operations; deleting some lines, changing some strings that are always the same and so on.
In Vim that is a no-brainer,
function! Modify_Strength_Files()
execute':%s/?/-/'
execute':%s/Ä/-/'
"--------------------------------------------------------
execute':%s/Ä/-/'
execute':%s///g'
"--------------------------------------------------------
execute':g/Version\ of\ Light\ Ship/d'
execute':g/Version\ of\ Data\ for\ Specific\ Regulations/d'
"--------------------------------------------------------
" execute':g/LOADING\ CONDITION/d'
" execute':g/REGULATION:\ A\.562\ IMO\ Resolution/d'
" This is to reduce multiple blank lines into one.
execute ':%s/\s\+$//e'
execute ':%s/\n\{3,}/\r\r/e'
" ---------------------
endfunction
copied verbatim.
How could a function like this be defined in Sublime Text editor, if it can be done at all, and then called to act upon the currently opened file?
Here are resources to write Sublime Text 2 plugins:
Sublime Text 2 API Reference
Sublime Text 2 Plugin Examples
How to run Sublime Text 2 commands
Setting up Sublime Text 2 Custom Keyboard Shortcuts
Example: you can write a similar plugin and bind a hot key to it, that is, batch_edit command. Then you can open a file and execute the command via that hot key. By the way, in this script, I didn't consider the file encoding. You can get the file encoding via self.view.encoding().
# -*- coding: utf-8 -*-
import sublime, sublime_plugin
import re
class BatchEditCommand(sublime_plugin.TextCommand):
def run(self, edit):
self._edit = edit
self._replace_all(r"\?", "-")
self._replace_all(u"Ä", "-")
self._delete_line_with(r"Version of Light Ship")
self._delete_line_with(r"Version of Data for Specific Regulations")
self._replace_all(r"(\n\s*\n)+", "\n\n")
def _get_file_content(self):
return self.view.substr(sublime.Region(0, self.view.size()))
def _update_file(self, doc):
self.view.replace(self._edit, sublime.Region(0, self.view.size()), doc)
def _replace_all(self, regex, replacement):
doc = self._get_file_content()
p = re.compile(regex, re.UNICODE)
doc = re.sub(p, replacement, doc)
self._update_file(doc)
def _delete_line_with(self, regex):
doc = self._get_file_content()
lines = doc.splitlines()
result = []
for line in lines:
if re.search(regex, line, re.UNICODE):
continue
result.append(line)
line_ending = {
"Windows" : "\r\n",
"Unix" : "\n",
"CR" : "\r"
}[self.view.line_endings()]
doc = line_ending.join(result)
self._update_file(doc)
I have set up Jenkins, but I would like to find out what files were added/changed between the current build and the previous build. I'd like to run some long running tests depending on whether or not certain parts of the source tree were changed.
Having scoured the Internet I can find no mention of this ability within Hudson/Jenkins though suggestions were made to use SVN post-commit hooks. Maybe it's so simple that everyone (except me) knows how to do it!
Is this possible?
I have done it the following way. I am not sure if that is the right way, but it seems to be working. You need to get the Jenkins Groovy plugin installed and do the following script.
import hudson.model.*;
import hudson.util.*;
import hudson.scm.*;
import hudson.plugins.accurev.*
def thr = Thread.currentThread();
def build = thr?.executable;
def changeSet= build.getChangeSet();
changeSet.getItems();
ChangeSet.getItems() gives you the changes. Since I use accurev, I did List<AccurevTransaction> accurevTransList = changeSet.getItems();.
Here, the modified list contains duplicate files/names if it has been committed more than once during the current build window.
The CI server will show you the list of changes, if you are polling for changes and using SVN update. However, you seem to want to be changing the behaviour of the build depending on which files were modified. I don't think there is any out-of-the-box way to do that with Jenkins alone.
A post-commit hook is a reasonable idea. You could parameterize the job, and have your hook script launch the build with the parameter value set according to the changes committed. I'm not sure how difficult that might be for you.
However, you may want to consider splitting this into two separate jobs - one that runs on every commit, and a separate one for the long-running tests that you don't always need. Personally I prefer to keep job behaviour consistent between executions. Otherwise traceability suffers.
echo $SVN_REVISION
svn_last_successful_build_revision=`curl $JOB_URL'lastSuccessfulBuild/api/json' | python -c 'import json,sys;obj=json.loads(sys.stdin.read());print obj["'"changeSet"'"]["'"revisions"'"][0]["'"revision"'"]'`
diff=`svn di -r$SVN_REVISION:$svn_last_successful_build_revision --summarize`
You can use the Jenkins Remote Access API to get a machine-readable description of the current build, including its full change set. The subtlety here is that if you have a 'quiet period' configured, Jenkins may batch multiple commits to the same repository into a single build, so relying on a single revision number is a bit naive.
I like to keep my Subversion post-commit hooks relatively simple and hand things off to the CI server. To do this, I use wget to trigger the build, something like this...
/usr/bin/wget --output-document "-" --timeout=2 \
https://ci.example.com/jenkins/job/JOBID/build?token=MYTOKEN
The job is then configured on the Jenkins side to execute a Python script that leverages the BUILD_URL environment variable and constructs the URL for the API from that. The URL ends up looking like this:
https://ci.example.com/jenkins/job/JOBID/BUILDID/api/json/
Here's some sample Python code that could be run inside the shell script. I've left out any error handling or HTTP authentication stuff to keep things readable here.
import os
import json
import urllib2
# Make the URL
build_url = os.environ['BUILD_URL']
api = build_url + 'api/json/'
# Call the Jenkins server and figured out what changed
f = urllib2.urlopen(api)
build = json.loads(f.read())
change_set = build['changeSet']
items = change_set['items']
touched = []
for item in items:
touched += item['affectedPaths']
Using the Build Flow plugin and Git:
final changeSet = build.getChangeSet()
final changeSetIterator = changeSet.iterator()
while (changeSetIterator.hasNext()) {
final gitChangeSet = changeSetIterator.next()
for (final path : gitChangeSet.getPaths()) {
println path.getPath()
}
}
With Jenkins pipelines (pipeline supporting APIs plugin 2.2 or above), this solution is working for me:
def changeLogSets = currentBuild.changeSets
for (int i = 0; i < changeLogSets.size(); i++) {
def entries = changeLogSets[i].items
for (int j = 0; j < entries.length; j++) {
def entry = entries[j]
def files = new ArrayList(entry.affectedFiles)
for (int k = 0; k < files.size(); k++) {
def file = files[k]
println file.path
}
}
}
See How to access changelogs in a pipeline job.
Through Groovy:
<!-- CHANGE SET -->
<% changeSet = build.changeSet
if (changeSet != null) {
hadChanges = false %>
<h2>Changes</h2>
<ul>
<% changeSet.each { cs ->
hadChanges = true
aUser = cs.author %>
<li>Commit <b>${cs.revision}</b> by <b><%= aUser != null ? aUser.displayName : it.author.displayName %>:</b> (${cs.msg})
<ul>
<% cs.affectedFiles.each { %>
<li class="change-${it.editType.name}"><b>${it.editType.name}</b>: ${it.path} </li> <% } %> </ul> </li> <% }
if (!hadChanges) { %>
<li>No Changes !!</li>
<% } %> </ul> <% } %>
#!/bin/bash
set -e
job_name="whatever"
JOB_URL="http://myserver:8080/job/${job_name}/"
FILTER_PATH="path/to/folder/to/monitor"
python_func="import json, sys
obj = json.loads(sys.stdin.read())
ch_list = obj['changeSet']['items']
_list = [ j['affectedPaths'] for j in ch_list ]
for outer in _list:
for inner in outer:
print inner
"
_affected_files=`curl --silent ${JOB_URL}${BUILD_NUMBER}'/api/json' | python -c "$python_func"`
if [ -z "`echo \"$_affected_files\" | grep \"${FILTER_PATH}\"`" ]; then
echo "[INFO] no changes detected in ${FILTER_PATH}"
exit 0
else
echo "[INFO] changed files detected: "
for a_file in `echo "$_affected_files" | grep "${FILTER_PATH}"`; do
echo " $a_file"
done;
fi;
It is slightly different - I needed a script for Git on a particular folder...
So, I wrote a check based on jollychang.
It can be added directly to the job's exec shell script. If no files are detected it will exit 0, i.e. SUCCESS... this way you can always trigger on check-ins to the repository, but build when files in the folder of interest change.
But... If you wanted to build on-demand (i.e. clicking Build Now) with the changed from the last build.. you would change _affected_files to:
_affected_files=`curl --silent $JOB_URL'lastSuccessfulBuild/api/json' | python -c "$python_func"`
Note: You have to use Jenkins' own SVN client to get a change list. Doing it through a shell build step won't list the changes in the build.
It's simple, but this works for me:
$DirectoryA = "D:\Jenkins\jobs\projectName\builds" ####Jenkind directory
$firstfolder = Get-ChildItem -Path $DirectoryA | Where-Object {$_.PSIsContainer} | Sort-Object LastWriteTime -Descending | Select-Object -First 1
$DirectoryB = $DirectoryA + "\" + $firstfolder
$sVnLoGfIle = $DirectoryB + "\" + "changelog.xml"
write-host $sVnLoGfIle
I tried to add that to comments but code in comments is no way:
Just want to prettify code from heroin's answer:
def changedFiles = []
def changeLogSets = currentBuild.changeSets
for (entries in changeLogSets) {
for (entry in entries) {
for (file in entry.affectedFiles) {
echo "Found changed file: ${file.path}"
changedFiles += "${file.path}"
}
}
}
Keep in mind for some cases git plugin returns empty changeSet, like:
First run in newly created branch
'Build now' button build
Refer to https://issues.jenkins-ci.org/browse/JENKINS-26354 for more details.