perl-obfus - obfuscate (make more difficult to understand) Perl source code programs
perl-obfus [ -v|--version ] [ --noparsing ] [ --output-line-len N ] [ --jam 0|1 ] [ --end-handling keep|skip|mangle] [ --pod-handling keep|skip ] [ --old-spacing-mode ] [ --keep-spaces ] [ --keep-newlines ] [ --bannerhead filename] [--bannertail filename ] [ --SN name_of_SN_sub] [--SNS name_of_SNS_sub ] [ --excludeidentsfile|-x filename ].. [ --excludeidentsfile-anycase filename ].. [ -X filename ].. [ --quoted-symbol-names filename ].. [ --suffixes-asis-list filename ].. [ -F user-defined-mapping-filename ].. [ -I include-dirs ].. [ -m module ].. [ -M module ].. [ -o destination-filename ] [ -P backend-perl-path ] [ -d map-filename ] [--embed-map] [ -e encode-count ] [ -i idents-mangling-params ] [ -n number-mangling-params ] [ -s string-mangling-params ] [ -c charcode-mangling-params ] [ --alter-only-symbols-from-antiexceptions ] [ -T time-asserter-params ] [ -H hostname-asserter-params ] [ -G generic-asserter-params ] [ -O profile-name ] file-to-obfuscate
This program turns Perl source code files into functionally equivalent Perl source code that is much more difficult to study, analyze and modify - thus providing you control over intellectual property theft. This is not compiler, thus the code it outputs will perfectly will run on all platforms it was able to run before. It does this by accessing the parsed form of the programs - thus it's MUCH more reliable than alternatives that don't do that; it supports all Perl features including all advanced ones like nested regular expressions, expressions in substitution parts of s// operator, Perl formats. It works perfectly with multi-module programs and for programs that depend on a lot of third-party modules that are not subject to obfuscation. By default it also encodes the obfuscated version of the file and makes it self-decoding at runtime thus not requiring any standalone decoder, and making the file completely non-understandable by anybody.
Perl-Obfus also allows to ensure licensing conditions of the code at runtime by providing any combination of lifetime period expiration control, advanced hostname checking and generic user-defined checks; in case licensing conditions are not met, there is an option to delete obfuscated file automatically, print user-defined message and terminate the execution or ability to execute user-implemented code. All checking of licensing conditions are additionally encoded to make them very difficult to analyze. The block of code that checks for licensing conditions can't be removed from the obfuscated program since it's made dependant on the initialization functionality performed by that block.
Perl-Obfus also has auxilary no-parsing mode in which it doesn't try to obfuscate the code at all, code is only encoded in this mode. This mode is useful for quick and unperfect source code hiding only. This mode is not default one, it can be activated by passing --noparsing commandline switch.
This program obfuscates only one perl source file at a time. By default it writes obfucated file to stdout, but it's greatly recommended to use the option -o to get the obfuscated version of the file in the file specified (since a lot of additional operations are required when simply redirecting the stdout to any file of choice). Note that the same file can't be used as an input and as an output in any case.
All comments besides the one on the first line are omitted from obfuscated file, there is no option to preserve them. It's possible to request to preserve or to omit POD documentation from obfuscated file via the use of --pod-handling option. The text after the __DATA__ and __END__ sections can be either stripped away, left as is or mangled - per the choice of the user via the use of --end-handling option (sometimes people put testsuites for the modules after the __END__). It's possible to add comments with author and copyright information to the top and to the end of the obfuscated version of the file using options --bannerhead and --bannertail respectively. Of course these comments and POD documentation will appear in clear text form in the obfuscated file, independant of whether encoding was applied to it.
The obfuscation typically means
Add to that the fact that the obfuscated code will also be encoded thus making the source code completely unreadable.
The non-encoded obfuscated code is extremely difficult to understand for a human since the name of variables and subroutines and other symbols are totally meaningless and hard to remember (e.g. @files becomes @zcadaa4fc81). It's possible to control most aspects of obfuscation using the commandline switches of the Perl-Obfus.
If the file being obfuscated is a script (i.e. not a module),
no modification to the original source file is needed for obfuscation to
succeed. If the file being obfuscated is a module that exports some symbols
by the use of a standard Exporter package and these symbols are used
by other files that you also wish to obfuscate, then you have to
make minor modification to the file (otherwise, for obvious reasons,
after obfuscation, the content of @EXPORT variable will be names of
non-obfuscated symbols, while the symbol names will be obfuscated.
To overcome this, the perl-obfus supports two special functions with names
SN and SNS (both names can be changed by the use of --SN
and --SNS). First one accepts a scalar as an argument, while the second one
- a list.
For SN function, the special support is enabled if its argument is a
constant string in
single quotes. For SNS function, the special support is enabled if its
arguments is a constant list produced using single qw() operator
(exactly with parenthesis as delimiters). The special support is treating their arguments
as symbol names,
and mangling the symbol names as all symbols are mangled.
I.e. SN('$a')
becomes SN('$MANGLED_a')
and
SNS(qw($a %b))
becomes SNS(qw($MANGLED_a %MANGLED_b))
(the names of
functions treated as SN and SNS will never become obfuscated - so you
don't need to include them in exceptions list). Using other way of passing
arguments to these two special subroutines won't enable the special treating
so you should use only the supported ways only, i.e. the SN('$' . "a")
or SN("\$a")
or SN(q($a))
or SNS('$a','%b')
or even
SNS(qw[$a %b])
will be the same as before obfuscation (and thus some
symbols won't be exported from the module being obfuscated). Also SN and
SNS should be used if your code generates strings that are then
eval'ed - e.g. instead of eval('$abc = '. "$value;")
you should write
eval(SN('$abc') . " = $value;")
. If you also need to run your code
non-obfuscated too, you should cut and paste definitions of the subroutines
SN and SNS as following:
sub SN { '';$_[0]; } sub SNS { '';@_; }
Note, that sometimes you will have to put this inside a BEGIN{} block in order these subroutines to be visible at the point where they are used.
The script starts a pipe to another (backend) perl process that does part
of the processing. Note that rather fresh version of perl is required for
backend - 5.7.2 or above, so in some cases you'll have to install it in
parallel to the version of the perl you are using. So you may be required to
pass the location and probably ionvokation options for the perl interpreter
used as a backend using -P switch - e.g. -P '/usr/local/bin/perl'
.
You don't need to install all modules used by the code you are obfuscating for the version of perl used for backend.
If the code being obfuscated expects modules in non-standard locations or needs them preloaded and requires specifying them to be performed via usual perl's switches -I, -m, -M, then you will have to pass the same set of switches to the perl-obfus (they will be passed to perl backend for it to be able to analyze the source code properly).
As was said above, the symbols from third-party and standard modules won't be mangled. But user needs to gather a list of such symbol names (called exceptions from this point) using a dedicated utility gen-ident-exceptions.pl, and pointing the names of files with exceptions using --excludeidentsfile or --excludeidentsfile-anycase options. For convenience, there is a -X switch that can be passed multiplie times to specifies the names of files in which list of exceptions to ignore are stored.
It's possible to request Perl-Obfus to save the mapping between obfuscated symbol names and original symbol names in the external file by passing the filename after -d switch.
Encoding can be controlled with the -e switch, to completely
turn it off add -e 0
to the perl-obfus command line.
Perl-Obfus has advanced support for ensuring licensing conditions (not available in Lite or Trial editions). It's possible to ensure licensing conditions by any combination of the following criterias:
Each type of checking criterias is implemented by subengines called asserters from here - the specific asserters are called time asserters, hostname asserters and generic asserters correpondingly from now. There are several subtypes of asserters of each type, each with different behaviour; it's possible to enable only one subtype of asserters of a given type (i.e. no more than one time asserter, no more than one hostname asserter, etc).
By default no asserters of any type are enabled.
If any asserter is enabled, the special block of code is prepended to the obfuscated version of original code; if it was requested to additionally encode the obfuscated code then the resultant code (special block and obfuscated version of original code) will be encoded as a whole.
In any case that special block of code will actualy be an encoded version of the code that will include implementation of all checks and actions to be performed in case licensing conditions are not met AND special initialization code without which the obfuscated version of original code will not work correctly (the special initialization code is in fact an initialization of variables used in some part of expressions inside the obfuscated version of original code). This means that it's impossible to remove the special block that includes checks for ensuring licensing conditions without making the rest of the code malfunctioning, even if user selected to just obfuscate (without applying encoding) the original source.
Asserters are configured from a command line in a similar way to token mangling parameters. Use -T option to configure time asserters, -H to configure hostname asserters, -G to configure generic asserters. See description of the individual asserters of each type for more information about their options.
It's possible to store the default commandline options in the globally-visible file $instroot/lib/perl-obfus/perl-obfus-settings.pl (where $instroot is a directory in which the Perl-Obfus package was installed to). See comments in that file for more information.
Note that there is interacive web-based commandline builder for Perl-Obfus available at http://www.stunnix.com/support/interactive/cmd-builder/.
Note that extra spaces in the lines (whitespaces and tabs) won't correspond to the ones in the original file, but to certain prettyprinted version of it.
Note that newlines won't correspond to the ones in the original file, but to certain prettyprinted version of it.
Note: use --bannertail only for files that don't have __END__ or __DATA__ sections, since otherwise these sections will be corrupted (since banner will be appended after the __END__ or __DATA__ sections).
#
)
as the first character of the line. The file name specified is first searched
in the current directory (if it's not absolute path), and then in the
subdirectory lib/perl-obfus/exceptions/ of the directory where
Perl-Obfus was installed to.
Most of the exceptions are generated
using gen-ident-exceptions.pl script. In very few cases users will have
to manually extend a set of exceptions using hand-written files - see
the description of the syntax of such files in the
gen-ident-exceptions.pl's manual.
There is no need to add perl special variables like @ARGV
and builtin subroutines like open - they are already hardcoded in the
perl-obfus.
It's possible to remove symbols from lists of exceptions by passing names of files with these symbol names using -X switch.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
Comments are allowed in such files by placing a hash sign (#
)
as the first character of the line. The file name specified is first searched
in the current directory (if it's not absolute path), and then in the
subdirectory lib/perl-obfus/exceptions/ of the directory where
Perl-Obfus was installed to.
This option is mostly useful in case the set of exceptions created from builtin list and content of files passed with -x switch is too broad.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
This can be seen alternative to using OBJNAME function. So instead of wrapping symbol names in call of OBJNAME, you can just list those symbols in a file and pass name of that file after --quoted-symbol-names switch.
Note that you can use this functionality to hide some hash keys you internally use in your code for storing data in various structures or even config files. Perl-Obfus will also accept - (dash) in names of symbols loaded from filename, to allow using keys with dashes.
Comments are allowed in such files by placing a hash sign (#
)
as the first character of the line. The file name specified is first searched
in the current directory (if it's not absolute path), and then in the
subdirectory lib/perl-obfus/exceptions/ of the directory where
Perl-Obfus was installed to.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
Comments are allowed in such files by placing a hash sign (#
)
as the first character of the line. The file name specified is first searched
in the current directory (if it's not absolute path), and then in the
subdirectory lib/perl-obfus/exceptions/ of the directory where
Perl-Obfus was installed to.
This option is mostly useful for protecting code for environments, that scan name of symbol for some suffix in order to treat the symbol specially.
The filename can be name of directory, in this case all files located in this directory and any of its subdirectories (at any depth) are loaded as if the names of these files were specified individually one-by-one.
Comments are allowed in such files by placing a hash sign (#
)
as the first character of the line. Each line in such file contains two
symbols: name of original symbol, one or more space characters, and required
resultant symbol.
In case some mangling engine decides to assign a symbol that is listed as resultant symbol, special attempts will be made to guarantee that the symbol chosen by obfuscation engine won't conflict with it (by adding prefixes until unqueness is reached).
If the file specified with this option exists, the accumulated mapping information will be merged with mapping information previously stored in the file - this allows one to have map file for entire project.
By default no filename is specified, and thus mapping information is not saved anywhere.
By default this mapping information is not embedded at all.
If this mode is activated, only the following options are in effect: all related to encoding - i.e. -e, and --bannerhead, --bannertail ,--pod-handling, -o .
obfuscator-title[,option=value]..
Tokens of each type can be mangled using different approaches, each approach corresponds to obfuscator, identified by obfuscator-title. Each obfuscator can have options that alter its behaviour, in order to specify them the comma separated option=value pairs may follow obfuscator-title after a comma.
The mangling-specification specifies all details on how to mangle tokens of each type, so if multiplie occurences of the option are specified, the last one is taken into the effect.
For each type of token a special obfuscator with title none is available - it doesn't alter the tokens in any way.
Here is a list of obfuscators for each type of the token, with the options they support.
The replacement symbol depends solely on the index of symbol being seen for the first time in the project, and on the value of seed argument.
The detection of collisions for symbols in the current file is done automatically. It's possible to activate detection of collisions for symbols in entire project by the use of adhere-mapfile option of this symbol obfuscator.
If option adhere-mapping is specified for this obfuscator and has non-zero value and if mapfile name is specified via global option -d, then Perl-Obfus will read specified mapfile at startup, and will try to lookup the original symbol names in it and use a replacement from that file if found; it will also ensure that protected symbols that were produced during that invokation of Perl-Obfus are not assigned to any symbol listed in mapfile (and if it encounters some obfuscated symbol it was going to use as a replacement as being used as a replacement for another symbol (i.e. so-called ``hash-collision'' occurs) then execution of Perl-Obfus is aborted with error message - in which case it's necessary to clear mapfile, change the seed and/or increase value of len option and protect entire application again); after processing completes, mapfile will be updated as usual.
Note, that shortest symbol obfuscator also can generate protected symbols using all possible combinations of characters, but it allows to generate shortest names possible at the same time (by requiring 2 passes on each source file).
Options:
The detection of collisions for symbols in the current file is done automatically. It's possible to activate detection of collisions for symbols in entire project by the use of adhere-mapfile option of this symbol obfuscator.
If option adhere-mapping is specified for this obfuscator and has non-zero value and if mapfile name is specified via global option -d, then Perl-Obfus will read specified mapfile at startup, and will try to lookup the original symbol names in it and use a replacement from that file if found; it will also ensure that protected symbols that were produced during that invokation of Perl-Obfus are not assigned to any symbol listed in mapfile (and if it encounters some obfuscated symbol it was going to use as a replacement as being used as a replacement for another symbol (i.e. so-called ``hash-collision'' occurs) then execution of Perl-Obfus is aborted with error message - in which case it's necessary to clear mapfile, and change the seed and/or use longer names in spec argument, and then clean and rebuild entire project again.
After processing completes, mapfile will be updated as usual.
Options:
If number of symbols that need to be replaced in the project is less than number of all possible variants all specified names allow to generate, then obfuscation is aborted with error message.
It's allowed to use non-letters in the value of this option (e.g. digits and underscore).
It's obvious that in theory it's possible to get md5sum collision - the critical situation when two different symbols will be obfuscated to the same symbol name. When such situation is detected, the obfuscation is aborted. The detection of collisions for symbols in the current file is done automatically. If detection of collisions for symbols in entire project is required, one can use adhere-mapfile option for enforcing uniqueness of protected symbols across all files - please read the description of symbol name obfuscator combs. The only solution in case md5sum collision is detected is to change the value of the seed option or to increase the value of the len option. However, such situations are very rare.
This is the default obfuscator for symbol names.
Options:
Options:
It's perfectly suitable for multimodule projects too. There are two modes of operation this obfuscator works in (controlled by its parameter countupdate) - scanning through the project files for computing the use counts for all symbols (used if parameter countupdate is passed value 1) and saving the counts to a special file hereafter called countsfile (whose name is specified as value of parameter countsfile) or performing the obfuscation itself using the symbol use counts from countsfile gathered during first mode of operation (used if parameter countupdate is passed value 0, or if this parameter is not specified at all). In the obfuscation mode obfuscator maintains its state (a mapping between original symbols and obfuscated ones) in the file whose name specified as a value of parameter statefile (hereafter such file will be called statefile).
Note that file with symbol counts should be up to date - at least it should mention all symbols that are subject to obfuscation - so if you added some code and introduced some new symbols, you'll have to regenerate countsfile. Perl-Obfus aborts execution if it encounters that some symbol was not counted at all, with diagnostics indicating that countsfile needs to be rebuilt. Rebuilding countsfile means deleting (or truncating) the countsfile and statefile and running Perl-Obfus in symbol count gathering mode on all files of the project. If your change to the code didn't introduce new symbols but just increased or decreased the use of already existing ones, it won't abort the execution but there will be a chance that size of resultant obfuscated file won't be the smallest possible.
So the common approach to using this obfuscator for symbol names is:
develop and debug the code, delete files a-file-with-counts and
a-file-with-state, then run the Perl-Obfus with options
like this -i shortest,countupdate=1,countsfile=a-file-with-counts
for all
source files in the project to gather symbol counts to the file
a-file-with-counts, and then run Perl-Obfus with options
like this
-i shortest,countupdate=0,countsfile=a-file-with-counts,statefile=a-file-with-state
for all source files in the project.
By default each symbol name is obfuscated to the unique, but random identifier of the length corresponding to the number of occurencies of the given symbol. That randomness of identifier can be disabled by passing value 0 for parameter dontshuffle - this will force e.g. first symbol in the first source file of the project to always be obfuscated to the name c (provided there is no exception with the same name).
It's possible to specify a set of characters that can be used for resultant symbol names by the use of spec option - e.g. one can make code very hard to analyze without modification by asking to use only symbols I and l for names of symbols - that will produce symbols like IllII or IIlIIl which look very similar in the most fonts (but of course this won't result in smallest output). The use of this option makes shortest obfuscator a reliable version of combs obfuscator for multi-module projects, since it eliminates a chance for a case when two different symbols in two different modules (in which only one of the symbols is used) getting replaced with the same resultant symbol (which is possible in theory, but has a very small possibility).
Options:
Options:
asserter-title[,option=value]..
There are several subtypes of asserters for each type. The subtype is selected by asserter-title. Each asserter can have options that alter its behaviour, in order to specify them the comma separated option=value pairs may follow asserter-title after a comma.
For each type of asserters a special asserter with title none is available - it doesn't perform any action.
Here is a list of asserter-titles for each type of the asserters, with the options they support.
Options:
For all cases time() is used as a fallback source of information in case primary method is not available. For Unix systems, /bin/date seems to be the most reliable and trusted source of information.
Default value of this option is builtin-time.
All of these asserters support the same set of parameters:
The following sources of information are supported:
There is a plain CGI Perl script in lib/perl-obfus/print-hostname.pl in the directory where Perl-Obfus is installed that prints the value acquired by all sources of information.
All hostname asserters differ only in treatment of the parameter matches. The following hostname asserters are supported:
It's possible to use fake generic asserters with code ' ' (i.e. single space character) to make the analysis of the program more complex (since in case any asserter is used, some fraction of numeric expressions will be turned to arithmetic expressions involving constant variables initialized in the encoded block). This trick (passing -G from-string,code=' ') will make custom decompiler one will have to write to be able to analyze the code much more complex.
In order to report violation of licensing conditions, user's code should execute the following statements:
exit 0;
The argument is profile-params, that has the following syntax:
profile-name[,option=value]..
There are several profiles available. The profile is selected by profile-name. Each profile can have options that alter its behaviour, in order to specify them the comma separated option=value pairs may follow profile-name after a comma.
The following values for profile-name are available:
The profile with name default is the default profile.
All profiles have the following options (specified in the way options for manglers and extractors are specified):
In case of an error, the exit code will be non-zero, otherwise the exit code will be zero.
On successful processing of the file, the message 'input-filename syntax OK' to stderr. The processing will stop if there is a syntax error in the file being obfuscated or in the file it uses - in that case location and details of syntax error will be printed to stderr.
The following commandline obfuscates and encodes file blah.pl using default parameters and exceptions from file named ./excepts, writing obfuscated and encoded version to oblah.pl:
perl-obfus blah.pl -o oblah.pl -x ./excepts
The following commandline is recommended way of obfuscating file blah.pl for shipping using default parameters and exceptions from file named ./excepts, writing obfuscated and encoded version to oblah.pl (the main difference from previous example is passing the value of the seed parameter for obfuscator routine for symbol names):
perl-obfus blah.pl -o oblah.pl -x ./excepts -i md5,seed=SomeRandomString
The following commandline is a recommended for producing the mildly-obfuscated non-encoded version of the blah.pl that is ideal for testing whether the obfuscated code has no problems like use of undefined symbols (that may arise due to insufficiently complete list of exceptions in file ./excepts) :
perl-obfus blah.pl -e 0 -o oblah.pl -x ./excepts -n none -s none -i prefix,str=ZZZ
The following commandlines are a sample of passing same set values for all options to the md5 obfuscator routine for symbol names. It obfuscates and encodes file blah.pl, writing obfuscated and encoded version of the file to oblah.pl:
perl-obfus blah.pl -o oblah.pl -i md5,seed=57823,prefix=p,len=5 perl-obfus blah.pl -o oblah.pl -i 'md5,prefix=p, seed=57823 , len=5'
The following example obfuscates and encodes file blah.pl, writing obfuscated and encoded version of the file to oblah.pl, with embedding code for license checking that allows the code to be executed itself till 28 April 2005; upon expiration of the code default message is printed:
perl-obfus blah.pl -o oblah.pl -T 'expire,whenexpires=28 April 2005,onviolated-warn=1' -H hosttails,matches=site.com+.site.com,onviolated-warn=1
It's possible to store the default commandline options in the globally-visible file $instroot/lib/perl-obfus/perl-obfus-settings.pl (where $instroot is a directory in which the Perl-Obfus package was installed to) which is a Perl module. This file defines one sub cmnargs that should return a list of options to be prepended to actual commandline the perl-obfus, thus allowing to store ``persistent settings'' for perl-obfus. It is most useful for specifying the location of perl used for backend (that should be a perl of version 5.7.2 or greater).
Here is a list of mostly innocent caveats.
sub f { { "blah" , 2}; }; sub g { { "blah", 2}; };
Here f()
returns integer 2, g()
returns reference to anonymous hash - though
the difference is only in amount of whitespace (whether there is a newline
after ``blah''). Since perl-obfus removes extra whitespaces (and wraps line
in order it not to be longer that the constant you specified) the behaviour
of functions can change. You should not write the code that is sensitive to
whitespace and perl parser bugs in general - so you should add explicit return
in f and g if you want them to return ref to hash.
See section NOTES for troubleshooting instructions.
In most cases, once properly prepared for obfuscation, obfuscated version of the code should work the same as non-obfuscated. It's recommended to check obfuscated version of the code for the use of undeclared subroutines using find-undeclared-subs.pl script - this will help to detect incomplete set of symbol name exceptions. After fixing the issues with incomplete set of exceptions, it's recommended to check whether ofbuscated code behaves exactly the same as original - by using pre-existing testsuite or checking functionality manually.
If some obfuscated code is syntaxically correct but works differently than original version , obfuscate it without encoding and string, integer and ident mangling (but with -jam=1), as following:
perl-obfus -i none -s none -n none -jam 1 -e 0
Then try to run it again. If it still does not work correctly, find the source file which is guilty by replacing each of the obfuscated files with original ones one by one. After you have found the file that contains the problem, append the definitions of all functions from the source file to that target file and by temporary renaming function names in the appended part to something else (e.g. by suffixing the names with '1' or 'blah') you will be able to find the function that is guilty. Same process can be applied to the blocks in the guilty function too (just replace obfuscated parts with source parts) to find out which part of the obfuscate function is misbehaving.
Having found the function block that misbehaves, that block should be modified in order the obfuscated version to have the same functionality as original code.
gen-ident-exceptions.pl, find-undeclared-subs.pl.