Note: Documentation is currently under construction
PyCrypt is a high-speed package for applying arbitrary ciphers to unicode text on a large scale. It grants the user freedom to write ciphers as snippets of actual python code while preventing foul play through a robust, configurable security system.
The preformance of this package is best observed by viewing
First-Time Use and then running
speedTest.py
, where a Viginere cipher is applied to the entirety of Moby-Dick by Herman
Melville. On a low-end 4-core laptop, encryption took 0.176 sec.
The PyCrypt database is created the first time PyCrypt.database
is imported or PyCrypt.database.init()
is called.
The user will be asked for their PostgreSQL admin password. This password will only be used once and is never saved. See the function init()
found within database/init.py
for more information.
All future transactions are then conducted through the generated user "pycrypt_default_user"; their privileges are defined in database/initUser.SQL
.
The below code assumes that the map alphaLower and the cipher vigenere are found within the PyCrpyt database. See database on saving maps and ciphers.
from PyCrypt import *
import PyCrypt.database as database
# Get saved data
mapQuery = database.LoadMap("alphaLower")
cipherQuery = database.LoadCipher("vigenere")
# Decompress saved data
transform, mapRange = DecompressTransform(mapQuery[3])
inverse, _ = DecompressInverse(mapQuery[4])
# User input
plaintext = "Attack at dawn"
keyword = "lemon"
options = {"deleteTextOutsideMap": False, "cycleKeywordOutsideMap": False}
# Convert strings to arrays of unicode values
keys = ProcessKeys(transform, keyword)
numRepr = Encode(plaintext)
# Remap unicode values
mappedText, maskedIndices = ApplyTransform(numRepr, transform)
# Encrypt text
encryptedText = ApplyFormula(cipherQuery[3], mappedText, keys, mapRange, maskedIndices, options=options)
# Reverse mapping
cipherOut = ApplyTransform(encryptedText, inverse)[0]
# Extract values
cipherOut = Decode(cipherOut)
print(cipherOut)
The encryption process is broken into two distinct phases
- character mapping
- cipher application
The matter of converting a unicode string into an array of integers suitable for large-scale operations is non-trivial and may yield different outcomes depending on the assumptions made by the user. For instance, do we assign a value of zero to any character? Do we include punctuation? Do capital and lower-case letters map to the same value? To remedy this, we grant the user full control over the process that transforms unicode characters into a usable integer array.
- transform
- The JSON
str
, typically loaded from PyCrypt's database, detailing which values any number of charaters map to
Returns an array[int]
through which
the mapping process is preformed and a set
containing all possible values that may be returned by applying the transform
jsonTransform = {
"A":0, "a":0, "B":1, "b":1, "C":2, "c":2, "D":3, "d":3, "E":4, "e":4,
"F":5, "f":5, "G":6, "g":6, "H":7, "h":7, "I":8, "i":8, "J":9, "j":9,
"K":10, "k":10, "L":11, "l":11, "M":12, "m":12, "N":13, "n":13, "O":14, "o":14,
"P":15, "p":15, "Q":16, "q":16, "R":17, "r":17, "S":18, "s":18, "T":19, "t":19,
"U":20, "u":20, "V":21, "v":21, "W":22, "w":22, "X":23, "x":23, "Y":24, "y":24,
"Z":25, "z":25}
arrayTransform, mapRange = DecompressTransform(jsonTransform)
- Inverse
- The compressed JSON
str
detailing the unicode values eachint
maps to. This is typically loaded from our database or generated throughGenInverseTransform()
.
Returns an array[int]
through which the mapping
process is undone and a set
containing all possible values that may be returned by applying the inverse
jsonInverse = {
'0' : 97, '1' : 98, '2' : 99, '3' : 100, '4' : 101, '5' : 102, '6' : 103,
'7' : 104, '8' : 105, '9': 106, '10': 107, '11': 108, '12': 109, '13': 110,
'14': 111, '15': 112, '16': 113, '17': 114, '18': 115, '19': 116, '20': 117,
'21': 118, '22': 119, '23': 120, '24': 121, '25': 122
}
arrayInverse, inverseMapRange = DecompressInverse(jsonInverse)
- numRepr
- The
array[int]
of unicode values produced byEncode()
- transform
- The
array[int]
(typically returned from eitherDecompressTransform()
orDecompressInverse()
) detailing what values a subset of unicode charters will map to - maskedIndices
- An
array[int]
containing the indices where the transform will not be applied
Returns array[int]
transformedValues and
array[int]
maskedIndices containing the indices where the
transform was not applied. These almost always correspond to characters outside
of the transform such as spaces, punctuation, and non-standard characters.
plaintext = "The FitnessGram™ Pacer Test is a multistage aerobic..."
unicode = Encode(plaintext)
# arrayTransform was returned from DecompressTransform()
mappedText, maskedIndices = ApplyTransform(unicode, arrayTransform)
# mappedText = [19, 7, 4, 32, 5, 8, 19, 13, 4, 18, 18, 6, 17, 0, 12, 8482, ...]
# maskedIndices = [3, 15, 16, 22, 27, 30, 32, 43, 51, 52, 53]
Formulas are composed of snippets of Python/Numpy code, subject to user-defined restrictions. Before execution, the contents of the snippet is parsed using abstract tree syntax. The use of banned attributes will raise an exception during a security scan, as determined by a whitelist and a blacklist of permitted/banned functions/modules supplied by config.JSON.
The names __func_name__ , __var_name__ , and __class_instance__ are reserved and will raise exceptions if used within the formula's text. The value of the local out is will be returned after a formula's evaluation. The user is free to define functions, lambdas,and local variables within a formula without triggering the security system. Functionality for classes are currently a work in progress.
The locals array[int]
maskedIndices, array[int]
mappedIndices and array[int]
mapRange are supplied to the formula during execution and
numpy is automatically imported as np at run time.
A list[]
keys and dict[str, object]
options contain the parameters
a user may feed to formula during execution. In the example below,
keys
contains a single value: the array[int]
of a
keyword. Options contain two:
"cycleKeywordOutsideMap"
and "deleteTextOutsideMap"
, and
the values of both are bools.
assert len(mapRange) == 1 + max(mapRange)
if options["cycleKeywordOutsideMap"]:
offset = np.resize(keys[0], len(text))[mappedIndices]
else:
offset = np.resize(keys[0], len(mappedIndices))
out = (text[mappedIndices] + offset) % len(mapRange)
if not options["deleteTextOutsideMap"]:
out = np.put(text, mappedIndices, out)
- formula
- A
str
containing python code. - text
- The
array[int]
upon which formula is applied. - keys
- The
list[]
of keywords or key values passed to the formula at runtime. A cypher may require zero or multiple keys, and the elements within keys can be of any datatype. - mapRange
- The
array[int]
collection of all possible characters produced byApplyTransform
and the mapping process. This value is also returned fromdatabase.LoadMap()
. - maskedIndices
- The
array[int]
containing the locations where the mapping process left characters unchanged, and will usually be equal to the second value returned fromApplyTransform()
. When the formula is applied, thearray[int]
mappedIndices is set equal to the collection of indices not in maskedIndices. - options
- The
dict[str, object]
containing the names and values of any additional variables needed by the formula at runtime.
# User input
options = {"cycleKeywordOutsideMap": False, "deleteTextOutsideMap": True}
keyword = "lemon"
plaintext = "I sexually Identify as an overused sexual identification copypasta. Ever since I was a boy I dreamed of spamming other users with my unfunny wall of text"
# Apply map
numRepr = Encode(plaintext)
mappedText, maskedIndices = ApplyTransform(numRepr, transform)
keys = ProcessKeys(transform, keyword)
# See introductory blurb
formulaStr = """
assert len(mapRange) == 1 + max(mapRange)
if options["cycleKeywordOutsideMap"]:
offset = np.resize(keys[0], len(text))[mappedIndices]
else:
offset = np.resize(keys[0], len(mappedIndices))
out = (text[mappedIndices] + offset) % len(mapRange)
if not options["deleteTextOutsideMap"]:
text[mappedIndices] = out
out = text
"""
encryptedText = ApplyFormula(cipherQuery[3], mappedText, keys, mapRange, maskedIndices, options=options)
Preforms the inverse of DecompressInverse()
, converting passed
array[int]
into a JSON str
more suitable for storage.
See here for details on DecompressInverse()
.
Preforms the inverse of DecompressTransform()
, converting passed
array[int]
into a JSON str
more suitable for storage.
See here for details on DecompressTransform()
.
Converts passed array[int]
to a str
using the UTF-32
encoding scheme.
Converts passed str
to an array[int]
containing the
UTF-32 values of each character in plaintext
Applies array[int]
transform without a mask to each str
in
list[object]
keys, then returns the contents of keys.
See here for details on ApplyTransform()
.
This function is automatically called when importing database
and is
responsible for establishing a connection to PyCrypt. If the
necessary database or user does not exist, they will be created.
The pg8000.native.Connection
object created by calling
database.Init()
. This is used to interact with the PyCrypt
Database. See pg8000 documentaion for more
information.
- name
- The
str
name of the cipher to be added to the database. Will raise an exception the passed name is already taken. - transfom
- A
dict[str, int]
defining what value each unicode character maps to. - inverse
- A
dict[int, int]
used to reverse the mapping process. if inverse isNone
, then one will be generated. - keywords
- A
list[str]
containing keywords that can users can by once the passed cipher is added to database.
# NOTE: This only runs if a map with the name "alphaLower" is not already in the system
if len(database.con.run("""SELECT 1 FROM maps WHERE "name"='alphaLower'""")) == 0:
transform = {
"A":0, "a":0, "B":1, "b":1, "C":2, "c":2, "D":3, "d":3, "E":4, "e":4,
"F":5, "f":5, "G":6, "g":6, "H":7, "h":7, "I":8, "i":8, "J":9, "j":9,
"K":10, "k":10, "L":11, "l":11, "M":12, "m":12, "N":13, "n":13, "O":14, "o":14,
"P":15, "p":15, "Q":16, "q":16, "R":17, "r":17, "S":18, "s":18, "T":19, "t":19,
"U":20, "u":20, "V":21, "v":21, "W":22, "w":22, "X":23, "x":23, "Y":24, "y":24,
"Z":25, "z":25}
t = DecompressTransform(transform)[0]
i = GenInverseTransform(t)
inverse = CompressInverse(i)
keywords = ["alpha", "lower", "ascii", "alphabet"]
database.SaveMap("alphaLower", transform, inverse, keywords)
- identifier
- This can be either a
str
or anint
. If identifier is an integer,database.LoadMap( )
will return the map with that ID. Otherwise, it will return all maps whose name matches the passed identifier.
Returns a list of entries matching the identifier. Each entry will contain 6 items, in order:
- A unique integer ID
- A hash formed from user data and the contents of the map
- A string name
- A compressed JSON transform used to map characters to ints
- The compressed inverse transform used convert ints back to characters
- A list of keywords (primarily for use within database)
query = database.LoadMap("alphaLower")
print(f"Hash of alphaLower map: '{query[1]}'")
- name
- The
str
name of the cipher to be added to the database. Will raise an exception the passed name is already taken. - formula
- A
str
consisting of valid python code that transforms a mapped plaintext array into cypher text. - inverse
- A
str
containing Python code that transforms cypher text into plaintext. The rules for the formula parameter apply. - keywords
- A
list[]
containing keywords that can be sorted by once cipher is added to database. - options
- A
dict[str, object]
in which each key/value pair represents the name and default value of extra locals passed to cipher/inverse.
Security checks are automatically preformed whenever this function is called. Exceptions will be raised if any check fails.
# NOTE: This only runs if a vigenere cipher is not aready in the system
if len(database.con.run("""SELECT 1 FROM ciphers WHERE "name"='vigenere'""")) == 0:
keywords = ["vigenere","caesar","polyalphabetic"]
options = {"cycleKeywordOutsideMap":False, "deleteTextOutsideMap":True}
formulaStr = """
assert len(mapRange) == 1 + max(mapRange)
if options["cycleKeywordOutsideMap"]:
offset = np.resize(keys[0], len(text))[mappedIndices]
else:
offset = np.resize(keys[0], len(mappedIndices))
out = (text[mappedIndices] + offset) % len(mapRange)
if not options["deleteTextOutsideMap"]:
out = np.put(text, mappedIndices, out)
"""
inverseStr = """
assert len(mapRange) == 1 + max(mapRange)
if options["cycleKeywordOutsideMap"]:
offset = np.resize(keys[0], len(text))[mappedIndices]
else:
offset = np.resize(keys[0], len(mappedIndices))
if options["deleteTextOutsideMap"]:
out = (text - offset) % len(mapRange)
else:
out = (text[mappedIndices] - offset) % len(mapRange)
out = np.put(text, mappedIndices, out)
"""
database.SaveCipher("vigenere", formulaStr, inverseStr, keywords, options)
- identifier
- This can be either a
str
or anint
. If identifier is an integer,database.LoadCipher()
will return the cipher with that ID. Otherwise, it will return all ciphers whose name matches the passed identifier.
Returns a list of entries matching the identifier. Each entry will contain 7 items, in order:
- A unique integer ID
- A hash formed from user data and the contents of the map
- A string name
- The formula text used for encryption
- The inverse formula used for decryption
- A list of keywords (primarily for use within database)
- A dict whose keys are the names of parameters used in the formula/inverse and whose values are those parameters' default values.
query = database.LoadCipher("vigenere")
print(f"Hash of vigenere cipher: '{query[1]}'")