Skip to content

JacobRThompson/PyCrypt

Repository files navigation

PyCrypt

Note: Documentation is currently under construction

PyCrypt is a high-speed package for applying arbitrary ciphers to unicode text on a large scale. It grants the user freedom to write ciphers as snippets of actual python code while preventing foul play through a robust, configurable security system.

The preformance of this package is best observed by viewing First-Time Use and then running speedTest.py, where a Viginere cipher is applied to the entirety of Moby-Dick by Herman Melville. On a low-end 4-core laptop, encryption took 0.176 sec.

Table of Contents

  1. First-Time Use
  2. Quick Example
  3. Overview
  4. Database
  5. Security(WIP)

First-Time Use

The PyCrypt database is created the first time PyCrypt.database is imported or PyCrypt.database.init() is called.

The user will be asked for their PostgreSQL admin password. This password will only be used once and is never saved. See the function init() found within database/init.py for more information.

All future transactions are then conducted through the generated user "pycrypt_default_user"; their privileges are defined in database/initUser.SQL.

Quick Example

The below code assumes that the map alphaLower and the cipher vigenere are found within the PyCrpyt database. See database on saving maps and ciphers.

from PyCrypt import *
import PyCrypt.database as database

# Get saved data
mapQuery = database.LoadMap("alphaLower")
cipherQuery = database.LoadCipher("vigenere")

# Decompress saved data
transform, mapRange = DecompressTransform(mapQuery[3])
inverse, _ = DecompressInverse(mapQuery[4])

# User input
plaintext = "Attack at dawn"
keyword = "lemon"
options = {"deleteTextOutsideMap": False, "cycleKeywordOutsideMap": False}

# Convert strings to arrays of unicode values
keys = ProcessKeys(transform, keyword)
numRepr = Encode(plaintext)

# Remap unicode values
mappedText, maskedIndices = ApplyTransform(numRepr, transform)

# Encrypt text
encryptedText = ApplyFormula(cipherQuery[3], mappedText, keys, mapRange, maskedIndices, options=options)

# Reverse mapping
cipherOut = ApplyTransform(encryptedText, inverse)[0]

# Extract values
cipherOut = Decode(cipherOut)
print(cipherOut)

Overview

The encryption process is broken into two distinct phases

  • character mapping
  • cipher application

Character Mapping

The matter of converting a unicode string into an array of integers suitable for large-scale operations is non-trivial and may yield different outcomes depending on the assumptions made by the user. For instance, do we assign a value of zero to any character? Do we include punctuation? Do capital and lower-case letters map to the same value? To remedy this, we grant the user full control over the process that transforms unicode characters into a usable integer array.

DecompressTransform(transform)

transform
The JSON str, typically loaded from PyCrypt's database, detailing which values any number of charaters map to

Returns an array[int] through which the mapping process is preformed and a set containing all possible values that may be returned by applying the transform

jsonTransform = {
    "A":0,  "a":0,  "B":1,  "b":1,  "C":2,  "c":2,  "D":3,  "d":3,  "E":4,  "e":4,
    "F":5,  "f":5,  "G":6,  "g":6,  "H":7,  "h":7,  "I":8,  "i":8,  "J":9,  "j":9,
    "K":10, "k":10, "L":11, "l":11, "M":12, "m":12, "N":13, "n":13, "O":14, "o":14,
    "P":15, "p":15, "Q":16, "q":16, "R":17, "r":17, "S":18, "s":18, "T":19, "t":19,
    "U":20, "u":20, "V":21, "v":21, "W":22, "w":22, "X":23, "x":23, "Y":24, "y":24,
    "Z":25, "z":25}

arrayTransform, mapRange = DecompressTransform(jsonTransform)

DecompressInverse(inverse)

Inverse
The compressed JSON str detailing the unicode values each int maps to. This is typically loaded from our database or generated through GenInverseTransform().

Returns an array[int] through which the mapping process is undone and a set containing all possible values that may be returned by applying the inverse

jsonInverse = {
    '0' : 97,   '1' : 98,   '2' : 99,   '3' : 100,  '4' : 101,  '5' : 102,  '6' : 103,
    '7' : 104,  '8' : 105,  '9': 106,   '10': 107,  '11': 108,  '12': 109,  '13': 110,
    '14': 111,  '15': 112,  '16': 113,  '17': 114,  '18': 115,  '19': 116,  '20': 117,
    '21': 118,   '22': 119,  '23': 120, '24': 121,  '25': 122
}

arrayInverse, inverseMapRange = DecompressInverse(jsonInverse)

ApplyTransform(numRepr, transform, maskedIndices=None)

numRepr
The array[int] of unicode values produced by Encode()
transform
The array[int] (typically returned from either DecompressTransform() or DecompressInverse()) detailing what values a subset of unicode charters will map to
maskedIndices
An array[int] containing the indices where the transform will not be applied

Returns array[int] transformedValues and array[int] maskedIndices containing the indices where the transform was not applied. These almost always correspond to characters outside of the transform such as spaces, punctuation, and non-standard characters.

plaintext = "The FitnessGram™ Pacer Test is a multistage aerobic..."
unicode = Encode(plaintext)

# arrayTransform was returned from DecompressTransform()
mappedText, maskedIndices = ApplyTransform(unicode, arrayTransform)

# mappedText = [19, 7, 4, 32, 5, 8, 19, 13, 4, 18, 18, 6, 17, 0, 12, 8482, ...]
# maskedIndices = [3, 15, 16, 22, 27, 30, 32, 43, 51, 52, 53]

Cipher Formulas

Formulas are composed of snippets of Python/Numpy code, subject to user-defined restrictions. Before execution, the contents of the snippet is parsed using abstract tree syntax. The use of banned attributes will raise an exception during a security scan, as determined by a whitelist and a blacklist of permitted/banned functions/modules supplied by config.JSON.

The names __func_name__ , __var_name__ , and __class_instance__ are reserved and will raise exceptions if used within the formula's text. The value of the local out is will be returned after a formula's evaluation. The user is free to define functions, lambdas,and local variables within a formula without triggering the security system. Functionality for classes are currently a work in progress.

The locals array[int] maskedIndices, array[int] mappedIndices and array[int] mapRange are supplied to the formula during execution and numpy is automatically imported as np at run time. A list[] keys and dict[str, object] options contain the parameters a user may feed to formula during execution. In the example below, keys contains a single value: the array[int] of a keyword. Options contain two: "cycleKeywordOutsideMap" and "deleteTextOutsideMap", and the values of both are bools.

assert len(mapRange) == 1 + max(mapRange)

if options["cycleKeywordOutsideMap"]:
    offset = np.resize(keys[0], len(text))[mappedIndices]

else:
    offset = np.resize(keys[0], len(mappedIndices))

out = (text[mappedIndices] + offset) % len(mapRange)

if not options["deleteTextOutsideMap"]:
    out = np.put(text, mappedIndices, out)

ApplyFormula(formula, text, keys, mapRange, maskedIndices, options={})

formula
A str containing python code.
text
The array[int] upon which formula is applied.
keys
The list[] of keywords or key values passed to the formula at runtime. A cypher may require zero or multiple keys, and the elements within keys can be of any datatype.
mapRange
The array[int] collection of all possible characters produced by ApplyTransform and the mapping process. This value is also returned from database.LoadMap().
maskedIndices
The array[int] containing the locations where the mapping process left characters unchanged, and will usually be equal to the second value returned from ApplyTransform(). When the formula is applied, the array[int] mappedIndices is set equal to the collection of indices not in maskedIndices.
options
The dict[str, object] containing the names and values of any additional variables needed by the formula at runtime.
# User input
options = {"cycleKeywordOutsideMap": False, "deleteTextOutsideMap": True}
keyword = "lemon"
plaintext = "I sexually Identify as an overused sexual identification copypasta. Ever since I was a boy I dreamed of spamming other users with my unfunny wall of text"

# Apply map
numRepr = Encode(plaintext)
mappedText, maskedIndices = ApplyTransform(numRepr, transform)
keys = ProcessKeys(transform, keyword)

# See introductory blurb
formulaStr = """
assert len(mapRange) == 1 + max(mapRange)

if options["cycleKeywordOutsideMap"]:
    offset = np.resize(keys[0], len(text))[mappedIndices]

else:
    offset = np.resize(keys[0], len(mappedIndices))

out = (text[mappedIndices] + offset) % len(mapRange)

if not options["deleteTextOutsideMap"]:
    text[mappedIndices] = out
    out = text
"""

encryptedText = ApplyFormula(cipherQuery[3], mappedText, keys, mapRange, maskedIndices, options=options)

Other Core Functions

CompressInverse(inverse)

Preforms the inverse of DecompressInverse(), converting passed array[int] into a JSON str more suitable for storage. See here for details on DecompressInverse().

CompressTransform(transform)

Preforms the inverse of DecompressTransform(), converting passed array[int] into a JSON str more suitable for storage. See here for details on DecompressTransform().

Decode(numRep)

Converts passed array[int] to a str using the UTF-32 encoding scheme.

Encode(plaintext)

Converts passed str to an array[int] containing the UTF-32 values of each character in plaintext

ProcessKeys(transform, *keys)

Applies array[int] transform without a mask to each str in list[object] keys, then returns the contents of keys. See here for details on ApplyTransform().

Database

database.Init()

This function is automatically called when importing database and is responsible for establishing a connection to PyCrypt. If the necessary database or user does not exist, they will be created.

database.con

The pg8000.native.Connection object created by calling database.Init(). This is used to interact with the PyCrypt Database. See pg8000 documentaion for more information.

database.SaveMap(name, transform, inverse, keywords)

name
The str name of the cipher to be added to the database. Will raise an exception the passed name is already taken.
transfom
A dict[str, int] defining what value each unicode character maps to.
inverse
A dict[int, int] used to reverse the mapping process. if inverse is None, then one will be generated.
keywords
A list[str] containing keywords that can users can by once the passed cipher is added to database.
# NOTE: This only runs if a map with the name "alphaLower" is not already in the system
if len(database.con.run("""SELECT 1 FROM maps WHERE "name"='alphaLower'""")) == 0:
    transform = {
        "A":0,  "a":0,  "B":1,  "b":1,  "C":2,  "c":2,  "D":3,  "d":3,  "E":4,  "e":4,
        "F":5,  "f":5,  "G":6,  "g":6,  "H":7,  "h":7,  "I":8,  "i":8,  "J":9,  "j":9,
        "K":10, "k":10, "L":11, "l":11, "M":12, "m":12, "N":13, "n":13, "O":14, "o":14,
        "P":15, "p":15, "Q":16, "q":16, "R":17, "r":17, "S":18, "s":18, "T":19, "t":19,
        "U":20, "u":20, "V":21, "v":21, "W":22, "w":22, "X":23, "x":23, "Y":24, "y":24,
        "Z":25, "z":25}

    t = DecompressTransform(transform)[0]
    i = GenInverseTransform(t)

    inverse = CompressInverse(i)

    keywords = ["alpha", "lower", "ascii", "alphabet"]

    database.SaveMap("alphaLower", transform, inverse, keywords)

database.LoadMap(identifier)

identifier
This can be either a str or an int. If identifier is an integer, database.LoadMap( ) will return the map with that ID. Otherwise, it will return all maps whose name matches the passed identifier.

Returns a list of entries matching the identifier. Each entry will contain 6 items, in order:

  1. A unique integer ID
  2. A hash formed from user data and the contents of the map
  3. A string name
  4. A compressed JSON transform used to map characters to ints
  5. The compressed inverse transform used convert ints back to characters
  6. A list of keywords (primarily for use within database)
query  = database.LoadMap("alphaLower")
print(f"Hash of alphaLower map: '{query[1]}'")

database.SaveCipher(name, formula, inverse, keywords, options)

name
The str name of the cipher to be added to the database. Will raise an exception the passed name is already taken.
formula
A str consisting of valid python code that transforms a mapped plaintext array into cypher text.
inverse
A str containing Python code that transforms cypher text into plaintext. The rules for the formula parameter apply.
keywords
A list[] containing keywords that can be sorted by once cipher is added to database.
options
A dict[str, object] in which each key/value pair represents the name and default value of extra locals passed to cipher/inverse.

Security checks are automatically preformed whenever this function is called. Exceptions will be raised if any check fails.

# NOTE: This only runs if a vigenere cipher is not aready in the system
if len(database.con.run("""SELECT 1 FROM ciphers WHERE "name"='vigenere'""")) == 0:

    keywords = ["vigenere","caesar","polyalphabetic"]
    options = {"cycleKeywordOutsideMap":False, "deleteTextOutsideMap":True}

    formulaStr = """
    assert len(mapRange) == 1 + max(mapRange)

    if options["cycleKeywordOutsideMap"]:
        offset = np.resize(keys[0], len(text))[mappedIndices]

    else:
        offset = np.resize(keys[0], len(mappedIndices))

    out = (text[mappedIndices] + offset) % len(mapRange)

    if not options["deleteTextOutsideMap"]:
        out = np.put(text, mappedIndices, out)
    """

    inverseStr = """
    assert len(mapRange) == 1 + max(mapRange)

    if options["cycleKeywordOutsideMap"]:
        offset = np.resize(keys[0], len(text))[mappedIndices]
    else:
        offset = np.resize(keys[0], len(mappedIndices))

    if options["deleteTextOutsideMap"]:
        out = (text - offset) % len(mapRange)
    else:
        out = (text[mappedIndices] - offset) % len(mapRange)
        out = np.put(text, mappedIndices, out)
    """

    database.SaveCipher("vigenere", formulaStr, inverseStr, keywords, options)

database.LoadCipher(identifier)

identifier
This can be either a str or an int. If identifier is an integer, database.LoadCipher() will return the cipher with that ID. Otherwise, it will return all ciphers whose name matches the passed identifier.

Returns a list of entries matching the identifier. Each entry will contain 7 items, in order:

  1. A unique integer ID
  2. A hash formed from user data and the contents of the map
  3. A string name
  4. The formula text used for encryption
  5. The inverse formula used for decryption
  6. A list of keywords (primarily for use within database)
  7. A dict whose keys are the names of parameters used in the formula/inverse and whose values are those parameters' default values.
query  = database.LoadCipher("vigenere")
print(f"Hash of vigenere cipher: '{query[1]}'")

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages