Allows up to 11 bits of data per unicode character as counted by social media and chat platforms such as Twitter and Discord.
Uses a limited charset within the Basic Multilingual Plane.
Based on, and uses a compatible encoding table with the Rust crate rust-base2048.
- No control sequences, punctuation, quotes, or RTL characters
pip install base2048
import base2048
base2048.encode(b'Hello!')
# => 'ϓțƘ໐µ'
base2048.decode('ϓțƘ໐µ')
# => b'Hello!'
import zlib
import base64
import base2048
string = ('🐍 🦀' * 1000 + '🐕' * 1000).encode()
data = zlib.compress(string)
b64_data = base64.b64encode(data)
# => b'eJztxrEJACAQBLBVHNUFBBvr75zvRvgxBEkRSGqvkbozIiIiIiIiIiIiIiIiIiIiIiJf5wAAAABvNbM+EOk='
len(b64_data)
# => 84
b2048_data = base2048.encode(data)
# => 'ը྿Ԧҩ২ŀΏਬйཬΙāಽႩԷ࿋ႬॴŒǔ०яχσǑňॷβǑňॷβǑňॷβǯၰØØÀձӿօĴ༎'
len(b2048_data)
# => 46
unpacked = zlib.decompress(base2048.decode(b2048_data)).decode()
len(unpacked)
# => 4000
unpacked[2000:2002]
# => '🦀🐍'
----> base2048.decode('༗ǥԢΝĒϧǰ༎ǥ')
DecodeError: Unexpected character 8: ['ǥ'] after termination sequence 7: ['༎']
- To catch the error, use either
base2048.DecodeError
or its base exception,ValueError
.
import base2048
try:
base2048.decode('🤔')
except base2048.DecodeError as e:
print(e)
The code in this project is released under the MIT License.
Javascript - base2048
Rust - rust-base2048