A program that implements 3 types of lossless compression algorithms: LZW, Huffman, Arithmetic coding.
1.Download Lossless-compression-Algorithms.zip file
2.Install all the requirements
pip install -r requirements.txt
3.Open Main.py file and run it, you can see a results of 3 encoded messages(stored in input.txt) and their running time.
4.Run Manual.py if you wanna specify the alphabet and the message manually(it reads input2.txt).
5.If you wanna see some analysis of the 3 algorithms run Running_time.py file
its constructor takes nothing.
Method | Params | Return | Description |
---|---|---|---|
LZW_encode | msg(string) | (string) | The function that encode the original message |
LZW_decode | msg(string) | (string) | The function that decode the encoded message |
its constructor takes one parameter freq_table which is a dictionary contains the alphabet of the message and their frequencies.
Method | Params | Return | Description |
---|---|---|---|
__construct_dict | node(Node): root, val(string): '' | none | Recursive function that construct the codeword_dict out of the huffman tree |
__huffman_dict | none | none | The function that construct the huffman tree |
print_codewords | none | none | The function that print the codeword_dict |
huffman_encode | msg(string) | (string) | The function that encode the original message |
huffman_decode | msg(string) | (string) | The function that decode the encoded message |
its constructor takes one parameter prob_table which is a dictionary contains the alphabet of the message and their probabilities.
Method | Params | Return | Description |
---|---|---|---|
__get_c_prob_table | none | c_prob_table(dict) | The function that construct c_prob_table out of prob_table |
arithmetic_encode | msg(string) | (string) | The function that encode the original message |
arithmetic_decode | msg(string), n(int): original msg length | (string) | The function that decode the encoded message |
- Be careful when you specify the frequency table and message in input2.txt, otherwise you might end up with an encoded message longer than the original one! for that reason we used a random generator, check the Main.py file(it safer).
- input.txt is generated randomly from the file Main.py, it contains 3 strings each message has different alphabet and length.
- The implementation of Arithmetic encoding is very simple here and not efficient. It depends on the programming language, in python I used Decimal library to specify the precision of the decimal number, however there more efficient ways using different programming languages.
- Arithmetic is very slow, so try a message length not longer than 1000 characters.
- For more information about the algorithms, check Report.pdf.
Hope you enjoy this simple implementation :)
If you face any problems please contact me
Made by:
Reem Hejazi 219410002
Raneem Balharith 219410305
Sara Al Shagawi 219410319
under @rem2718 account
For any suggestions/comments/questions email me at: rem.e2.718@gmail.com