-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert code to Cython #24
Comments
Experiment I ran annotating parser.pyx off the # parser.pyx
cdef class Parser(object):
cdef bytearray _buffer
cdef int _buffer_pos
cdef object _read_fn
cdef int MAX_BUFSIZE
cdef str CRLF
cdef int CRLF_LEN
def __cinit__(self, read_fn):
self._buffer = bytearray()
self._buffer_pos = 0
self._read_fn = read_fn
self.MAX_BUFSIZE = 1024 * 1024
self.CRLF = '\r\n'
self.CRLF_LEN = 2
cdef bytearray _readline(self):
cdef bytearray result
cdef bytearray buffer
cdef int crlf_position
buffer = self._buffer[self._buffer_pos:]
crlf_position = buffer.find(self.CRLF)
if crlf_position == -1:
raise StopIteration()
result = buffer[:crlf_position]
self._buffer_pos += crlf_position + self.CRLF_LEN
return result
cdef void _read_into_buffer(self):
cdef bytes data
# FIXME: implement a maximum size for the buffer to prevent a crash due to bad clients
data = self._read_fn(self.MAX_BUFSIZE)
self._buffer.extend(data)
cdef bytearray _read(self, int n_bytes):
cdef bytearray result
cdef bytearray buffer
buffer = self._buffer[self._buffer_pos:]
if len(buffer) < n_bytes:
raise StopIteration()
result = buffer[:n_bytes]
# FIXME: ensure self.CRLF is next
self._buffer_pos += n_bytes + self.CRLF_LEN
return result
def get_instructions(self):
cdef bytearray instructions
cdef int array_length
cdef list instruction_set
cdef bytearray line
cdef str instruction
self._read_into_buffer()
while self._buffer:
self._buffer_pos = 0
instructions = self._readline()
# the Redis protocol says that all commands are arrays, however,
# Redis's own tests have commands like PING being sent as a Simple String
if instructions.startswith('+'):
self._buffer = self._buffer[self._buffer_pos:]
yield str(instructions[1:].strip()).split()
elif instructions.startswith('*'):
# array of instructions
array_length = int(instructions[1:]) # skip '*' char
instruction_set = []
for _ in range(array_length):
line = self._readline()
str_len = int(line[1:]) # skip '$' char
instruction = str(self._read(str_len))
instruction_set.append(instruction)
self._buffer = self._buffer[self._buffer_pos:]
yield instruction_set
else:
# inline instructions, saw them in the Redis tests
for line in instructions.split(self.CRLF):
self._buffer = self._buffer[self._buffer_pos:]
yield str(line.strip()).split() # setup.py
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules = cythonize(["dredis/parser.pyx"])
) Before Cython:
After Cython:
Takeaway: ~7% improvement by annotating only the parser file. |
Sorry, I had cProfile enabled and it messed the results. Here's it corrected (without cProfile):
~4-5% improvement |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I ran a few experiments with Cython and saw a significant performance increase by mostly annotating a few files with static types.
I thought the asyncore code was the problem, but glued the Redis ae.c/anet.c libraries with Python but didn't see the performance improvement I was expecting. There's room for improvement with the way I am creating the parsers and buffers, but I think Cython would be a lot simpler. The C code can be found at https://gist.github.com/hltbra/1f2266b96c50700cc17a10115634e55f
The text was updated successfully, but these errors were encountered: