You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I decided to write down some of my ideas for what I want to change in PyGerber 3.0.0 release. I didn't try to organize it well, so it is basically a stream of thoughts. I think I may extend it in the future,. The main reason for sharing it here is to make it possible for interested parties to share opinions on planned changes. It is what it is I guess :D
PyGerber 3.0.0 redesign plan
This document describes the plan for design changes in PyGerber 3.0.0. Hopefully writing
this down will help me to better asses the scope of the changes and consider possible
issues before messing up my codebase.
Release 3.0.0 goals
Further simplify rendering code, making interface as minimal and as abstract as
possible.
Clean up token inheritance and remove all logic from tokens.
Make Parser a proper, simple, visitor, with no custom additions.
Make Context / State logic simpler by removing immutability and sick amount of
getters/setters.
Split PyGerber logic into RenderVM, Interpreter, Parser and Tokenizer (last one likely
non-existent).
Changes
Tokens
Thing currently called a Tokenizer is not a tokenizer at all, its a parser, hence it
should be named this way. Tokens naming is currently inconsistent, most of them contain
more than command name, which is not necessary. Token names should be as short and
simple as possible. They are all searchable in Gerber specification, so there is no need
to complicate their names.
Docstrings within tokens are just copy-pasted from Gerber specification, which is not
particularly useful. They should be removed. The problem is that Gerber specification is
only available in PDF format, so there is no simple way to extract explanations from
there. This causes docstrings to become very generic, but that's sort of expected, as
they should directly map to Gerber spec.
The question is how to map tokens to locations in specification. It can be either done
by decorators, by class attributes or possibly by some other way. Maybe there is no
reason to link them at all. I can have separate map completely unrelated to token
classes. This guarantees separation of concerns and allows for easier changes, as
everything will be in one place.
Now unpacking logic (parsing integers etc.) is in tokens. This should likely be moved
into parser class, as this will make it easier to later create C++ based Parser, which
will simply construct tokens from correct constructor parameters.
Currently Offset class is used to represent coordinates and it stores distances as
millimeters, but likely it would be more handy to defer this process for later, it could
be done within Interpreter. Until then, coordinates should be stored as strings in, lets
call it RawUnit class. RawUnit should be agnostic of coordinate type, whether it is X,
Y, I, J. This will also allow us to implement incremental coordinates in the future.
Token classes should probably inherit from pydantic.BaseModel, this will give us
automatic serialization for testing purposes, constructors, validation and probably type
coercion. Since we will be removing Parser class, we can remove update_drawing_state()
methods. Old conversion to string system should also be removed, hence get_gerber_code() can be removed too. get_state_based_hover_message() should be
probably extracted to different visitor class. Since we mentioned visitors, parser2_visit_token() should be replaced with __visit__() method with single visitor argument and all visiting logic should be based on single TokenVisitor
class.
Tricky part is dealing with group tokens. Currently all the tokens are stored in single
group token called AST. This one is sort of fine, as it has no logic, but for example
macro token do have custom logic which is not desirable. This should involve change in
tokenizer, since currently macros are tokenized in this annoying way.
Should statements even exist as a separate token? Most of them contain only single
command hence they bring unnecessary level of indirection. I guess I will prefer to just
keep % signs as tokens with no action associated, same applies to *. Or maybe we can
even discard them at all, since every token knows if it needs a % and/or * sign
after it. This should also work in case of macros, which are sort of special since they
contain multiple separate statements. Oh and they don't contain any token indicating end
of macro. But since macros can contain only primitives we can just end macro after
encountering anything that is not a macro primitive. Since we will have a separate
Interpreter class adding this logic should be easy.
I do not think that having token class customization is useful at all. It should be
removed.
One more problem is how to incorporate error reporting. Maybe new Tokenizer/Parser
should have some interface for collecting Warnings and Errors from the very begging.
tbc.
Parser2
Parser2 is not a parser, it is an... Compiler? Honestly, thats probably best way to
describe it is since it basically translates stateful Gerber operations represented by
tokens into stateless operations which are executed later by Renderer2 descendants.
Additionally, I think that Renderer2 should be renamed either to VM or Interpreter. We
already discussed how visitor interface should be changed to be as generic as
Refactoring Guru description wanted it to be. Additionally, Parser2Hooks should be
simplified (no nested classes) and merged into Parser2 (now named Compiler). This way
VM/Interpreter implementation can be reused between different formats, eg. drill files.
Renderer2
Renderer2 should be renamed to VM or Interpreter. Number of draw operations should be
reduced to minumum. We definitely need arc and line draws and region fills. Since region
fills already have to support boundaries made of arcs and lines, they should be enough
to draw any other primitives. This should also simplify rotations etc. since they have
to be implemented only for this 3 primitives. Apertures are cached anyway, so
performance penalty should be minimal. Arc-to-point conversion logic should be
implemented in VM/Interpreter, as it is currently. Arcs should be only one way, let's
say counterclockwise, to be determined. Arcs in opposite direction should be converted
by Compiler to counterclockwise arcs, as draws are stateless and there is no penalty for
not continuous sequences of draws.
This last point makes it so we lose semantic information about lines continuity, but
this should not have any significant consequences in the future, we were not analyzing
that and we are unlikely to do so in the future.
If we remove most of the high level logic from Renderer2, then we should do all the
transformations like rotations and scales at Compiler level. Additionally things like
block apertures, macros and step-repeat expressions should be expanded into simple
commands at Compiler level. We could create bytecode-like format for that, so it is
sickly generic.
Polarization is a kind of annoying problem since we will have to toggle polarity on and
of. We could implement complex boolean logic at Compiler level, but thats likely to
cause performance problems, as those algorithms could be Q(n^2) or even worse.
Alternative is to add commands for starting and ending layer and then pasting said layer
into image. It's questionable if this can be implemented in generic way. It should be
simple for raster images, but for svg images it could be a problem.
Additionally we have to calculate a size of image. This likely should be done at
Compiler level and there should be a separate command for specifying image properties,
mainly size, but there could be other needs in the future.
Let's consider a macro expansion. At compiler level we will encounter macro definition
and later macro instantiation. Draws within macro can draw and can clear content of
macro shape but afterwards macro is "flattened" and it is pasted as stamp, cleared
regions will not affect image. This requires some kind of stack to create macros in sub
images. For raster images we can create separate image and then paste it (that's how it
is done now) and for SVGs we can use "use" tag (also how it is done now).
VM/Interpreter should have it's own separate set of unit tests instead of end2end tests
how it is now done. Without unit tests its hard to determine in which part of code there
is a problem.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Hi,
I decided to write down some of my ideas for what I want to change in PyGerber 3.0.0 release. I didn't try to organize it well, so it is basically a stream of thoughts. I think I may extend it in the future,. The main reason for sharing it here is to make it possible for interested parties to share opinions on planned changes. It is what it is I guess :D
PyGerber 3.0.0 redesign plan
This document describes the plan for design changes in PyGerber 3.0.0. Hopefully writing
this down will help me to better asses the scope of the changes and consider possible
issues before messing up my codebase.
Release 3.0.0 goals
possible.
getters/setters.
non-existent).
Changes
Tokens
Thing currently called a Tokenizer is not a tokenizer at all, its a parser, hence it
should be named this way. Tokens naming is currently inconsistent, most of them contain
more than command name, which is not necessary. Token names should be as short and
simple as possible. They are all searchable in Gerber specification, so there is no need
to complicate their names.
Docstrings within tokens are just copy-pasted from Gerber specification, which is not
particularly useful. They should be removed. The problem is that Gerber specification is
only available in PDF format, so there is no simple way to extract explanations from
there. This causes docstrings to become very generic, but that's sort of expected, as
they should directly map to Gerber spec.
The question is how to map tokens to locations in specification. It can be either done
by decorators, by class attributes or possibly by some other way. Maybe there is no
reason to link them at all. I can have separate map completely unrelated to token
classes. This guarantees separation of concerns and allows for easier changes, as
everything will be in one place.
Now unpacking logic (parsing integers etc.) is in tokens. This should likely be moved
into parser class, as this will make it easier to later create C++ based Parser, which
will simply construct tokens from correct constructor parameters.
Currently Offset class is used to represent coordinates and it stores distances as
millimeters, but likely it would be more handy to defer this process for later, it could
be done within Interpreter. Until then, coordinates should be stored as strings in, lets
call it RawUnit class. RawUnit should be agnostic of coordinate type, whether it is X,
Y, I, J. This will also allow us to implement incremental coordinates in the future.
Token classes should probably inherit from pydantic.BaseModel, this will give us
automatic serialization for testing purposes, constructors, validation and probably type
coercion. Since we will be removing Parser class, we can remove
update_drawing_state()
methods. Old conversion to string system should also be removed, hence
get_gerber_code()
can be removed too.get_state_based_hover_message()
should beprobably extracted to different visitor class. Since we mentioned visitors,
parser2_visit_token()
should be replaced with__visit__()
method with singlevisitor
argument and all visiting logic should be based on singleTokenVisitor
class.
Tricky part is dealing with group tokens. Currently all the tokens are stored in single
group token called AST. This one is sort of fine, as it has no logic, but for example
macro token do have custom logic which is not desirable. This should involve change in
tokenizer, since currently macros are tokenized in this annoying way.
Should statements even exist as a separate token? Most of them contain only single
command hence they bring unnecessary level of indirection. I guess I will prefer to just
keep
%
signs as tokens with no action associated, same applies to*
. Or maybe we caneven discard them at all, since every token knows if it needs a
%
and/or*
signafter it. This should also work in case of macros, which are sort of special since they
contain multiple separate statements. Oh and they don't contain any token indicating end
of macro. But since macros can contain only primitives we can just end macro after
encountering anything that is not a macro primitive. Since we will have a separate
Interpreter class adding this logic should be easy.
I do not think that having token class customization is useful at all. It should be
removed.
One more problem is how to incorporate error reporting. Maybe new Tokenizer/Parser
should have some interface for collecting Warnings and Errors from the very begging.
tbc.
Parser2
Parser2 is not a parser, it is an... Compiler? Honestly, thats probably best way to
describe it is since it basically translates stateful Gerber operations represented by
tokens into stateless operations which are executed later by Renderer2 descendants.
Additionally, I think that Renderer2 should be renamed either to VM or Interpreter. We
already discussed how visitor interface should be changed to be as generic as
Refactoring Guru description wanted it to be. Additionally, Parser2Hooks should be
simplified (no nested classes) and merged into Parser2 (now named Compiler). This way
VM/Interpreter implementation can be reused between different formats, eg. drill files.
Renderer2
Renderer2 should be renamed to VM or Interpreter. Number of draw operations should be
reduced to minumum. We definitely need arc and line draws and region fills. Since region
fills already have to support boundaries made of arcs and lines, they should be enough
to draw any other primitives. This should also simplify rotations etc. since they have
to be implemented only for this 3 primitives. Apertures are cached anyway, so
performance penalty should be minimal. Arc-to-point conversion logic should be
implemented in VM/Interpreter, as it is currently. Arcs should be only one way, let's
say counterclockwise, to be determined. Arcs in opposite direction should be converted
by Compiler to counterclockwise arcs, as draws are stateless and there is no penalty for
not continuous sequences of draws.
This last point makes it so we lose semantic information about lines continuity, but
this should not have any significant consequences in the future, we were not analyzing
that and we are unlikely to do so in the future.
If we remove most of the high level logic from Renderer2, then we should do all the
transformations like rotations and scales at Compiler level. Additionally things like
block apertures, macros and step-repeat expressions should be expanded into simple
commands at Compiler level. We could create bytecode-like format for that, so it is
sickly generic.
Polarization is a kind of annoying problem since we will have to toggle polarity on and
of. We could implement complex boolean logic at Compiler level, but thats likely to
cause performance problems, as those algorithms could be Q(n^2) or even worse.
Alternative is to add commands for starting and ending layer and then pasting said layer
into image. It's questionable if this can be implemented in generic way. It should be
simple for raster images, but for svg images it could be a problem.
Additionally we have to calculate a size of image. This likely should be done at
Compiler level and there should be a separate command for specifying image properties,
mainly size, but there could be other needs in the future.
Let's consider a macro expansion. At compiler level we will encounter macro definition
and later macro instantiation. Draws within macro can draw and can clear content of
macro shape but afterwards macro is "flattened" and it is pasted as stamp, cleared
regions will not affect image. This requires some kind of stack to create macros in sub
images. For raster images we can create separate image and then paste it (that's how it
is done now) and for SVGs we can use "use" tag (also how it is done now).
VM/Interpreter should have it's own separate set of unit tests instead of end2end tests
how it is now done. Without unit tests its hard to determine in which part of code there
is a problem.
Beta Was this translation helpful? Give feedback.
All reactions