Genshi Compiler

Discussion:

Genshi Compiler

fviktor

2011-07-17 00:07:10 UTC

Hi,

I think You are the most interested audience for my latest product, a
working and tested template compiler for Genshi:

http://code.google.com/p/genshi-compiler/

License: MIT

I know that there's prior art in this area like Kajiki. There have
been some discussion and some good suggestions about this topic here
as well. I have been trying to read through myself on them recently.

My story which led me to the development of this package in short:

I had a maintainable, but somewhat complex Genshi template which
rendered a rather complicated HTML document from an equally complex
tree of Python objects. I could not get it nicer, it is all legacy
code. It worked fine for a few object, but started to be insanely slow
for 100-200 such object laid out in a 4-6 level deep tree structure.
It took ~600ms on a pretty fast Core i7 machine just to render the
already loaded template. It was unbearable on a production server.
Yes, it contains recursive template functions, it was the cleanest way
to develop this kind of template.

So I had three choices:

* Rewrite it in pure Python (or Cython) to get better speed, but it
would mean loosing most of the maintainability and the possibility of
generating broken markup. It would also pose a great risk of breaking
functionality, which is quite hard to test, since the generated HTML
is part of a user interface...

* Rewrite the whole thing in a fast text template language, which has
an existing mechanism to compile the template to Python source or byte
code. It would risk the generation of broken markup as well.

* Trying to compile it to pure Python. It could also be used for my
other templates. There are lots of them...

I don't use too much of Genshi's more dynamic features, like py:match.
(I tried to build my template that way in the past, but the rendering
became too slow and it wasn't much more maintainable than before.) I
don't use token filters for other than language translation (Babel),
which could be incorporated otherwise. I'm using xi:include, but it
could be replaced by simply importing another compiled template module
and calling a compiled template function directly.

So I decided to take the red pill and write such a compiler. The first
version completed in ~20 hours, but I wasn't satisfied with the
results. It worked, but the code was somewhat fragile and hard to
extend. So I completely refactored and extended it in another ~40
hours, which led to reasonable results. Now it has good unit test
coverage and hope to receive more features in the near future.

i18n:msg and i18n in general will be implemented soon, since I need
that for most of my other templates.

xi:include might be implemented by introducing a mapping from template
file names (as given in href) and the template modules generated
(fully qualified import names).

token stream support seems to be doable as well (e.g. the compiled
template would generate a compatible token stream), but it is out of
my scope, currently. It would lower performance, but would allow for
using the existing filters.

To be honest, py:match seems to be hopeless. I have ideas to implement
that kind of functionality, but my best bet is that it won't be any
faster that way than running the whole thing through Genshi.

Cython output (decorating the generated code with type declarations)
is also planned, since that could give us another 2-3x performance
gain making it potentially 100-150x faster than Genshi. I know, it is
close to a dream, but might be achieved for the most time critical
parts.

Output modules for other programming languages, like Standard C++,
Java or C# are also possible, making Genshi a more universal template
format. Expressions might not be fully compatible, however. Loop
constructs might need to be translated, etc. But having such support
would make it possible to prototype something in Python, then reuse
the template as is for the final solution on another language.

Please don't keep your ideas in secret. :)

Contributions and donations are also welcome.

Viktor Ferenczi
***@ferenczi.eu
freelance software consultant
http://www.linkedin.com/in/viktorferenczi
http://careers.stackoverflow.com/viktor-ferenczi

--
You received this message because you are subscribed to the Google Groups "Genshi" group.
To post to this group, send email to ***@googlegroups.com.
To unsubscribe from this group, send email to genshi+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/genshi?hl=en.

Yuen Ho Wong

2011-08-23 12:21:09 UTC

Permalink

Whoa I hadn't come back to this group for 2 months and now there's
another Genshi compiler. Nice job. I can do away with py:match and
xi:include. I'd like to have a way to post-process/transform a
rendered result tho. I don't mind so much whether it's token stream or
a tree. In a perfect world where Python was fast and has a decent DOM
implementation, I'd just expose a DOM tree and support XPath and be
done with it. But they are not in Python.

Jimmy Yuen Ho Wong

Post by fviktor
Hi,
I think You are the most interested audience for my latest product, a
http://code.google.com/p/genshi-compiler/
License: MIT
I know that there's prior art in this area like Kajiki. There have
been some discussion and some good suggestions about this topic here
as well. I have been trying to read through myself on them recently.
I had a maintainable, but somewhat complex Genshi template which
rendered a rather complicated HTML document from an equally complex
tree of Python objects. I could not get it nicer, it is all legacy
code. It worked fine for a few object, but started to be insanely slow
for 100-200 such object laid out in a 4-6 level deep tree structure.
It took ~600ms on a pretty fast Core i7 machine just to render the
already loaded template. It was unbearable on a production server.
Yes, it contains recursive template functions, it was the cleanest way
to develop this kind of template.
* Rewrite it in pure Python (or Cython) to get better speed, but it
would mean loosing most of the maintainability and the possibility of
generating broken markup. It would also pose a great risk of breaking
functionality, which is quite hard to test, since the generated HTML
is part of a user interface...
* Rewrite the whole thing in a fast text template language, which has
an existing mechanism to compile the template to Python source or byte
code. It would risk the generation of broken markup as well.
* Trying to compile it to pure Python. It could also be used for my
other templates. There are lots of them...
I don't use too much of Genshi's more dynamic features, like py:match.
(I tried to build my template that way in the past, but the rendering
became too slow and it wasn't much more maintainable than before.) I
don't use token filters for other than language translation (Babel),
which could be incorporated otherwise. I'm using xi:include, but it
could be replaced by simply importing another compiled template module
and calling a compiled template function directly.
So I decided to take the red pill and write such a compiler. The first
version completed in ~20 hours, but I wasn't satisfied with the
results. It worked, but the code was somewhat fragile and hard to
extend. So I completely refactored and extended it in another ~40
hours, which led to reasonable results. Now it has good unit test
coverage and hope to receive more features in the near future.
i18n:msg and i18n in general will be implemented soon, since I need
that for most of my other templates.
xi:include might be implemented by introducing a mapping from template
file names (as given in href) and the template modules generated
(fully qualified import names).
token stream support seems to be doable as well (e.g. the compiled
template would generate a compatible token stream), but it is out of
my scope, currently. It would lower performance, but would allow for
using the existing filters.
To be honest, py:match seems to be hopeless. I have ideas to implement
that kind of functionality, but my best bet is that it won't be any
faster that way than running the whole thing through Genshi.
Cython output (decorating the generated code with type declarations)
is also planned, since that could give us another 2-3x performance
gain making it potentially 100-150x faster than Genshi. I know, it is
close to a dream, but might be achieved for the most time critical
parts.
Output modules for other programming languages, like Standard C++,
Java or C# are also possible, making Genshi a more universal template
format. Expressions might not be fully compatible, however. Loop
constructs might need to be translated, etc. But having such support
would make it possible to prototype something in Python, then reuse
the template as is for the final solution on another language.
Please don't keep your ideas in secret. :)
Contributions and donations are also welcome.
Viktor Ferenczi
freelance software consultanthttp://www.linkedin.com/in/viktorferenczihttp://careers.stackoverflow.com/viktor-ferenczi

fviktor

2011-08-24 12:41:11 UTC

Permalink

Post by Yuen Ho Wong
Whoa I hadn't come back to this group for 2 months and now there's
another Genshi compiler. Nice job.

Thanks.

I really don't like seeing as text template languages are taking over
the role even for HTML templating, when they should not be used IMHO
just because Genshi is the second slowest solution according to some
benchmarks. It is why I'm working on my own solution to the problem. I
also have lots of working Genshi templates and I don't want to rewrite
them in a much more error prone templating language just because of
its performance... It is often possible to move the template
rendering to build time or cache the rendered output, but I consider
such solutions only as workarounds, not real solutions to the original
performance problem. Caching is always tricky and somewhat risky,
anyway.

I'm progressing with the i18n part. That's not that easy, however.
i18n:msg is already working, but still no support for the other i18n
directives. It is also somewhat tricky to get the translator object
into all the function context efficiently while keeping the
extensibility and allowing for a custom gettext compatible translator
object. I'm also thinking about supporting of compile time
translation, e.g. embedding the translated text into the generated
code instead of trying to do the translation at runtime. It would need
the compilation of each template to every single language, but that
seems to worth it, even for a hundred languages, since compilation is
done only once.

I've also fixed a few issues, like problems with the $$ escaping.

The current code is in a private Mercurial repository, it will go into
the public repository when the code has stabilized.

Post by Yuen Ho Wong
I can do away with py:match and xi:include.

Same here. I tried to use py:match before, but the performance penalty
was huge. Just take a look at Trac's performance as an example. :(

Post by Yuen Ho Wong
I'd like to have a way to post-process/transform a
rendered result tho. I don't mind so much whether it's token stream or
a tree.

Support for Genshi compatible (or at least a similar) token stream
output is in my plans. It will allow for post processing, ideally
using Genshi's own filters. But that would decrease performance for
sure.

Another possibility is to parse the result back using lxml's parser in
streaming mode and filter the output that way, then serialize it
again. I think it might not be that slow as one would think, but we
need a few benchmarks on this in order to have the actual figures.

Post by Yuen Ho Wong
In a perfect world where Python was fast and has a decent DOM
implementation, I'd just expose a DOM tree and support XPath and be
done with it. But they are not in Python.

Could you please explain this in more detail? We might be able to
improve on the situation (there's Cython, for example), but I don't
completely see your point here.

Viktor

Yuen Ho Wong

2011-08-26 11:30:17 UTC

Permalink

My memory is bad but I recall from my last investigation for the
possibility of using lxml as the backend is that neither its
ElementTree implementation nor SAX would be able to capture all the
information Genshi's stream requires. Genshi's stream captures token's
row and column, but lxml doesn't expose that to you (I could be wrong
but you can look into it). There are other things that Genshi can do
such as namespace normalization and whitespace stripping that you just
can't replicate using an lxml backend. You can't ensure compatibility
if you fail to capture the same kind of information on input. Genshi's
stream is SAX-like, but it does more, or the author would have just
settled with SAX.

I think the notion of using stream tokens for the hope of being fast
in Genshi is pointless and has been proven so because Genshi's value
is in its transformability. Do you think transforming a stream of
tokens in 1000 Python objects using a series of generators is faster
or traversing down a DOM tree with a couple of hundred nodes in some
multiple of log(n) time?

The way I see it, there are a couple of ways to improve on Genshi's
performance without a whole lot of things, most have already been
suggested before: http://genshi.edgewall.org/wiki/GenshiPerformance#Ideasforimprovingtheperformance.
I totally agree with these findings.

I have a couple of whacky ideas as well:

1. Now that Cython 0.15 supports generators now, we can rewrite the
Stream and all the filters in Cython and see how much things improve
and see if it's worth it.
2. Completely replace the Stream and all the filters with the Python
bindings for libxml2 and libxslt. Now we can transform on an AST using
the DOM API and XPath directly in C speed and do whatever we want.
Serialization is gonna be painful tho, and will definitely break all
the existing filters.
3. Use pyTenjin. But this isn't a solution isn't it :)?

Jimmy Wong

Joshua J. Kugler

2011-08-26 23:35:18 UTC

Permalink

Post by Yuen Ho Wong
My memory is bad but I recall from my last investigation for the
possibility of using lxml as the backend is that neither its
ElementTree implementation nor SAX would be able to capture all the
information Genshi's stream requires. Genshi's stream captures
token's row and column, but lxml doesn't expose that to you (I could
be wrong but you can look into it). There are other things that
Genshi can do such as namespace normalization and whitespace
stripping that you just can't replicate using an lxml backend. You
can't ensure compatibility if you fail to capture the same kind of
information on input. Genshi's stream is SAX-like, but it does more,
or the author would have just settled with SAX.

I don't know how much extension it would take, but the Sax parser that
was written for Suds (https://fedorahosted.org/suds/) will at least
report the line/column for a syntax error during parsing. It seems it
would be a trival addition to store this information in the element
objects generated. Just an idea. I have no idea if it would work.

j
--
Joshua Kugler
Part-Time System Admin/Programmer
http://www.eeinternet.com - Fairbanks, AK
PGP Key: http://pgp.mit.edu/ ID 0x73B13B6A

Yuen Ho Wong

2011-08-27 11:01:51 UTC

Permalink

I don't think there's any point sticking with a token stream unless we
can somehow make it fast in Python. The advantage of using a token
stream is it saves memory. It's only useful for immediate output. For
almost any other use that demands transformation, you almost always
have to use a tree. There are many things that goes on in Genshi that
demands a linear search of all the nodes inside a document, you can
express those nodes in a token stream where there are at least 2 or 3
Python objects for each node, or you can arrange them in a tree, a
node object per node. We know that linearizing a generator expression
to a list is slow, we know that object creation, function calls and
loops in Python are slow, so which would you choose? I'm actually
quite pessimistic that we'll be able to speed up Genshi the way it is
now to any decent speed (a few orders of magnitude faster) without
breaking a whole lot of things. I'd imagine implementation a pre-
processor and post-processor for pyTenjin is going to take care of
most if not all the functionality we love in Genshi, and yet still be
2 orders of magnitude faster.

Jimmy Yuen Ho Wong

Post by Joshua J. Kugler

I don't know how much extension it would take, but the Sax parser that
was written for Suds (https://fedorahosted.org/suds/) will at least
report the line/column for a syntax error during parsing. It seems it
would be a trival addition to store this information in the element
objects generated. Just an idea. I have no idea if it would work.
j
--
Joshua Kugler
Part-Time System Admin/Programmerhttp://www.eeinternet.com- Fairbanks, AK
PGP Key:http://pgp.mit.edu/ ID 0x73B13B6A

Kyle Alan Hale

2011-08-31 02:48:54 UTC

Permalink

Post by fviktor

Post by Yuen Ho Wong
I can do away with py:match and xi:include.

Same here. I tried to use py:match before, but the performance penalty
was huge. Just take a look at Trac's performance as an example. :(

It sounds like I may be in the minority, but sacrificing py:match and py:include would be a deal-breaker for me and a few of my projects. Losing those two features, Genshi would have nothing to offer that other engines can't do better.

Lukasz Michalski

2011-08-31 07:53:55 UTC

Permalink

Post by Kyle Alan Hale

Post by fviktor

Post by Yuen Ho Wong
I can do away with py:match and xi:include.

Same here. I tried to use py:match before, but the performance penalty
was huge. Just take a look at Trac's performance as an example. :(

+1

Regards,
Łukasz

Uwe Schroeder

2011-08-31 07:56:39 UTC

Permalink

Post by Kyle Alan Hale

Post by fviktor

Post by Yuen Ho Wong
I can do away with py:match and xi:include.

Same here. I tried to use py:match before, but the performance penalty
was huge. Just take a look at Trac's performance as an example. :(

It sounds like I may be in the minority, but sacrificing py:match and
py:include would be a deal-breaker for me and a few of my projects.
Losing those two features, Genshi would have nothing to offer that other
engines can't do better.

I could sacrifice py:match as all my projects only use them in one spot, but
py:include would definitely be a deal breaker...

Personally I rather throw better hardware at the problem. A few hundred bucks
buy you a real nice dedicated server these days...

Yuen Ho Wong

2011-08-31 08:51:46 UTC

Permalink

I think the issue is mostly py:match. The way most people use it now I
think is to just give it a consistent look and feel by xi:include'ing
a master.html which does py:match and XPATH matching. It's very very
difficult to speed up py:match the way it is now. One of the reasons
Genshi is so slow is because all the py:match stuff is rerun every
time you render a template. That's potentially a lot of work.

xi:include is less bleak. You got to have some mechanism to do include
other pages anyway, xi:include is just one way to do it. I'm
personally not too fond of having too many namespaces in a template
but other then that I have no particular preference towards whether
removing to keeping it. Kajiki seems to have replaced py:match and
xi:include with py:import and py:include. This is totally fine by me.
Template inclusion is definitely 1 area that needs some work now.
Things can speed up quite a bit if the inclusion tree is resolved at
the time the markup is loaded into the template instead of during
serialization every time you render the page.

At upwards 500ms a page, you get to spit out 2-3 pages per second per
process. Hardware is not going to solve your problem here. Genshi
needs 1 or 2 orders of magnitude speed up in order to be usable in
sites with lots of traffic.

Jimmy Wong

Post by Uwe Schroeder

Post by Kyle Alan Hale

Post by fviktor

Post by Yuen Ho Wong
I can do away with py:match and xi:include.

Same here. I tried to use py:match before, but the performance penalty
was huge. Just take a look at Trac's performance as an example. :(

It sounds like I may be in the minority, but sacrificing py:match and
py:include would be a deal-breaker for me and a few of my projects.
Losing those two features, Genshi would have nothing to offer that other
engines can't do better.

I could sacrifice py:match as all my projects only use them in one spot, but
py:include would definitely be a deal breaker...
Personally I rather throw better hardware at the problem. A few hundred bucks
buy you a real nice dedicated server these days...