Ever wanted to understand more about unicode in python? This talk is a good explanation on how to deal with it properly
Ever wanted to understand more about unicode in python? This talk is a good explanation on how to deal with it properly
I recently became aware of this native Python package PyMySQL. The package has one important benefit vs. the other solutions to talk to a MySQL server, such as MySQLdb (AKA mysql-python) , namely, it reimplements the MySQL protocol, instead of binding to the MySQL connector library (also known as libmysqlclient). Why is this an issue? Well, because the MySQL connector library is GPL, and you can’t bind against GPL code unless your code is under a GPL-compatible license. This excludes all commercial uses, and makes all derivative works of libmysqlclient GPL as well, including the Python binding MySQLdb. If you thought about circumventing the problem using unixodbc, tough luck: the ODBC MySQL connector is also GPL, thus making unixodbc GPL as well.
Despite the low version number, PyMySQL, seems to be working, has no dependencies, it’s pure python, and it is released under the very liberal MIT license.
In this post we are going to describe and implement non-planar samplers. In the previous post about samplers, we implemented and characterized different planar samplers to make antialiasing possible. The characteristic of these samplers was to produce regular or random points on a plane with x,y between zero and one. To implement effects such as simulation of lenses behavior, reflections and so on, we also need to be able to shoot rays according to geometrical patterns other than the plane. More specifically, we need to be able to map points on a disk (to simulate lenses) or on a hemisphere (to simulate other optical effects such as reflections), while at the same time preserving the good characteristics of the random distributions outlined in the planar case.
To achieve this, the samplers now implement two new methods, BaseSampler.map_samples_to_disk() and BaseSampler.map_sampler_to_hemisphere(). They are in charge of remapping the planar distribution to a disk or to a hemisphere, but with a couple of twists: in the disk remap, the points in the range [0:1] must be remapped to a full circle from [-1:1] in both axes, so to cover the circle completely while preserving the distribution. This is done through a formulation called Shirley’s concentric maps.
In the hemisphere remapping, we also want to introduce a variation in the density so that it changes with the cosine of the polar angle from the top of the hemisphere. In other words, we want an adjustable parameter e to focus the point density closer to the top of the hemisphere.
We will need the characteristics of this distributions later on, when we will have to implement reflections and other optical effects. As you can see from the above plot, higher values of the parameter e produce a higher concentration of the points close to the top of the hemisphere. On the other hand, a low e parameter tend to produce a more uniform distribution over the full hemisphere.
To obtain the points, the sampler object has now three methods to request an iterator. We are no longer iterating on the object itself, because we need to provide three different iteration strategies. Methods BaseSampler.diskiter(), BaseSampler.hemisphereiter() and BaseSampler.squareiter(), each returning a generator over the proper set of points. Note that the hemisphere point generator returns 3D points, differently from the other two returning 2D points.
I think it may be interesting for others to see how to call easily a C routine from python, without implementing a python module. What you need is the ctypes module. Remeber however that apparently the use of this module is generally frowned upon, at least according to a note I found in PEP 399:
Usage of ctypes to provide an API for a C library will continue to be frowned upon as ctypes lacks compiler guarantees that C code typically relies upon to prevent certain errors from occurring (e.g., API changes).
although to be honest, it may be in the context of the PEP itself, and not as a general recommendation.
Nevertheless, suppose you want to call the function
double pow(double, double)
in the standard math library.
The first thing to do is to define the prototype of the function. You achieve this via the following:
prototype=ctypes.CFUNCTYPE(ctypes.c_double, ctypes.c_double, ctypes.c_double)
The call to ctypes.CFUNCTYPE creates a prototype object for a C function that returns a double (the first argument) and accepts two double (the second and third arguments).
Now you have a prototype. This entity is a class
>>> prototype <class 'ctypes.CFunctionType'>
and you can bind this prototype to the actual function in the math library with the following. First you create an object representing the library
>>> dll=ctypes.cdll.LoadLibrary("libm.dylib") # on Mac. Linux use libm.so
and then you bind to the actual function
>>> pow = prototype(("pow", dll))
>>> pow(3,4)
81.0
This just brushes the surface, but I wanted to make a simple introductory post.
One very powerful feature of python is the interactive interpreter: it allows you to test and quickly evaluate snippets of code. Occasionally, I need to rerun the same code, either during the same or another python interpreter session. One quick way to achieve would be to copy and paste the code again, but you quickly realize the prompt makes it hard:
>>> for i in xrange(10): ... print i+10 ... print i-10 ... print i/2 ... 10 -10 and so on
if you directy copy and paste the above snippet, it clearly won’t work due to the presence of the prompts:
>>> >>> for i in xrange(10): File "<stdin>", line 1 >>> for i in xrange(10): ^ SyntaxError: invalid syntax >>> ... print i+10
Frustrated, I decided to solve the problem once and for all: I created a .pythonrc file where I override the normal “>>> ” prompt to a header prompt, and the continuation prompt to the empty string:
import sys sys.ps1='--- [Python] ---\n' sys.ps2=''
Then, I added the PYTHONSTARTUP variable in my .bash_profile to refer to this file:
export PYTHONSTARTUP=$HOME/.pythonrc
Now my interactive session looks like this
Python 2.7.1 (r271:86832, Feb 27 2011, 20:04:04)
[GCC 4.2.1 (Apple Inc. build 5664)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
--- [Python] ---
for i in xrange(2):
print i
0
1
--- [Python] ---
I am now free to copy and paste the above code and replay it in the interpreter, or later on, directly into a vim session (but I will have to change tab spacing).
After having finally obtained a raytracer which produces antialiasing, it is now time to take a look at performance. We already saw some numbers in the last post. Rendering a 200×200 image with 16 samples per pixels (a grand total of 640.000 rays) takes definitely too much. I want to perform some profiling with python, find the hotspots in the code, and eventually devise a strategy to optimize them.
To perform basic profiling, I used the cProfile program provided in the standard library. It appears that the longest processing time is in the hit() function
$ python -m cProfile -s time test3.py
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
2560000 85.329 0.000 108.486 0.000 Sphere.py:12(hit)
1960020 30.157 0.000 30.157 0.000 {numpy.core.multiarray.array}
7680000 23.115 0.000 23.115 0.000 {numpy.core._dotblas.dot}
1 19.589 19.589 195.476 195.476 World.py:25(render)
2560000 7.968 0.000 116.454 0.000 World.py:62(f)
640000 6.710 0.000 133.563 0.000 World.py:61(hit_bare_bones_object)
640025 4.438 0.000 120.902 0.000 {map}
640000 3.347 0.000 136.910 0.000 Tracer.py:5(trace_ray)
640000 3.009 0.000 3.009 0.000 {numpy.core.multiarray.zeros}
640000 2.596 0.000 2.613 0.000 {sorted}
640000 2.502 0.000 3.347 0.000 {filter}
640000 1.835 0.000 16.784 0.000 Ray.py:4(__init__)
This does not surprise me, as the main computation a raytracer performs is to test each ray for intersection on the objects in the scene, in this case multiple Sphere objects.
Understood that most of the time is spent into hit(), I wanted to perform line-by-line profiling. This is not possible with the standard python cProfile module, therefore I searched and found an alternative, line_profiler:
$ easy_install-2.7 --prefix=$HOME line_profiler $ kernprof.py -l test3.py Wrote profile results to test3.py.lprof $ python -m line_profiler test3.py.lprof
Before running the commands above, I added the @profile decorator to the method I am interested in. This decorator is added by line_profiler to the __builtin__ module, so no explicit import statement is needed.
class Sphere(object):
<...>
@profile
def hit(self, ray):
<...>
The results of this profiling are
Line # Hits Time Per Hit % Time Line Contents
==============================================================
12 @profile
13 def hit(self, ray):
14 2560000 27956358 10.9 19.2 temp = ray.origin - self.center
15 2560000 17944912 7.0 12.3 a = numpy.dot(ray.direction, ray.direction)
16 2560000 24132737 9.4 16.5 b = 2.0 * numpy.dot(temp, ray.direction)
17 2560000 37113811 14.5 25.4 c = numpy.dot(temp, temp) \
- self.radius * self.radius
18 2560000 20808930 8.1 14.3 disc = b * b - 4.0 * a * c
19
20 2560000 10963318 4.3 7.5 if (disc < 0.0):
21 2539908 5403624 2.1 3.7 return None
22 else:
23 20092 75076 3.7 0.1 e = math.sqrt(disc)
24 20092 104950 5.2 0.1 denom = 2.0 * a
25 20092 115956 5.8 0.1 t = (-b - e) / denom
26 20092 83382 4.2 0.1 if (t > 1.0e-7):
27 20092 525272 26.1 0.4 normal = (temp + t * ray.direction)\
/ self.radius
28 20092 333879 16.6 0.2 hit_point = ray.origin + t * \
ray.direction
29 20092 299494 14.9 0.2 return ShadeRecord.ShadeRecord(
normal=normal,
hit_point=hit_point,
parameter=t,
color=self.color)
Therefore, it appears that most of the time is spent in this chunk of code:
temp = ray.origin - self.center a = numpy.dot(ray.direction, ray.direction) b = 2.0 * numpy.dot(temp, ray.direction) c = numpy.dot(temp, temp) - self.radius * self.radius disc = b * b - 4.0 * a * c
We cannot really optimize much. We could precompute self.radius * self.radius, but it does not really have an impact. Something we can observe is the huge amount of routine calls. Is the routine call overhead relevant ? Maybe: Python has a relevant call overhead, but a very simple program like this
def main():
def f():
return 0
a=0
for i in xrange(2560000):
if f():
a = a+1
print a
main()
is going to take 0.6 seconds, not small, but definitely not as huge as the numbers we see. Why is that ? And why is the raytracer so slow for the same task ? I think the bottleneck is somewhere else.
I decided to profile World.render() to understand what’s going on: this is the routine in charge of going through the pixels, shooting the rays, then delegating the task of finding intersections to Tracer.trace_ray, which in turns re-delegates the task to World.hit_bare_bone_object. I don’t really like this design, but I stick to the book as much as possible, mostly because I don’t know how things will become later on.
The profiling showed two hot spots in World.render(), in the inner loop:
Line # Hits Time Per Hit % Time Line Contents
==============================================================
41 640000 18786192 29.4 29.2 ray = Ray.Ray(origin = origin,
direction = (0.0,0.0,-1.0))
42
43 640000 22414265 35.0 34.9 color += numpy.array(tracer.trace_ray(ray))
Why is it so slow to perform these two operations? It turns out that numpy is incredibly slow at creating arrays. This may indeed be the reason why it’s so slow to instantiate a Ray object (two numpy.arrays), to add the color (another instantiation) and to perform operations in the Sphere.hit slow lines. At this point I’m not sure I can trust numpy.array, and I decide to remove it completely replacing arrays with tuples. The result is pleasing
$ time python test3.py real 0m31.215s user 0m29.923s sys 0m2.355s
This is an important point: tuples are much faster than small arrays. numpy seems to be optimized for large datasets and performs poorly when handling small ones. This includes not only the creation of the arrays, but also any operation in numpy that may create numpy arrays as a consequence, such as calling numpy.dot on two tuples instead of a trivial implementation such as
def dot(a,b):
return a[0]*b[0]+a[1]*b[1]+a[2]*b[2]
in fact, if I use numpy.dot on tuples in Sphere.hit():
a = numpy.dot(ray.direction, ray.direction) b = 2.0 * numpy.dot(temp, ray.direction) c = numpy.dot(temp, temp) - self.radius * self.radius
the total running time goes from 31 seconds to a staggering 316 seconds (5 minutes). My guess is that they are converted to numpy.arrays internally, followed by the actual vector-vector operation.
I call myself happy with a runtime of 30 seconds for now, and plan to optimize further when more complex operations are performed. You can find the version for this post at github.
In the previous post, we explored a very basic way of plotting images: shooting a ray from the center of every pixel, and plot the color of the object we hit. The result is a rather flat, very jagged image
Border jagging arises from the fact that we are sampling with a discrete grid (our ViewPlane) an object that is smooth due to its functional expression (our spheres). The sampling is by nature approximated, and when mapped to pixels it produces either the color of the object, or the color of the background.
We would like to remove the jagged behavior. To achieve this, we could increase the resolution. That would make the jaggies less appreciable thanks to a higher number of pixels. An alternative is to keep the lower resolution, and shoot many rays per each pixels instead of only one (a technique called supersampling). To compute the final color, weighting is performed according to the number of rays that impact vs. the total number of rays sent from that pixel. This technique is known as anti-aliasing.
The figure details visually what said above: the real description of the sphere is smooth (left hand figure). Shooting one ray per pixel, and coloring the pixel according to hit/no-hit, produces either a fully colored pixel, or a fully background pixel (center). With supersampling, we shoot a higher number of rays, and per each pixel we perform weighting of the color.
As a result, the jaggies in the sphere are replaced with a smoother, more pleasant transition
The pattern used for the supersampled rays is important. The module samplers in the python raytracer implements three common strategies.
It is the most intuitive and easy, and the one used above: a regular grid of rays. It is easy to implement but it can introduce unpleasant artifacts for more complex situations. Plotting the position of the rays in the pixel will produce the following layout (for a 8×8 supersample grid)
As we see, the layout is regular on the grid of subcells (painted yellow and white for better visualization) that define the pixel. On the vertical and horizontal distributions (plotted on the grey bars above and on the left) we also see a regular distribution, but its regularity may introduce artifacts in some cases.
Random sampling positions the rays at random within the pixel. This may sound appealing, but it may instead end up as suboptimal: it can produce random clumping, in particular for a small number of samples. This will unbalance the weighting leading to an incorrect evaluation. Plotting one distribution one may obtain
Note the uneven distribution of the points, leaving large parts not sampled and other parts oversampled. In addition, the vertical and horizontal distribution tend to be uneven.
Jittered sampling takes the best of both worlds: the regularity of the Regular sampler with a degree of randomness from the Random sampler. The idea is to select the center of each subcell and apply randomization, so that each subcell produces only one ray, but without the artifact inducing regularity proper of the Regular sampler.
Unfortunately, performing supersampling makes the creation of the image considerably slower. In the following table you can see the timings (in seconds) for Jittered and Regular sampling, compared against the case with no sampling. The size of the image rendered is 200×200 pixels.
| Sample size/sets | No sampling | Regular | Random | Jittered |
| 1/1 | 23 | |||
| 4/1 | 68 | 63 | 160 | |
| 9/1 | 158 | 152 | 271 | |
| 16/1 | 229 | 235 | 276 | |
| 16/2 | 223 | 247 | 267 | |
| 16/4 | 240 | 277 | 260 |
As we can see, supersampling introduces a considerably higher computational effort. We also see that having multiple sets (a requirement to prevent the same set of subsamples to be reused for adjacent pixels, something that again would introduce artifacts) does not really change the timings. According to the current implementation, I expect this to be verified. On the other hand, I don’t expect timings for Regular and Jittered to be so different, since the creation of the values is performed once and for all at startup. This is worth investigating while looking for performance improvement. In the next post I will perform profiling of the python code and check possible strategies to reduce the timings, eventually revising the current design.
The current implementation of python-raytrace can be found at github. In this release, I added the samplers. Samplers are derived classes of the BaseSampler class, and are hosted in the samplers module. Derived Samplers must reimplement _generate_samples. Points are stored because we want to be able to select sets at random as well as replay the same points. Samplers also reimplements the __iter__() method as a generator of (x,y) tuples, with x and y being in the interval [0.0, 1.0). Once initialized, the Sampler can therefore be iterated over with a simple
for subpixel in sampler:
# use subpixel
The World class can now be configured with different Samplers for the antialiasing. The default Sampler is a Regular 1 subpixel Sampler, which is the original one-ray-per-pixel sampling.
A quick addition needed to the raytracer is providing freedom to add more objects to the rendering scene. In Part 1, the design was such that only one object, a sphere, could be drawn. The new code allows much more flexibility. I added a Plane object, introduced assignment of colors to the objects, divided the source into multiple files, and fixed a bug relative to rendering direction. Let’s see more specifically.
The class World is the main interface to user programming of the raytracer. This small program creates and renders a scene containing four spheres, a white one in the origin, the others along the axes, each with different colors.
import raytrace
from raytrace import objects
w=raytrace.World()
w.add_object(objects.Sphere(center=(0.0,0.0,0.0),
radius=10.0,
color=(1.0,1.0,1.0)
)
)
w.add_object(objects.Sphere(center=(50.0,0.0,0.0),
radius=10.0,
color=(1.0,0.0,0.0)
)
)
w.add_object(objects.Sphere(center=(0.0,50.0,0.0),
radius=10.0,
color=(0.0,1.0,0.0)
)
)
w.add_object(objects.Sphere(center=(0.0,0.0,50.0),
radius=10.0,
color=(0.0,0.0,1.0)
)
)
w.render()
The resulting image is the following
Clearly, the white sphere is not visible, as it’s hidden by the blue sphere. This test allowed me to discover a problem with orientation: the green sphere was on the wrong side. I therefore had to analyze a bit the orientation and the different coordinate systems into play here
A remapping is therefore needed from the pixel coordinate of the rendering and the pixel coordinate of the display (e.g. pygame). The transformation is trivial, of course, but it must be kept into account, otherwise pics will be flipped horizontally.
Another interesting fact is that, according to “Ray Tracing from the ground up” it’s useful to perform the raytracing operation starting from the bottom and working our way up, rendering pixels from left to right. According to them this is for coding symmetry and future convenience, so we stick to it.
In the book, finding the foremost object is made more complex by the language used, C++. In python, you can use functional programming style to obtain the same in a very concise statement. The idea is to cast a ray, then go through all objects in the world to find the intersection point (if any) between the ray and the object. If more than one object is hit, the one with the intersection point closer to the observer commands the pixel color, since it’s in front.
I achieve this with the following code
def hit_bare_bones_object(self,ray):
def f(o):
shadeRec = o.hit(ray)
if shadeRec:
return (shadeRec.parameter, o)
else:
return None
try:
foremost=sorted( \
filter(lambda x: x is not None, \
map(f, self.objects)
), key=lambda x: x[0]
)[0][1]
except IndexError:
return None
return foremost
What does this code do ? I defined a simple internal function f which accepts an object, performs the hit and returns a tuple containing the hit point position (as a parameter of the ray, so it’s a single number, not a xyz coordinate) and the object.
Now, I use this function to map all the objects defined in the world. I will obtain a list with one entry per each object, either a None (not hit) or a 2-tuple containing the parameter and the hit object. I filter out the None entries, leaving only the 2-tuples and then sort according to their first element. The 2-tuple with the lowest parameter is now at index 0, and the [1] element of this tuple is the foremost object. At any time, the list may be empty (such as if you don’t have any object, or no object is hit. In that case, a IndexError will be raised and that will indicate that the ray hit nothing. I may rework on this function later on, but for this second round, it suits my needs.
It’s now time to move on to samplers. Given that the code is growing in size, I created a git repository you can clone from. The release of this post is available here. The code is under BSD license.
Some time ago I visited Sydney, and I made a tragic mistake: I entered the University bookshop. Why a mistake, you say? I am book maniac. As soon as I enter a book shop (live or on web) I end up spending up to a thousands euro every time. This time, it was not the credit card, but the limit of 20 kg on my luggage to put a limit on what I could buy. Needless to say I got really good stuff, in particular this 761 pages of pure awesomeness: “Ray tracing from the ground up”. This book teaches you how to write a raytracer, step by step from a first basic skeleton to incredibly complex optic effects. It also provides code, available for download under GPL. I haven’t downloaded it, but according to the book it’s in C++. Since I want to keep myself fresh with python, I will write my own version in this language.
The first objection I may hear is performance. Native python is much slower than C++. I agree, and I do it on purpose. My objective is in fact not only to write a raytracer for fun (and the world is full of already awesome raytracers), but also to perform some infrequent python exercises: interfacing with C, parallelization and, hopefully, some OpenCL programming. This is my hope, at least. I will try to spend some time on it and see how far I can get.
Raytracing is a technique to produce a photorealistic image. It works by projecting rays from the observer to the scene, and coloring pixels on a viewplane for every ray that intersects an object.
This mechanism resembles how vision works, although in the opposite direction. Light rays from a lamp hit objects and their reflection happens to scatter around. Some of these reflections will enter our eyes and allow us to see the world around us. Doing the opposite, tracing from the observer, is clearly more efficient as we don’t care about the rays not hitting the observer (at least in its basic implementation), only those who do.
Performing raytracing (or to be more accurate, for now just its basic form raycasting) involves the following steps:
That’s it. Basically, it’s an exercise in geometry: finding intersections between lines and 3D objects in 3D space. This first program, ray.py, does exactly that. You will need to install the Python Imaging Library, pygame and numpy. The result is intriguing:
Ok, I have a very loose definition of “intriguing”, but it’s a start.
Hello StumbleUpon users. I am writing two additional posts on the Mandelbrot set, and you may be interested in them. The first one is here. The second one is here. Thank you for your interest!
This code is so fascinating
from PIL import Image
max_iteration = 1000
x_center = -1.0
y_center = 0.0
size = 300
im = Image.new("RGB", (size,size))
for i in xrange(size):
for j in xrange(size):
x,y = ( x_center + 4.0*float(i-size/2)/size,
y_center + 4.0*float(j-size/2)/size
)
a,b = (0.0, 0.0)
iteration = 0
while (a**2 + b**2 <= 4.0 and iteration < max_iteration):
a,b = a**2 - b**2 + x, 2*a*b + y
iteration += 1
if iteration == max_iteration:
color_value = 255
else:
color_value = iteration*10 % 255
im.putpixel( (i,j), (color_value, color_value, color_value))
im.save("mandelbrot.png", "PNG")
If you execute the python code above, the result is this beautiful and mysterious picture:
What you see is a Fractal, a highly fractured geometric picture, and more specifically one of the most famous representatives: the Mandelbrot Set. I am not familiar at all with the mathematics behind fractals, so I won’t talk about them, or I will simply end up saying something wrong.
Such a highly featured and attractive picture is obtained from a really simple algorithm. It can be explained easily: take a grid of points on the plane, like for example a simple sheet of graph paper. Choose a scale for the axes. For every point (x,y) of the plane now do the following:
This, as said, decides the color of one single point (hence, a pixel of the image). Repeat for all the pixels and what you obtain is the Mandelbrot set.
You can also think in terms of complex numbers, plotting on the complex plane. Instead of (a,b) we now have a single complex number z0, and instead of (x,y) we now have a complex number c. The repeated operation now can be described as
What if you change the starting point? What if we choose z0 = 1.0 instead of 0.0, or z0 = 1.0 i ? I tinkered with the program a bit. Here are the results. This image shows how the Mandelbrot Set changes when z0 goes from -3 to 3 (step 0.1). The set “blows away” towards the left, when going either towards -3 or +3.

If you change z0 along the imaginary axis, going from -3.0 i to 3.0 i, what you obtain is instead:

It blows away towards the right. So I said, what happens if I increase both the real part and the imaginary part together? will they “balance out” ? Not really. It burns away

I have absolutely no idea of what all this means from the strict mathematical point of view, except the fact that I am exploring the space of the values of z0 and how the Mandelbrot changes accordingly. I thought it was just cool to try it out.