### Abstract

Fractals demonstrate ‘Self-similarity’ — each set comprises several smaller similar copies of itself. This can be demonstrated using a geometric shape as a template which represents an attractor. Templates describe graphically the functions which govern the movement between points on a plane. Templates can give an approximation of a fractal, but can be drawn on a computer using the Chaos Game. Self-affine fractals allow an increase in the range of fractal possibilities via modifications to template dimensions, whilst statistically self-similar fractals add an element of randomness for a more natural appearance. Fractals can also be used to compress images to more manageable file sizes.

### Self-similar fractals and their templates

The concept of similarity features prominently in classical geometry, notably in the writings of the Greek mathematician Euclid (*c*.300 bc), the ‘Father of Geometry’, who laid the foundations of rigorous geometrical argument. Two figures in the plane are *similar* if they have the same shape, but not necessarily the same size, so that one may be obtained from the other first by scaling and then sliding it around and rotating it, perhaps flipping it over. In modern parlance, two objects in the plane are similar if it is possible to make a reduced or enlarged photocopy of one and position it to coincide exactly with the other, perhaps turning it over first. If this can be done without scaling, in other words, with a same size photocopy, the objects are called *congruent*. Otherwise, the reduction or enlargement factor required to produce the similar copy is called the *scale* or *scaling ratio* of the copy, which may be expressed as either a fraction or percentage. Thus if a figure is a ¼ scale (or 25 per cent scale) copy of another then all the lines in the copy are ¼ of the length of the corresponding lines in the original.

All circles are similar to each other, as are all squares. However, two triangles are similar precisely when the three angles of one triangle are the same as the three angles of the other. Two p. 18↵rectangles are similar just when the length of the longer side divided by that of the shorter side is the same for both rectangles. Figure 8 shows several figures that are all similar to each other.

A *self-similar* set is one that is made up of several smaller similar copies of itself. We’ve seen that the von Koch curve is self-similar since it comprises 4 suitably placed scale ⅓ copies of itself. This may be represented diagrammatically by a *template* consisting of one large rectangle and 4 smaller ones, each a ⅓ scale copy of the large one, see Figure 9. The part of the von Koch curve framed by each of the smaller rectangles is a ⅓ scale copy of the whole curve framed by the large rectangle. The size and position of the p. 19↵rectangles specify the scaling and positioning needed to fit together the four smaller copies to make up the whole von Koch curve.

But much more than this is true. The template is a simple diagram made up of classical geometrical shapes, namely five rectangles—there is nothing fractal about it. Nevertheless, the template completely defines the fractal: the von Koch curve is (essentially) the *only* object that is made up of smaller scaled copies of itself positioned as indicated by the template. It is easy to reconstruct the von Koch curve from the template by repeated substitution of a scaled template into the smaller rectangles. The first couple of stages are indicated in Figure 10 from which it p. 20↵should be clear that the small rectangles form patterns that get progressively closer to the von Koch curve itself.

This is a particular case of a very general and powerful method of describing and constructing fractals. A *template* consists of some simple shape (square, rectangle, triangle, etc.) and a number of smaller similar copies of the shape positioned somehow in the plane. Each of the smaller shapes represents a *similarity transformation* or *scaling transformation* that takes the larger shape and scales and repositions it to coincide with the smaller copy, scaling and positioning any figure drawn inside correspondingly. A fundamental property is that templates define fractals. That is, given a template, there is essentially just one figure, usually a fractal, made up of smaller copies of itself scaled and positioned according to the shapes in the template. (This statement is not quite accurate, but we will be more precise soon.) This figure is sometimes called the *attractor* of the template.

For a template made up of similar shapes, the attractor will be *self-similar*. Figures 11 and 12 show several self-similar fractals with their templates. In each case note that what is seen in each of the smaller regions of the template is a scaled copy of the whole. Figure 11 is known as the *(right-angled) Sierpiński triangle* made up of 3 copies of itself at scale ½. The *snowflake* of Figure 12(a) is made up of 4 smaller copies of itself; 3 are at scale ¼ whilst the larger central copy is at scale ¾ and has been rotated by 180° (or alternatively reflected in a horizontal line). This example illustrates two variants: the scalings of the similarity transformations do not all have to be the same, and some of the copies may be reflected or rotated. The *spiral* in Figure 12(b) is remarkable in that it is defined by just 2 similarity transformations. The first scales down only slightly with a 19/20 ratio but with a 45° rotation, and the second transforms the whole picture to the spiral at the end of one of the arms of the main spiral scaling at 1/5.p. 21↵

### Orientation

In saying that a template completely defines a fractal we have glossed over the possibility that if the shape in the template has some symmetry then there will be more than one similarity that transforms the shape to a smaller copy. For a simple instance, there are 8 different similarities that transform a square to a smaller square. The scaling can be direct or there can be a rotation through 90°, 180°, or 270°, or a ‘mirror’ reflection in a vertical, horizontal, or one of the two diagonal lines, see Figure 13; we refer to these possibilities as the *orientations* of the scalings. Thus for the template used for the Sierpiński triangle in Figure 11 we have p. 22↵p. 23↵a choice of 8 similarity transformations that move the large square onto each of the 3 smaller squares, so there are 8 × 8 × 8 = 512 different ways of choosing the three similarity transformations. Figure 14 shows two of the possible attractors resulting from the same template but with different choices of orientation for the similar copies. Note that the attractor of Figure 14(a) may be obtained from 8 different combinations of orientation of the similarities: the portion of the attractor in each of the three small squares may be obtained from the whole attractor either with a rotation or by a reflection in a diagonal. On the other hand, there is only one choice of orientation for each of the three similarities that gives the attractor of Figure 14(b).

It turns out that there are 456 different attractors defined by this ‘three squares’ template. Of these, 8 have mirror symmetry about the rising diagonal, and all of these result from 8 different combinations of orientations for the three small images. The other 448 result from a unique combination of orientations.

Thus if the shapes in a template shape have some symmetry, the template alone need not define a unique attractor; however, once the orientations of the similarity transformations are specified, p. 24↵p. 25↵there ceases to be any ambiguity. This important fact may be stated as follows:

Given a template, there exists a unique set (called the attractor and usually a fractal) that is made up of similar copies of itself which are scaled and positioned according to the shapes in the template, provided that the orientation (i.e. reflections or rotations) of each copy is specified.

(The perceptive reader may realize that this is still not strictly true—there are ‘pathological’ sets which also fit the templates: for example the ‘empty set’, which has a blank picture, fits any template. But for here this statement is good enough.)

### Templates and functions

Functions were introduced as a rule or formula that takes each point of the plane and ‘moves’ it to a new position. There is another way of thinking of functions which emphasizes their geometrical effect. A (black and white) picture in the plane is made up of a large collection of points (these might be thought of as the dots or pixels on a computer screen that make up the picture). A function moves each individual point to a new point. Thus the *set* of points that make up the picture are moved by the function to a *set* of new positions, which taken together form a new picture. The function *transforms* the original set or picture into a new one, Figure 15. The word *transformation* is used synonymously with ‘function’, particularly when thinking about the geometrical effect on objects. (The words *map* or *mapping* are also used—in the familiar sense of the word, a map is really a function that associates each point on the ground with a point on the page of an atlas, at the same time transforming a coastline, say, to a corresponding wiggly curve on the page.)

We have encountered the function (*x, y*) → (½*x*, ½*y*) which moves each point in the plane to the point midway between it and p. 26↵the origin. Its geometrical effect is to shrink everything down towards the origin by a factor ½. This function will transform any picture to a ½ size copy—thus it represents a similarity transformation of scale ½.

Templates are simply a diagrammatic way of expressing a family of transformations or functions. Each of the smaller shapes codes the similarity transformation which takes the large shape and scales it down to the size and position of the smaller one. Each point inside the large shape is moved to the corresponding position within the smaller one, so any picture inside the large shape is transformed to a similar picture inside the smaller shape.

The Sierpiński triangle of Figure 11 has a template comprising the ‘unit square’, that is the square of side length 1 with corner with coordinates (0, 0), (1, 0), (0, 1), (1, 1), and three smaller squares of side-lengths . The three functions giving the three similarity transformations that transform the unit square to the bottom left, bottom right, and top left squares respectively, have the formulae:

p. 27The first function just shrinks everything down towards the origin by a scale factor ½. The second function does the same thing and then, because of the ‘+ ½; term, shifts everything a distance ½ to the right. The third function scales by a factor ½ and then shifts a distance ½ up. As well as transforming the square, the functions transform the fractal itself. Thus (*x, y*) → (*x, y*) shrinks the entire Sierpiński triangle onto the scale ½ copy inside the bottom left square. Similarly, the other two functions in (A) transform the entire Sierpiński triangle onto the similar copies in the bottom right and top left small squares. Together, these three scale ½ copies make up the entire Sierpiński triangle. The three functions (1) describe the basic self-similarities of the Sierpiński triangle—they are a mathematical way of describing the template. The Sierpiński triangle is completely defined by the template and orientations, or equivalently by the three functions. However, the functions have the advantage that they also encode the orientations of the transformations—these do not have to be noted separately. For example, the three functions that transform the fractal in Figure 14(b) to the similar copies in the three smaller squares are

with the second function differing from that in (1).

We saw that a template (with given orientations) defines a unique fractal. This fact can be expressed in the language of transformations or functions, namely that a family of transformations defines a unique object.

Given a family of contracting transformations, there exists a unique set (called the

attractorand usually a fractal) such that, taken together, the transformed copies of the set make up the entire set.

This ‘theorem’ or ‘mathematical fact’ can be stated and proved in a formal mathematical way. Note that the theorem states two p. 28↵things: that such a set exists, and that it is unique—there is only one figure which satisfies the stated property. This is typical of many major theorems in mathematics which guarantee the existence and uniqueness of entities with certain properties. Notice also that this statement does not require the transformations to be similarities—it holds for any *contracting* transformations, that is transformations that shrink sets in some way, and similarities of scale less than 1 are a special case of this. A family of contracting transformations is called an *iterated function system*, and the theorem is sometimes called the ‘Fundamental Theorem of Iterated Function Systems’. In these terms the theorem simply asserts that an iterated function system has a unique attractor.

### Drawing fractals

Given a template or a family of transformations, how can we obtain good pictures, usually on a computer, of the unique fractal so defined? One widely-applicable method was indicated above for the von Koch curve in Figure 10 where we repeatedly substituted scaled down copies of the template into itself to get increasingly good approximations to the fractal.

An alternative method that is easy and efficient to program, called the *Chaos Game*, was introduced by Michael Barnsley. This produces a sequence of points in the plane (i.e. on a computer screen) that give a very good approximation to the fractal. This is rather like drawing fractals by iterating a function as we did for the Hénon attractor, the difference here is that at each step the function to be applied is chosen at random.

Let's consider this for the Sierpiński triangle with its template expressed by the three functions in (1). Take any initial starting point, the origin (0, 0) will do. Select one of the three functions at random, for example by tossing a die, and if it comes up 1 or 2 choose the first function, for 3 or 4 choose the second, and for p. 29↵5 or 6 choose the third, and apply the chosen function to the initial point to get a second point. Throw the die again to select another function and apply it to this second point to get a third. Continuing in this way, we get a sequence of points, each obtained from the previous one by applying one of the functions chosen at random. These points will not be arbitrarily scattered across the picture but will jump around the Sierpiński triangle and after a time will fill out the whole fractal. Omitting the first 100 points (to allow the procedure to settle down) and plotting the next 10,000 points will give a good computer image of the Sierpiński triangle. This method is effective for computing a picture of the fractal defined by any family of contracting transformations. The chaos game was used to draw the fractals in Figure 12.

### Self-affine fractals

An *affine transformation* is more general than a similarity, in that it scales by different ratios in different directions and so elongates objects, for example affine transformations transform squares into rectangles or parallelograms, and circles into ellipses. A *self-affine* fractal is one that is made up of smaller affine copies of itself. Self-affine fractals can be represented by templates, but now the smaller shapes are affine copies, rather than similar copies of the large shape. Typically the template will consist of a large square and smaller rectangles or even parallelograms. Figure 16 shows a self-affine fractal with its template with each of the three rectangles containing an affine copy of the whole. For self-affine fractals the component parts may be elongated in certain directions, and this allows a much greater range of fractals to be constructed than just self-similar ones. Figure 17 shows a self-affine fern and a self-affine tree, which on looking closely are each made up of 5 smaller affine copies—they each have templates consisting of just 5 rectangles or parallelograms (the templates have not been shown to avoid the pictures becoming too cluttered).p. 30↵

### Statistically self-similar fractals

Another variation on self-similar constructions is to introduce some randomness. To illustrate this, let's return to the von Koch curve construction. As before, start with an interval, split it into three equal parts, and remove the centre part. But at this juncture toss a coin. If it comes up heads then insert the new line segments with the ‘⋀’ shape pointing upward, just as before, but if it is a tail insert the ‘ ⋁ ’ shape pointing downward rather than upwards. Now remove the middle thirds of these four straight segments, and toss a coin for each gap in turn. Insert the ‘ ⋁ ’ pointing upwards for each head, but downwards for each tail. Continue in this way, tossing the coin for each gap to determine the direction of the new part. This produces a *random von Koch curve*, see Figure 18.

p. 31This random curve is not self-similar in the strict sense: 33 per cent photo-reduction does not give exactly the same picture as each of the four component parts. Nevertheless, it is *statistically self-similar* in that the modification inside each segment follows p. 32↵just the same random procedure at every stage of the construction.

Introducing an element of randomness into a fractal construction can often result in a more natural appearance. A ‘real’ coastline has irregularities more like the random von Koch curve than the non-random version. A forest skyline viewed from a distance might be considered a fractal, but there will be a randomness in the skyline resulting from statistical variations in tree heights.

### Fractal image compression

Image compression involves coding pictures by a relatively small amount of information which is nevertheless enough to enable the p. 33↵accurate reconstruction of the pictures. Such techniques are central to internet technology. There is a limit to the rate at which data can be transmitted to a computer via a cable or a wireless connection and sending images pixel by pixel would be very slow indeed. To overcome this, pictures and movies are coded in a very efficient way before being sent to personal computers which contain software to decode the information and display the pictures.

Image compression depends on the fact that there is a considerable amount of redundancy in any picture or photograph. If the 2 million or so pixels or ‘dots’ on a computer screen were coloured completely haphazardly by any of the 16 million different shades in principle available, this information could not be compressed if an identical screen image was to be reproduced. However, in reality, regions of a picture will vary slowly in texture and colour, for example there may a large area of sky of nearly constant shade. Moreover, there may be repetition—one part of a wheat field looks very similar to another. A compression method capitalizes on such redundancy to reduce the amount of information needed to specify the picture. Such methods are lossy in that the picture reconstructed from the compressed data will not be identical to the original—pixels will have slightly different colours and some features may be less clear. Nevertheless, a good compression method will lead to a reconstructed image that, to the eye, is virtually indistinguishable from the original.

We have seen many fractals that are determined by a template of a simple form. In particular, convincing pictures of natural objects such as trees, clouds, ferns, or grasses result from templates with a small number of shapes. These pictures are completely specified by their templates (along with any orientations needed) and these can be represented very concisely, for example by giving the coordinates of the corners of the figures in the template. This provides a very efficient way of coding intricate pictures—the picture of the tree in Figure 17 is effectively determined by about 30 two-digit numbers. In this way complex objects can be p. 34↵specified by a small amount of data that allows the object to be reconstructed using one of the drawing methods indicated above.

As Michael Barnsley realized in the late 1980s, this has a potential for image compression. If, starting with some black and white drawing in the plane, one could find a relatively small number of transformations for which the drawing, or at least a good approximation to it, is the attractor, then the drawing could be compressed into a small amount of data. Ideally, one would like an automatic ‘scanner’ or ‘camera’ which could scan a picture and output a small family of transformations or, equivalently, a template, whose attractor is indistinguishable from the picture.

A great deal of research was done into the 1990s to develop such a process. One method, very roughly, involves partitioning a picture into squares and, for each 2 × 2 block of four squares, scanning the individual squares to find the one in which the picture is as closely as possible a ½* scale version of that within the 2 × 2 block. This gives a similarity transformation from a block of squares to an individual square, and taking these all together yields a family of transformations with an attractor that is hopefully close to the original picture. Essentially, this procedure seeks out approximate self-similarities within the drawing. Such a method may be adapted to greyscale pictures, that is those made up of all shades from black to white, and indeed to colour pictures.

Fractal compression methods can give very good data compression ratios with little loss of resolution when the pictures are reconstructed. Moreover, reconstructing images from the family of transformations is very quick and efficient. The disadvantage is that the process of scanning squares required to encode images is computationally rather slow. In particular, the automatic encoding was not really rapid enough when the challenge became to compress videos in the mid- to late 1990s. Nowadays, methods such as JPEG and, for video, MPEG, together with wavelet methods dominate image compression and fractal compression is less used.