Tuesday, 26. September 2006, 17:37:10
Project V is advancing at quite a rapid pace lately. I'm almost back to 100% and I can again write code at my usual speed. I'm still rewriting my graphics library and I think it's smart that I spend the time to make sure my graphics look good. This is a visual environment and the last thing I want is people being disillusioned because it doesn't look good.
I finished my line drawing and ellipse drawing routine (with alpha channels) last week. This week, I concentrated on font rendering. It is ultra precise. I may have to optimize some of it if it's too slow. Right now, it seems fine. In Windows, there are not many options to render fonts. If you want to alter anything, it's very difficult to do. For example, the font used in this blog's header is Century Gothic, but you can't get it to appear that way with Windows. First, anything below 24pt shows up either too thin (as in the menu bar of this blog which is ok for that) or too thick if in bold. In an image software, I just used a 24pt size and scaled it down by 20%. I also adjusted the inter-character spacing. You can adjust inter-character spacing in Windows, but you can't scale. Well, you can, but there's an irritating side-effect. If the size, after scaling, goes below what it would be around 18pt, it reverts to that size's shape. So it again gets too thin. This is NOT caused by the reduction in size from scaling. Windows will actually fetch the font shape based on the final size and not the original size specified. Also, it completely ignores the weight set for the font. Either it's normal or it's bold. There is no in-between although the API lets you specify it.
Normally, I don't give a damn about fonts, but I found the fact that fonts changed shape seemingly at random really aggravating. Luckily, Windows offers a way to fetch glyphs for each character you wish to render. I still use the Windows API to get the spacing of characters so they appear correct, but I handle all the scaling and conversion myself. This way, the original shape doesn't magically change to that of a lower font size during conversion. I can rotate, shear or do any kind of transformation. I can also apply a filter to the final output. I am now working on enabling an outline of a certain font. It is slightly memory intensive, but the algorithm is rather simple.
Since I want an alpha channel so that all the fonts look very smooth, I need values from 0 to 255. Well, it just so happens that 16*16 is 256. That's very close. We can just reduce all values above 128 down by 1 to get a maximum of 255. So what happens is I create a temporary double buffer for a single glyph and render it at 16 times the size wanted. Then I just count how many pixels are set in each 16x16 grid. That gives me a total of 256 (actually 257 with 0) possible values. I should set an option to specify the grid size. Windows offers grid sizes of 1x1, 2x2, 4x4 and 8x8 only. I offer one size larger. It could be useful in the future if we allow more than 256 shades for each color channel. Although there are many colors that can't be distinguished, one thing that is lacking with the RGB system is that the human eye can see more than the 256 ranges for individual colors. For example, we can see more than the 256 shades of gray, red, green or blue or other solid color. That's why gradients that change colors often look better than gradients that only alter the brightness of a single color.
As for the glyph itself, I thought it would be a lot harder than it was. When you retrieve a glyph, it is upside down with positive values going up. Perhaps it is the display that actually has inverted y axis with values progressing downwards. Either way, the y values must be inverted. Then you can have one of three different shapes to constitute the outline of the glyph. Well, there are really two shapes, but we'll get to that soon enough. The first shape is a list of straight lines. This is simple enough. Just go through them and draw them. The other kinds of shapes are Bézier curves or spline curves. These can show up as a simple third order spline or as two connected third order splines sharing some of the same control points. My code can handle any amount of connected third order splines, so maybe that'll be useful in the future. The storage details aren't important, but the way it is rendered may be of note. With splines, you can't just ask if a pixel is on the spline or not. You can't go "what's the y value for this x value?". Well, you can't do it without a lot of computations.
There are two ways of rendering splines. One popular method is the subdivision algorithm. A spline has three points (A,B,C). A and C are located on the spline itself, one at each end point. B is a control point that usually lies outside of the spline (otherwise it would just be a line and there's no need to use a spline). The subdivision algorithm takes the midpoint of D=mid(A,B) and E=mid(B,C) and then again takes the midpoint between these two points F=mid(D,E). So F lies directly on the spline. For the next subdivision, you repeat the process, but using (A,D,F) and (F,E,C). Continue as necessary. Usually, this requires recursion and I abhor recursion. Yes, it seems clever and I do understand it all too well. I can convert any recursive algorithm into a non-recursive one and I can think of recursive algorithms for many problems usually before the non-recursive solution. But recursion takes an inordinate amount of memory. With non-recursive algorithms, you can set up your memory requirements ahead of time and no matter how much "depth" or iterations the algorithm requires, you can rest assured that your code won't go nuts with memory.
Non-recursive it is then. Although the subdivision algorithm only requires addition and shifting right by one, it requires saving the state for each successive subdivision. This grows at an exponential rate. So in my non-recursive algorithm, I choose to just compute each point along the curve the old fashioned way... with an equation. Sure, it requires more multiplications, but I think it's faster than the overhead of calling, storing and restoring all those stack frames. I don't have proof of this, but I did have just a couple of hand coded subdivisions and I saw a speedup even though I was computing many more points. A spline is easily computed. What you can do is compute a point along the spline depending on a percentage. Now, you won't get exactly where you expect. For example, 50% may not be what appears to be visually at 50% along the path. This is because the control points control where the halfway point is. More points control their share of the spline. So if most control points are near the first end point, then most computed points will be in that area. You'll have to go near 80% just to get to points that appear to be visually over 50% of the spline's length. But this usually isn't a problem. Just plot more points. The number of plot points is adjustable. So now we just have a loop that computes points along the spline. Just draw a line between each set of points until we reach the end point.
The line drawing algorithm is also a little weird. It only computes one point for each y value along the line. In other words, we do scanline conversion. Also, we always draw from top to bottom and never draw the bottom most point. The reason for this is two fold. First, whenever we set a pixel, we actually XOR (flip) it. We want start and end pixels on every row. If two lines cross at the exact same pixel, then we would get into a problem if we only had a start point, but no end point to fill in. So we flip it if it's already set. Since we XOR and successive lines share endpoints, this would mean endpoints would always be written twice and thus always be off. This is not what we want. So we don't draw the end point. The other reason we don't draw the bottom most point is for parallel lines, which we don't draw at all. If we drew the bottom most point, this would create a gap. For example, the letter L is basically two rectangles. Suppose we draw the bottom rectangle and then we draw the right most vertical line from the vertical rectangle. If we draw the bottom most point of this line (that meets the top of the bottom rectangle), it would be inside the bottom rectangle and screw up our scanline rendering algorithm.
The scanline rendering algorithm simply has two states. Whenever it sees a set pixel, it sets every pixel afterwards until it reaches another already set pixel. Then it switches off and repeats the scanning process. If the vertical line as mentioned above juts into a rectangle, then the scanner would turn off before reaching the end of the rectangle. So we don't draw the bottom most pixel in the line drawing algorithm. We could have chosen the top most pixel, just as long as you decide which one to not draw. It has to be consistent throughout. I originally had tried something like not drawing the end point of each successive line no matter if they went up or down like a chain. But this did not work because you get into problem with horizontal lines. With horizontal lines, you don't draw the end point if the sign of the slope of the previous line is the same as the next line, but you do draw it if they are different. Also, the starting point is not drawn if you have successive horizontal lines. Although improbable, it still has to be considered. It plain didn't work. Funny how the very simple algorithm of just not drawing the bottom most pixel on the y axis worked perfectly. The slope doesn't even matter and you can always reorder the endpoints so that you draw from top to bottom.
Small text is still a problem, but I think I'll add an option to re-adjust the intensity if the levels over the entire glyph fall below 255. In other words, if there isn't a single pixel at full intensity, then re-adjust the brightness. This way, it assures that any rendered glyph appears at full intensity. Windows uses something called hinting. I have no idea how that works and don't really want to spend any more time on it. Besides, the regular font routines are still available, so you can use that for most text if you wish. I will still use the system routines for text editing as there's just too much basic and built-in functionality there. System wide copying, editing, pasting is too important I think. I can update it, but I don't think it's worth it at this point.
Now I'm at the point where I have to draw connections. I'm still not sure how to do this. A lot of component drawing tools force the components to stay put. You can't move them unless you're willing to manually retrace the connections. I don't want this to become about tracing. You should click on the source and then the destination and be done with it. Checking for overlapping lines is what's really troubling me. I could limit lines to being vertical or horizontal only for each segment. That way I could have a list of vertical and horizontal segments. Any overlap and I'd need to modify it. How to modify it, that's the question. Moving components is also a problem. Not only is the component moving, but all its connections.
The rest is simple. The compiler is simple. I think I'll just write a plain simulator for now. Later on, writing a compiler would be equally simple. I'll just use sockets for linking to other applications. I like that the components can be anything. Just link them up and you're done.
One thing that's cool is that normally, you have to specify whether or not an argument is a value or a function that can produce those values. You need to know the difference. With components, it knows no such thing. You can link up any kind of component that produces those values. It can be a single value used over and over, or it can be the output from another component. There's no need to differentiate different kinds of sources. What does it matter where the data comes from? Even in Lisp, you have to put an opening parenthesis to call a function. You can't use it as you would a value because then you would get a reference to the function and not the produced values.
I also like that sometimes you have algorithms that produce data where you would like to have it updated in a certain custom way. Usually, the algorithm in question provides a callback. This is tedious. Not only in the definition of the callback, but in getting everything to work right. What I like is that components can provide this same functionality as self connected input and outputs. If you want to change the data, you can break the link and instead make it go through a component of your choosing.
GUI's will work like this. It'll provide a host of functionality where you can just connect to any event. This is much more natural way to handle events. It's no longer based on polling or queuing.
I'm now looking at what information should be stored with each component at design time. Things like author, date of creation, last modified, version, input, output, name, category, icons, etc. I'm thinking of using a database to store components. Would there be any drawback to this?
Anyways, after line drawing, it's all about creating components. Components for basic operations and certain system operations. I'll release an API for C++ if anyone wants to make custom OS components. Then I foresee a standards committee or something to set interface standards for different operations. What I like about this is that you can do things multiple ways. Because conversion of data is nearly automatic, you don't have to stick to ONE way. You can use templates. That's for a later version, but should make things quite interesting. This will involve the ability to create networks that actually execute at design time to modify and alter your application. So templates can actually check what type the inputs are and can produce and alter existing components. For example, an equation component would let you write an equation and when you click "submit" or whatever, the design time network would create the inputs, outputs and runtime components necessary for execution. Then you can link up the inputs and outputs to other components. Components will also be able to respond to events such as linkage and change the look of the component as needed. For example, an addition component can start off with two inputs. When there are no inputs remaining, the design time network can insert another input. Addition could then be done on any amount of inputs. Powerful stuff!
I'm really optimistic lately. Although there's not much on the interface just yet, I think it looks rather good. Once I have a few things added, like a "favourites" sidebar for most used components and the library screen, we should be set to go.
I'll continue this thread once I finish the linking of components. I really wish I could find another name for components. The word "component" comes with pre-existing notions. It'd be nice to leave all that behind and start fresh.
Something else I'm doing is taking screen shots and writing down the progression of this tool. It's always nice to see how things work out. This is not seen too often. I've also got a little animated splash screen in the works. I may just use it for "demo" purposes. By demo, I mean a video, not a prototype. If I can continue to work like this, another few weeks and I should be in the testing phase. So end of October looks promising, but all my predictions have been wrong so far because something unrelated always comes up. Oh well.
Until next time.
Here are a few shots of the scanline algorithm at work from my progress report compilation.
This is scanline converted 'S' with the outline only.
This is the same 'S' filled in.

Although this image is not from those shown here, it is the same font with a heavier weight using the exact same algorithm scaled down to its proper size using the 16x16 algorithm.

Here is the same image zoomed in.

I think maybe a list of scanline coordinates would be faster and take up a lot less memory. I'll leave that for a future revision. Right now, this is for text that I can't render using the regular OS API.