This is an early draft of the full article published in Graphics Gems IV, the citation is:
Haines, Eric, "Point in Polygon Strategies," Graphics Gems IV, ed. Paul Heckbert, Academic Press, p. 24-46, 1994.
(The full article is better, but I cannot find the final text on my machine.) The associated code for this article is available online.
Testing whether a point is inside a polygon is a basic operation in computer graphics. Graphics Gems presents an algorithm for testing points against convex polygons (Badouel 1990). This Gem provides algorithms which are from 1.6 to 9 or more times faster for convex polygons. It also presents algorithms for testing non-convex polygons and discusses the advantages and drawbacks of each. Faster, more memory intensive algorithms are also presented, along with an O(log n) algorithm for convex polygons. Code is included for the algorithms discussed.
If the entire area enclosed by the polygon is to be considered inside, then the winding number is used for testing. This value is the number of times the polygon goes around the point. For example, a point in the center pentagonal area formed by a star has a winding number of two, since the outline goes around it twice. If the point is outside, the polygon does not wind around it and so the winding number is zero. Winding numbers also have a sign, which corresponds to the direction the edges wrap around the point.
Complex polygons can be formed using either polygon definition. A complex polygon is one that has separate outlines (which can overlap). For example, the letter "R" can be defined by two polygons, one consisting of the exterior outline, the other the inside hole in the letter. Most point in polygon algorithms can be easily extended to test multiple outline polygons by simply running each polygon outline through separately and keeping track of the total parity.
While the above method is simple and quick, another projection may be useful when the polygon vertices are not guaranteed to lay on a single plane. While such polygons could be considered ill defined, they do crop up. For example, if a spline surface is tessellated into quadrilaterals instead of triangles, then the quadrilaterals are likely to be ill defined in this way. If a component is simply dropped, cracking can occur between such polygons when they are cast upon different planes.
One solution is to tessellate such polygons into triangles, but this may be impractical for a variety of reasons. Another approach is to cast all polygons tested onto a plane perpendicular to the testing ray's direction. A polygon might have no area on this plane, but the ray would miss this polygon anyway. Casting a polygon onto an arbitrary plane means having to perform a matrix transformation, but this solution does provide a way around potential cracking problems.
In ray tracing, Worley (Worley 1993a) points out that the polygon's 3D bounding box can be treated like a 2D bounding box by throwing away one coordinate, as done above for polygons. By analysis of the operations involved, it can be shown to be generally more profitable to first intersect the polygon's plane then test whether the point is inside the 2D bounding box rather than first testing the 3D bounding box. Other bounding box variants can be found in Woo (Woo 1992).
One way to think about this algorithm is to consider the test point to be at the origin and to check the edges against this point. If the Y components of a polygon edge differ in sign, then the edge can cross the test ray. In this case, if both X components are positive, the edge and ray must intersect and a crossing is recorded. Else, if the X signs differ, then the X intersection of the edge and the ray is computed and if positive a crossing is recorded.
MacMartin (MacMartin 1992) pointed out that for polygons with a large number of edges there are generally runs of edges which have Y components with the same sign. For example, a polygon representing Brazil might have a thousand edges, but only a few of these will straddle any given latitude line and there are long runs of contiguous edges on one side of the line. So a faster strategy is to loop through just the Y components as fast as possible; when they differ then retrieve and check the X components. Compared to the basic crossings test the MacMartin test was up to 1.8 times faster for polygons up to 100 sides, with performance particularly enhanced for polygons with many edges.
Other optimizations can be done for this test. Preprocessing the edge list into separate X and Y component lists, with the first vertex added to the end, makes for particularly tight loops in the testing algorithm. This is not done in the code provided so as to avoid any preprocessing or additional memory costs.
[Addenda: Joseph Samosky and Mark Haigh-Hutchinson submitted (unpublished) articles to Graphics Gems V. One idea these submissions inspired is a somewhat faster crossings test. By turning the division for testing the X axis crossing into a tricky multiplication test this part of the test became faster, which had the additional effect of making the test for "both to left or both to right" a bit slower for triangles than simply computing the intersection each time. The main increase is in triangle testing speed, which was about 15% faster; all other polygon complexities were pretty much the same as before. On machines where division is very expensive (not the case on the HP 9000 series on which I tested) this test should be much faster overall than the old code. Your mileage may (in fact, will) vary, depending on the machine and the test data, but in general I believe this code is both shorter and faster. Related work by Samosky is in: Samosky, Joseph, "SectionView: A system for interactively specifying and visualizing sections through three-dimensional medical image data", M.S. Thesis, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, 1993. The routine online is called the CrossingsMultiplyTest.]
When the number of sides is small, the barycentric test is comparable to the MacMartin test in speed, with the additional bonus of having the barycentric coordinates computed. As the number of sides approached 100, the MacMartin test becomes 3 to 6 times faster than the barycentric method.
A faster triangle fan tester proposed by Green (Green 1993) is to store a set of half-plane equations for each triangle and test each in turn. If the point is outside any of the three edges, it is outside the triangle. The half-plane test is an old idea, but storing the half-planes instead of deriving them on the fly from the vertices gives this scheme its speed at the cost of some additional storage space. For triangles this scheme is the fastest of all of the algorithms discussed so far, being almost twice as fast as the MacMartin crossings test. It is also very simple to code and so lends itself to assembly language translation.
Both the half-plane and barycentric triangle testers can be sped up further by sorting the order of the edge tests. Worley and Haines (Worley 1993b) note that the half-plane triangle test is more efficient if the longer edges are tested first. Larger edges tend to cut off more exterior area of the polygon's bounding box, and so can result in earlier exit from testing a given triangle. Sorting in this way makes the test up to 1.6 times faster, rising quickly with the number of edges in the polygon. However, polygons with a large number of edges tend to bog down the sorted edge triangle algorithm, with the MacMartin test being from 1.6 to 2.2 times faster for 100 edge polygons.
For the general test a better ordering for each triangle's edges is to sort by the area of the polygon's bounding box outside the edge, since we are trying to maximize the amount of area discarded by each edge test. This ordering provides up to another 10% savings in testing time. Unfortunately, for the convex test below, this ordering actually loses about 10% for regular polygons due to a subtle quirk. As such, this ordering is not presented in the statistics section or the code.
The triangle fan tests can exit as soon as any triangle is found to contain the point. This algorithm can be enhanced by both sorting the edges of each triangle by length and also sorting the testing order of triangles by their areas. Larger triangles are more likely to enclose a point and so end testing earlier. Using both of these sorting strategies makes convex testing 1.2 times faster for squares and 2.5 times faster for regular 100 sided polygons.
Another strategy is to test the point against each exterior edge in turn. If the point is outside any edge, then the point must be outside the entire convex polygon. This algorithm uses less additional storage than the triangle fan and is very simple to code.
The order of edges tested affects the speed of the algorithm; testing edges which cut off the most area of the bounding box earliest on is the best ordering. Finding this optimal ordering is non-trivial, but doing the edges in order is often the worst strategy, since each neighboring edge usually cuts off little more area than the previous. Randomizing the order of the edges makes this algorithm up to 10% faster overall for regular polygons. However, even then the triangle fan algorithm with sorting is up to 1.35 times faster for 100 edge regular polygons.
The exterior edge strategy looks for an early exit due to the point being outside the polygon, while the triangle fan convex test looks for one due to the point being inside. For example, for 100 edge polygons if all points tested are inside the polygon the triangle fan is 1.7 times faster; if all are outside the exterior test is more than 11 times faster (but only 3 times faster if the edges are not randomized). So when the polygon/bounding box area is low the exterior edge strategy might be best.
A method with O(log n) performance is discussed by Preparata and Shamos (Preparata 1985). The polygon is preprocessed by adding a central point to it and is then divided into wedges. The angles from an anchor edge to each wedge's edges are computed and saved, along with half-plane equations for each wedge's polygon edge. When a point is tested, the angle from the anchor edge is computed and a binary search is used to determine the wedge it is in, then the corresponding polygon edge is tested against it (Figure 4). This algorithm is slower for polygons with few edges because the startup cost is high, but the binary search makes for a much faster test when the number of edges is high.
Problems occur in triangle fan algorithms when the code assumes that a point that lies on a triangle edge is inside that triangle. Points on the edges between test triangles will be classified as being inside two triangles, and so will be classified as being outside the polygon. This problem does not happen with the convex test. However, another problem is common to this family of algorithms. If a point is on the edge between two polygons, it will be classified as being inside both.
The code presented for these algorithms does not fully address either of these problems. In reality, a random point tested against a polygon using either algorithm has an infinitesimal chance of landing exactly on any edge. For rendering purposes this problem can be ignored, with the result being one mis-shaded pixel once in a great while.
The crossings test does not have these problems when the 2D polygons are in the same plane. By the nature of the test, all points are consistently categorized as being to one side of any given edge or vertex. This means that when a point is somewhere inside a mesh of polygons the point will always be in only one polygon. Points exactly on the unshared edge of a polygon will be classified as arbitrarily inside or outside the polygon by this method, however. Again, this problem is rarely encountered in rendering and so can usually be ignored.
When a point is to be classified, the proper bin is retrieved and if the point is outside the X bounds, it must be outside the polygon. Else, the list is traversed and the edges tested against the point. Essentially, a modified crossings test is done, with additional speed coming from the sorted order and from the storage of the "fully crosses" condition (Figure 5).
To test a point against this structure is extremely quick in most cases. For a reasonable polygon many of the cells are either inside or outside, so testing consists of a simple look-up. If the cell contains edges, then a line segment is formed from the test point to the cell corner and is tested against all edges in the list (Antonio 1992). Since the state of the corner is known, the state of the test point can be found from the number of intersections (Figure 6).
Care must be taken when a polygon edge exactly (or even nearly exactly) crosses a grid corner, as this corner is then unclassifiable. Rather than coping with the topological and numerical problems involved, one simple solution is to just start generating the grid from scratch again, giving slightly different dimensions to the bounding box. When testing the line segment against the edges in a list, exact intersections of an edge endpoint must be counted only once.
One additional speed up is possible. Each grid cell has four sides. If no edges cross a side, then that side will be fully inside or outside the polygon. A perfectly horizontal or vertical test line segment can then be generated and the faster crossings test can be used against the edges in the cell. The only test case where this made a significant difference was with a 20x20 grid imposed on a 1000 edge polygon, where the grid cell side test was 1.3 times faster.
The performance of all the algorithms is practically linear; as such, the ratios of times for the 1000 edge polygons are representative of performance for polygons with a large number of edges.
General Algorithms, Random Polygons:
number of edges per polygon 3 4 10 100 1000 MacMartin 2.9 3.2 5.9 50.6 485 Crossings 3.1 3.4 6.8 60.0 624 Triangle Fan+edge sort 1.1 1.8 6.5 77.6 787 Triangle Fan 1.2 2.1 7.3 85.4 865 Barycentric 2.1 3.8 13.8 160.7 1665 Angle Summation 56.2 70.4 153.6 1403.8 14693 Grid (100x100) 1.5 1.5 1.6 2.1 9.8 Grid (20x20) 1.7 1.7 1.9 5.7 42.2 Bins (100) 1.8 1.9 2.7 15.1 117 Bins (20) 2.1 2.2 3.7 26.3 278General Algorithms, Regular Polygons:
number of edges per polygon 3 4 10 100 1000 MacMartin 2.7 2.8 4.0 23.7 225 Crossings 2.8 3.1 5.3 42.3 444 Triangle Fan+edge sort 1.3 1.9 5.2 53.1 546 Triangle Fan 1.3 2.2 7.5 86.7 894 Barycentric 2.1 3.9 13.0 143.5 1482 Angle Summation 52.9 68.1 158.8 1489.3 15762 Grid (100x100) 1.5 1.5 1.5 1.5 1.5 Grid (20x20) 1.6 1.6 1.6 1.7 2.5 Bins (100) 2.1 2.2 2.6 4.6 3.8 Bins (20) 2.4 2.5 3.4 9.3 55.0Convex Algorithms, Regular Polygons:
number of edges per polygon 3 4 10 100 1000 Inclusion 4.82 5.01 6.21 7.12 8.3 Sorted Triangle Fan 1.11 1.41 3.75 29.36 289.6 Unsorted Triangle Fan 1.25 2.04 6.30 69.18 734.7 Unsorted Barycentric* 1.79 2.80 6.94 65.62 668.7 Random Exterior Edges 1.11 1.61 3.82 33.44 333.6 Ordered Exterior Edges 1.28 1.70 4.15 41.07 408.9 Convex MacMartin 2.44 2.48 3.18 17.31 159.8* The "unsorted barycentric" code is a slightly optimized version of Badouel's code in Graphics Gems (Badoeul 1990).
Of the algorithms with efficiency structures, the trapezoid algorithm is somewhere in between the edge based algorithms and the gridding algorithm in speed. Gridding gives almost constant time performance for most normal polygons, though like the other crossings test it performs a bit slower when entirely random polygons are tested. Interestingly, even for polygons with just a few edges the gridding algorithm outperforms most of the other tests.
Testing times can be noticeably decreased by using an algorithm optimized for convex testing when possible. For example, the convex sorted triangle fan test is up to 2 times faster than its general case counterpart. For convex polygons with many edges the inclusion test is extremely efficient because of its O(log n) behavior. There is a zone from around 8 to 25 or so edges where the convex MacMartin test is slightly faster than the others, though not significantly so.
In summary, the basic crossings test is generally useful, but we can do better. Testing triangles using the sorted half-plane algorithm was more than twice as fast; on the other end of the spectrum, the MacMartin optimization made testing nearly twice as fast for polygons with many edges.
Code for this article is on the web.
(Badouel 1990) Badouel, Didier, "An Efficient Ray-Polygon Intersection," Graphics Gems (Andrew S. Glassner, ed.), Academic Press, pp. 390-393, 1990.
(Glassner 1989) Glassner, Andrew S., ed., An Introduction to Ray Tracing, Academic Press, pp. 53-59, 1989. See Hypergraph for some related information.
(Green 1993) Green, Chris, "Simple, Fast Triangle Intersection," Ray Tracing News 6(1), 1993.
(MacMartin 1992) MacMartin, Stuart, et al, "Fastest Point in Polygon Test," Ray Tracing News 5(3), 1992.
(Schorn 1994) Schorn, Peter, and Fisher, Frederick, "Testing the Convexity of a Polygon," Graphics Gems IV, (ed. Paul Heckbert), p. 7-15, 1994.
(Woo 1992) Woo, Andrew, "Ray Tracing Polygons using Spatial Subdivision,"
Proceedings of Graphics Interface '92, pp. 184-191, 1992.
(Worley 1993a) Worley, Steve and Haines, Eric, "Bounding Areas for Ray/Polygon
Intersection," Ray Tracing News 6(1), 1993.
(Worley 1993b) Worley, Steve and Haines, Eric, "Triangle Intersection Revisited," Ray Tracing News 6(2), 1993.