GPU computational geometry algorithms

Graphics Processing Unit (GPU) is a kind of computer processor originally designed to render 3D graphics for computer games.  It turns out that the kinds of computations that GPUs do can be applied to many other general purpose computing problems, and in some cases GPU algorithms can be vastly faster than traditional CPU algorithms. Computational geometry is a branch of theoretical computer science concerned with algorithms and data structures for geometric objects.

The leading manufacturer of GPUs is NVidia.  OpenCL and CUDA are two popular programming environments for writing GPU code. I have experience with OpenCL but am open to using other languages.

In addition to these low-level languages, there are emerging high-level tools such as PyOpenCLParFunk, and Thrust.

Project: Implement a novel data computational geometry data structure and/or algorithm on the GPU. Measure its performance and confirm that it is faster than comparable CPU implementations.

Some ideas for worthwhile data structures or algorithms:

  1. A doubly connected edge list (DCEL) or comparable structure. This is an important one; many algorithms depend on using a DCEL structure.
  2. Low-dimension linear programming, convex programming, or LP-type programming.
  3. k-dimension convex hulls. 2D and 3D convex hulls have already been done.
  4. Voronoi diagrams.
  5. Binary space partitions.
  6. Range trees.
  7. Quad trees.
  8. Planar graph separators.
  9. Well separated pairs decomposition.
  10. Anything else implemented in CGAL.
  11. Problems on my list of MapReduce topics might apply too.

MapReduce algorithms

MapReduce is a software framework for distributed computing on huge data sets.  It was created by Google and is used internally by Amazon, eBay, Facebook, Google, IBM, Nokia, Yahoo!, and many other organizations.

DiscoHolumbus, and Apache Hadoop are open source implementations of MapReduce, for Python, Haskell, and Java, respectively.  Other implementations are available as well.


  1. Pick a computational problem that can be parallelized.
  2. Design a MapReduce-style algorithm to solve the problem. You will probably be able to reuse ideas from the vast literature on parallel algorithms.
  3. Implement the algorithm and analyze its performance.

Here are some ideas for important problems that might be worth investigating. If you’re interested in a specific problem, you’ll need to do a brief literature review to confirm that the problem is neither impossible, nor trivially easy, to solve with MapReduce.

  1. Matrix multiplication using Strassen’s algorithm
  2. Binary space partition
  3. Minimum spanning trees, possibly with a variant of Boruvka’s algorithm
  4. 2D arrangements
  5. 2D convex hulls or 3D convex hulls
  6. Voronoi diagrams
  7. Factoring using the Quadratic Sieve or General Number Field Sieve
  8. Graph separators
  9. Linear programming

Generating guitar chords algorithmically

A musician playing a guitar, or a related fretted instrument, forms musical chords by arranging up to four fingers on the instrument’s fretboard.  The positions of the fingers dictate the musical pitch that is generated by each string.  Strings can also be muted so that they do not produce any sound.  For every common musical chord, there are several known frettings, or ways of playing that chord, and these frettings are available in databases such as  However, it would be interesting and useful to generate these frettings algorithmically.

This is a difficult computational problem.  It amounts to choosing a subset of finger positions, and the number of subsets grows exponentially in the number of finger positions, of which there are hundreds.  Fender Stratocasters typically have 6 strings, each of which may be left open, muted, or sounded in 21 different frets, for a total of 6*23 = 138 finger positions and 2^138 sets of finger positions.

However, there are also many constraints on which fingerings are valid.  A fingering must only generate musical pitches that are actually in the desired chord, and in the Western musical temperament there are only 12 pitches to worry about.  Also, many conceivable fingerings are impossible to play due to anatomical limitations of the human hand.

The problem of generating guitar chords algorithmically can be posed as a branch-and-bound problem, where each step involves adding one finger to a candidate fingering.  This formulation limits the depth of the search to the number of fingers, which is typically four.  The branching factor is the number of legal places to position each next finger, which, naively, is large.  However musical and anatomical constraints make the effective branching factor much smaller.

Alternatively, using dynamic programming, integer linear programming, a constraint solver, or logic programming, may be feasible. I haven’t thought about these approaches as much.

Monte Carlo Artificial Intelligence for Go

Go is a classical strategic board game.  It has very simple rules, with only two kinds of pieces, yet requires a depth of strategy similar to Chess.  The computational problem of making optimal Go movies is computationally difficult.  However, there is a simple randomized algorithm using Monte Carlo methods that seems to work well in practice:

  1. Enumerate all the liberties (legal moves)
  2. For each liberty, play out a random game where players take turns making completely random, but legal moves.  Try k random trials, and keep track of the number of times w that the computer player wins.
  3. Choose the liberty with the highest win ratio w/k

This is a nice algorithm because it is simple, the player strength and running time can be tuned by adjusting the k parameter, it is embarrasingly parallel, and it seems to work well against human opponents.  The basic algorithm has already been designed, implemented, and documented, but there is room for more work in this area.

Project: Implement a Monte-Carlo Go player. The project can be extended by parallelizing the algorithm, or by developing the player on a resource-constrained mobile device.

Compress Wikipedia

Wikipedia is the online encyclopedia that everyone can edit.  It would be nice to be able to carry around an offline copy of Wikipedia for many reasons, such as reading on mobile devices when network service is unavailable, or as an archive in case of a prolonged Internet outage.  Wikipedia stores a lot of data, so compressing it would be helpful.  There is even a contest for solving this problem well.

General purpose text compression algorithms such as LZW (GZip) and BWT (bzip2) do a decent job, but it is possible to do even better.  Standard text compression algorithms limit themselves to linear-time algorithms for obvious reasons.  However, we might be able to do better with a relatively high-order polynomial-time algorithm. One approach would be to use a polynomial-time approximation scheme for a NP-complete problem that corresponds to text compression.


  • Reduce hypertext compression to a known approximable NP-complete problem.
  • Implement the approximation algorithm.
  • Compare the compression performance to competitors.

Prize Collecting Chinese Postman Problem

The Chinese postman problem (CPP) is an NP-complete problem related to the traveling salesman problem (TSP).  The desired output of TSP is the shortest cycle that visits every vertex exactly once.  By contrast, the desired output of CPP is the shortest cycle that visits every edge at least once.  The TSP corresponds to the problem of a visiting every city (vertex) only once; the CPP corresponds to visiting every city road (edge) at least once.  By extension the CPP corresponds to the problem of traversing and photographing city roads to collect data for a street view service such as Google Street View or Microsoft Bing Maps.

In the prize collecting variant, every edge is assigned a numerical reward, and the object is the collect at least X points of reward.  This corresponds to the (realistic) scenario in which some streets are more important than others, and the algorithm can choose to skip unimportant streets.  The prize collecting CPP is NP-complete, but perhaps it could be approximated efficiently.

Project: design a polynomial time approximation scheme (PTAS) for the prize collecting Chinese postman problem.