Fall 2007 Final Project and Resources
|
Please feel free to look around and find an application group that may be of interest for your final project. If you would like to use your own idea for your final project, please submit a Final Project Application Information sheet (doc) to one of the TAs. The objective of ECE498AL final projects is to provide students with
first-hand experience in accelerating real-world applications. These projects
should allow students to apply their knowledge in writing high-performance
massively parallel applications to applications whose acceleration can
enable significant advancement of a research area. The areas of interest
include science, engineering, medicine, and financing. In order to save
your time in search for a project, we have created a recommended list
of projects that are of interest to world-class researchers. These researchers
will be interested in help mentoring the student teams pursuing their
projects. |
|
| Leaders | Themes |
| Jim Phillips, John Stone |
1. Molecular Dynamics (UIUC Beckman Institute) Many aspects of molecular dynamics simulation can be drastically accelerated by massively parallel processors. The Theoretical and Computational Biophysics Lab at the Beckman Institute is the home of NAMD/VMD, a suite of computational molecular dynamics tools that are used by a very large user community worldwide. The projects include Molecular Surface Computation (pdf), Lattice Cutoff Computation for Multilevel Summation (pdf), Grid Generation and Matching for Small Molecule Docking (pdf), Monte Carlo Sampling of Protein Conformations (pdf). |
|
New! GPU Acceleration of spatial decomposition and pairlist maintenance for Molecular Dynamics (pdf) Molecular dynamics (MD) is a computational technique used to study biopolymers
and |
|
| 2. Magnetic Resonance Imaging (UIUC)
Fast MRI image reconstruction can enable production of a movie of beating heart and allows early detection of cardio-vascular disease. The project here uses a model-based adaptive approach to construct detailed real-time 3D movies that can be used in clinic settings. PARADISE - MRI (pdf) |
|
| Ed
Seidel Fluid Dynamics |
3. Computational Fluid Dynamics and
Ocean simulation (Louisiana State University)
Finite-Element coastal ocean model (pdf). |
| Robert J. Brunner Volodymyr Kindratenko Dave Semeraro NCSA |
4. Computational Astrophysics (UIUC) |
|
Spherical 3D Discrete Elements Method (DEM) (doc)
|
|
| Dennis J Lin Thomas Huang |
5. Computer Vision for HCI (UIUC)
Gestures are an integral part of human-human communication, and automated gesture recognition systems will go a long way to improving human-computer interactions. Vision-based gesture interfaces are useful in virtual-reality and clean-room environments, where carrying an input device can be cumbersome or dangerous. Also, advances in gesture recognition may translate into better whole-body tracking and eventually lead to automated sign-language recognition. However, the human hand is a versatile instrument, and it can take on a wide variety of shapes. Unlike faces, which change shape by deforming smoothly, the joints on the fingers produce articulated motion that is difficult to characterize. Also, the fingers are relatively textureless and prone to self-occlusion, which means that a single view is often too ambiguous to identify the hand pose. Finally, driven by relatively large muscle in the forearm, fingers are capable of incredibly fast motion, posing an additional challenge for interactive applications. Computer Vision for HCI (pdf) |
| Michael Garland | 6. Sparse Matrix Computation (NVIDIA)
The ability to perform efficient computations on linear systems and matrices
is crucial in a host of applications. For linear systems where all (or
most) entries of the matrix are non-zero, we would use efficient dense
matrix routines like those provided by BLAS and LAPACK. However, many
of the linear systems that arise in practice have relatively few non-zero
entries. For instance, a system of equations defined over a triangle mesh
might form a matrix with 1 million rows and columns, but for which on
average only 7 entries per row are non-zero. Such sparse matrices require
rather different routines to achieve efficiency. The goal of this project
is to implement such routines. Sparse Matrix Computation (doc) |
| Stuart Levy, UIUC/NCSA | 7. Computer Animation for Scientific Visualization
(UIUC/NCSA)
Current CPUs are a bit too slow to decompress high quality, very high resolution animations (for example, 1920x1080 stereoscopic, or 4Kx2K monoscopic) at full frame rates (for example, 30 Hz). But this should be well within the capability of GPUs, which can thus replace what would otherwise be quite expensive hardware. The core of the application would be a GPU-accelerated JPEG decompressor, whose output would go to GPU textures or to the framebuffer. Most of the arithmetic needed for JPEG (Discrete Cosine Transform and YCbCr â' RGB conversion, on nice compact 8x8-pixel blocks) should be well-suited for GPU acceleration. It's less clear whether the Huffman decoding phase can be accelerated too, but may be worth a try. JPEG offers some features not needed here: progressive, hierarchical, and lossless extensions. We might also specialize for a particular style of chroma subsampling, e.g. 4:2:0. |
| Wen-mei Hwu, Shane Ryoo |
8. More Projects |
| Russell Hewett
(preferred contact) Mark Butala (preferred contact) Farzad Kamalabadi |
9. State Estimation with the Localized Ensemble Kalman Filter (doc) The goal of state estimation is to make a best estimate of a time evolving object given a statistical model for the state evolution and a set of statistically related measurements. State estimation of large-scale dynamic objects is a computationally enormous problem. Even relatively small state estimation problems require lengthy execution times and large amounts of computational storage. The Localized Ensemble Kalman Filter is method that scales much better in memory usage compared to most state estimation techniques, but significant computational speedup is still needed to make large-scale reconstructions possible. An example of a large scale dynamic state estimation problem is time
dependent tomography where the goal is to reconstruct a time evolving
object from noisy line integral measurements. Fast algorithms such as
filtered back projection exist for large-scale tomography problems, but
such methods assume that the object is unchanging during the measurement
interval. Dynamic tomography problems such as the time dependent reconstruction
of the electron density in the solar atmosphere or functional MRI scanning
of the human brain are of such large-scale that state estimation, even
with the LEnKF, is computationally intractable. |
| Lu Wan | 10. Design Space Exploration for
Implicitly Parallel Programming Models on Massive Processor Array (doc) Implicitly Parallel Programming Models have recently been proposed. To reduce the cost of parallel programming, new compiler modules need be constructed to deal with the complexity automatically. Code-Gen is one important phase in that the hardware constraints and software runtime requirements need be considered in a unified framework. The quality of the generated code largely depends on efficient space exploration during Code-Gen phase. Besides, hardware setting such as granularity of the parallel execution hardware, sram / register size, execution speed, etc, should be modeled accordingly to facilitate space exploration. Similarities exist between Code-Gen problem and circuit-level synthesis. This project will try to exploit circuit-level CAD techniques to deal with problems in the emerging Implicitly Parallel Programming domain. |
| Michael Connor |
11. Machine Learning: SVMs on GPU (doc) A popular and very powerful machine learning algorithm is Support Vector Machine (SVM). While this technique is both theoretically interesting and empirically quite accurate, it is also quite slow. A GPU implementation should be able to speed up training that once took many hours or days (and tuning runs that could take weeks) into something manageable, while retaining global optimization lost by other approximations. |
| Xiaohuang
Huang Changrui Yuan Shanxiang Qi |
12. HDR Image Processing Library
(doc) Our aim is going to implement the HDR image processing library. Because the floating-point calculation is not as fast as integer calculation. Once the HDR image processing in use, the GPU accelerate is required. |
| Resources: | |
| NVIDIA Official CUDA Page | http://developer.nvidia.com/object/cuda.html |
| CUDA Occupancy Calculator (xls) - A programmer's tool that allows you to compute the multiprocessor occupancy of a GPU by a given CUDA kernel. The multiprocessor occupancy is the ratio of active warps to the maximum number of warps supported on a multiprocessor of the GPU, and is helpful in determining how efficient the kernel will be on the GPU. | |
| Archived final project listings from previous semester(s): | ||
| Spring 2007 - First-time course offering by Prof. Hwu (UIUC) and Prof. Kirk (NVIDIA)! | ||