Main

Publications

HPCBugBase: An Experience Base for HPC Defects (Poster)

Taiga Nakamura, "HPCBugBase: An Experience Base for HPC Defects", poster presented at the International Conference for High Performance Computing, Networking, Storage and Analysis (SC06), November 11-17, Tampa, Florida, 2006.
Awarded the third place in the ACM Student Research Competition.

Abstract: We present the design and implementation of HPCBugBase, an experience base for high performance computing (HPC) software defects. Our goal is to accumulate empirical knowledge about commonly occurring defects in HPC codes using an incremental approach. This knowledge is structured so that HPC practitioners such as programmers and tool builders can use it to reduce debugging costs, as well as provide feedback which becomes incorporated into the system. By building the experience base, we expect to help the process of making explicit the knowledge about recurring defects that otherwise cannot be shared. The current system is built on a Wiki system, which allows incremental accumulation of data at various levels of abstraction. We implement additional analysis functions that do not exist in a generic Wiki system as custom plug-ins. We have populated the system with data collected from software engineering studies from the DARPA High Productivity Computer Systems Project.

Experiments to Understand HPC Time to Development

Hochstein, L., Nakamura, T., Basili, V. R., Asgari, S., Zelkowitz, M. V., Hollingsworth, J. K., Shull, F., Carver, J., Voelp, M., Zazworka, N., Johnson, P. "Experiments to Understand HPC Time to Development ," CTWatch Quarterly, Volume 2, Number 4A, November 2006. [PDF]

http://www.ctwatch.org/quarterly/articles/2006/11/experiments-to-understand-hpc-time-to-development/

Observations about Software Development for High End Computing

Carver, J. C., Hochstein, L. M., Kendall, R. P., Nakamura, T., Zelkowitz, M. V., Basili, V. R., Post, D. E. "Observations about Software Development for High End Computing," CTWatch Quarterly, Volume 2, Number 4A, November 2006. [PDF]

http://www.ctwatch.org/quarterly/articles/2006/11/observations-about-software-development-for-high-end-computing/

What's Working in HPC: Investigating HPC User Behavior and Productivity

Wolter, N., McCracken, M. O., Snavely, A., Hochstein, L., Nakamura, T., Basili, V. "What's Working in HPC: Investigating HPC User Behavior and Productivity," CTWatch Quarterly, Volume 2, Number 4A, November 2006. [PDF]

http://www.ctwatch.org/quarterly/articles/2006/11/whats-working-in-hpc-investigating-hpc-user-behavior-and-productivity/

Identifying Domain-specific Defect Classes Using Inspections and Change History

Taiga Nakamura, Lorin Hochstein, Victor R. Basili, "Identifying Domain-Specific Defect Classes Using Inspections and Change History", Proceeding of 5th ACM-IEEE International Symposium on Empirical Software Engineering (ISESE'06), September 21-22, 2006, Rio de Janeiro, Brazil.
[PDF]

Abstract: We present an iterative, reading-based methodology for analyzing defects in source code when change history is available. Our bottom-up approach can be applied to build knowledge of recurring defects in a specific domain, even if other sources of defect data such as defect reports and change requests are unavailable, incomplete or at the wrong level of abstraction for the purposes of the defect analysis. After defining the methodology, we present the results of an empirical study where our method was applied to analyze defects in parallel programs which use the MPI (Message Passing Interface) library to express parallelism. This library is often used in the domain of high performance computing, where there is much discussion but little empirical data about the frequency and severity of defect types. Preliminary results indicate the methodology is feasible and can provide insights into the nature of real defects. We present the results, derived hypothesis, and lessons learned.

http://doi.acm.org/10.1145/1159733.1159785

Metrics of Software Architecture Changes Based on Structural Distance

Taiga Nakamura and Victor R. Basili, "Metrics of Software Architecture Changes Based on Structural Distance", METRICS '05: Proceedings of the 11th IEEE International Software Metrics Symposium (METRICS'05), 2005.
[PDF]

Abstract: Software architecture is an important form of abstraction, representing the overall system structure and the relationship among components. When software is modified from one version to another, its architecture may change. Software modification involving architectural change is often difficult when the change goes beyond the original architectural design, involving changes to the connectivity of multiple components. Existing research has looked at architectural change at the level of architecture metrics such as size, complexity, coupling and cohesion, which abstract a particular version of the software in isolation. In this paper, we argue that this level of abstraction is often too high to characterize some interesting aspects of the architectural change process, and propose an approach that takes into account the change in connectivity from version to version of individual components. In this approach, two endpoints of a major change are taken as reference points, and intermediate connectivity changes are examined relative to the endpoints. We define a distance measure between software structures using a graph kernel function, which is quite powerful as it is applicable to any software structure representable as a graph. Using this distance measure, we define a metric which models the architecture change as a transition between two endpoints. In addition to theoretical analysis of the approach, we present empirical results obtained by applying the approach to open-source software projects to evaluate its validity and usefulness.


http://portal.acm.org/citation.cfm?id=1090955.1092148

Measuring Productivity on High Performance Computers

Marvin Zelkowitz, Victor Basili, Sima Asgari, Lorin Hochstein, Jeff Hollingsworth, Taiga Nakamura, "Measuring Productivity on High Performance Computers", METRICS '05: Proceedings of the 11th IEEE International Software Metrics Symposium (METRICS'05), 2005.
[PDF]

Abstract: In the high performance computing domain, the speed of execution of a program has typically been the primary performance metric. But productivity is also of concern to high performance computing developers. In this paper we will discuss the problems of defining and measuring productivity for these machines and we develop a model of productivity that includes both a performance component and a component that measures the development time of the program. We ran several experiments using students in high performance courses at several universities, and we report on those results with respect to our model of productivity.

http://portal.acm.org/citation.cfm?id=1092146

Optimistic Fair Contract Signing for Web Services

Hiroshi Maruyama, Taiga Nakamura, Tony Hsieh, "Optimistic Fair Contract Signing for Web Services", XMLSEC '03: Proceedings of the 2003 ACM workshop on XML security, 2003.

Abstract: Reliable and atomic transactions are a key to successful e-Business interactions. Reliable messaging subsystems, such as IBM's MQ Series, or broker-based techniques have been traditionally used for this purpose. In this paper, we take a radically different approach to address this problem, which is to apply the idea of Optimistic Fair Contract Signing recently proposed by Asokan, Shoup, and Waidner. We show a design of the protocol based on the latest XML and Web Services Security standards and discuss the benefits and limitations of this approach.

http://doi.acm.org/10.1145/968559.968572

An Audio Watermarking Method Using a Two-dimensional Pseudo-random Array

Ryuki Tachibana, Shuichi Shimizu, Seiji Kobayashi, Taiga Nakamura, "An Audio Watermarking Method Using a Two-dimensional Pseudo-random Array", Signal Processing, Vol. 82-10, October 2002.

Abstract: In this paper, we describe a multiple-bit audio watermarking method which is robust against wow-and-flutter, random sample cropping, and pitch shifting. Though these processings are easy to perform, they are difficult for audio watermarks to survive, because they introduce mis-synchronization between the embedded and detection watermarks. Our main ideas against these mis-synchronization attacks are a two-dimensional pseudo-random array (PRA), magnitude modification, and non-linear subbands. The embedding algorithm modifies the magnitudes of segmented areas in the time–frequency plane of the content, according to a two-dimensional pseudo-random array, while the detection algorithm correlates the magnitudes with the PRA. The two-dimensional array makes the watermark robust against cropping because, even when some portions of the content are heavily degraded, other portions of the content can match the PRA and contribute to watermark detection. Secondly, the magnitude modification enables detection even from displaced detection windows. This is because magnitudes are less influenced than phases by fluctuations of the analysis windows caused by random cropping. The last idea, wider bandwidths at higher frequencies, keeps the correspondence of the embedded and detection PRA even for pitch-shifted content. We theoretically and experimentally analyze the robustness of the proposed algorithm against a variety of signal degradations.

http://dx.doi.org/10.1016/S0165-1684(02)00284-0

Automatic Music Monitoring and Boundary Detection for Broadcast using Audio Watermarking

Taiga Nakamura, Ryuki Tachibana, and Seiji Kobayashi, "Automatic Music Monitoring and Boundary Detection for Broadcast using Audio Watermarking", Proceedings of Security and Watermarking of Multimedia Contents IV, SPIE vol. 4675, pp. 170-180, San Jose, USA, January 2002.

Abstract: An application of watermarking for automatic music monitoring of radio broadcasts is discussed. By embedding information into the music as a watermark before broadcasting it, it is possible to keep track of what music has been on the air at what time, and for how long. However, to effectively implement this application, the handling of content transitions is important, because the detection reliability deteriorates at the content boundaries. In this paper, a method of detecting content boundaries using overlapping detection windows is described. The most probable pattern of content transition is selected under the condition that detection results from multiple windows are available. The derived rules are represented using a finite state model, which is useful for detection in real time. Experimental results on FM radio broadcasts are also presented.

http://citeseer.ist.psu.edu/684728.html
http://www.trl.ibm.com/projects/RightsManagement/datahiding/paper/Taiga_EI02Paper.pdf

Au Audio Watermarking Method Robust against Time- and Frequency-Fluctuation

Ryuki Tachibana, Shuichi Shimizu, Taiga Nakamura, Seiji Kobayashi, "An Audio Watermarking Method Robust against Time- and Frequency-Fluctuation", Proceedings of Security and Watermarking of Multimedia Contents III, SPIE vol. 4314, pp. 104-115, San Jose, USA, January 2001.

Abstract: In this paper, we describe an audio watermarking algorithm that can embed a multiple-bit message which is robust against wow-and-flutter, cropping, noise-addition, pitch-shift, and audio compressions such as MP3. The embedding algorithm calculates and manipulates the magnitudes of segmented areas in the time-frequency plane of the content using short-term DFTs. The detection algorithm correlates the magnitudes with a pseudo-random array that corresponds to two-dimensional areas in the time-frequency plane. The two-dimensional array makes the watermark robust because, even when some portions of the content are heavily degraded, other portions of the content can match the pseudo-random array and contribute to watermark detection. Another key idea is manipulation of magnitudes. Because magnitudes are less influenced than phases by fluctuations of the analysis windows caused by
random cropping, the watermark resists degradation. When signal transformation causes pitch fluctuations in the content, the frequencies of the pseudo-random array embedded in the content shift, and that causes a decrease in the volume of the watermark signal that still correctly overlaps with the corresponding pseudo-random array. To keep the overlapping area wide enough for successful watermark detection, the widths of the frequency subbands used for the detection segments should be increased as frequency increases. We theoretically and experimentally analyze the robustness of proposed algorithm against a variety of signal degradations.

http://www.trl.ibm.com/projects/RightsManagement/datahiding/paper/Ryuki_EI01Paper.pdf