DYNINST API/DAIS STANDARDS:
ORGANIZATIONAL MEETING
Wednesday, March 18, 1998
Attendees
I. Process Issues
Issue:
What is the form of participation?
Proposal:
Structure the process like the X consortium. Anyone can attend, but
voting will be limited to those organizations who either provide
resources in money (e.g. $30K to Wisc. or Maryland) or 1/4 FTE.
But be assured that participation is encouraged regardless.
Comments:
Doug P:
Building the low-level implementation is not the only way to
participate. It is just as important to have people working on the tools
level. The whole intention is to have it possible to create real tools.
We don't want an impression that we're trying to exclude somebody, but we
do want to avoid someone coming in and "scuttling the ship" without being
part of the crew.
Bart:
The meetings will remain open but you only get to vote if you are providing
resources. IBM has committed resources in the form of several FTEs.
Wisconsin/Maryland have both committed resources. It is important to make
sure that the infrastructure does get done. Otherwise the high level
tools aren't any use.
Jeff B:
Would like to see SGI adopt this effort, but are willing to help get
that process going. They are willing to have 1 FTE working on this as
long as Doug P. (IBM) continues to be interested in this.
Robert H:
Looking for tool infrastructure, would expect to be an active
participant, some uncertainty. (can't officially speak for NASA)
SGI:
They are here to observe with hopes to leverage or provide if
appropriate, but don't understand it well enough yet.
Mary Z:
As long as IBM remains interested in this, LLNL will be active.
Shirley B:
UT is involved with DOD mod and is interested in portable tools.
They can get resources to work on projects with specific deliverables that
benefit the PET/DOD users. Anything that they provide resources for
needs to benefit the application developers. For PET to put resources
into this, we would have to show specific deliverables, tools. They
are not interested in infrastructure as a deliverable. The main focus
is not creating tools, but rather porting, survey, and robustifying
existing tools.
Arndt B:
The market of SC users is so small that there is no need to have
disjoint tools. We need to make them more portable. We have our own
approach - and want to see how these evolve. We are interested in seeing
how dyninst and DAIS are evolving. Could contribute to the effort. We
are a long distance away (Munich), so might not have regular contact.
Brian T:
Argonne's interest is currently passive. We are interested in adaptivity,
so might be able to incorporate this into some agent.
Question:
What is the time table for this effort?
Answer (Bart):
Of course we don't have a good estimate right now. It depends
on the resources available, and the scope. Hopefully in three months we will
have a good idea of where we are going.
Question:
As an end user, is it obvious what you want as a minimal set in the API?
Answer (Doug P):
We will use the high level tools (e.g., Paradyn) as drivers. The tool
requirements will drive API level development. Infrastructure in isolation
is not very interesting.
Question:
What are the frequency of the meetings?
Answer (Jeff H):
We anticipate about every 3-6 months with teleconference more frequently.
A good time for the next meeting is after the SPDT conference in Oregon
during the first week in August.
Comment (Mary Z):
Regular teleconferences worked well with OpenMP effort (LLNL). However,
using just email doesn't work out very well.
Comment (Bart M):
If there are other organizations who are not at this meeting, please
feel free to contact them or let us know and we will contact them.
Question (Jeff B):
Is this effort like the National Compiler Infrastructure?
Answer (Jeff H):
They have funding source (DARPA) for that purpose. However, if there is
sufficient interest and demand, we might consider submitting a proposal for
funding the development of the reference implementation of the API.
Question (Chris K):
How will tool developers know what others are doing?
Answer (Doug P):
The mail reflector is an appropriate forum for experiences in tool
development. Otherwise there is no specific plan for how to deal with
this.
II. Scope
Issue:
What are we trying to standardize?
Proposal:
A multi-layered model, with well-defined interfaces at the dyninst and
DAIS levels. Dyninst is the stuff that has to be done on a single node,
DAIS is the stuff that glues together multiple nodes.
Dyninst:
* Platform independent process instrumentation on a single node.
* Platform independent process control functions on a single node.
DAIS:
* Platform independent extracting data from processes.
* Multi-node, multi-tool support/RPC architecture, Security.
* Scalability (by offloading work to the nodes)
New Features and pieces outside dyninstAPI/DAIS:
* Source browser
* Expression parser
* Name demangler
* Clock sync package (distributed clocks)
Comment (Bart M):
When it comes to scalability and performance, the lesson from
Paradyn is: You cannot make things asynchronous enough. Synchronous
(blocking) behavior was never the right answer. It is important to design
in from the start a model that is very asynchronous.
Question (Arndt B):
What are the target architectures? Where do heterogeneous things fit?
Answer (Bart M):
If designed properly into the RPC, heterogeneity is almost free.
Answer (Jeff H):
We believe that we will be able to continue to hide the architecture
issues under the dyninst API. Because dyninst has machine independent
abstractions, DAIS could work with multiple different platforms
simultaneously.
Comment (Mary Z):
Heterogeneity is important for projects like the Computational Plant
(A cluster system that is constantly evolving - nodes being added and
removed).
Issue:
What are are the uses of the API?
Suggestions were placed on the white board. The list included:
debuggers,
performance steering,
performance tools (code, comm, and I/O),
visualization,
load balancing,
ras,
test coverage,
future systems design/simulation,
Condor like systems: running on idle workstation systems.
This currently requires linking with a special C library. Instead,
we could use dyninst to "hijack" the job and change the C library
to be the condor version and send it off to the condor queue.
memory tools (perf, array bounds, ptr checks)
checkpointing
relative debugging: comparing the output of two different runs/versions of
a program.
Comment (Doug P):
Let me explain RAS applications. For example, an application tries to
save relevant data when it realizes it is going down, or there is a
problem. Within the system, the RAS code can trigger a client application
to help handle this situation.
Question:
Will the API will support debuggers and static analysis tools?
Answer (Doug P):
IBM is looking at putting debuggers on top of this API. We are not
restricting DAIS to performance analysis tools. Issues related to source
code get into the fuzzy area between the client and the infrastructure
level. DAIS is not a debugger, but we want to provide infrastructure that
can support a debugger and application steering, etc.
Question:
Why is application steering so interesting? Isn't this really a
relatively small issue, related to moving large volumes of data out of
the process.
Answer:
Comment (Jeff B):
We should focus on the performance tool issues as driver, this seems to
give us about 70% of the required functionality for all of the proposed
applications.
Comment (Mary Z):
Source browsers are an important component. Should they be part of the
DAIS standard, or are hooks that permit building source browsers
sufficient?
Comment (Bart M):
There is a need for a library of useful functionality for tools. For example,
name demangelers are an important part of the picture. We need to be able
to translate from internal to external names (and back) for different
languages, compilers, and platforms.
Comment (Doug P):
Expressions parsers are another item that are a useful common feature.
Question (Jeff B):
Can we use kernInst to help with checkpoint/restore?
Answer (Bart M):
KernInst is not ready for prime time yet.
A poll was taken of what applications of the API the group felt were important.
Everyone got up to three votes. The results were: performance tools(14),
debuggers (8), memory tools (3), visualization (4), relative debugging (2),
load balancing (1), future systems (1), RAS (1).
Comment:
This is a biased group. Many (maybe most) people here are performance tool
builders or debuggers writers. Probably some of the others are subsets of
these.
III. Status of dyninstAPI
A copy of the current draft dyninstAPI document was distributed.
Comments (Jeff H):
There are some features that are in the document and missing from the
current reference implementation. The two most significant are block
and loop level instrumentation and thread support.
Question:
How soon will the instruction level instrumentation described by Ari on
Tuesday be available?
Comment (Bart M):
The prototype for fine-grained instrumentation is very early, and it will
take quite a bit of work to get it ready for distribution.
Question:
How do you start up an application and take control of it?
Answer (Jeff H):
The API provides attach and process create methods.
Question:
How do you access a variable that is in memory on another node (perhaps in
software DSM or that is part of an HPF distributed array)?
Answer (Doug P):
We don't plan to handle these language specific issues in the API, instead
we will provide enough to read/write the local memory on a node and there
will need to be mapping and access functions.
A list of features not in the current API document, but that would be useful
was put on the white board. It contained:
- support for distributed environments
- register state - what registers: perf counters - timing,
pc, sp frame pointer, etc.
- stack trace
- some notions of breakpoint - and step and single step
- symbol table information/source mapping information (anything you can
get out of the symbol table without parsing the source)
- compiler language and vendor (string representation of what
compiler, etc.)
- signals catching
- floating point expressions
- 32/64 bit - both ints and floats
- basic structures / arrays
- extract machine specific info (effective addr)
- address as a base type for snippet expressions
- bulk data transfer, perhaps with a filter function to return all
values that meet a simple test (i.e. not zero, < 0.0001, etc.).
- load code (e.g. dynamic linked library) -- will be implemented soon
- dump what you think the state of the world is now (tools for debugging
tool building)
- simple string to AST Expr tree conversion routine.
Question:
Does dyninst or DAIS need to be just thread aware or specific thread-package
aware?
Answer (Jeff H):
To allow snippets to only be active for a subset of the threads in
an address space, thread-package specific instrumentation is required in
the thread context switch code.
Question:
How are signals handled within this interface?
Answer (Jeff H):
A mutator process can select if a specific signal will stop this process
and inform the mutator. If a mutator wishes to change the signal handling
behavior within the application, it can use the oneShot interface to cause
a new signal handler to be installed.
Question:
How can conditional break points that have arbitrary code be inserted
and used?
Answer (Jeff H):
For simple expressions "inline" snippet can be generated. For more
complex code, it might be possible to invoke the native compiler, have
it produce a predicate function which the dynamic linker would load into
the program and the snippet would be installed to call.
Comment:
That approach assumes that a compiler will be available on the nodes.
In many systems, the compiler is only installed on the front-end node.
Question:
How will instrumentation of individual instructions be handled?
Comment:
ATOM has good support for instrumenting individual instructions.
It also has a nice abstraction for instrumenting instructions, and
computing the effective address of a load or store instruction.
Question:
How do rewriting and dynamic instrumentation fit together?
Answer:
Question:
What is the status of the source code for the reference implementation?
Answer (dyninst):
Currently we make the source code freely available for non-profit uses which
includes internal use by companies. Redistribution is this only thing that
has a substantial restriction. Also, we have avoided using GNU Public
Licensed code so far. Although there are hooks in the code that can plug
into some gnu functionality such as the name demangler.
Answer (DAIS):
We intend to make the code available to partners. There are some parts
that use IBM proprietary code, but that code is used for AIX specific
functionality.
Comment (Jeff H):
Many of the features on the list of possible additions to the dyninstAPI
will require a substantial amount of work. Perhaps we are better off
starting by defining the interfaces.
III. Status of DAIS
Doug is still working on the first public draft of the DAIS document (should
be ready in about 2-3 weeks).
A list of possibly useful features was placed on the white board. A * means
that Doug felt the issue was already addressed in the current DAIS effort.
- security *
- process/thread sub-grouping (and names groups)
maybe hooks for MPI communicators to register?
- scaling to 1000's of nodes
- help with sync clocks (external libraries)
- Can the RPC mechanism be abstracted so that different ones can be used?
- language consistent between DAIS and DyninstAPI
- App language expression {language and mechanism -
compiled, interp, run-time compile}
- moving data from app (Dais vs dyninstAPI)
- communications between daemons (OMIS does this)
- communications between clients / peers ... apps & dais
servers & dais clients
- multiple simultaneous clients tools
- interface for serial tools (yes -- degenerate case)
- A dyninst-only tool coexisting with a DAIS-based tool
* This may note be possible, can't attach to the same application
at the same time.
- question about whether a dyninst tool would co-exist a
dais-dyninst tool (DP - no).
- NT interface (dyninstAPI has one, DAIS doesn't)
- dump what you think the state of the world is now (tools
for debugging tool building)
- language for API (interface & implementation) {how many
languages are involved here}
- ... discussion of implication of exceptions ...
- work in batch / queued mode
- connecting to a job with or without stopping { dais has
both an attach and connect}
- dynamic process / thread spawning ... (might just provide
a registration hook)
- eventually 3rd party data transfers ... e.g. ship a block
of data to a third process ...
Question:
What language is DAIS written in?
Answer (Doug P):
It uses C++ with no templates, but does use data polymorphism and exceptions.
Comment (Jeff H):
The dyninstAPI uses a constructor only for the top level object, then
uses member functions to build up other objects, thus avoiding having to
deal with exceptions.
Question:
Is it possible to abstract out the authentication so that a different
module can be plugged in to provide different authentication? One suggestion
might be to use the GSS API for security.
Answer (Doug P):
I am not familiar with the GSS API, but we might be able to create some layer
that lets users select a security interface.
Comment (Bart M):
Doug, if you could write a thin layer to adapt DCE to conform to GSS
(and I don't know what it looks like exactly, so I don't know how hard
this would be) could you then support the GSS API and still supply the DCE
security you had planned?
Answer (Doug P):
I don't know, I will have to look at GSS, but it might be possible.
Question:
How does it work when multiple tools try to use DAIS at once for the
same application? This is an important features if DAIS is used for
load leveling, RAS, or condor, and then someone wants to do visualization
or debugging.
Answer (Doug P):
There are two types of modes possible:
attach: exclusive access to process, can change control flow.
connect: access to process, can insert probes (attach without stop, or
"asynchronous attach")
Question (Jeff H):
What issues does dynamic process spawning raise for DAIS?
Answer (Doug P):
DAIS doesn't deal with this currently.
V. Summary and Action
Main Goal by teleconference:
Make the requirements more concrete and document them.
Action Items:
Suggested by Doug P:
We need to identify tool developers must-have and like-to-have API
features. We want to Try to avoid a laundry list!
Robert Hood volunteered to look through the features used by p2d2 to identify
missing items from the dyninstAPI
Jeff H will document the interfaces for some of the "easy" extensions and
add them to the API document. This will include simple expression
string to AST translation, breakpoints, and dynamic loading of code.
Doug will give us a new DAIS document in 2-4 weeks.
After Doug's document has circulated we will have a tele-conference.
We will try to have the next meeting after SPDT'98 in Oregon on Aug. 3.
Send Doug email if you want to join in on Dais end --- so he
can get legal things filled out.
A special thanks to Mary Zosel and Aaron Sawdey for taking notes during the
meeting. Credit for capturing what happened goes to them, blame for
inaccuracies should be directed to me - Jeff