The Timeline view shows CPU and GPU activity that occurred while
your application was being profiled. Multiple timelines can be opened in the Visual
Profiler at the same time. Each opened timeline is represented by a different instance
of the view. The name of the session file containing the timeline and related data is
shown in the tab. The following figure shows a Timeline view with
two open sessions, one for session eigenvalues.vp and the other
for session dct8x8.vp.
Along the top of the Timeline view is a horizontal ruler that
shows elapsed time from the start of application profiling. Along the left of the view
is a vertical ruler that describes what is being shown for each horizontal row of the
timeline, and that contains various controls for the timeline. These controls are
described in Timeline Controls
The types of timeline rows that are displayed in the Timeline view
are:
- Process
- A timeline will contain a Process row for each
application profiled. The process identifier represents the pid of the process.
The timeline row for a process does not contain any intervals of activity.
Threads within the process are shown as children of the process.
- Thread
- A timeline will contain a Thread row for each thread in
the profiled application that performed either a CUDA driver or runtime API
call. The thread identifier is a unique id for that thread. The timeline row for
a thread is does not contain any intervals of activity.
- Runtime API
- A timeline will contain a Runtime API row for each thread
that performs a CUDA Runtime API call. Each interval in the row represents the
duration of the call on the CPU.
- Driver API
- A timeline will contain a Driver API row for each thread
that performs a CUDA Driver API call. Each interval in the row represents the
duration of the call on the CPU.
- Device
- A timeline will contain a Device row for each GPU device
utilized by the application being profiled. The name of the timeline row
indicates the device ID in square brackets followed by the name of the device.
The timeline row for a device does not contain any intervals of
activity.
- Context
- A timeline will contains a Context row for each CUDA or
OpenCL context on a GPU device. The name of the timeline row indicates the
context ID and the compute API bound to that context (either CUDA or OpenCL).
The timeline row for a context does not contain any intervals of
activity.
- Memcpy
- A timeline will contain memory copy row(s) for each context that performs
memcpys. A context may contain up to three memcpy rows for device-to-host,
host-to-device, and device-to-device memory copies. Each interval in a row
represents the duration of a memcpy executing on the GPU.
- Compute
- A timeline will contain a Compute row for each context
that performs computation on the GPU. Each interval in a row represents the
duration of a kernel on the GPU device. The Compute row
indicates all the compute activity for the context on a GPU device. The
contained Kernel rows show activity of each individual
application kernel.
- Kernel
- A timeline will contain a Kernel row for each type of
kernel executed by the application. Each interval in a row represents the
duration of execution of an instance of that kernel on the GPU device.
Each row is labeled with a percentage that indicates the total execution time
of all instances of that kernel compared to the total execution time of all
kernels. Next, each row is labeled with the number of times the
kernel was invoked (in square brackets), followed by the kernel name.
For each context, the kernels are ordered top to bottom by execution time.
- Stream
- A timeline will contain a Stream row for each stream used
by the application (including both the default stream and any application
created streams). Each interval in a Stream row
represents the duration of a memcpy or kernel execution performed on that
stream.