hprof-0.1.3/.gitignore000064400017500001750000000000221264027031000130260ustar0000000000000000target Cargo.lock hprof-0.1.3/.travis.yml000064400017500001750000000010201264027031000131460ustar0000000000000000language: rust sudo: required rust: - nightly - beta before_script: - pip install 'travis-cargo<0.2' --user && export PATH=$HOME/.local/bin:$PATH script: - | travis-cargo build && travis-cargo test && travis-cargo bench && travis-cargo doc after_success: - travis-cargo --only beta doc-upload - travis-cargo coveralls env: global: - secure: cpewKWIxIogX0DDZZajvlQkXR29Mc2JHLMzNQjO1FsVA67d5fsYlhGCIzF4eb32Yz73IpNFi3yKAdDfNYIVoYwWAYDRwJX5+noV3uQy5yhPRo5c5XYEYXXrYQkqaXArShvSw2Aq+Q94jK3rKT0Q4XaL7jwjDzvzY7dTBWlpBvk0= hprof-0.1.3/Cargo.toml000064400017500001750000000015201264027041400127770ustar0000000000000000# THIS FILE IS AUTOMATICALLY GENERATED BY CARGO # # When uploading crates to the registry Cargo will automatically # "normalize" Cargo.toml files for maximal compatibility # with all versions of Cargo and also rewrite `path` dependencies # to registry (e.g., crates.io) dependencies. # # If you are reading this file be aware that the original Cargo.toml # will likely look very different (and much more reasonable). # See Cargo.toml.orig for the original contents. [package] name = "hprof" version = "0.1.3" authors = ["Corey Richardson "] description = "A simple hierarchical profiler" documentation = "https://cmr.github.io/hprof" readme = "README.md" license = "BSL-1.0" repository = "https://github.com/cmr/hprof" [dependencies.clock_ticks] version = "0.1.0" [dependencies.log] version = "0.3.4" [features] unstable = [] hprof-0.1.3/Cargo.toml.orig000064400017500001750000000005351264027041400137430ustar0000000000000000[package] name = "hprof" version = "0.1.3" authors = ["Corey Richardson "] description = "A simple hierarchical profiler" documentation = "https://cmr.github.io/hprof" repository = "https://github.com/cmr/hprof" readme = "README.md" license = "BSL-1.0" [dependencies] clock_ticks = "0.1.0" log = "0.3.4" [features] unstable = [] hprof-0.1.3/LICENSE_1_0.txt000064400017500001750000000024001264027031000133220ustar0000000000000000Permission is hereby granted, free of charge, to any person or organization obtaining a copy of the software and accompanying documentation covered by this license (the "Software") to use, reproduce, display, distribute, execute, and transmit the Software, and to prepare derivative works of the Software, and to permit third-parties to whom the Software is furnished to do so, all subject to the following: The copyright notices in the Software and this entire statement, including the above license grant, this restriction and the following disclaimer, must be included in all copies of the Software, in whole or in part, and all derivative works of the Software, unless such copies or derivative works are solely in the form of machine-executable object code generated by a source language processor. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. hprof-0.1.3/README.md000064400017500001750000000044521264027031000123300ustar0000000000000000# `hprof`, a real-time hierarchical profiler [![Travis](https://img.shields.io/travis/cmr/hprof.svg?style=flat-square)](https://travis-ci.org/cmr/hprof) [![Crates.io](https://img.shields.io/crates/v/hprof.svg?style=flat-square)](https://crates.io/crates/hprof) [Documentation](https://cmr.github.io/hprof) `hprof` is suitable only for getting rough measurements of "systems", rather than fine-tuned profiling data. Consider using `perf`, `SystemTap`, `DTrace`, `VTune`, etc for more detailed profiling. # What is hierarchical profiling? Hierarchical profiling is based on the observation that games are typically organized into a "tree" of behavior. You have an AI system that does path planning, making tactical decisions, etc. You have a physics system that does collision detection, rigid body dynamics, etc. A tree might look like: - Physics - Collision detection - Broad phase - Narrow phase - Fluid simulation - Rigid body simulation - Collision resolution - Update positions - AI - Path planning - Combat tactics - Build queue maintenance - Render - Frustum culling - Draw call sorting - Draw call submission - GPU wait A hierarchical profiler will annotate this tree with how much time each step took. This is an extension of timer-based profiling, where a timer is used to measure how long a block of code takes to execute. Rather than coding up a one-time timer, you merely call `Profiler::enter("description of thing")` and a new entry will be made in the profile tree. The idea came from a 2002 article in Game Programming Gems 3, "Real-Time Hierarchical Profiling" by Greg Hjelstrom and Byon Garrabrant from Westwood Studios. They report having thousands of profile nodes active at a time. # License This software is licensed under the [Boost Software License](http://www.boost.org/users/license.html). In short, you are free to use, modify, and redistribute in any form without attribution. # Example Output ``` Timing information for main loop: setup - 1133523ns (6.725068%) physics - 2258292ns (13.3982%) collision - 1140731ns (50.512998%) update positions - 1108782ns (49.098257%) render - 13446767ns (79.778204%) cull - 1134725ns (8.438646%) gpu submit - 2197346ns (16.341073%) gpu wait - 10088879ns (75.028287%) ``` hprof-0.1.3/examples/explicit.rs000064400017500001750000000020571264027031000150550ustar0000000000000000extern crate hprof; fn main() { let p = hprof::Profiler::new("main loop"); loop { p.start_frame(); { let _g = p.enter("setup"); std::thread::sleep_ms(1); } { let _g = p.enter("physics"); let _g = p.enter("collision"); std::thread::sleep_ms(1); drop(_g); let _g = p.enter("update positions"); std::thread::sleep_ms(1); drop(_g); } { let _g = p.enter("render"); let _g = p.enter("cull"); std::thread::sleep_ms(1); drop(_g); let _g = p.enter("gpu submit"); std::thread::sleep_ms(2); drop(_g); let _g = p.enter("gpu wait"); std::thread::sleep_ms(10); } p.end_frame(); // this would usually depend on a debug flag, or use custom functionality for drawing the // debug information. if true { p.print_timing(); } break; } } hprof-0.1.3/examples/implicit.rs000064400017500001750000000021271264027031000150440ustar0000000000000000extern crate hprof; fn main() { loop { hprof::start_frame(); { let _g = hprof::enter("setup"); std::thread::sleep_ms(1); } { let _g = hprof::enter("physics"); let _g = hprof::enter("collision"); std::thread::sleep_ms(1); drop(_g); let _g = hprof::enter("update positions"); std::thread::sleep_ms(1); drop(_g); } { let _g = hprof::enter("render"); let _g = hprof::enter("cull"); std::thread::sleep_ms(1); drop(_g); let _g = hprof::enter("gpu submit"); std::thread::sleep_ms(2); drop(_g); let _g = hprof::enter("gpu wait"); std::thread::sleep_ms(10); drop(_g); } hprof::end_frame(); // this would usually depend on a debug flag, or use custom functionality for drawing the // debug information. if true { hprof::profiler().print_timing(); } break; } } hprof-0.1.3/examples/noguard.rs000064400017500001750000000022111264027031000146630ustar0000000000000000extern crate hprof; fn main() { let p = hprof::Profiler::new("main loop"); loop { p.start_frame(); { p.enter_noguard("setup"); std::thread::sleep_ms(1); p.leave(); } { p.enter_noguard("physics"); p.enter_noguard("collision"); std::thread::sleep_ms(1); p.leave(); p.enter_noguard("update positions"); std::thread::sleep_ms(1); p.leave(); p.leave(); } { p.enter_noguard("render"); p.enter_noguard("cull"); std::thread::sleep_ms(1); p.leave(); p.enter_noguard("gpu submit"); std::thread::sleep_ms(2); p.leave(); p.enter_noguard("gpu wait"); std::thread::sleep_ms(10); p.leave(); p.leave(); } p.end_frame(); // this would usually depend on a debug flag, or use custom functionality for drawing the // debug information. if true { p.print_timing(); } break; } } hprof-0.1.3/src/lib.rs000064400017500001750000000255751264027031000127650ustar0000000000000000// Copyright Corey Richardson 2015 // Distributed under the Boost Software License, Version 1.0. // (See accompanying file LICENSE_1_0.txt or copy at // http://www.boost.org/LICENSE_1_0.txt) //! A real-time hierarchical profiler. //! //! # What is hierarchical profiling? //! //! Hierarchical profiling is based on the observation that games are typically //! organized into a "tree" of behavior. You have an AI system that does path //! planning, making tactical decisions, etc. You have a physics system that does //! collision detection, rigid body dynamics, etc. A tree might look like: //! //! - Physics //! - Collision detection //! - Broad phase //! - Narrow phase //! - Fluid simulation //! - Rigid body simulation //! - Collision resolution //! - Update positions //! - AI //! - Path planning //! - Combat tactics //! - Build queue maintenance //! - Render //! - Frustum culling //! - Draw call sorting //! - Draw call submission //! - GPU wait //! //! A hierarchical profiler will annotate this tree with how much time each step //! took. This is an extension of timer-based profiling, where a timer is used to //! measure how long a block of code takes to execute. Rather than coding up a //! one-time timer, you merely call `Profiler::enter("description of thing")` and //! a new entry will be made in the profile tree. //! //! The idea came from a 2002 article in Game Programming Gems 3, "Real-Time //! Hierarchical Profiling" by Greg Hjelstrom and Byon Garrabrant from Westwood //! Studios. They report having thousands of profile nodes active at a time. //! //! There are two major ways to use this library: with explicit profilers, and with an implicit //! profiler. //! //! # Implicit (thread-local) profiler //! //! To use the implicit profiler, call `hprof::start_frame()`, `hprof::end_frame()`, and //! `hprof::enter("name")`. Destructors will take care of the rest. You can access the profiler //! using `hprof::profiler()`. //! //! # Explicit profilers //! //! Use `Profiler::new()` and pass it around/store it somewhere (for example, using //! [`current`](https://github.com/PistonDevelopers/current)). #[macro_use] extern crate log; extern crate clock_ticks; use std::cell::{Cell, RefCell}; use std::rc::Rc; thread_local!(static HPROF: Profiler = Profiler::new("root profiler")); /// A single tree of profile data. pub struct Profiler { root: Rc, current: RefCell>, enabled: Cell, } /// A "guard" for calling `Profiler::leave` when it is destroyed. pub struct ProfileGuard<'a>(&'a Profiler); impl<'a> Drop for ProfileGuard<'a> { fn drop(&mut self) { self.0.leave() } } macro_rules! early_leave { ($slf:ident) => (if $slf.enabled.get() == false { return }) } impl Profiler { /// Create a new profiler with the given name for the root node. pub fn new(name: &'static str) -> Profiler { let root = Rc::new(ProfileNode::new(None, name)); root.call(); Profiler { root: root.clone(), current: RefCell::new(root), enabled: Cell::new(true) } } /// Enter a profile node for `name`, returning a guard object that will `leave` on destruction. pub fn enter(&self, name: &'static str) -> ProfileGuard { self.enter_noguard(name); ProfileGuard(self) } /// Enter a profile node for `name`. pub fn enter_noguard(&self, name: &'static str) { early_leave!(self); { let mut curr = self.current.borrow_mut(); if curr.name != name { *curr = curr.make_child(curr.clone(), name); } } self.current.borrow().call(); } /// Leave the current profile node. pub fn leave(&self) { early_leave!(self); let mut curr = self.current.borrow_mut(); if curr.ret() == true { if let Some(parent) = curr.parent.clone() { *curr = parent; } } } /// Print out the current timing information in a very naive way. pub fn print_timing(&self) { println!("Timing information for {}:", self.root.name); for child in &*self.root.children.borrow() { child.print(2); } } /// Return the root profile node for inspection. /// /// This root will always be valid and reflect the current state of the `Profiler`. /// It is not advised to inspect the data between calls to `start_frame` and `end_frame`. pub fn root(&self) -> Rc { self.root.clone() } /// Finish a frame. /// /// Logs an error if there are pending `leave` calls, and later attempts to /// print timing data will be met with sadness in the form of `NaN`s. pub fn end_frame(&self) { early_leave!(self); if &*self.root as *const ProfileNode as usize != &**self.current.borrow() as *const ProfileNode as usize { error!("Pending `leave` calls on Profiler::frame"); } else { self.root.ret(); } } /// Start a frame. /// /// Resets timing data. Logs an error if there are pending `leave` calls, but there are /// otherwise no ill effects. pub fn start_frame(&self) { early_leave!(self); if &*self.root as *const ProfileNode as usize != &**self.current.borrow() as *const ProfileNode as usize { error!("Pending `leave` calls on Profiler::frame"); } *self.current.borrow_mut() = self.root.clone(); self.root.reset(); self.root.call(); } /// Disable the profiler. /// /// All calls until `enable` will do nothing. pub fn disable(&self) { self.enabled.set(false); } /// Enable the profiler. /// /// Calls will take effect until `disable` is called. pub fn enable(&self) { self.enabled.set(true); } /// Toggle the profiler enabledness. pub fn toggle(&self) { self.enabled.set(!self.enabled.get()); } } /// A single node in the profile tree. /// /// *NOTE*: While the fields are public and are a cell, it is not advisable to modify them. pub struct ProfileNode { pub name: &'static str, /// Number of calls made to this node. pub calls: Cell, /// Total time in ns used by this node and all of its children. /// /// Computed after the last pending `ret`. pub total_time: Cell, /// Timestamp in ns when the first `call` was made to this node. pub start_time: Cell, /// Number of recursive calls made to this node since the first `call`. pub recursion: Cell, /// Parent in the profile tree. pub parent: Option>, // TODO: replace this Vec with an intrusive list. Use containerof? /// Child nodes. pub children: RefCell>>, } impl ProfileNode { pub fn new(parent: Option>, name: &'static str) -> ProfileNode { ProfileNode { name: name, calls: Cell::new(0), total_time: Cell::new(0), start_time: Cell::new(0), recursion: Cell::new(0), parent: parent, children: RefCell::new(Vec::new()) } } /// Reset this node and its children, seting relevant fields to 0. pub fn reset(&self) { self.calls.set(0); self.total_time.set(0); self.start_time.set(0); self.recursion.set(0); for child in &*self.children.borrow() { child.reset() } } /// Create a child named `name`. pub fn make_child(&self, me: Rc, name: &'static str) -> Rc { let mut children = self.children.borrow_mut(); for child in &*children { if child.name == name { return child.clone() } } let new = Rc::new(ProfileNode::new(Some(me), name)); children.push(new.clone()); new } /// Enter this profile node. pub fn call(&self) { self.calls.set(self.calls.get() + 1); let rec = self.recursion.get(); if rec == 0 { self.start_time.set(clock_ticks::precise_time_ns()); } self.recursion.set(rec + 1); } /// Return from this profile node, returning true if there are no pending recursive calls. pub fn ret(&self) -> bool { let rec = self.recursion.get(); if rec == 1 { let time = clock_ticks::precise_time_ns(); let durr = time - self.start_time.get(); self.total_time.set(self.total_time.get() + durr); } self.recursion.set(rec - 1); rec == 1 } /// Print out the current timing information in a very naive way. /// /// Uses `indent` to determine how deep to indent the line. pub fn print(&self, indent: u32) { for _ in 0..indent { print!(" "); } let parent_time = self.parent .as_ref() .map(|p| p.total_time.get()) .unwrap_or(self.total_time.get()) as f64; let percent = 100.0 * (self.total_time.get() as f64 / parent_time); if percent.is_infinite() { println!("{name} - {calls} * {each} = {total} @ {hz:.1}hz", name = self.name, calls = self.calls.get(), each = Nanoseconds((self.total_time.get() as f64 / self.calls.get() as f64) as u64), total = Nanoseconds(self.total_time.get()), hz = self.calls.get() as f64 / self.total_time.get() as f64 * 1e9f64 ); } else { println!("{name} - {calls} * {each} = {total} ({percent:.1}%)", name = self.name, calls = self.calls.get(), each = Nanoseconds((self.total_time.get() as f64 / self.calls.get() as f64) as u64), total = Nanoseconds(self.total_time.get()), percent = percent ); } for c in &*self.children.borrow() { c.print(indent+2); } } } pub fn profiler() -> &'static Profiler { HPROF.with(|p| unsafe { std::mem::transmute(p) } ) } pub fn enter(name: &'static str) -> ProfileGuard<'static> { HPROF.with(|p| unsafe { std::mem::transmute::<_, &'static Profiler>(p) }.enter(name) ) } pub fn start_frame() { HPROF.with(|p| p.start_frame()) } pub fn end_frame() { HPROF.with(|p| p.end_frame()) } // used to do a pretty printing of time struct Nanoseconds(u64); impl std::fmt::Display for Nanoseconds { fn fmt(&self, f: &mut std::fmt::Formatter) -> std::fmt::Result { if self.0 < 1_000 { write!(f, "{}ns", self.0) } else if self.0 < 1_000_000 { write!(f, "{:.1}us", self.0 as f64 / 1_000.) } else if self.0 < 1_000_000_000 { write!(f, "{:.1}ms", self.0 as f64 / 1_000_000.) } else { write!(f, "{:.1}s", self.0 as f64 / 1_000_000_000.) } } }