Stephen Ingram

stephen ingram

Principal Backend Software Engineer/Data Analyst for RISC Networks

For interested parties, this page describes both my working skill set and the different domains in which I have previous experience. I evidence these claims with a detailed list of my projects and publications.

Resume / Academic CV / Blog / LinkedIn

skill set

coding

As a developer, writing code is the major component of my regular workflow. I have experience building software with system-level languages like C, Python, and Java. For analysis and system administration, I have employed scripting languages like awk & Python to process and clean raw data files for input into databases, or data analysis environments like Jupyter and RStudio. A detailed list of familiar languages and software tools can be found on my experience timeline.

visualization

Looking at the data is a critical step in almost any data analysis. As part of the InfoVis group at UBC, I have discussed visualization techniques and evaluations since 2005. In my own experience, I use Tableau, or dplyr and ggplot2 for exploration of static tables, and GNU Octave/Matlab RStudio, and IPython Notebooks for iterative exploration while developing algorithms.

stats/machine learning

Stats/ML are a suite of techniques that help build predictive models of data distributions as well as give guidance about how much confidence to put in those models. As head of analytics at Coho Data, I have employed time-series analysis and clustering to analyze customer usage patterns. As a doctoral student, I have devised "unsupervised learning" techniques for data exploration. As a quant, I have used a variety of supervised learning techniques for fitting model parameters and statistical hypothesis testing for confirmatory analyses.

computer graphics

Familiarity with computer graphics APIs like WebGL/OpenGL has yielded at least two benefits for me as a researcher/analyst. First, I have leveraged the graphics pipeline to build novel visualization techniques that scale to large datasets. Second, I have exploited the parallelism of graphics processors (GPUs) to speed up existing analysis techniques.

domains

data storage

My team and I have designed and developed an analytics system for analyzing storage product performance, events, and alerts. The system is deployed in AWS, is designed to easily scale with demand, and uses a flexible elasticsearch back end.

I have also developed different methods for analyzing the workload statistics of computer storage systems. I am interested in analyzing block-level storage traces for:

classifying workloads
optimizing flash provisioning
predicting usage patterns
empowering administrators

financial

My exposure to finance is on the trading side of things. I have researched the following:

Order-book-level market-impact analysis
Long-term trading system development (monthly)
Short-term trading system development (daily)
Trading algorithm ("Algo") development
Automated trading system ("Bot") development

text analysis

I have been involved in several projects to help researchers navigate unordered collections of documents. These projects are:

Overview
Glimmer (see docs example)
Metacombine

scientific computing

My experience with scientific computing has focused on numerical linear algebra and nonlinear optimization, having taken graduate courses in both these topics. In my own research, I have experience with

Nonlinear optimization techniques like Quasi-Newton methods
Fast methods for exact matrix inverses like matrix reordering.
Fast methods for approximate matrix inverses like Krylov subspace, and multigrid methods.

bioinformatics

I consulted with BC Cancer Agency to development software for analyzing DNA copy number alterations. The project involved collaborating with a lead researcher to design a visual console for analyzing copy numbers and labels across chromosomes.