Data Fundamentals

Unit 1 - Numerical

multidimensional numerical array / nD array Fundamental data type which holds rectangular, multidimensional arrays of numbers of uniform type in a dense compact form, with a large number of efficient operations that can be applied in a vectorised form.
vectorised / vectorised computation code which does computations on many values simultaneously, without explicit iteration over those values
Single Instruction Multiple Data / SIMD A hardware instruction that applies the same operation (e.g. addition) to multiple values in a single operation.
GPU / graphics processor units The fast number processing unit on your computer.
vector A one dimensional array of numbers.
matrix A two dimensional array of numbers.
tensor an array with >2 dimensions.
linear algebra The algebra of matrices (strictly, of linear maps)
linear map A function which satisfies linearity (see below).
vector space A mathematical space which consists of tuples of numbers (usually reals) and two operations: vector addition and scalar multiplication
array functions functions which are applied to nD arrays
- array arithmetic arithmetic operations on nD arrays, like elementwise addition
- indexing and slicing Selecting individual elements, or rectangular blocks of elements from an nD array.
- array generation array functions which create new arrays, like np.zeros
- array rearranging array functions which adjust the apparent layout of arrays, like transpose, reshape and fliplr
- order operations array functions which use the rank ordering of values, like np.sort, np.argsort, np.argmin, etc.
- aggregate functions array functions which summarise an array, like np.sum or np.mean
Vector operations array functions which depend on the mathematical structure of vectors, like the cross product
Matrix operations functions which depend on the mathematical structure of matrices, like the determinant or the inverse
Signal processing operations functions which treat arrays as sampled signals, like convolution and the DFT.
NumPy Library providing nD arrays for Python.
SciPy Library providing scientific operations on nD arrays for Python
Matplotlib Library providing plotting facilities for nD arrays for Python.
dtype Datatype of an nD array, e.g. float32
shape Size of an nD array, e.g. shape (4,8) is 4 row 8 column 2-dimensional nDarray
rank number of dimensions of an nD array (rank 1→vector, rank 2→matrix, rank n tensor)
mutable being able to be changed after creation
slice a rectangular sub-region of an array (may have “skipped” elements)
elementwise arithmetic arithmetic operation applied each element of an nD array
scalar arithmetic array/scalar operations, like x+2
array arithmetic array/array operations, like x+x
Boolean array an array of True/False values, the result of a Boolean test on arrays. Can be used directly as an index in NumPy.
broadcasting rules for automatically replicating an array to allow arithmetic between arrays of different shapes, as long as the last dimensions match exactly
transpose exchange of rows and columns of a 2D array, or more generally reversal of the order of dimensions
tiling repeating an array in a regular pattern
masking selecting non-rectangular elements of an array, e.g. using a Boolean array as an index
cumulative sum a running sum
argsort the set of (1D) array indices such that indexing the array in that order would result in a sorted array

Formulae

Notation for a vector

x

x, x \in R^{d}, x = [x_{1}, x_{2}, \dots, x_{d}]

Notation for a Matrix

A

A, A \in R^{n \times m}, A = a_{1, 1} a_{2, 1} \dots a_{n, 1} a_{1, 2} a_{2, 2} a_{n, 2} \dots \dots \dots a_{1, m} a_{2, m} a_{n, m}

Unit 2 - Floats And Strided

striding The trick used to treat a flat sequence of numbers as a multidimensional array, by storing the offsets to jump to the next element in any dimension
stride the offset (in bytes, typically) to jump to get to the next value in a given dimension
C ordering / row-major last dimension changes fastest
Fortran ordering / column-major first dimension changes fastest
integers whole numbers
floating point numbers approximations to real numbers with variable absolute precision across their range
numerical issues / numerical precision problems arising from the approximation of real numbers by floating-point numbers
scientific notation numbers written with a single digit before a decimal point, and an explicit exponent, like $5.242 e 4 = 5.242 \times 1 0^{4} = 52420$
float a floating point number
sign bit the bit indicated whether a float is negative or positive
exponent the part of a floating point number that specifies the bitshift to be applied (the base 2 exponent)
mantissa the part of the floating point number specifying the fractional part of a (binary) number, following 1.xxxxx…
IEEE754 standard defining storage formats and standard operations on floating point numbers
binary32 / float32 / single precision IEEE754 32 bit floating point numbers
binary64 / float64 / double precision IEEE754 64 bit floating point numbers
float128 / quadruple precision rare format for very precise computations
float256 / octuple precision even rarer format for super precise computations (like astronomical predictions)
floating point exception / FPE an event triggered by a floating point operation which is problematic.
- Invalid Operation operation without defined value, like 0/0 or inf−inf
- Division by Zero division by zero
- Overflow number exceeds maximum number storable for the float type
- Underflow number is closer to zero than the smallest (absolute) number storable for the float type
- Inexact result will not be within machine precision
Not A Number / NaN special “placeholder” number generated by invalid operations, which will propagate to all other operations in which NaN is involved.
roundoff error error due to inexact representation of real numbers as floats
relative error magnitude of difference between real number and float approximation, normalised by the magnitude of the real number
endianness the order in which multi-byte words are stored in memory
- little endian least significant byte first
- big endian most significant byte first
row vector a vector stored as an Nx1 array
column vector a vector stored as a 1xN array
singleton dimension an array dimension of 1
promoting increasing the rank of an array by introducing a new singleton dimension, like (3,3) ‐> (3,1,3)
squeezing removing all singleton dimensions, like (1,3,1,4) ‐> (3,4)
Einstein summation notation compact notation to exchange array dimensions, and also compute sums and products
First rule of vectorisation: no for loops

Formulae

Relative precision

Floating point representation

Relative precision of floating point representation

Relative error ϵ = \frac{∣ f l o a t ( x ) Closest floating-point number - x Real number ∣}{Absolute value of real number ∣ x ∣}

Machine Epsilon

IEEE754 guarantee for relative precision (machine precision $ϵ$ ) for a $t$ bit mantissa floating point number.

Relative error ϵ = \frac{1}{2} \cdot 2^{- t Bits in the mantissa} = 2^{- (t + 1)}

Unit 3 - Visualisation

histograms a plot showing the relative frequencies of values falling into specified bins
scatterplots a plot showing disconnected markers at $x, y$ locations
contour plots a plot showing the value of a 2D function $f (x, y)$ as a collection of isocontour lines
Layered Grammar of Graphics A model for discussing, implementing and understanding scientific visualisations
Stat a transformation of data, like taking the mean
Mapping a mapping of data attributes onto visual attributes; e.g. price onto x location
Scale a scaling of a data attribute onto the corresponding visual attribute
Guide a visual element indicating the scaling and mapping applied, such that the transformation can be reversed by the reader to return to the data units from the visual units; for example, tick marks.
Geom a geometric object used to visually represent data, like a line or point.
Coord a special scaling that maps data onto 2D locations on the page/screen
Layer visual representation of data with a single type of geom, on a common set of mappings, coords and scales. Multiple layers may be drawn on the same coords, forming a single facet.
Facet Multiple facets represent multiple views of data, on separate coords.
Figure A collection of facets.
Caption Text which describes a figure to help the reader interpret it.
Axes The visual manifestation of coords
units The measurement quantities for a data attribute, like miles-per-hour
Ticks visual guides to indicate equal-spaced steps in the original data units, visible only on outer axes.
legend a guide used to identify distinct layers rendered on a single facet
title a guide used to label a whole facet or figure
grid visual guides to indicate equal-spaced steps in the original data units, visible across the whole facet.
annotations textual labels and accompanying arrows etc. used to highlight important parts of a figure.
Markers point geoms
Patches polygon/area geoms
independent variable a variable which causes some relationship. Speed of movement causes stopping distance to vary, for example.
dependent variable a variable which is caused by some relationship. Stopping distance is dependent upon speed of movement, for example.
line geoms A geom representing a connection between two data points. May vary in style, thickness, colour and opacity.
point geoms A geom representing a single point in 2D space. May vary in shape, colour, size and opacity
ribbon plot A plot using an area geom which behaves as a line of varying width
dimensional quantities quantities with real-world units, like miles per hour
dimensionless quantities without real-world units, like Mach number
aggregate summary statistics statistics which summarise some larger dataset compactly, like the mean
smoothing and regression finding simple, smooth functions which approximate observed data
binning splitting data sets into discrete “bins” each of which spans a range of a data units; used to form histograms
rank plot a plot of values arranged in rank order
Box plot a plot used to summarise the distribution of a collection of values, showing the median, interquartile range, extrema and outliers
- interquartile range the range between the 25% and 75% percentiles of a dataset. Drawn as a box on Box plot.
- median the value which exactly splits a dataset into one half smaller and one half larger; the 50th percentile. Drawn as a line on Box plot.
- extrema values representing the extreme values of a distribution, for example the 5th and 95th percentiles. Drawn as whiskers on a Box plot
- outliers values outside of the whiskers. Drawn as “fliers”, usually crosses or circles on each data point in a Box plot
violin plot a plot similar to a Box plot which shows an approximate, smooth density instead of fixed nonparametric statistics
linear regression finding a line of best fit to observed data
colour map a mapping from data units to colours
colour bar a guide to indicate the interpretation of a colour map
monotonically varying brightness apparent visual brightness which continuously increases as the data attribute increases
perceptually uniform visual mapping where apparent perceptual change corresponds linearly and uniformly to data unit changes.
false contours apparent contours in data caused by bad colour maps and not by underlying discrete steps in data
dash patterns patterns applied to style line geoms
staircase / step a type of plot where lines are only drawn horizontally and vertically to indicate instantaneous changes of value
bar chart a plot which represents data units as the height of rectangular area geoms.
transparency / opacity / alpha visual appearance of transparency to allow geoms to be partially visible behind other geoms
axis limits range across entire axis, used to define the coord scaling
data units the units of the original data (e.g. miles per hour)
visual units the units displayed (e.g. inches, pixels or RGB values)
aspect ratio the ratio of width of a plot to its height.
projection the model used to map data units to visual units; for example Cartesian or polar
logarithmic axes visual units represent the logarithm of data units
symlog modified logarithmic scale to allow for negative values
polar coordinates visual coordinates in terms of angle and distance
double $y$ axes two different y units overlaid on the same facet
standard deviation the average deviation from the average
nonparametric intervals intervals based on summary statistics of data, like the interquartile range

Unit 4 - Vectors

embedding The representation of objects as points in a vector space (e.g. a word embedding)
orthogonal direction A direction at 90 degrees to (unrelated to) some other direction
vector space A mathematical space which consists of tuples of numbers (usually reals) and two operations: vector addition and scalar multiplication
scalar multiplication multiplying a vector by a scalar value (a single real number)
vector addition adding two vectors, which is just the elementwise sum
norm the length of a vector. Norms must be chosen and there are many choices of norm.
inner product The dot product (sum of elementwise products) of two vectors
topological vector space A vector space equipped with a norm
feature vectors The representation of “features” which define properties as points in a vector space; critical in machine learning.
clustered Partitioned into distinct groups according to spatial affinity
vector quantisation Clustering collections of vectors and replacing the original vectors with cluster centres
weighted sums a sum of values, where each value is scaled by some weight
linearly interpolate find a line connecting two points
Euclidean norm the “usual” spatial distance, or $L_{2}$ norm
Euclidean space a vector space equipped with the Euclidean norm
direction / length vectors can be considered to have a direction (unit length vector) and a length (a scaling factor applied to the direction)
cosine distance measurement of angle between two vectors, in spaces where inner products are defined
unit vectors a vector normalised to have unit length (norm=1) for some norm
document vectors a way of representing whole text documents as single vectors
outer product the matrix representing every possible product of the elements of two vectors
surface normal the vector pointing “out” from a 3D surface
mean vector the mean of a collection of vectors
geometric centroid the centre of a collection of vectors, which is just the mean vector
zero mean mean at the origin
variance spread of values, defined as the square of the differences from the mean
standard deviation square root of variance, useful as in the same units as the original values
correlations linear relationships between variables
ellipse a squashed circle
covariance matrix a matrix representing the spread of vector valued data sets
error ellipse an ellipse which can be computed from the covariance matrix and will (in general) capture some proportion of the data points.
geometric median the vector which minimises the distance to all vectors in a data set
high dimensional vector spaces a vector space with $D > 3$
curse of dimensionality very important the problem that increasing dimensionality exponentially increases the volume of a vector space, making generalisation in high-dimensions very hard.
linear maps a function which satisfies the conditions of linearity
linearity the property that $f (a + b) = f (a) + f (b); f (c a) = c f (a); f (0) = 0$
linear functions linear maps
parallelotope the generalisation of a parallelogram to higher dimensions. A geometric shape with parallel sides, but not necessarily orthogonal sides.
linear transform A linear map from a vector space onto itself, e.g. $R^{N} \to R^{N}$ .
projection A function which is idempotent, such that $f (f (x)) = f (x)$
real matrices A matrix (2D array) consisting of real valued entries
linear algebra The algebra of matrices
special forms Special kinds of matrices, like diagonal, orthogonal or identity matrices
left multiplication Multiplying a matrix $A$ on the left of an expression $A B$
right multiplication Multiplying a matrix $A$ on the right of an expression $B A$
powers (of a matrix) The effect of repeated multiplication , $A^{4} = AAAA$ for example.
diagonal (of a matrix) The elements $a_{i} i$ of a matrix $A$ ; runs down a diagonal top-left to bottom right.
anti-diagonal (of a matrix) The horizontally flipped diagonal.
real square matrices A matrix that is real, and has equal number of rows and columns
upper triangular A matrix with zeros below the diagonal
lower triangular A matrix with zeros above the diagonal
symmetric A matrix with elements mirrored around the diagonal
skew-symmetric matrix A matrix with elements mirrored around the diagonal, but negated on one side
sparse A matrix that mainly consists of zeros, with a few sparse nonzero elements

Formulae

Vectors

Addition of two vectors

x + y = [x_{1} + y_{1}, x_{2} + y_{2, \dots, x_{n} + y_{n}}]

Scalar multiplication a vector

c x = [c x_{1}, c x_{2}, \dots, c x_{n}]

Linear interpolation of two vectors (Linear Interpolation)

l er p (x vector, y vector, α scalor) = Proportion of vector x (1 - α) x + Proportion of vector y α y

Cosine similarity

Cosine of angle between two vectors in terms of normalised dot product

Angle between x and y cos θ = \frac{x vector \cdot y vector}{Norm of x ∣∣ x ∣∣ Norm of y ∣∣ y ∣∣}

Dot product / inner product

x \cdot y = i \sum x_{i} y_{i}

Mean vector

Mean vector m e an (x_{1}, x_{2}, \dots, x_{n}) = Number of points \frac{1}{N} i \sum Vectors x_{i}

Matrix

Identity Matrix

I = 10 \dots 0 010 \dots \dots \dots 001

diagonal Matrix

D = d_{1} 0 \dots 0 0 d_{2} 0 \dots \dots \dots 00 d_{n}

Non-singular Matrix

A B \neq = B A (except in special cases), (A B) C = A (BC)

Definition of linearity (for a linear function $f$ and equivalent matrix $A$ )

f (x + y) = f (x) + f (y) f (c x) = c f (x) = A (x + y) = A x + A y, = A (c x) = c A x,

Matrix addition

A + B = a_{1, 1} + b_{1, 1} a_{2, 1} + b_{2, 1} \dots a_{n, 1} + b_{n, 1} a_{1, 2} + b_{1, 2} a_{2, 2} + b_{2, 2} a_{n, 2} + b_{n, 2} \dots \dots \dots a_{1, m} + b_{1, m} a_{2, m} + b_{2, m} a_{n, m} + b_{n, m}

Scalar matrix multiplication

c A = c a_{1, 1} c a_{2, 1} \dots c a_{n, 1} c a_{1, 2} c a_{2, 2} c a_{n, 2} \dots \dots \dots c a_{1, m} c a_{2, m} c a_{n, m}

Matrix multiplication

Element ij of C C_{ij} = k \sum Row i of A a_{ik} b_{kj} Column j of B

Outer product (matrix version)

x \otimes y = x^T y$$ ##### Inner product (matrix version)

x \cdot y=xy

##### Covariance

\underbrace{C_{ij}}{\text{Element of covariance matrix }\sum} = \frac{1}{N} \sum{k=1}^N \Biggl( \underbrace{(x_{k,i}- \mu_{i})}{\text{Centered data (i-th component)}} \Biggr) \Biggl(\underbrace{(x{k,j}- \mu_{j})}_{\text{Centered data (j-th component)}} \Biggr)

##### Covariance Matrix

\underbrace{\sum}{\text{Covariance matrix}} = \frac{1}{N} \underbrace{(X-\mu)}{\text{Centered data matrix}} \overbrace{(X-\mu)^T}^{\text{Transpose fo centered data}}

P(A), \text{an event is a subset of the sample space, i.e. a set of outcomes}

##### Joint distribution

P(A,B)

##### Conditional from joint distribution

P(A|B) = \frac{P(A,B)}{P(B)}

\mathbb{E}[X] = \int_{x} fx(x)dx

##### **Expectation of a function of a random variable**

\mathbb{E}[g(X)] = \int_{x} fx(x)g(x)dx

##### Acceptance probability for Metropolis-Hastings jump from $x$ to $x'$

P(\text{accept})= \begin{cases} f_{X}(x’) / {f_X(x)}, & f_X(x)>f_X(x’)\ 1, & f_X(x)\leq f_X(x’) \end{cases}

## Unit 10 - Signals - **sample** a measurement of a continuous signal at a precise instant or point - **quantisation** the reduction of a continuous signal to discrete steps - **sampled signal** the discrete representation of a continuous signal as a set of evenly spaced samples - **sampling rate** the rate at which samples are taken; $1 / {\Delta} T$ - **digital signal processing** the name for techniques applied to process sampled signals computationally - **Nyquist limit** the highest frequency representable for a given sample rate; equal to half the sample rate - **aliasing** the distortion introduced when a signal with frequencies greater than the Nyquist rate are present when a signal is sampled - **wagon wheel effect** temporal aliasing in the video domain - **interpolation** estimating the value of a function in between known samples - **gridding** the combination of interpolation with regular sampling to reinterpolate an irregularly sampled signal into a regularly sampled signal - **interpolation function** a function used to interpolate between measurements - **constant** / **nearest-neighbour** assumes function constant between samples - **linear** assumes a linear slope between samples - **polynomial** fits a polynomial (e.g. cubic) to successive groups of samples - **piecewise interpolation** interpolation applied separately to different regions of measurements (e.g. every pair of samples), rather than regression over the entire sequecne of measurements - **amplitude quantisation** reduction of amplitude to a fixed number of distinct levels - **noise** components of a signal unrelated to the signal of interest - **signal to noise ratio** the ratio of the signal of interest to noise - **decibels** a logarithmic unit used for ratios, equal to $10log(S/N)$ for signal to noise ratios. - **moving average** a smoothing operation which applies the mean statistics to a sliding window - **sliding window** an operation which takes successive "windows" of fixed length groups of samples from a sampled signal. Critical in implementing DSP algorithms, as it turns unbounded length signals into a collection of fixed length vectors. - **feedback filtering** filtering which uses the previous output of the filtering operation as one of the inputs at the next step - **exponential smooth** a very simple but effective linear feedback filter, with one step history - **linear filters** filters which are just linear sums (weighted sums) of neighbouring samples - **nonlinear filter** any filter which is not a linear filter - **median filter** a nonlinear filter which applies the median filter to a sliding window - **Order filters** a nonlinear filter which applies any order statistic (median, max, min, percentile) to a sliding window - **multiply and accumulate** a type of hardware instruction that can efficiently accumulate a weighted sum - **convolution** the key operation in signal processing, which computes weighted sums of a sliding window - **convolution kernel** the array of values which define the convolution apply - **algebraic properties of convolution** key properties of convolution which make it efficient and useful: - **commutes** $A∗B=B∗A$ - **associates** $A∗(B∗C)=(A∗B)∗C$ - **distributes** $A∗(B+C)=A∗B+A∗C$ - **delta function** a (pseudo) function which is zero everywhere, except for a value of $1$ at the origin - **impulse** a delta function - **impulse response recovery** a way of extracting the convolution kernel from a system by feeding an impulse in and measuring the response - **reverberation** the acoustic effect of multiple reflections, as in the characteristic sound of a room - **the time domain** the representation of signals as amplitudes varying over time (or over space, in an imaging context) - **the frequency domain** the representation of signals as the superposition of oscillations of different frequencies - **sine wave** a pure oscillation defined by $A sin(\omega t+θ)$ - **amplitude** $A$, the intensity or strength of the oscillation - **phase** $\theta$, a shift of the oscillation in time - **frequency** $\omega$, the rate of oscillation - **Fourier theorem** every periodic signal can be represented precisely as a sum of sine waves - **correlation** the (linear) similarity between two signals - **magnitude** amplitude - **sinusoids** sine waves - **real signals** signals without imaginary values - **Fourier transform** function which can transform the time domain (measurements over time) to the frequency domain (sum of oscillations) - **inverse Fourier transform** transforms frequency domain (exactly) back to the time domain - **discrete Fourier transform (DFT)** Fourier transform for sampled signals - **complex numbers** numbers with a real and imaginary component - **Argand diagram** a way of drawing complex numbers as 2D points on the plane. This can be used to see a **polar** representation of complex numbers, in terms of phase and magnitude. - **phase** the angle of a complex number to the x-axis - **magnitude** the distance of a complex number from the origin - **fast Fourier transform (FFT)** a very efficient algorithm for computing the DFT in $O(NlogN)$ time (in some cases), instead of $O(N^2)$ time - **convolution theorem** convolution in the time domain is identical to elementwise multiplication in the frequency domain, and vice versa - **smoothing filter** a filter which reduces high frequency components ("smooths them out") - **lowpass filter** a filter which reduces high frequency components, like noise - **highpass filter** a filter which reduces low frequency components, like slow trends - **bandpass filter** a filter which selects a band of frequencies of interest (like tuning into a radio) - **notch filter** / **bandstop filter** a filter which removes a band of frequencies, e.g. removing 50Hz mains hum - **filter design** the process of designing a filter in the time domain given frequency domain specifications ### Formulae #### Signal ##### Definition of Nyquist limit

f_{n} = \frac{f_{s}}{2}

##### Signal to noise ratio

\begin{align} SNR &= \frac{S}{N}, \ SNR_{dB} &= 10\log_{10}\left( \frac{S}{N} \right) \end{align}

##### Exponential smoothing

y[t] = \alpha y[t-1] + (1-\alpha)x[t]

##### Convolution of sampled signals

(x \cdot y)[n] = \sum_{m=-M}^{M}x[n]y[n-m]

##### Fourier Transform

\overbrace{F(\underbrace{\omega}{Frequency}) = \int{-\infty}^{\infty}}^{\text{Frequency domain}} \underbrace{f(t)}{{\text{Time domain}}} \cdot \Biggl( \cos(\underbrace{\omega}{\text{Frequency}} \cdot \underbrace{t}{\text{Time}}) - i\cdot \sin(\underbrace{\omega}{\text{Frequency}} \cdot \underbrace{t}{\text{Time}}) \Biggr) d\underbrace{t}{\text{Time}}

🪴 Quartz 4.0

Explorer

Data_Fundamentals

Data Fundamentals

Unit 1 - Numerical

Formulae

Unit 2 - Floats And Strided

Formulae

Relative precision

Floating point representation

Machine Epsilon

Unit 3 - Visualisation

Unit 4 - Vectors

Formulae

Vectors

Addition of two vectors

Scalar multiplication a vector

Linear interpolation of two vectors (Linear Interpolation)

Cosine similarity

Dot product / inner product

Mean vector

Matrix

Identity Matrix

diagonal Matrix

Non-singular Matrix

Definition of linearity (for a linear function $f$ and equivalent matrix $A$ )

Matrix addition

Scalar matrix multiplication

Matrix multiplication

Outer product (matrix version)

Graph View

Table of Contents

Backlinks

🪴 Quartz 4.0

Explorer

Data_Fundamentals

Data Fundamentals

Unit 1 - Numerical

Formulae

Unit 2 - Floats And Strided

Formulae

Relative precision

Floating point representation

Machine Epsilon

Unit 3 - Visualisation

Unit 4 - Vectors

Formulae

Vectors

Addition of two vectors

Scalar multiplication a vector

Linear interpolation of two vectors (Linear Interpolation)

Cosine similarity

Dot product / inner product

Mean vector

Matrix

Identity Matrix

diagonal Matrix

Non-singular Matrix

Definition of linearity (for a linear function f and equivalent matrix A)

Matrix addition

Scalar matrix multiplication

Matrix multiplication

Outer product (matrix version)

Graph View

Table of Contents

Backlinks

Definition of linearity (for a linear function $f$ and equivalent matrix $A$ )