Redirection and Piping

This post will demonstrate programs which each perform a single very specific task but which can be chained together in such a way that the output of one forms the input of another. Connecting programs like this, or piping to use the correct terminology, enables more complex workflows or processes to be run.

Continue reading

A Stochastic Estimation of Pi

The graphic below shows a square with sides of length 2 containing a circle with a radius of 1. The area of the square is 4 and the area of the circle is 3.141592654 (sound familiar?!). The ratio of the area of the circle to the area of the square is 0.785398163 which is π / 4.

A while ago I wrote an article on estimating pi using various formulae, and in this post I will use the circle-within-a-square ratio illustrated above to run a stochastic simulation in Python to estimate a value of π.

Continue reading

Factorization

This is a simple Python project implementing a function to calculate the factors of a given integer, ie. all of the numbers which divide into that number. The core function returns a list of the factors of the number argument but as a bonus we can use that to write a generator function consisting of just one line of code!

Continue reading

An Introduction to curses

In a few of my posts I have used various ANSI terminal codes to manipulate terminal output, specifically moving the cursor around and changing text colours. These codes are often cryptic, can be cumbersome if used extensively, and cannot be relied upon to work across different platforms. A better solution is to use Python's implementation of the venerable curses library, and in this post I will provide a short introduction to what I consider its core functionalities: moving the cursor around and printing in different colours.

Continue reading

SI Prefixes

Most people are familiar with a few of the more common prefixes used before many units to denote a fraction or a multiple of the unit - kilograms, megabytes, centimetres etc.. As well as these there a number of less well known ones, going right up to yotta and right down to yocto.

As a simple but hopefully enlightening programming exercise I have put together this short Python program to list all the prefixes along with their corresponding powers and multipliers.

Continue reading

The Caesar Shift Cypher

The Caesar Shift Cypher was named after Julius Caesar and is the simplest method of encypherment possible. It consists of shifting letters along by one or more places, so for example if you use a shift of 1 then A becomes B, B becomes C etc.. To decypher the message you just shift letters in the opposite direction. Clearly it lacks sophistication and can easily be cracked, either by trial and error or, if you have a reasonable length of encrypted text, using frequency analysis.

As a little programming exercise I will code the Caesar Shift Cypher in Python, and in a future post will break it with frequency analysis.

Continue reading

Linear Regression

In a previous post I implemented the Pearson Correlation Coefficient, a measure of how much one variable depends on another. The three sets of bivariate data I used for testing and demonstration are shown again below, along with their corresponding scatterplots. As you can see these scatterplots now have lines of best fit added, their gradients and heights being calculated using least-squares regression which is the subject of this article.

Continue reading

Levenshtein Word Distance

A while ago I wrote an implementation of the Soundex Algorithm which attempts to assign the same encoding to words which are pronounced the same but spelled differently. In this post I'll cover the Levenshtein Word Distance algorithm which is a related concept measuring the "cost" of transforming one word into another by totalling the number of letters which need to be inserted, deleted or substituted.

The Levenshtein Word Distance has a fairly obvious use in helping spell checkers decided which words to suggest as alternatives to mis-spelled words: if the distance is low between a mis-spelled word and an actual word then it is likely that word is what the user intended to type. However, it can be used in any situation where strings of characters need to be compared, such as DNA matching.

Continue reading

Estimating Pi

Pi is an irrational number starting off 3.14159 and then carrying on for an infinite number of digits with no pattern which anybody has ever discovered. Therefore it cannot be calculated as such, just estimated to (in principle) any number of digits.

In this project I will code a few of the simpler methods in Python to give a decidedly non-rigorous introduction to what is actually a vast topic. If you want to do something rather more serious than playing with my code you can download the application called y-cruncher used to break the record here.

Continue reading