An example of exploring petrophysical and well log measurements using a number of plots from Seaborn and Matplotlib

Photo by Markus Spiske on Unsplash

Machine learning and Artificial Intelligence are becoming popular within the geoscience and petrophysics domains. Especially over the past decade. Machine learning is a subdivision of Artificial Intelligence and is the process by which computers can learn and make predictions from data without being explicitly programmed to do so. We can use machine learning in a number of ways within petrophysics, including automating outlier detection, property prediction, facies classification, etc.

This series of articles will look at taking a dataset from basic well log measurements through to petrophysical property prediction. These articles were originally presented at the SPWLA 2021 Conference during…

A short guide on multiple options for renaming columns in a pandas dataframe

Photo by Giulio Gabrieli on Unsplash

Ensuring that dataframe columns are appropriately named is essential to understand what data is contained within, especially when we pass our data on to others. In this short article, we will cover a number of ways to rename columns within a pandas dataframe.

But first, what is Pandas? Pandas is a powerful, fast, and commonly used python library for carrying out data analytics. The Pandas name itself stands for “Python Data Analysis Library”. According to Wikipedia, the name originates from the term “panel data”. It allows data to be loaded in from a number file formats (CSV, XLS, XLSX, Pickle…

Understand your data distribution and identify outliers in petrophysics and well log data using boxplots

Multiple boxplots with different y-axis ranges generated using matplotlib in python. Image by author.

Boxplots are a great tool for data visualisation, they can be used to understand the distribution of your data, whether it is skewed or not, and whether any outliers are present. In this article, we will look at what boxplots are and how we can display them using pandas and matplotlib.

To accompany this article, I have made the following YouTube Video.

What are Boxplots?

A boxplot is a graphical and standardised way to display the distribution of data based on five key numbers: The “minimum”, 1st Quartile (25th percentile), median (2nd Quartile./ 50th Percentile), the 3rd Quartile (75th percentile), and the…

Visualising well log data versus depth using the matplotlib library from Python

Well log plot created using the matplotlib Python library. Image by author.


Well log plots are a common visualization tool within geoscience and petrophysics. They allow easy visualization of data (for example, Gamma Ray, Neutron Porosity, Bulk Density, etc) that have been acquired along the length (depth) of a wellbore. On these plots, we display our logging measurements on the x-axis and measured depth or true vertical depth on the y-axis.

In this short article, we will see how to create a simple log plot visualisation from one of the Volve Wells that was released as part of a larger dataset by Equinor in 2018.

I have previously covered different aspects of…

Use scatter plots to visualise the relationship between variables

Neutron density scatter plot / crossplot created with matplotlib in python. Image by the author.


Scatter plots are a commonly used data visualisation tool. They allow us to identify and determine if there is a relationship (correlation) between two variables and the strength of that relationship.

Within petrophysics scatter plots, are commonly known as crossplots. They are routinely used as part of the interpretation workflow and can be used for

  • clay and shale end points identification for our clay or shale volume calculations
  • outlier detection
  • lithology identification
  • hydrocarbon identification
  • rock typing
  • regression analysis
  • and more

In this short tutorial we will see how to display histograms from one of the Volve DatasetWells.

The notebook for…

Visualising the distribution of data with histograms

Photo by Marcin Jozwiak on Unsplash


Histograms are a commonly used tool within exploratory data analysis and data science. They are an excellent data visualisation tool and appear similar to bar charts. However, histograms allow us to gain insights about the distribution of the values within a set of data and allow us to display a large range of data in a concise plot. Within the petrophysics and geoscience domains, we can use histograms to identify outliers and also pick key interpretation parameters. For example, clay volume or shale volume end points from a gamma ray.

To create a histogram:

  • We first take a logging curve…

Examples of machine learning to enhance your petrophysical workflow

Photo by Dan Meyers on Unsplash

Several decades of hydrocarbon exploration have led to the acquisition and storage of large quantities of well related measurements, which have been used to characterise the subsurface geology and its hydrocarbon potential. The potential of these large volumes of data has been increasingly exploited over the last couple of decades as computational power and the adoption of new machine learning algorithms increase. Within the petrophysical domain, machine learning has been used to speed up workflows, characterise the geology into discrete electrofacies, make predictions, and much more.

What is Petrophysics?

Petrophysics is a discipline that studies the physical and chemical properties of rocks and…

A Python library dedicated to loading and exploring well log LAS files

Photo by ali elliott on Unsplash

The welly library was developed by Agile Geoscience to help with loading, processing and analysing well log data from a single well or multiple wells. The library allows exploration of the meta data found within the headers of las files and also contains a plotting function to display a typical well log. Additionally, the welly library contains tools for identifying and handling data quality issues.

The Welly library can be found at the Agile Geoscience GitHub at

In this short tutorial we will see how to load a well from the Volve field and exploring some of the functionality…

An example using petrophysical well log measurements

Photo by Tim Mossholder on Unsplash

Data exploration and pre-processing are important steps within any data science or machine learning workflow. When working on tutorial or training datasets it can be the case that they have been engineered in a way to make it easy to work with and allow the algorithm being discussed to be run successfully. However, in the real world, data is messy! It can have erroneous values, incorrect labels and parts of it can be missing.

Missing data is probably one of the most common issues when working with real datasets. Data can be missing for a multitude of reasons, including sensor…

Subdividing the subsurface using Python

Application of unsupervised cluster analysis on well log data to identify lithofacies (Image by Author)

Understanding the subsurface lithology is an important task in geoscience and petrophysics. Using a variety of electrical measurements generated from well logging technology we are able to make inferences about the underlying geology, such as the lithology, facies, porosity, and permeability.

Machine Learning algorithms have routinely been adopted to group well log measurements into distinct lithological groupings, known as facies. This process can be achieved using either unsupervised learning or supervised learning algorithms.

Supervised learning is the most common and practical of machine learning tasks and it is designed to learn from example using input data that has been mapped…

Andy McDonald

Petrophysicist, Geoscientist and data Scientist with a passion for data analytics, machine learning, and artificial intelligence.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store