The value of intellectual play: Mill, machine learning, and a drilling problem I couldn’t stop thinking about

Featured

Posted on January 20, 2026 by matteomycarta

A few years back, I watched a CSEG talk by Lee Hunt (then at Jupiter Resources) called Value thinking: from the classical to the hyper-modern. One case study in particular stuck with me—so much so that I ended up exploring it in a Jupyter Lab notebook, bringing it up in a job interview, and eventually testing whether an AI could reason through it on its own.

This post is about that journey. It’s also about what happens when you let yourself get genuinely curious about someone else’s problem. And—fair warning—it involves a 19th-century philosopher, a seven-well dataset, and a neural network that learned to distrust AVO attributes.

The problem

Jupiter Resources had a history of occasionally encountering drilling trouble in the Wilrich reservoir—specifically, loss of circulation when encountering large systems of open fractures. Mud loss. The kind of problem that can cost you a well.

They had done extensive geophysical work with multiple seismic attributes that, in theory, should correlate with fractures: Curvature, Coherence, AVAz (amplitude variation with azimuth), VVAZ (velocity variation with azimuth), and Diffraction imaging. But they lacked direct calibration data for the drilling problem, and some of the attributes were giving conflicting results.

Lee Hunt, who led the team and the geophysical work, suspected from the start that the AVO-based attributes might be compromised. He had seen evidence as far back as 2014 that AVAz and VVAZ responses in the Wilrich were dominated by an overlying coal, not the fractures themselves—the attributes were measuring a different geological signal entirely. Diffraction imaging was planned early as a complementary measure, precisely because it might not be affected by the coals in the same way (personal communication).

Seven wells. Five attributes. Four of the wells had experienced drilling problems; three had not. Here’s the data:

The question: which attribute—or combination—could reliably predict drilling problems, so that future wells could be flagged ahead of time?

Mill’s Methods: 19th-century philosophy meets drilling risk

Rather than accept uncertainty and provide no geophysical guidance at all, the team at Jupiter tried something different: Mill’s Methods of Induction. Their goal was to find a pattern that could help them advise the operations team—flag high-risk well locations ahead of time so contingency plans could be in place. Mill’s Methods are a set of logical procedures for identifying causal relationships, laid out by philosopher John Stuart Mill in 1843. They’re often illustrated with a food poisoning example (who ate what, who got sick), but they work just as well here.

This approach was characteristic of Lee Hunt’s attitude toward quantitative geophysics—an attitude I had come to admire through his other work. A few years earlier, he had published a CSEG Recorder column called “Many correlation coefficients, null hypotheses, and high value,” a tutorial on statistics for geophysicists that included synthetic production data and an explicit invitation: “You can do it, too. Write in to tell us how.”

I took him up on it. I worked through his examples in Jupyter notebooks, built visualizations, explored prediction intervals, learned a good deal of scientific computing along the way. I reached out to him about the work. I even wrote up some of that exploration in a blog post on distance correlation and variable clustering—the kind of technical deep-dive where you’re learning as much about the tools as about the data. That extended engagement gave me a feel for his way of thinking: understand the statistics, accept the uncertainty, improve your techniques if you can—but don’t just throw up your hands when the data is messy.

Method of Agreement: Look at all the problem wells (A, B, F, G). What do they have in common? Curvature is TRUE for all four. So is Diffraction imaging. The other attributes vary.

Method of Difference: Compare problem wells to non-problem wells (C, D, E). Neither Curvature nor Diffraction alone perfectly discriminates—Well E has Curvature TRUE but no problem; Well D has Diffraction TRUE but no problem.

Joint Method: But here’s the key insight—Curvature AND Diffraction together form a perfect discriminator. Every well where both are TRUE had problems. Every well where at least one is FALSE did not.

This wasn’t a claim about causation. It was a decision rule: when the next well location shows both high curvature and diffraction anomalies, flag it as elevated risk and ensure contingency protocols are in place.

The logic is sound because of asymmetric costs. Preparing for mud loss (having lost circulation material on site, adjusting mud weight plans) is a minor expense. Not preparing when you should have—that’s where you lose time, money, sometimes the well. You don’t need certainty to justify preparation. You need a defensible signal.

What a neural network learned

I wanted to see if a data-driven approach would arrive at the same answer. Looking at the table myself, and spending some time applying Mill’s Methods, I had already seen the pattern—Curvature and Diffraction together were the key predictors. But I was curious: what would a simple neural network learn on its own?

I trained a two-layer network (no hidden layer)—mathematically equivalent to logistic regression—on the same seven wells. (Yes, seven wells. I know. But stay with me.)

The network classified all seven wells correctly. But the real insight came from the weights it learned:

Attribute	Weight
Curvature	+14.6
Diffraction	+9.7
Coherence	~0
AVAz	−4.9
VVAZ	−14.5

Curvature and Diffraction were strongly positive—predictive of problems. Coherence contributed almost nothing. But AVAz and VVAZ were negative—the network learned to suppress them.

A way to think about negative weights: imagine training a network to identify ducks from a set of photos that includes birds, ducks, and people in duck suits. The network will learn to weight “duck features” positively, but also to weight “human features” negatively—to avoid being fooled by the costumes. In the Wilrich case, the AVAz and VVAZ attributes were like duck suits: they looked like fracture indicators, but they were actually measuring something else.

This was interesting. All five attributes have theoretical justification for detecting fractures. Why would the network actively discount two of them?

When I mentioned this result to Lee Hunt, he confirmed what he had long suspected (personal communication): the AVAz and VVAZ responses in the Wilrich were dominated by an overlying coal, not the fractures themselves. He had measured this effect and documented it in a 2014 paper, where multiple attributes—including AVAz—showed statistically significant correlations to coal thickness rather than to reservoir properties. The neural network had learned, from just seven data points, to suppress exactly the attributes that Lee’s domain knowledge had already flagged as problematic.

This is Mill’s Method of Residues in action: if you know something else causes an observation, subtract it out. And it’s a reminder that domain knowledge and data-driven methods can converge on the same answer when both are applied honestly. I found this deeply satisfying.

What the AI got right—and what it missed

More recently, I revisited this problem using ChatGPT with the Wolfram plugin. I wanted to see if an AI, given just the table and a prompt about Mill’s Methods, could reason its way to the same conclusions.

It did—mechanically. It correctly identified Curvature and Diffraction as the consistent factors among problem wells. It noted that neither attribute alone was a perfect discriminator. It even offered to run logistic regression.

But it missed the interpretive leap. It hedged with phrases like “although there are exceptions” when in fact there were no exceptions to the conjunction rule. And it didn’t articulate the pragmatic framing: that the goal wasn’t to find the true cause, but to build a defensible decision rule under uncertainty.

That framing—the shift from epistemology to operations—required domain knowledge and judgment. The AI could apply Mill’s Methods. It couldn’t tell me why that application was useful here.

Drafting this post, I worked with a different AI—Claude—and found the collaboration more useful in a different way: not for solving the problem, but for reflection. Having to explain the context, the history, the why of my interest helped me articulate what I’d been carrying around in my head for years. Sometimes the value of a thinking partner isn’t in the answers, but in the questions that force you to be clearer.

Why this stuck with me

I’ll be honest: I kept thinking about this problem for years. It became part of a longer arc of engagement with Lee’s work—first the statistics tutorial, then the Wilrich case study, each building on the last.

When I interviewed for a geophysics position (Lee was retiring, and I was a candidate for his role), I mentioned this case study. I pulled out a pen and paper and wrote the entire seven-well table from memory. They seemed impressed—not because memorizing a table is hard, but because it signaled that I’d actually enjoyed thinking about it. That kind of retention only happens when curiosity is real.

I didn’t get the job. The other candidate had more operational experience, and that was the right call. But the process was energizing, and I’m sure that enthusiasm carried into my next opportunity, where I landed happily and stayed for over six years.

I tell this not to brag, but to make a point: intellectual play compounds. You don’t always see the payoff immediately. Sometimes you explore a problem just because it’s interesting—because someone like Lee writes “You can do it, too” and you decide to take him seriously—and it pays dividends in ways you didn’t expect.

The convergence

Three very different approaches—19th-century inductive logic, a simple neural network, and (later) an AI assistant—all pointed to the same answer: Curvature and Diffraction predict drilling problems in this dataset. The AVO attributes are noise, or worse, misleading.

When three methods converge, you can trust the signal. And you can make decisions accordingly.

That’s the real lesson here: rigorous reasoning under uncertainty isn’t about finding the One True Cause. It’s about building defensible heuristics, being honest about what you don’t know, and updating as new data comes in. Mill understood this in 1843. A neural network can learn it from seven wells. And sometimes, so can an AI—with a little help.

I hope you enjoyed this as much as I enjoyed putting it together.

The original case study was presented by Lee Hunt in his CSEG talk “Value thinking: from the classical to the hyper-modern.” The neural network analysis is in my Geoscience_ML_notebook_4. Lee documented the coal correlation issue in Hunt et al., “Precise 3D seismic steering and production rates in the Wilrich tight gas sands of West Central Alberta” (SEG Interpretation, May 2014), and later reflected on confirmation bias as an obstacle to recognizing such issues in “Useful Mistakes, Cognitive Biases and Seismic” (CSEG Recorder, April 2021). My thanks to Lee for the original inspiration, for confirming the geological context, and for sharing the original presentation materials.

References and Links

Hunt, L., 2013, Many correlation coefficients, null hypotheses, and high value: CSEG Recorder, December 2013. Link
Hunt, L., S. Hadley, S. Reynolds, R. Gilbert, J. Rule, M. Kinzikeev, 2014, Precise 3D seismic steering and production rates in the Wilrich tight gas sands of West Central Alberta: SEG Interpretation, May 2014.
Hunt, L., 2021, Useful Mistakes, Cognitive Biases and Seismic: CSEG Recorder, April 2021.
My neural network analysis: Geoscience_ML_notebook_4
My earlier exploration of Lee’s production data: Data exploration in Python: distance correlation and variable clustering
ChatGPT + Wolfram session on Mill’s Methods: Gist

AI/HI Transparency Statement Modified from Brewin http://www.theguardian.com/books/2024/apr/04/why-i-wrote-an-ai-transparency-statement-for-my-book-and-think-other-authors-should-too

Has any text been generated using AI?	Yes
Has any text been improved or corrected using HI?	Yes

Additional context: This post emerged from a conversation with Claude AI (Anthropic). I provided the source materials (a ChatGPT + Wolfram session, a Jupyter notebook, personal history with the problem), direction, and editorial judgment throughout. Claude drafted the post based on these inputs and our discussion of structure, voice, and framing. I reviewed multiple draft, revised as needed, rewrote some key sections, and made all final decisions about what went to publication. The core analysis—Mill’s Methods, the neural network, the interpretation—was done by me years before this collaboration; the AI’s role was in helping articulate and structure that work for a blog audience.

Geoscience Machine Learning bits and bobs – data completeness

Posted on September 19, 2020 by matteomycarta

2016 Machine learning contest – Society of Exploration Geophysicists

In a previous post I showed how to use pandas.isnull to find out, for each well individually, if a column has any null values, and sum to get how many, for each column. Here is one of the examples (with more modern, pandaish syntax compared to the example in the previous post:

for well, data in training_data.groupby('Well Name'): 
print(well)
print (data.isnull().values.any())
print (data.isnull().sum(), '\n')

Simple and quick, the output showed met that – for example – the well ALEXANDER D is missing 466 samples from the PE log:

ALEXANDER D
True
Facies         0
Formation      0
Well Name      0
Depth          0
GR             0
ILD_log10      0
DeltaPHI       0
PHIND          0
PE           466
NM_M           0
RELPOS         0
dtype: int64

A more appealing and versatile alternative, which I discovered after the contest, comes with the matrix function form the missingno library. With the code below I can turn each well into a Pandas DataFrame on the fly, then a missingno matrix plot.

for well, data in training_data.groupby('Well Name'): 

msno.matrix(data, color=(0., 0., 0.45)) 
fig = plt.gcf()
fig.set_size_inches(20, np.round(len(data)/100)) # heigth of the plot for each well reflects well length 
axes=fig.get_axes()
axes[0].set_title(well, color=(0., 0.8, 0.), fontsize=14, ha='center')

In each of the following plots, the sparklines at the right summarizes the general shape of the data completeness and points out the rows with the maximum and minimum nullity in the dataset. This to me is a much more compelling and informative way to inspect log data as it shows the data range where data is missing. The well length is also annotated on the bottom left, by which information I learned that Recruit F9 is much shorter than the other wells. And to ensure this is reinforced I introduced a line to modify each plot so that its height reflects the length. I really like it!

2020 Machine Predicted Lithology – FORCE

Since I am taking part in this year’s FORCE Machine Predicted Lithology challenge, I decided to take the above visualization up a notch. Using Ipywidget’s interactive and a similar logic (in this case data['WELL'].unique()) the tool below allows browsing wells using a Select widget and check the chosen well’s curves completeness, on the fly. You can try the tool in this Jupyter notebook.

In a second Jupyter notebook on the other hand, I used missingno matrix to make a quick visual summary plot of the entire dataset completion, log by log (all wells together):

Then, to explore in more depth the data completion, below I also plotted the library’s dendrogram plot. As explained in the library’s documentation, The dendrogram uses a hierarchical clustering algorithm (courtesy of Scipy) to bin variables against one another by their nullity correlation (measured in terms of binary distance). At each step of the tree the variables are split up based on which combination minimizes the distance of the remaining clusters. The more monotone the set of variables, the closer their total distance is to zero, and the closer their average distance (the y-axis) is to zero.

I find that looking at these two plots provides a very compelling and informative way to inspect data completeness, and I am wondering if they couldn’t be used to guide the strategy to deal with missing data, together with domain knowledge from petrophysics.

Interpreting the dendrogram in a top-down fashion, as suggested in the library documentation, my first thoughts are that this may suggest trying to predict missing values in a sequential fashion rather than for all logs at once. For example, looking at the largest cluster on the left, and starting from top right, I am thinking of testing use of GR to first predict missing values in RDEP, then both to predict missing values in RMED, then DTC. Then add CALI and use all logs completed so far to predict RHOB, and so on.

Naturally, this strategy will need to be tested against alternative strategies using lithology prediction accuracy. I would do that in the context of learning curves: I am imagining comparing the training and crossvalidation error first using only non NaN rows, then replace all NANs with mean, then compare separately this sequential log completing strategy with an all-in one strategy.

Geoscience Machine Learning bits and bobs – introduction

Posted on June 17, 2019 by matteomycarta

Bits and what?

After wetting (hopefully) your appetite with the Machine Learning quiz / teaser I am now moving on to a series of posts that I decided to title “Geoscience Machine Learning bits and bobs”.

OK, BUT fist of all, what does ‘bits and bobs‘ mean? It is a (mostly) British English expression that means “a lot of small things”.

Is it a commonly used expression? If you are curious enough you can read this post about it on the Not one-off British-isms blog. Or you can just look at the two Google Ngram plots below: the first is my updated version of the one in the post, comparing the usage of the expression in British vs. US English; the second is a comparison of its British English to that of the more familiar “bits and pieces” (not exactly the same according to the author of the blog, but the Cambridge Dictionary seems to contradict the claim).

I’ve chosen this title because I wanted to distill, in one spot, some of the best collective bits of Machine Learning that came out during, and in the wake of the 2016 SEG Machine Learning contest, including:

The best methods and insights from the submissions, particularly the top 4 teams
Things that I learned myself, during and after the contest
Things that I learned from blog posts and papers published after the contest

I will touch on a lot of topics but I hope that – in spite of the title’s pointing to a random assortment of things – what I will have created in the end is a cohesive blog narrative and a complete, mature Machine Learning pipeline in a Python notebook.

* September 2020 UPDATE *

Although I have more limited time these days, compared to 2016, I am very excited to be participating in the 2020 FORCE Machine Predicted Lithology challenge. Most new work and blog posts will be about this new contest instead of the 2016 one.

***************************

Some background on the 2016 ML contest

The goal of the SEG contest was for teams to train a machine learning algorithm to predict rock facies from well log data. Below is the (slightly modified) description of the data form the original notebook by Brendon Hall:

The data is originally from a class exercise from The University of Kansas on Neural Networks and Fuzzy Systems. This exercise is based on a consortium project to use machine learning techniques to create a reservoir model of the largest gas fields in North America, the Hugoton and Panoma Fields. For more info on the origin of the data, see Bohling and Dubois (2003) and Dubois et al. (2007).

This dataset is from nine wells (with 4149 examples), consisting of a set of seven predictor variables and a rock facies (class) for each example vector and validation (test) data (830 examples from two wells) having the same seven predictor variables in the feature vector. Facies are based on examination of cores from nine wells taken vertically at half-foot intervals. Predictor variables include five from wireline log measurements and two geologic constraining variables that are derived from geologic knowledge. These are essentially continuous variables sampled at a half-foot sample rate.

The seven predictor variables are:

Five wire line log curves include gamma ray (GR), resistivity logging(ILD_log10), photoelectric effect (PE), neutron-density porosity difference and average neutron-density porosity (DeltaPHI and PHIND). Note, some wells do not have PE.
Two geologic constraining variables: nonmarine-marine indicator (NM_M) and relative position (RELPOS)

The nine discrete facies (classes of rocks) are:

Tentative topics for this series

List of previous works (in this post)
Data inspection
Data visualization
Data sufficiency
Data imputation
Feature augmentation
Model training and evaluation
Connecting the bits: a full pipeline

List of previous works (comprehensive, to the best of my knowledge)

In each post I will make a point to explicitly reference whether a particular bit (or a bob) comes from a submitted notebook by a team, a previously unpublished notebook of mine, a blog post, or a paper.

However, I’ve also compiled below a list of all the published works, for those that may be interested.

The contest’s original article published by Brendon Hall on The Leading Edge, and the accompanying notebook

The Github repo with all teams’ submissions.

Two blog posts by Matt Hall of Agile Scientific, here and here

The published summary of the contest by Brendon Hall and Matt Hall on The Leading Edge

An SEG extended abstract on using gradient boosting on the contest dataset

An arXiv e-print paper on using a ConvNet on the contest dataset

Abstract for a talk at the 2019 CSEG / CSPG Calgary Geoconvention

Computer vision in geoscience: recover seismic data from images – part 2

Posted on September 18, 2017 by matteomycarta

In part 1 of this short series I demonstrated how to detect the portion occupied by the seismic section in an image (Figure 1).

Figure 1

The result was a single binary image with the white object representing the pixels occupied by the seismic section (Figure 2).

Figure 2

You can download from GitHub all the tools for the automated workflow (including both part 1 and part 2, and some of the optional features outlined in the introduction) in the module mycarta.py, as well as an example Jupyter Notebook showing how to run it.

Next I want to use this binary object to derive a transformation function to rectify to a rectangle the seismic section in the input image.

The first step is to detect the contour of the object. Notice that because we used morphological operations it is not a perfect quadrilateral: it has rounded corners and some of the sides are bent, therefore the second step will be to approximate the contour with a polygon with enough tolerance to ensure it has 4 sides only(this took some trial and error but 25 turned out to be a good value for the parameter for a whole lot of test images I tried).

In reality, the two steps are performed together using the functions find_contours (there is only one to find, really) and approximate_polygon from the skimage.measure module, as below:

contour = np.squeeze(find_contours(enhanced, 0))
coords = approximate_polygon(contour, tolerance=25)

The variable coords contains the coordinate for the corner points of the polygon (the first point is repeated last to close the polygon), which in Figure 3 I plotted superimposed to the input binary object.

Figure 3 – approximated polygon

A problem with the output of approximate_polygon is that the points are not ordered; to solve it I adapted a function from a Stack Overflow answer to sort them based on the angle from their centroid:

def ordered(points):
  x = points[:,0]
  y = points[:,1]
  cx = np.mean(x)
  cy = np.mean(y)
  a = np.arctan2(y - cy, x - cx)
  order = a.ravel().argsort()
  x = x[order]
  y = y[order]
  return np.vstack([x,y])

I call the function as below to get the corners in the contour without the last one (repetition of the first point).

sortedCoords = ordered(coords[:-1]).T

I can then plot them using colors in a predefined order to convince myself the indeed are sorted:

plt.scatter(sortedCoords[:, 1], sortedCoords[:, 0], s=60, 
 color=['magenta', 'cyan', 'orange', 'green'])

Figure 4 – corners sorted in counter-clockwise order

The next bit of code may seem a bit complicated but it is not. With coordinates of the corners known, and their order as well, I can calculate the largest width and height of the input seismic section, and I use them to define the size of the registered output section, which is to be of rectangular shape:

w1 = np.sqrt(((sortedCoords[0, 1]-sortedCoords[3, 1])**2)
  +((sortedCoords[0, 0]-sortedCoords[3, 0])**2))
w2 = np.sqrt(((sortedCoords[1, 1]-sortedCoords[2, 1])**2)
  +((sortedCoords[1, 0]-sortedCoords[2, 0])**2))

h1 = np.sqrt(((sortedCoords[0, 1]-sortedCoords[1, 1])**2)
  +((sortedCoords[0, 0]-sortedCoords[1, 0])**2))
h2 = np.sqrt(((sortedCoords[3, 1]-sortedCoords[2, 1])**2)
  +((sortedCoords[3, 0]-sortedCoords[2, 0])**2))

w = max(int(w1), int(w2))
h = max(int(h1), int(h2))

and with those I define the coordinates of the output corners used to derive the transformation function:

dst = np.array([
  [0, 0],
  [h-1, 0],
  [h-1, w-1],
  [0, w-1]], dtype = 'float32')

Now I have everything I need to rectify the seismic section in the input image: it is warped using homologous points (the to sets of four corners) and a transformation function.

dst[:,[0,1]] = dst[:,[1,0]]
sortedCoords[:,[0,1]] = sortedCoords[:,[1,0]]
tform = skimage.transform.ProjectiveTransform()
tform.estimate(dst,sortedCoords)
warped =skimage.transform.warp(img, tform, output_shape=(h-1, w-1))

Notice that I had to swap the x and y coordinates to calculate the transformation function. The result is shown in Figure 5: et voilà!

Figure 5 – rectified seismic section

You can download from GitHub the code to try this yourself (both part 1 and part 2, and some of the optional features outlined in the introduction, like removing the rectangle with label inside the section) as well as an example Jupyter Notebook showing how to run it.

Mild or wild: robustness through morphological filtering

Posted on December 10, 2016 by matteomycarta

This guest post (first published here) is by Elwyn Galloway, author of Scibbatical on WordPress. It is the forth in our series of collaborative articles about sketch2model, a project from the 2015 Calgary Geoscience Hackathon organized by Agile Geoscience. Happy reading.

We’re highlighting a key issue that came up in our project, and describing what how we tackled it. Matteo’s post on Morphological Filtering does a great job of explaining what we implemented in sketch2model. I’ll build on his post to explain the why and how. In case you need a refresher on sketch2model, look back at sketch2model, Sketch Image Enhancement, Linking Edges with Geomorphological Filtering.

Morphological Filtering

As Matteo demonstrated by example, sketch2model’s ability to segment a sketch properly depends on the fidelity of a sketch.

An image of a whiteboard sketch (left) divides an area into three sections. Without morphological filtering, sketch2model segments the original image into two sections (identified as orange, purple) (centre). The algorithm correctly segments the area into three sections (orange, purple, green) when morphological filtering is applied (right).

To compensate for sketch imperfections, Matteo suggested morphological filtering on binarized images. Morphological filtering is a set of image modification tools which modify the shape of elements in an image. He suggested using the closing tool for our purposes. Have a look at Matteo’s Post for insight into this and other morphological filters.

One of the best aspects of this approach is that it is simple to apply. There is essentially one parameter to define: a structuring element. Since you’ve already read Matteo’s post, you recall his onion analogy explaining the morphological filtering processes of erosion and dilation – erosion is akin to removing an onion layer, dilation is adding a layer on. You’ll also recall that the size of the structuring element is the thickness of the layer added to, or removed from, the onion. Essentially, the parameterization of this process comes down to choosing the thickness of the onion layers.

Sketch2model uses dilation followed by erosion to fill gaps left between sketch lines (morphological dilation followed by erosion is closing). Matteo created this really great widget to illustrate closing using an interactive animation.

Matteo’s animation was created using this interactive Jupyter notebook. Closing connects the lines of the sketch.

Some is Good, More is Better?

Matteo showed that closing fails if the structural element used is too small. So just make it really big, right? Well, there can be too much of a good thing. Compare what happens when you use an appropriately sized structuring element (mild) to the results from an excessively large structuring element (wild).

Comparing the results of mild and wild structuring elements: if the structuring element is too large, the filter compromises the quality of the reproduction.

Using a morphological filter with a structural element that is too small doesn’t fix the sketches, but using a structural element that is too large compromises the sketch too. We’re left to find an element that’s just right. Since one of the priorities for sketch2model was to robustly handle a variety of sketches with as little user input as possible — marker on whiteboard, pencil on paper, ink on napkin — we were motivated to find a way to do this without requiring the user to select the size of the structuring element.

Is there a universal solution? Consider this: a sketch captured in two images, each with their own resolution. In one image, the lines of the sketch appear to be approximately 16 pixels wide. The same lines appear to be 32 pixels wide in the other image. Since the size of the structuring element is defined in terms of pixels, it becomes apparent the ideal structuring element cannot be “one size fits all”.

High-resolution (left) versus low-resolution (right) image of the same portion of a sketch. Closing the gap between the lines would require a different size structuring element for each image: about 5 pixels for high-resolution or 1 pixel for low-resolution.

Thinking Like a Human

Still motivated to avoid user parameterization for the structuring element, we explored ways to make the algorithm intelligent enough to select an appropriate structuring element on its own. Ultimately, we had to realize a few things before we came up with something that would work:

When capturing an image of a sketch, users compose very similar images (compose in the photographic sense of the word): sketch is centered and nearly fills the captured image.
The image of a sketch is not the same as a user’s perception of a sketch: a camera may record imperfections (gaps) in a sketch that a user does not perceive.
The insignificance of camera resolution: a sketched feature in captured at two different resolutions would have two different lengths (in pixels), but identical lengths when defined as a percentage of image size.

With these insights, we deduced that the gaps we were trying to fill with morphological filtering would be those that escaped the notice of the sketch artist.

Recognizing the importance of accurate sketch reproduction, our solution applies the smallest structuring element possible that will still fill any unintentional gaps in a sketch. It does so in a way that is adaptable.

A discussion about the definition of “unintentional gap” allowed us to create a mandate for the closing portion of our algorithm. Sketch2model should fill gaps the user doesn’t notice. The detail below the limit of the user’s perception should not affect the output model. A quick “literature” (i.e. Google) search revealed that a person’s visual perception is affected by many factors beyond the eye’s optic limits. Without a simple formula to define a limit, we did what any hacker would do… define it empirically. Use a bunch of test images to tweak the structuring element of the closing filter to leave the perceptible gaps and fill in the imperceptible ones. In the sketch2model algorithm, the size of structuring element is defined as a fraction of the image size, so it was the fraction that we tuned empirically.

Producing Usable Results

Implicit in the implementation is sketch2model’s expectation that the user’s sketch, and their image of the sketch are crafted with some care. The expectations are reasonable: connect lines you’d like connected; get a clear image of your sketch. Like so much else in life, better input gives better results.

Input (left) and result (right) of sketch2model.

To produce an adaptable algorithm requiring as little user input as possible, the sketch2model team had to mix a little image processing wizardry with some non-technical insight.

Have you tried it? You can find it at sketch2model.com. Also on GitHub.

Previous posts in the sketch2model series: sketch2model, Sketch Image Enhancement, Linking Edges with Geomorphological Filtering.

Machine Learning in Geoscience with Scikit-learn. Part 2: inferential statistics and domain knowledge to select features for oil prediction

Posted on November 13, 2016 by matteomycarta

In the first post of this series I showed how to use Pandas, Seaborn, and Matplotlib to:

load a dataset
test, clean up, and summarize the data
start looking for relationships between variables using scatterplots and correlation coefficients

In this second post, I will expand on the latter point by introducing some tests and visualizations that will help highlight the possible criteria for choosing some variables, and dropping others. All in Python.

I will use a different dataset than that in the previous post. This one is from the paper “Many correlation coefficients, null hypotheses, and high value“(Lee Hunt, CSEG Recorder, December 2013).

The target to be predicted is oil production from a marine barrier sand. We have measured production (in tens of barrels per day) and 7 unknown (initially) predictors, at 21 wells.

Hang on tight, and read along, because it will be a wild ride!

I will show how to:

1) automatically flag linearly correlated predictors, so we can decide which might be dropped. In the example below (a matrix of pair-wise correlation coefficients between variables) we see that X2, and X7, the second and third best individual predictors of production (shown in the bottom row) are also highly correlated to X1, the best overall predictor.

2) automatically flag predictors that fail a critical r test

3) create a table to assess the probability that a certain correlation is spurious, in other words the probability of getting at least the correlation coefficient we got with our the sample, or even higher, purely by chance.

I will not recommend to run these tests and apply the criteria blindly. Rather, I will suggest how to use them to learn more about the data, and in conjunction with domain knowledge about the problem at hand (in this case oil production), make more informed choices about which variables should, and which should not be used.

And, of course, I will show how to make the prediction.

Have fun reading: get the Jupyter notebook on GitHub.

sketch2model – linking edges with mathematical morphology

Posted on July 22, 2016 by matteomycarta

Introduction

As written by Elwyn in the first post of this series, sketch2model was conceived at the 2015 Calgary Geoscience Hackathon as a web and mobile app that would turn an image of geological sketch into a geological model, and then use Agile Geoscience’s modelr.io to create a synthetic seismic model.

The skech2model concept: modelling at the speed of imagination. Take a sketch (a), turn it into an earth model (b), create a forward seismic model (c). Our hack takes you from a to b.

One of the main tasks in sketch2model is to identify each and every geological body in a sketch as a closed polygon. As Elwyn wrote, “if the sketch were reproduced exactly as imagined, a segmentation function would do a good job. The trouble is that the sketch captured is rarely the same as the one intended – an artist may accidentally leave small gaps between sketch lines, or the sketch medium can cause unintentional effects (for example, whiteboard markers can erase a little when sketch lines cross, see example below). We applied some morphological filtering to compensate for the sketch imperfections.

Morphological filtering can compensate for imperfections in a sketch, as demonstrated in this example. The original sketch (left) was done with a marker on white board. Notice how the vertical stroke erased a small part of the horizontal one. The binarized version of the sketch (middle) shows an unintentional gap between the strokes, but morphological filtering successfully closes the small gap (right).

The cartoon below shows what would be the final output of sketch2model in the two cases in the example above (non closed and closed gap).

My objective with this post is to explain visually how we correct for some of these imperfections within sketch2model. I will focus on the use of morphological closing, which consist in applying in sequence a dilation and an erosion, the two fundamental morphological operations.

Quick mathematical morphology review

All morphological operations result from the interaction of an image with a structuring element (a kernel) smaller than the image and typically in the shape of a square, disk, or diamond. In most cases the image is binary, that is pixels take either value of 1, for the foreground objects, or 0 for the background. The structuring element operates on the foreground objects.

Morphological erosion is used to remove pixels on the foreground objects’ boundaries. How ‘deeply’ the boundaries are eroded depends on the size of the structuring element (and shape, but in this discussion I will ignore the effect of changing the shape). This operation is in my mind analogous to peeling off a layer from an onion; the thickness of the layer is related to the structuring element size.

Twan Maintz in his book Digital and medical image processing describes the interaction of image and structuring element during erosion this way: place the structuring element anywhere in the image: if it is fully contained in the foreground object (or in one of the objects) then the origin (central) pixel of the structuring element (and only that one) is part of the eroded output. The book has a great example on page 129.

Dilation does the opposite of erosion: it expands the object boundaries (adding pixels) by an amount that is again related to the size of the structuring element. This is analogous to me to adding back a layer to the onion.

Again, thanks to Maintz the interaction of image and structuring element in dilation can be intuitively described: place the structuring element anywhere in the image: does it touch any of the foreground objects? If yes then the origin of the structuring element is part of the dilated result. Great example on pages 127-128.

Closing is then for me akin to adding a layer to an onion (dilation) and then peeling it back off (erosion) but with the major caveat that some of the changes produced by the dilation are irreversible: background holes smaller than the structuring element that are filled by the dilation are not restored by the erosion. Similarly, lines in the input image separated by an amount of pixels smaller than the size of the structuring element are linked by the dilation and not disconnected by the erosion, which is exactly what we wanted for sketch2model.

Closing demo

If you still need further explanation on these morphological operations, I’d recommend reading further on the ImageMagik user guide the sections on erosion, dilation, and closing, and the examples on the Scikit-image website.

As discussed in the previous section, when applying closing to a binary image, the external points in any object in the input image will be left unchanged in the output, but holes will be filled, partially or completely, and disconnected objects like edges (or lines in sketches) can become connected.

We will now demonstrate it below with Python-made graphics but without code; however, you can grab the Jupyter notebook with complete Python code on GitHub.

I will use this model binary image containing two 1-pixel wide lines. Think of them as lines in a sketch that should have been connected, but are not.

We will attempt to connect these lines using morphological closing with a disk-shaped structuring element of size 2. The result is plotted in the binary image below, showing that closing was successful.

But what would have happened with a smaller structuring element, or with a larger one? In the case of a disk of size 1, the closing magic did not happen:

Observing this result, one would increase the size of the structuring element. However, as Elwyn will show in the next post, also too big a structuring element would have detrimental effects, causing subsequent operations to introduce significant artifacts in the final results. This has broader implications for our sketch2model app: how do we select automatically (i.e. without hard coding it into the program) the appropriate structuring element size? Again, Elwyn will answer that question; in the last section I want to concentrate on explaining how the closing machinery works in this case.

In the next figure I have broken down the closing operation into its component dilation and erosion, and plotted them step by step to show what happens:

So we see that the edges do get linked by the dilation, but by only one pixel, which the following erosion then removes.

And now let’s break down the closing with disk of size two into its component. This is equivalent to applying two consecutive passes of dilation with disk of size 1, and then two consecutive passes of erosion with disk of size 1, as in the demonstration in the next figure below (by the way, if we observed carefully the second panel above we could predict that the dilation with a disk of size two would result in a link 3-pixel wide instead of 1-pixel wide, which the subsequent erosion will not disconnect).

Below is a GIF animated version of this demo, cycling to the above steps; you can also run it yourself by downloading and running the Jupyter notebook on GitHub.

Additional resources

Closing Jupyter notebook with complete Python code on GitHub

sketch2model Jupyter notebook with complete Python code on GitHub

sketch2model – sketch image enhancements

Posted on May 25, 2016 by matteomycarta

This is the second post of in a series of collaborative articles about sketch2model, a project from the 2015 Calgary Geoscience Hackathon organized by Agile Geoscience.

The first post was written by Elwyn Galloway and published on both his Scibbatical blog and here on MyCarta. In that article Elwyn mentioned the need for an adaptive image conditioning workflow for binarization of photos with geological sketches in images. Binarization is the process of converting a natural image to a binary image (check this simple but awesome interactive demonstration of binarization), which in our case is necessary to separate the sketch from the background.

The following is a demonstration of the preliminary image processing operations applied to the input photo when sketch2model is run. The full code listing of this demonstration is available as a Jupyter notebook on GitHub. Also in GitHub you can find a Jupyter Notebook with the fully documented version of sketch2model.

First we import one of the photos with sketches and convert it to a grayscale image.

im = io.imread('paper_breaks.png')
im = color.rgb2gray(im[0:-1:2,0:-1:2])

Next we enhance the grayscale image with a couple of cascaded processes. But before we do that, let’s graph the intensity values across the image to understand the degree of contrast between sketch edges and background, which ultimately will determine our success in separating them from it. We show this in the figure below, on the left, for one column of pixels (y direction). The black line across the input image on the right shows the location of the column selected. It is fairly obvious from the plot on the left that the intensity of the background is not uniform, due to variable light conditions when the photo was taken, and towards the right (e.g. bottom of the photo) it gets closer to that of the edges. In some images it might even become less than the intensity of the edge. This highlights the need for (preemptively) applying the enhancements illustrated in the remainder of the post.

The first enhancement is called compressor, or limiter. I read many years ago that it is used in electronics to find hard edges in data: the idea is to square each element in the data (image, or other type of data), smooth the result (enough to remove high frequency variations but not so much as to eliminate variability), take the square root, and finally divide each element in the input by the square root result.

I experimented with this method (at the time using Matlab and its Image Processing Toolbox) using the same gravity dataset from my 2015 geophysical tutorial on The Leading Edge (see the post Mapping and validating geophysical lineaments with Python). An example of one such experiments is shown in the figure below where: the top left map is the Bouguer data; the centre top map is the squared data; the top right is the result of a Gaussian blur; the bottom left the result of square root, and centre right is the final output, where the hardest edges in the original data have been enhanced.

The most important parameter in this process is the choice of the smoothing or blur; using a Gaussian kernel of different size more subtle edges are enhanced, as seen in the bottom right map (these are perhaps acquisition-related gridding artifacts).

In our sketch2model implementation the size of the Gaussian kernel is hardcoded; it was chosen following trial and error on multiple photos of sketches and yielded optimal results in the greatest majority of them. We were planning to have the kernel size depend on the size of the input image, but left the implementation to our ‘future work’ list.’

Here’s the compressor code from sketch2model:

# compressor or limiter (electronics): find hard edges in data with long 
# wavelength variations in amplitude
# step 1: square each element in the image to obtain the power function
sqr = im**2
# step 2: gaussian of squared image
flt2 = sp.ndimage.filters.gaussian_filter(sqr,21)
# step 3: divide the intensity of each original pixel by the square root 
# of the smoothed square
cmprs= im/(np.sqrt(flt2))

and a plot of the result (same column of pixels as in the previous one):

From the plot above we see that now the background intensity is uniform and the contrast has been improved. We can maximize it with contrast stretching, as below:

# contrast stretching
p2, p98 = np.percentile(cmprs, (2, 98))
rescale = exposure.rescale_intensity(cmprs, in_range=(p2, p98))

We now have ideal contrast between edges and background, and can get a binary image with the desired sketch edges using a scalar threshold:

# binarize image with scalar threshold
binary = ~(color.rgb2gray(rescale) > 0.5)

Bingo!

MyCarta

A blog about Geoscience, Visualization, Data Science, AI

Tag Archives: geophyiscs

The value of intellectual play: Mill, machine learning, and a drilling problem I couldn’t stop thinking about

Featured

The problem

Mill’s Methods: 19th-century philosophy meets drilling risk

What a neural network learned

What the AI got right—and what it missed

Why this stuck with me

The convergence

References and Links

Geoscience Machine Learning bits and bobs – data completeness

2016 Machine learning contest – Society of Exploration Geophysicists

2020 Machine Predicted Lithology – FORCE

Geoscience Machine Learning bits and bobs – introduction

Bits and what?

* September 2020 UPDATE *

***************************

Some background on the 2016 ML contest

Tentative topics for this series

List of previous works (comprehensive, to the best of my knowledge)

Computer vision in geoscience: recover seismic data from images – part 2

Mild or wild: robustness through morphological filtering

This guest post (first published here) is by Elwyn Galloway, author of Scibbatical on WordPress. It is the forth in our series of collaborative articles about sketch2model, a project from the 2015 Calgary Geoscience Hackathon organized by Agile Geoscience. Happy reading.

Morphological Filtering

Some is Good, More is Better?

Thinking Like a Human

Producing Usable Results

Machine Learning in Geoscience with Scikit-learn. Part 2: inferential statistics and domain knowledge to select features for oil prediction

sketch2model – linking edges with mathematical morphology

Introduction

Quick mathematical morphology review

Closing demo

Additional resources

Related Posts

sketch2model – sketch image enhancements

The problem

Mill’s Methods: 19th-century philosophy meets drilling risk

What a neural network learned

What the AI got right—and what it missed

Why this stuck with me

The convergence

References and Links

Share this:

2016 Machine learning contest – Society of Exploration Geophysicists

2020 Machine Predicted Lithology – FORCE

Share this:

Bits and what?

*** September 2020 UPDATE ***

***************************

Some background on the 2016 ML contest

Tentative topics for this series

List of previous works (comprehensive, to the best of my knowledge)

Share this:

Share this:

This guest post (first published here) is by Elwyn Galloway, author of Scibbatical on WordPress. It is the forth in our series of collaborative articles about sketch2model, a project from the 2015 Calgary Geoscience Hackathon organized by Agile Geoscience. Happy reading.

Morphological Filtering

Some is Good, More is Better?

Thinking Like a Human

Producing Usable Results

Share this:

Share this:

Introduction

Quick mathematical morphology review

Closing demo

Additional resources

Related Posts

Share this:

Share this:

* September 2020 UPDATE *