In a recent post titled Unweaving the rainbow, Matt Hall described our joint attempt (partly successful) to create a Python tool to enable recovery of digital data from any pseudo-colour scientific image (and a seismic section in particular, like the one in Figure 1), without any prior knowledge of the colormap.
Please check our GitHub repository for the code and slides and watch Matt’s talk (very insightful and very entertaining) from the 2017 Calgary Geoconvention below:
In the next two post, coming up shortly, I will describe in greater detail my contribution to the project, which focused on developing a computer vision pipeline to automatically detect where the seismic section is located in the image, rectify any distortions that might be present, and remove all sorts of annotations and trivia around and inside the section. The full workflow is included below (with sections I-VI developed to date):
- I – Image preparation, enhancement:
- Convert to gray scale
- Optional: smooth or blur to remove high frequency noise
- Enhance contrast
- II – Find seismic section:
- Convert to binary with adaptive or other threshold method
- Find and retain only largest object in binary image
- Fill its holes
- Apply opening and dilation to remove minutiae (tick marks and labels)
- III – Define rectification transformation
- Detect contour of largest object find in (2). This should be the seismic section.
- Approximate contour with polygon with enough tolerance to ensure it has 4 sides only
- Sort polygon corners using angle from centroid
- Define new rectangular image using length of largest long and largest short sides of initial contour
- Estimate and output transformation to warp polygon to rectangle
- IV – Warp using transformation
- V – Blanking annotations inside seismic section (if rectangular):
- Start with output of (4)
- Pre-process and apply canny filter
- Find contours in the canny filter smaller than input size
- Sort contours (by shape and angular relationships or diagonal lengths)
- Loop over contours:
- Approximate contour
- If approximation has 4 points AND the 4 semi-diagonals are of same length: fill contour and add to mask
- VI – Use mask to remove text inside rectangle in the input and blank (NaN) the whole rectangle.
- VII – Optional: tools to remove arrows and circles/ellipses:
- For arrows – contours from (4) find ones with 7 sizes and low convexity (concave) or alternatively Harris corner and count 7 corners, or template matching
- For ellipses – template matching or regionprops
- VIII – Optional FFT filters to remove timing lines and vertical lines
You can download from GitHub all the tools for the automated workflow (parts I-VI) in the module mycarta.py, as well as an example Jupyter Notebook showing how to run it.
The first post focuses on the image pre-processing and enhancement, and the detection of the seismic line (sections I and II, in green); the second one deals with the rectification of the seismic (sections IV to V, in blue). They are not meant as full tutorials, rather as a pictorial road map to (partial) success, but key Python code snippets will be included and discussed.