I started writing this blog 10 years ago and it is to date one of the things I am the most proud of.
This is a big milestone for me, so I would like to begin with thanking all the people that encouraged me at the beginning, and in particular, for their valiant support and feedback: Matt Hall, Evan Bianco, Oliver Kuhn, Steve Lynch, and last but not least my life partner Rebecca.
A lot of the work I did in the first couple of years was on furthering my understanding, and sharing that with others, of the use of colours in scientific visualization, and how to use better colormaps, for example Perceptual rainbow palette, part a and part b.
I am grateful I achieved those knowledge-sharing goals:
The blog has often been used as reference in talks and other publications on colormaps, beginning with this classic matplotlib talk given by Kristen Thyng at scipy 2014
I was thrilled to have received positive feedback on my work by Bernice Rogovitz, someone I hold in very high esteem
Some of that work on the blog resulted in being invited by Matt Hall to write a tutorial for The Leading Edge (SEG). The tutorial came with a Jupyter notebook demonstrating how to evaluate default colour maps and favour more perceptual alternatives
I am particularly proud to see that the article is still ranking in the top 20 most downloaded papers from The Leading Edge (between 2010-2020)
Ultimately, I am very happy to have created a space for sharing and exchanging ideas freely.
So, to celebrate these 10 years of MyCarta, I treated it to a new domain, mycartablog.com (but the old domain, and links still work) and a brand new look (it took me a while to get there but I like it a lot) with a theme that should now be responsive for all devices (welcome to the new era Matteo!).
I will also soon publish a short series of short but sweet new posts on colormaps and visualization (and republish on linkedin).
As of yesterday, I no longer have a full-time day job.
I am looking for opportunities.
I’d love to hear about projects in geophysics, computational geoscience, data science, machine learning. Feel free to get in touch with me at matteo@mycarta.ca.
This guest post (first published here) is by Elwyn Galloway, author of Scibbatical on WordPress. It is the first in our series of collaborative articles about sketch2model, a project from the 2015 Calgary Geoscience Hackathonorganized by Agile Geoscience. Happy reading.
Collaboration in action. Evan, Matteo, and Elwyn (foreground, L to R) work on sketch2model at the 2015 Calgary Geoscience Hackathon. Photo courtesy of Penny Colton.
Welcome to an epic blog crossover event. Two authors collaborating to tell a single story over the course of several articles.
We’ve each mentioned the sketch2model project on our respective blogs, MyCarta and scibbatical, without giving much detail about it. Apologies if you’ve been waiting anxiously for more. Through the next while, you’ll get to know sketch2model as well as we do.
The sketch2model team came together at the 2015 Geoscience Hackathon (Calgary), hosted by Agile Geoscience. Elwyn and Evan Saltman (epsalt on twitter and GitHub) knew each other from a previous employer, but neither had met Matteo before. All were intrigued by the project idea, and the individual skill sets were diverse enough to combine into a well-rounded group. Ben Bougher, part of the Agile Geoscience team, assisted with the original web interface at the hackathon. Agile’s take on this hackathon can be found on their blog.
Conception
The idea behind sketch2model is that a user should be able to easily create forward seismic models. Modelling at the speed of imagination, allowing seamless transition from idea to synthetic seismic section. It should happen quickly enough to be incorporated into a conversation. It should happen where collaboration happens.
The skech2model concept: modelling at the speed of imagination. Take a sketch (a), turn it into an earth model (b), create a forward seismic model (c). Our hack takes you from a to b.
Geophysicists like to model wedges, and for good reasons. However, wedge logic can get lost on colleagues. It may not effectively demonstrate the capability of seismic data in a given situation. The idea is not to supplant that kind of modeling, but to enable a new, lighter kind of modeling. Modeling that can easily produce results for twelve different depositional scenarios as quickly as they can be sketched on a whiteboard.
The Hack
Building something mobile to turn a sketch into a synthetic seismic section is a pretty tall order for a weekend. We decided to take a shortcut by leveraging an existing project: Agile’s online seismic modelling package, modelr. The fact that modelr works through any web browser (including a smartphone) kept things mobile. In addition, modelr’s existing functionality allows a user to upload a png image and use it as a rock property model. We chose to use a web API to interface our code with the web application (as a bonus, our approach conveniently fit with the hackathon’s theme of Web). Using modelr’s capabilities, our hack was left with the task of turning a photo of a sketched geologic section into a png image where each geologic body is identified as a different color. An image processing project!
Agile is a strong proponent for Python in geophysics (for reasons nicely articulated in their blog post), and the team was familiar with the language to one extent or another. There was no question that it was the language of choice for this project. And no regrets!
We aimed to create an algorithm robust enough to handle any image of anything a user might sketch while accurately reproducing their intent. Marker on whiteboard presents different challenges than pencil on paper. Light conditions can be highly variable. Sketches can be simple or complex, tidy or messy. When a user leaves a small gap between two lines of the sketch, should the algorithm take the sketch as-is and interpret a single body? Or fill the small gap and interpret two separate bodies?
Our algorithm needs to be robust enough to handle a variety of source images: simple, complex, pencil, marker, paper, white board (check out the glare on the bottom left image). These are some of the test images we used.
Matteo has used image processing for geoscience before, so he landed on an approach for our hack almost instantly: binarize the image to distinguish sketch from background (turn color image into a binary image via thresholding); identify and segregate geobodies; create output image with each body colored uniquely.
Taking the image of the original sketch (left) and creating a binary image (right) is an integral part of the sketch2model process.
Python has functions to binarize a color image, but for our applications, the results were very inconsistent. We needed a tool that would work for a variety of media in various lighting conditions. Fortunately, Matteo had some tricks up his sleeve to precondition the images before binarization. We landed on a robust flow that can binarize whatever we throw at it. Matteo will be crafting a blog post on this topic to explain what we’ve implemented.
Once the image is binarized, each geological body must be automatically identified as a closed polygon. If the sketch were reproduced exactly as imagined, a segmentation function would do a good job. The trouble is that the sketch captured is rarely the same as the one intended — an artist may accidentally leave small gaps between sketch lines, or the sketch medium can cause unintentional effects (for example, whiteboard markers can erase a little when sketch lines cross, see example below). We applied some morphological filtering to compensate for the sketch imperfections. If applied too liberally, this type of filtering causes unwanted side effects. Elwyn will explore how we struck a balance between filling unintentional gaps and accurate sketch reproduction in an upcoming blog post.
Morphological filtering can compensate for imperfections in a sketch, as demonstrated in this example. The original sketch (left) was done with a marker on white board. Notice how the vertical stroke erased a small part of the horizontal one. The binarized version of the sketch (middle) shows an unintentional gap between the strokes, but morphological filtering successfully closes the small gap (right).
Compared to the binarization and segmentation, generating the output is a snap. With this final step, we’ve transformed a sketch into a png image where each geologic body is a different color. It’s ready to become a synthetic seismic section in modelr.
Into the Wild
“This is so cool. Draw something on a whiteboard and have a synthetic seismogram right on your iPad five seconds later. I mean, that’s magical.”
Sketch2model was a working prototype by the end of the hackathon. It wasn’t the most robust algorithm, but it worked on a good proportion of our test images. The results were promising enough to continue development after the hackathon. Evidently, we weren’t the only ones interested in further development because sketch2model came up on the February 17th episode of Undersampled Radio. Host Matt Hall: “This is so cool. Draw something on a whiteboard and have a synthetic seismogram right on your iPad five seconds later. I mean, that’s magical.”
Since the hackathon, the algorithm and web interface have progressed to the point that you can use it on your own images at sketch2model.com. To integrate this functionality directly into the forward modelling process, sketch2model will become an option in modelr. The team has made this an open-source project, so you’ll also find it on GitHub. Check out the sketch2model repository if you’re interested in the nuts and bolts of the algorithm. Information posted on these sites is scant right now, but we are working to add more information and documentation.
Sketch2model is designed to enable a new kind of collaboration and creativity in subsurface modelling. By applying image processing techniques, our team built a path to an unconventional kind of forward seismic modelling. Development has progressed to the point that we’ve released it into the wild to see how you’ll use it.
I started this blog in 2012; in these 3 1/2 years it has been a wonderful way to channel some of my interests in image processing, geophysics, and visualization (in particular colour), and more recently Python.
During this time, among other things, I learned how to build and maintain a blog, I packaged a popular Matlab function, wrote an essay for Agile Geoscience’s first book on Geophysics, presented at the 2012 CSEG Geoconvention, and wrote two tutorials for The Leading Edge. Last, but not least, I made many new friends and professional connections.
Starting with 2016 I would like to concentrate my efforts on building useful (hopefully) and fun (for me at least) open source (this one is for sure) tools in Python. This is going to be my modus operandi:
do some work, get to some milestones
upload the relevant IPython/Jupiter Notebooks on GitHub
The implementation of the finished app involves using morphological filtering and other image processing methods to enhance the sketch image and convert it into a model with discrete bodies, then pass it on to Agile’s modelr.io to create a synthetic.
I think openness in geoscience is very important, and I feel we all have a duty to be open with our work, data, ideas when possible and practical. I certainly do believe in sharing a good deal of the work I do in my spare time. So much so that when I started this blog there was no doubt in my mind I would include an agreement for people to use and modify freely what I published. Indeed, I venture to say I conceived the blog primarily as a vehicle for sharing.
Some of the reasons for sharing are also selfish (in its best sense): doing so gives me a sense of fulfillment, and pleasure, as Matt Hall writes in Five things I wish I’d known (one of the essays in 52 You Should Know About Geophysics), you can find incredible opportunities for growth in writing, talking, and teaching. There is also the professional advantage of maintaining visibility in the marketplace, or as Sven Treitel puts it, Publish or perish, industrial style (again in 52 You Should Know About Geophysics).
How I used to share
At the beginning I choose an Attribution-NonCommercial-ShareAlike license (CC BY-NC-SA) but soon removed the non-commercial limitation in favour of an Attribution-ShareAlike license (CC BY-SA).
A (very) cold shower
Unfortunately, one day last year I ‘woke up’ to an umpleasant surprise: in two days an online magazine had reposted all my content – literally, A to Z! I found this out easily because I received pingback approval requests for each of them (thank you WP!). Quite shocked, I confess, the first thing I did was to check the site: indeed all my posts were there. The published included an attribution with my name at the top of each post but I was not convinced this would be fair use. Quite the contrary, to me this was a clear example of content scraping, and the reason why I say that is because they republished even my Welcome post and my On a short blogging sabbatical post – in the science category! – please see the two screen captures below (I removed their information) of the pingbacks:
If this was a legitimate endeavour, I reasoned, a magazine with thoughtful editing, I was sure those two posts would have not been republished. Also, I saw that posts from many other blogs were republished en masse daily.
Limitations of Creative Commons licenses
I asked for advice/help from my twitter followers, and on WordPress Forums, while at the same time started doing some research. That is when I learned this is very common, however being in good company (google returned about 9,310,000 results when searching ‘blog scraping’) did not feel like much consolation: I read that sites may get away with scraping content, or at least try. I will quote directly from the Plagiarism Today article Creative Commons: License to Splog?: “They can scrape an entire feed, offer token attribution to each full post lifted (often linking just to the original post) and rest comfortably knowing that they are within the bounds of the law. After all, they had permission …Though clearly there is a difference between taking and reposting a single work and reposting an entire site, the license offers a blanket protection that covers both behaviors”.
It is possible to switch to a more restrictive Creative Commons license like the Attribution-NonCommercial-NoDerivs (perhaps modified as a CC+), but that only allows to cut your losses, not to fight the abuse, as it is only on a going-forward basis (I read this in an article, and jotted down a note, but I unfortunately cannot track down the source – you may be luckier, or cleverer).
Then I was contacted by the site administrator through my blog contact form (again I removed their information), who had read my question on the WordPress forum:
Your Name: ______
Your Email Address: ______
Your Website: ______
Message: Hello.
Your site is under a CC license. What’s the trouble in republishing your content?
Regards.
Subject: Your license
Time: Thursday July 26, 2012 at 12:26 am
IP Address: ________
Contact Form URL: http://mycartablog.com/contact/
Sent by an unverified visitor to your site.
I responded with a polite letter, as suggested by @punkish on twitter. I explained why I thought they were exceeding what was warranted under the Creative Commons license, that republishing the About page and Sabbatical posts was to me proof of scraping, and I threatened to pursue legal recourse, starting with DMCA Notice of Copyright Infringement. Following my email they removed all my posts from their site, and notified me.
Two alternatives
I think I was fortunate in this case, and decided to take matters into my own hands to prevent it from happening again. Following my research I saw two good, viable ways to better protect my blog from scraping whole content, while continuing to share my work. The first one involved switching to WordPress.org. This would allow more customization of the blog, and use of such tools as the WP RSS footer plugin, which allows to Get credit for scraped posts, and WP DMCA website protection. Another benefit of switching to WordPress.org is that – if you are of belligerent inclination – you can try to actively fight content scraping with cloacking. Currently, although it is one of my goals for the future of this blog, I am not prepared to switch WordPress.com due to time constraints.
I customized my statement to reduce as much as possible the need for readers to ask for permission by allowing WorPress reblogging and by allowing completely open use of my published code and media. Below is a screen capture of my statement, which it is located in the blog footer:
I hope this will be helpful for those that may have the same problem. Let me know what you think.
The WordPress.com stats helper monkeys prepared a 2012 annual report for this blog.
Here’s an excerpt:
4,329 films were submitted to the 2012 Cannes Film Festival. This blog had 19,000 views in 2012. If each view were a film, this blog would power 4 Film Festivals
I recently updated the blog look by switching to the Oxygen WP theme. This is a very good looking theme and I am thrilled by the results.
I made a few changes to the default look using CSS code. Mind you, nothing extraordinaire, in fact I am an absolute beginner. But with a good tip from WP staff member philiparthurmoore, a bit of reading, and a lot of (failed) experiments, I got exactly the look I was after.
These are the changes I made:
first, I removed one of the three columns, the right one, indicated by an arrow in the Oxygen demo below, so as to have more room available for the posts’ content.
Second, and third, I wanted to have both the header and the category menu (indicated by arrows below) fixed in place at the top of the page when everything else is scrolling.
Notice that to accomplish that for the black category menu I had to add an image widget underneath it, with a white image of the same size as the menu. This was to allow the other widgets to start below it. A bit of a pedestrian solution, but it works.
By the way, the black category menu is shown in the demo but it does not come up by default when you first use the theme. You have to create it in the Menus (see below), add your categories to it, then you can shuffle them around by dragging and dropping (very cool).
That’s it, Bob’s your uncle!
And here’s my lightly commented code. Let me know if you like the blog’s look and if you have any suggestions.
/* To create fixed position secondary menu */
.menu-secondary {
position: fixed;
width: 180px;
margin-top: 185px;
margin-bottom: 190px;
padding: 0 0 34px;
}
/* Added blank Image Widget with white box 180px x 330px */
/* to keep other widgets below secondary menu */
.widget_image {
margin-top: 205px;
}
/* To display nothing in the third column */
#tertiary {
display: none;
}
/* To allow main content to extend into third column */
#content {
width: 750px;
margin-left: 215px;
}
/* To create fixed position header */
#masthead {
clear: both;
padding: 70px 0 0;
width: 100%;
position: fixed;
z-index: 200;
background-color: #fff;
margin-bottom: 10px;
}
#content {
margin-top: 185px;
}
UPDATE
Reader Ron DeSpain wrote to let me know when not in Full Window mode the fixed secondary menu and fixed header overlapped with the content. Not only that, but some of the text disappeared outside of the right page edge. I tried a couple of fixes, but those then resulted in issues when viewing in full window mode.
So I decided to revert for now to the default Oxygen scrolling menu and header, until my CSS programming skills allow me to get this sorted (taking courses, see below).