When Apple announced their new file system, APFS, in June, I hustled to be in the front row of the WWDC presentation, questions with the presenters, and then the open Q&A session. I took a week to write up my notes which turned into as 12 page behemoth of a blog post — longer than my college thesis.
I had been procrastinating making the family holiday card. It was a combination of having a lot on my plate and dreading the formulation of our annual note recapping the year; there were some great moments, but I’m glad I don’t have to do 2016 again. It was just before midnight and either I’d make the card that night or leave an empty space on our friends’ refrigerators. Adobe Illustrator had other ideas:
Since Noms dropped last week the dev community has seemed into it. “Git for data” — it simultaneously evokes something very familiar and yet unconstrained. Something that hasn’t been well-noted is how much care the team has taken to make Noms fun to build with, and it is.
I liked Go right away. It was close enough to C and Java to be instantly familiar, the examples and tutorials were straightforward, and I was quickly writing real code. I’ve wanted to learn Go since its popularity was surging few years ago. In no danger of being judged an early adopter, I happily found a great project that—as it happened—had to be in Go (more in a future post).
This series of posts covers APFS, Apple’s new filesystem announced at WWDC 2016. See the first post for the table of contents.
I’m not sure Apple absolutely had to replace HFS+, but likely they had passed an inflection point where continuing to maintain and evolve the 30+ year old software was untenable. APFS is a product born of that necessity.
After nearly nine years at Sun and then six at Delphix I’m looking for the next technology, team, and market to dive into. I’ve had the extremely good fortune of working with three groups—the DTrace team, Fishworks at Sun, and Delphix—that featured top-tier technologists working on differentiated products, each of them a wonderful place and time. Most recently, as CTO of Delphix, I grew the engineering team from a tiny seed, and was fortunate enough to be joined by so many people from my past including some of my best best friends and long-term colleagues.
Walk around almost any software development shop or university CS department and you’ll be struck by the underrepresentation of women. At least you would be were this not an expected norm of our industry. And of course much has been written about this recently hot topic in Silicon Valley. What do companies and organizations do about it? At Delphix our culture is one of focus and purpose; our approach to diversity follows in that spirit.
Like many programmers I like to try out new languages. After lunch with Alex Crichton, one of the Rust contributors, I started writing my favorite program in Rust. Rust is a “safe” systems language that introduces concepts of data ownership and mutability to semantically prevent whole categories of problems.
We built DTrace to solve problems; at the start, the problems we understood best were our own. In the Solaris Kernel Group we started by instrumenting the kernel and system calls, the user/kernel boundary. Early use required detailed knowledge of kernel internals. As DTrace use grew—within the team, in Sun and then beyond—we extended DTrace to turn every function and every instruction in user programs into probes. We added stable points of instrumentation both in the kernel and in user-land so that no deep knowledge of program or kernel internals would be required.
In the frenzied, insular world of a Silicon Valley startup it can be easy to lose perspective on the broader community in which we live and work. Among the great hackathon projects to come from our bi-annual engineering event was the idea of “Angel Sharks”, a group of volunteers at Delphix who provide opportunities for volunteering and community giving.
I started my blog June 17, 2004, tempted by the opportunity of Sun’s blogging policy, and cajoled by Bryan Cantrill’s presentation to the Solaris Kernel Team “Guerrilla Marketing” (net: Sun has forgotten about Solaris so let’s get the word out). I was a skeptical blogger.
Delphix customers include top companies across a wide range of industries, most of them executing around the clock. Should a problem arise they require support from Delphix around the clock as well. To serve our customers’ needs we’ve drawn from industry best-practices while recently mixing in an unconventional approach to providing the best possible customer service regardless of when a customer encounters a problem.
Data breaches make headlines at a regular cadence. Each is a surprise, but they are not, as a whole, surprising. While the extensive and sophisticated Target breach has retailers jumping, the lesser-known theft of personal information for 20m subscribers from three South Koreans credit card companies had a less arcane cause.
In my last blog post, I wrote about the ZFS write throttle, and how we saw it lead to pathological latency variability on customer systems. Matt Ahrens, the co-founder of ZFS, and I set about to fix it in OpenZFS. While the solution we came to may seem obvious, we arrived at it only through a bit of wandering in a wide open solution space.
It’s no small feat to build a stable, modern filesystem. The more I work with ZFS, the more impressed I am with how much it got right, and how malleable it’s proved. It has evolved to fix shortcomings and accommodate underlying technological shifts. It’s not surprising though that even while its underpinnings have withstood the test of production use, ZFS occasionally still shows the immaturity of the tween that it is.
I’ve been watching ZFS nearly from its moment of inception, so it’s exciting to see it enter its newest phase of development in OpenZFS. While ZFS has long been regarded as the hottest filesystem on 128 bits, and has shipped in many different products, what’s been most impressive to me about ZFS development has been the constant iteration and reinvention.
Today marks my third anniversary of joining Delphix. Joining a startup, I knew there would be lots to learn — indeed there’s been a lesson nearly once-a-day. Here are my top three lessons from my first three years at a startup. Even if the points themselves should have been obvious to me, the degree of their impact certainly wasn’t.
A couple of weeks ago, Joyent hosted A Midsummer Night’s Systems meetup, a fun event with talks ranging from Node.js fatwas to big data for Mario Kart 64. My colleague Jeremy Jones had recently done some amazing work, perfect for the meetup, but with his first child less than a day old, Jeremy allowed me to present in his stead.
A prospective new college hire recently related an odd comment from his professor: systems programming is dead. I was nonplussed; what could the professor have meant? Systems is clearly very much alive. Interesting and important projects march under the banner of systems. But as I tried to construct a less emotional rebuttal, I realized I lacked a crisp definition of what systems programming is.
The idea of the holistic engineer embodies the point of view that an engineer needs to consider the whole system, the whole body of work that makes a product successful. It bears no relation to holistic health — and it’s not some even newer age quackery. There are many specialist roles in the software industry — marketing, product management, project management, documentation, education, support, etc.
I’ve continued to explore ZFS as I try to understand performance pathologies, and improve performance. A particular point of interest has been the ZFS write throttle, the mechanism ZFS uses to avoid filling all of system memory with modified data. I’m eager to write about the strides we’re making in that regard at Delphix, but it’s hard to appreciate without an understanding of how ZFS batches data. Unfortunately that explanation is literally nowhere to be found.
Back in October I was pleased to attend — and my employer, Delphix, was pleased to sponsor — illumos day and ZFS day, run concurrently with Oracle Open World. Inspired by the success of dtrace.conf(12) in the Spring, the goal was to assemble developers, practitioners, and users of ZFS and illumos-derived distributions to educate, share information, and discuss the future.
Lately, I’ve been rooting around in the bowels of ZFS as we’ve explored some long-standing performance pathologies. To that end I’ve been fortunate to learn at the feet of Matt Ahrens who was half of the ZFS founding team and George Wilson who has forgotten more about ZFS than most people will ever know. I wanted to start sharing some of the interesting details I’ve unearthed.
At the illumos hackathon last week, Robert Mustacchi and I prototyped better support for manipulating user-land structures. As anyone who’s used it knows, DTrace is currently very kernel-centric — this both reflects the reality of how operating systems and DTrace are constructed, and the origins of DTrace itself in the Solaris Kernel Group. Discussions at dtrace.conf(12) this spring prompted me to chart a path to better user-land support.
I wish that none of our customers encountered problems with our product, but then do, and when they do our means for remotely accessing their systems is often via a Webex shared screen. We remotely control their Delphix server to collect data (often using DTrace). While investigating a customer issue recently I developed a couple of techniques to work around common problems; I thought I’d share them in case others have similar problems — and as a note to my future self who will certainly forget the specifics next time.
The mantra as we initially developed DTrace was to make impossible things possible, not easy things easier. Since codifying that, the tendency toward the latter in developer tools has been apparent. Our focus on the former however has left certain usability burrs that stymie newbies, and annoy vets. Much of the DTrace development of late has focused on a middle category: simplifying hard things that should be simple.
The print() action
DTrace first peered into Java in early 2005 thanks to an early prototype by Jarod Jenson that led eventually to the inclusion of USDT probes in the HotSpot JVM. If you want to see where, say, the java.net.SocketOutputStream.write() method is called, you can simply run this DTrace script: hotspot$target:::method-entry /copyinstr(arg1, arg2) == "java/net/SocketOutputStream" && copyinstr(arg3, [...]
For the second time in as many quadrennial dtrace.confs, I was impressed at how well the unconference format worked out. Sharing coffee with the DTrace community, it was great to see some of the oldest friends of DTrace — Jarod Jenson, Stephen O’Grady, Jonathan Adams to name a few — and to put faces to [...]
A few months ago I took DTrace on OEL for a spin after Oracle announced it. The results were ugly; as one of the authors of DTrace, I admit to being shocked by shoddiness of the effort. Yesterday, Oracle dropped an updated beta so I wanted to see how far they’ve come in the 4+ months [...]
Tonight, my Delphix colleague Zubair Khan and I presented the integration we’ve done with git at the SF Bay Area Large-Scale Production Engineering meetup. When I started at Delphix, we were using Subversion — my ire for which the margins of this blog are too narrow to contain. We switched to git, and in the [...]
Back at Fishworks, my colleagues had a nickname for me: Adam Leventhal, Hardware Engineer. I wasn’t designing hardware; I wasn’t even particularly more involved with hardware specs. The name referred to my preternatural ability to fit round pegs into square holes, to know when parts would bend but not break (or if they broke how [...]
ZFS recently celebrated its informal 10th anniversary; to mark the occasion, Delphix hosted a ZFS-themed meetup for the illumos community (sponsored generously by Joyent). Many thanks to Deirdre Straughan, the new illumos community manager, for helping to organize and for filming the event. Three of my colleagues at Delphix presented work they’ve been doing in [...]
Every once in a rare while our development machines encounter an fatal error during boot because we couldn’t unmount tmpfs. This weekend I cracked the case, so I thought I’d share my uses of boot-time DTrace, and the musty corners of the operating systems that I encountered along the way. First I should explain a [...]
Exactly 10 years ago today, Jeff Bonwick and Matt Ahrens got their first ZFS prototype working in user-land. Jeff had scrapped his previous attempt at reinventing filesystems, working through the established filesystem management and engineering channels at Sun, and this time started with a clean sheet of paper. Matt had joined Sun that June shortly [...]
On Monday, the Delphix systems crew is rolling down the 101 to the illumos hackathon in San Jose. Anyone who’s working on illumos, developing illumos-derived technologies like ZFS or DTrace, or who wants to cut some OS code, should drop by. Here’s the sign up. What’s a hackathon? Not exactly sure, but we’re hoping to [...]
It’s my pleasure to welcome Matt Amdur to Delphix, to the world of DTrace, and — just today — to the blogosphere. Matt joined Delphix about two months after 10 years of software engineering, most recently at VMware. Matt and I met in at Brown University in 1997 where we worked together closely for all [...]
After writing about Oracle’s port of DTrace to OEL, I wanted to take it for a spin. Following the directions that Wim Coekaerts spelled out, I installed and configured a VM to run OEL with Oracle’s nascent DTrace port. Setting up the system was relatively painless. Here’s my first DTrace invocation on OEL: [root@screven ~]# [...]
Yesterday (October 4, 2011) Oracle made the surprising announcement that they would be porting some key Solaris features, DTrace and Zones, to Oracle Enterprise Linux. As one of the original authors, the news about DTrace was particularly interesting to me, so I started digging. I should note that this isn’t the first time I’ve written [...]