image_pdfimage_print

WebApp in 5 minutes: Deploying a Flask Python App to Pivotal Web Services/CF

image_pdfimage_print

As part of a recent project demo, I wanted to use Pivotal CF / Pivotal Web Services to deploy and execute a python application.  While there are other options on the market, they aren’t Federation solutions, and so aren’t preferable.
So what was the problem?  Well, by default CF only supports Ruby, Java and Node.js applications.  While those are all great languages, they aren’t python, and so they aren’t for me. 

Specifically, I’m a fan of the Flask microframework for creating web apps, so I wanted to make that work in CloudFoundry.

Fortunately, CloudFoundry supports the concept of a ‘buildpack’, or a custom execution evironment.  As a result, as long as it can run in Linux, you can probably get it to run on CloudFoundry.

The first task, was to write a very simple flask application:

import os
import pprint
import logging
from flask import Flask

app = Flask(__name__)

logging.basicConfig(level=logging.DEBUG)

@app.route('/')
def hello():
    return 'Hello World!\n' + pprint.pformat(str(os.environ))

port = os.getenv('VCAP_APP_PORT', '5000')
if __name__ == "__main__":
    app.run(host='0.0.0.0', port=int(port))

This is the equivalent of “Hello World”.  The only ‘strangeness’ here is the following:

port = os.getenv('VCAP_APP_PORT', '5000')

which uses the VCAP environment variables that CF sets in order to listen on the correct port.  If that value is not found, it defaults to 5000.  Note that your application gets routed through some network address translation and load balancing when deployed to the cloud, so users still access it via port 80 – this is just internal housekeeping.  Note that you’ll need to install the flask module before this will work – using pip and virtualenv are probably the best route.

Now we want to deploy our application.  First, sign up for a free Pivotal CF account at run.pivotal.io:

Screen Shot 2014-04-14 at 3.29.04 PM

Then you can login to the console:

Screen Shot 2014-04-14 at 3.30.20 PM

Overall thats pretty simple.  The last thing we need to do is install the cf command line tools.  They are pretty easy to install.

I then login and get a bit of feedback:

(cf-mcowger1)[mcowger@Vox][0] Projects/hello-python:
→ cf login -a https://api.run.pivotal.io                                                                                                                                   ]
API endpoint: https://api.run.pivotal.io

Username> <redacted>

Password> 
Authenticating...
OK

Targeted org mkc

Targeted space development

API endpoint: https://api.run.pivotal.io (API version: 2.2.0)
User:         <redacted>
Org:          mkc
Space:        development

The next step is to collect all the requirements into a file.  pip makes this easy by enabling us to output a requirements.txt file in the correct format:

(cf-mcowger1)[mcowger@Vox][0] Projects/hello-python:
→ pip freeze > requirements.txt                                                                                                                                            ]
(cf-mcowger1)[mcowger@Vox][0] Projects/hello-python:
→ car requirements.txt                                                                                                                                                     ]
zsh: command not found: car
(cf-mcowger1)[mcowger@Vox][127] Projects/hello-python:
→ cat requirements.txt                                                                                                                                                     ]
Flask==0.10.1
Jinja2==2.7.2
MarkupSafe==0.19
Werkzeug==0.9.4
itsdangerous==0.24
wsgiref==0.1.2

Lastly, we need to tell CF how to run this application (e.g. specify command lines, etc).  This is done with a ‘Procfile’, and is quite simple:

→ cat Procfile                                                                                                                                                             ]
web: python hello.py

Easy!

The last step is to deploy, or ‘push’ the app to Pivotal CF.  This is easily done with the following command

cf push mcowger-1 -b https://github.com/ephoning/heroku-buildpack-python -m 128m -i 1

Let’s break that down:

  • push: deploy a web application
  • mcowger-1: what to call this web application.  By default, this will also be in the URL (although it can be changed)
  • -b https://github.com/ephoning/heroku-buildpack-python: Here we specify that custom buildpack I mentioned, using an opensource one from github.  Fortunately, CloudFoundry can use most Heroku buildbacks (they use the same concept) with minimal, if any, modifications.
  • -m 128m: How much memory to alllocate your application.  A minimal application like this needs minimal memory
  • -i 1: How many instances of this should we run.  Its entirely possible to scale your application (assuming its designed for it) by changing this value.

Out of that, CF gives us a bunch of status output, and if all goes well, after minute or two, you’ll get:

requested state: started
instances: 1/1
usage: 128M x 1 instances
urls: mcowger-1.cfapps.io

And would you look at that, you can hit mcowger-1.cfapps.io and see your application running live for the world.  Now, your next application should be somewhat more complex, but you get the idea.  As you add more modules, don’t forget to use the ‘pip freeze’ command to add them to requirements.txt!

In a future post I’ll discuss using marketplace apps to back your application with things like memcached, redis or PostgreSQL.

A skeleton for this project is available on my github.

On Generating Unique Data – ‘dd’ can do it

image_pdfimage_print

IDC semi-recently released their guidelines for performance testing on all flash arrays.  Its a good read, and makes solid suggestions for how to handle the unique aspects of all flash arrays, especially ones with dedupe like XtremIO, Pure, etc.

One of their suggestions comes around preconditioning the array (to expose any issues an array or its devices may have with garbage collection after overwrite).  Specifically, they suggest:

Our research revealed the best wa y to precondition is to create a significant workload of 4 KB unique, not deduplicatable, or uncompressible exclusive writes and let the test run for 24 hour

 

This seemed pretty obvious to me…you can’t simply write terabytes of zeroes to an array with dedupe/compression and expect to actually occupy terabytes on physical media.  Aaron Skogsberg of Pure suggested this was non trivial, to which I replied:

To which Aaron replied:

although I will note that he never provided anything to back that up.  A friend of mine, Pat Donahue, also suggested that /dev/urandom wouldn’t be a good choice (his timeline is private) because its not as entropic as /dev/random.
In light of those suggestions, I decided to actually back it up.  So, I wrote some quick code that performs the following:

  1. Given a file, opens the file and reads it in 4K blocks (to match the dedupe size of most modern dedupe algorithms, including that of XtremIO and others)
  2. For each chunk, calculate its SHA-256 hash, and increment that count in a hashtable of all identified unique chunks.
  3. Report the results

In short – do what most dedupe algorithms do.  Granted, this is a simple implementation, but its pretty reasonable to a first approximation.  Further enhancements could use techniques like sliding windows to potentially enhance duplicate identification.  Here’s a snippet of the most important function:

def shamethod(percent_of_file=100):
fh = open(filename, 'rb')

blocks_in_file = filesize_bytes / BLOCK_SIZE
blockcount_to_process = int(blocks_in_file * (percent_of_file / 100.0))
blockmap = sorted(random.sample(range(0, blocks_in_file), blockcount_to_process))

prevsize = 0
blocks_processed = 0
sha256_hashes = {}

bar = Bar('Processing File (SHA 256)', suffix='ETA: %(eta_td)s', max=len(blockmap))

for blocklocation in blockmap:
fh.seek(blocklocation * BLOCK_SIZE)
chunk = fh.read(BLOCK_SIZE)
blocks_processed += 1
if chunk:
s = hashlib.sha256()
s.update(chunk)
sha256_hashes[s.hexdigest()] = sha256_hashes.get(s.hexdigest(),0) + 1

print "SHA-256 Results (Baseline)"
print "File Blocks: %s" % blocks_in_file
print "Evaled Blocks: %s" % blockcount_to_process
print "Total Unqiue: %s" % len(sha256_hashes)
print "Dedupability: %s:1" % (round((float(blocks_processed) / len(sha256_hashes)),2)

From there, I tested this against known dedupe rates from production arrays, and got numbers within 1% of actual values, so I was confident in the implementation.

The next step was to generate my sample data.  I built three sets of data:

  • All zeros: dd if=/dev/zero of=zeros  count=1048576
  • Random from urandom: dd if=/dev/urandom of=urandom  count=1048576
  • Random from /dev/random: dd if=/dev/random of=random  count=1048576

And ran it through the code:

→ python fullchecker.py ~/Downloads/zeros 100                                                     ]
Processing File (SHA 256) |################################| ETA: 0:00:00
SHA-256 Results (Baseline)
Percent    :    100
File Blocks:    131072
Evaled Blocks:  131072
Total Unqiue:   1
Dedupability:   131072.0:1
Hashtable (MB): 0.0
Time For SHA256: 7.46s

→ python fullchecker.py ~/Downloads/urandom 100                                                   ]
Processing File (SHA 256) |################################| ETA: 0:00:00
SHA-256 Results (Baseline)
Percent    :    100
File Blocks:    131072
Evaled Blocks:  131072
Total Unqiue:   131072
Dedupability:   1.0:1
Hashtable (MB): 6.0
Time For SHA256: 7.71s

→ python fullchecker.py ~/Downloads/random 100                                                    ]
Processing File (SHA 256) |################################| ETA: 0:00:00
SHA-256 Results (Baseline)
Percent    :    100
File Blocks:    131072
Evaled Blocks:  131072
Total Unqiue:   131072
Dedupability:   1.0:1
Hashtable (MB): 6.0
Time For SHA256: 7.51s

I’ve bolded the important parts. As you can see, in the case of all zeroes, we achieved a 100% dedupe rate (as expected) because there is only a single detected unique block. In the case of /dev/random, there were 131072 unique blocks detected out of 131072 total blocks, so the ‘best available’ random data from /dev/random, achieving a 0% dedupe rate (as is expected).

So, how did /dev/urandom do? It contained 131072 unique blocks out of 131072 total blocks – the EXACT SAME as /dev/random, and demonstrably non-dedupeable.

So – I’m very comfortable in asserting that my original suggestion of using /dev/urandom does meet the requirements in the IDC paper of unique data.

A note about comments: I’m open to constructive comments demonstrating that my work above isn’t valid or needs changes, but I will delete comments that don’t make a technical case for it or if they turn in ad-hominum or ad-comitatus suggestions.

My Loyalties….

image_pdfimage_print

This is a modified version of a post I made on VMware’s internal Socialcast, expressing my feelings of how important it is to work together.  A reminder:  this is *my* blog, not an EMC, VMware or Pivotal blog.

My paycheck comes from David Goulden.  I have an emc.com email address.  I give briefings at the EMC EBC on the EMC strategy.  I train EMC teams on the integration points between EMC and VMware and Pivotal.  In that sense, you could say I’m an EMCer.  I’d agree with you.

I sit in PromD almost every day.  I have a vmware.com email address.  I have a vmware badge.  I train EMCers on how to use/sell VMware products.  I write joint messaging documents.  I’m a VCDX and VCDX panelist.  I’m a vExpert for every year its existed.   I give presentations at the VBC alongside VMware teams at least monthly.  In that sense, you could say I’m a VMware-ite (er?).  I’d agree with you.

My badge works in the Pivotal and Greenplum buildings.  I write apps in Python, Java using things like RabbitMQ, Redis and Gemfire against PivotalHD.  I believe that PaaS is the future of this industry.  In that sense, you could say I’m a Pivoter.  I’d agree with you.

Really, I believe I am an evangelist for the Federation.  I don’t believe that EMC alone has what it takes to be the best in this market – we lack some very critical things around agility, understanding of applications and the mobile world.  These are things that VMware and Pivotal have down pat.  In concert, I think that EMC brings some value in a strong depth and breadth of physical infrastructure that Pivotal and VMware dont have.  There’s a reason these three companies exist together…

…because together, we are be the best choice for a customer, if we work together.

As a result, I push for us to improve everywhere I can.  Sometimes that means telling an internal EMC team to use VSAN rather than a VNX (which I did yesterday), sometimes it means being honest about cost models so we all know where we stand on things like $/GB, sometimes it means suggesting an EMC product to fill a gap for a customer where VMware doesn’t have a play, and sometimes it means admitting when an EMC product isn’t the best choice to use at VMware (in my role as their global architect) and suggesting something else.

I’m not emotional about specific products – but I am absolutely invested in this.  I know beyond a shadow of a doubt that my success (personally, professional, monetarily) is strongly tied to the success of the Federation as a whole.  I *need* products like VSAN, vC Ops, ScaleIO, ViPR, etc to win in the market so I can feed my family, and I will work as hard as I can to make that happen.  Ultimately, I think that this will benefit the customers.  As Chad wrote in his pre-sales manifesto: “We put the customer first, company second, and ourselves third.”

We will all win together; customers, partners, vmware, emc and pivotal.

I’m with you.

VCP Recertification

image_pdfimage_print

VMware announced recently that they would start requiring re-certification of VCPs.  I’m not sure I feel like this is a good call.

Their reasoning is two fold (as far as I can tell):

  1. Ensuring that VCP holder’s knowledge is current.  “But staying up to date in the expertise gained and proven by your certification is equally vital. If your skills are not current, your certification loses value.”
  2. Most other industry certifications require this as well. “…and is on par with other technical certifications like ones offered by HP, Cisco and CompTia (A+)” – Christian Mohn

I think both of these arguments are specious.

  1. VCPs are tied to a specific version of vSphere…they aren’t ‘version agnostic’.  Check out my transcript below:Screen Shot 2014-03-10 at 10.38.33 AMEvery certification I’ve received is version specific.  Meaning there’s nothing to ‘keep up to date’.  vSphere 3.5 hasn’t changed.  Therefore my VCP3 shouldn’t need to be updated.  Clearly, if I don’t hold VCP4 or VCP5 (or some other, higher certification), I can’t show that I’ve been keeping up with the technology, and that my knowledge of the current product line is outdated, but that doesn’t impact the understanding in my VCP3.  There’s no reason to remove my ability to use the VCP3 logo…and perhaps VMware should drop the use of unversioned logos:
    VCP_Logoin favor of versioned logos like this:

    vcp5_image_400X300So that its more clear how current a given person’s certification is.

  2. Other vendors do expire certifications, but the majority of vendors also don’t specify a version number on the certification itself.  The CCNA, CCIE, from everything I understand are non-version specific, and therefore having to ‘recertify’ is entirely reasonable, because the technology they refer to has changed.

So there you have it – I think the re-certification requirement is silly.

Quick Post: VNX Snapshots Performance

image_pdfimage_print

I’ve recently been working on a design for VNX behind VPLEX, and wanted to do some pseudo-continuous data protection.  Now, normally one would use RecoverPoint on VPLEX for that, but there are some current limitations that made that not work for me – specifically, the fact that RecoverPoint CDP can only be used with one side of a VPLEX MetroCluster at a time.  If that side of the cluster goes down your data stays up (good), but you lose your CDP with it (bad).

Unknown-1So, the next option would have been to do RecoverPoint directly on the arrays (VNX) using the built in splitter…this is also a reasonable idea, but has a downside – RecoverPoint requires LUNs (called copy LUNs) that are the same size as the production devices.  So for a single CDP copy at either side of a 100GB production device, we are storing 300GB of data (+ another 50 or so for journals).  Because space is at a premium in this environment, I wanted to look at something a bit more ‘thin.’

UnknownSo, I turned to the Pool-based snapshots in VNX that were released with the Inyo codebase (and still available of course in Rockies on the VNX 2 series).  I like these because they consume space from the same pool the production VM is in (no need to strand space in dedicated snap devices), and consume only as much space as was written to the LUN.  Lastly, they use a Redirect-on-Write technique to avoid the performance hit of Copy On First Write like the older SnapView snapshots did.  Sometimes these are called ‘AdvancedSnapshots’, as an FYI.

But – how impactful are the snapshots?  How do they impact performance?  I decided to test this, to see if would be a reasonable thing to propose to my customer.

Screenshot 2013-12-31 11.23.39

I set up a quick test.  I used a VNX5500 in my lab (so not the current VNX2 series, and also a much lower end model than the customer’s VNX7600), and used 25x300GB 10K SAS drives along with 4x100GB EFDs in a RAID5 pool.  Carved out a 2TB LUN and allocated it to a host.

I started off with just a streaming test of 100% write traffic, and as able to achieve about 550MB/sec sustained (about 5.5GBit).  Next, I wrote a loop to write data, and create snapshots as I went:

[vmax] /mnt/A: for i in {1..96}; do ] 10:30 AM %
for> naviseccli snap -create -res 4 -keepFor 1d 
for> sudo dd if=/dev/zero of=/mnt/A/10G bs=1M count=10240
for> echo $i
for> done

Every iteration of the loop therefore had 10GB of new data, as far as the array was concerned. I let this run 96 times, to simulate hourly snapshots over 4 days.  During this process, I kept track of the write performance.  Here’s a snippet:

10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 18.988 s, 565 MB/s
6
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 17.2495 s, 622 MB/s
7
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 17.4732 s, 615 MB/s
8
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 19.1517 s, 561 MB/s
9
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 18.7212 s, 574 MB/s
10
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 18.8609 s, 569 MB/s
11

As you can see, its very consistent, and there’s no real degradation in performance while taking the snapshots.

Lastly, I ran a similar test while deleting all those snapshots in the background, to make sure that the customer wouldn’t experience any degradation as the snapshots aged out and were deleted as time rolled on.  Another snippet:

10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 19.1024 s, 562 MB/s
13
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 20.2217 s, 531 MB/s
14
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 21.1915 s, 507 MB/s
15
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 20.8978 s, 514 MB/s
16
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 19.8404 s, 541 MB/s
17
10240+0 records in
10240+0 records out
10737418240 bytes (11 GB) copied, 19.5746 s, 549 MB/s
18

Again, not notable performance difference.  Thats some good news, as it means I can suggest this idea to a customer without concern.

ScaleIO @ Scale – Update

image_pdfimage_print

If you saw my recent post about pushing the limits of ScaleIO on AWS EC2, you’ll notice that I had a few more plans.  I wanted to push the node count even higher, and run on some heavier duty instances.

Well, the ScaleIO development team noticed what I had done, and decided to take my code and push it to the next level.  Using the methods I developed, they hit the numbers I had been hoping to get.  1000 nodes, across 5 protection domains, all on a single MDM and single cluster.

995SDS_400SDC_100Vols_923773IOPSAs you can see, the team was able to get 100 volumes built and 400(!) clients.  Most impressively, using minimal nodes (I believe these to be m1.medium nodes), they achieved a 3.5 Gbytes/s (yes, bytes!) – 28 Gigabit worth of performance across the AWS cloud.  Also, very nearly 1 million IO/s.  Needless to say, I was floored when I saw these results.

Special thanks to the ScaleIO team, Saar, Alexei, Alex, Lior, Eran, Dvir who ran this test (with no help from me, a mean feat in itself given how undocumented my code was!) and produced these results.

Lastly, I also got my hands on 10 AWS hi1.4xlarge instances, which have local SSDs…Unfortunately, I managed to delete most of the screen shots from my test, but I was able to achieve 3.5-4.0 Gbytes/sec using 10 nodes on the same 10GBit switch.  Truly impressive.  And, as a number of people have asked about latency….average latencies in that test were ~650 µsec!  The one screen shot I was able to grab was during a rebuild after I had removed and replaced a couple nodes.

Screen Shot 2013-11-14 at 8.22.38 PM

 

Rebuilding at 2.3GB/s is something you rarely see :).

I’m really happy to be able to share these cool updates from the team.  Feel free to ask questions.

 

ScaleIO @ Scale – 200 Nodes and Beyond!

image_pdfimage_print

Buzz-Robot-2_1266239631Ever since my last post a couple weeks about ScaleIO, I’ve been wanting to push its limits.  Boaz and Erez (the founders of ScaleIO) are certainly smart guys, but I’m an engineer, and whenever anyone says ‘It can handle hundreds of nodes’, I tend to want to test that for myself.

So, I decided to do exactly that.  Now, my home lab doesn’t have room for more than a half dozen VMs.  My EMC lab could probably support about 50-60.  I was going for more – WAY more.  I wanted hundreds, maybe thousands.  Even EMC’s internal cloud didn’t really have the scale that I wanted to do, as its geared for more long lived workloads.

So, I ended up running against Amazon Web Services, simply because I could spin up cheap ($.02/hr) t1.micro instances very rapidly without worrying about cost (too much – it still aint free). They have an excellent API (boot) that is very good and easy to use.  Combine that with the paramiko ssh library and you have a pretty decent platform to deploy a bunch of software on.

Some have asked why I didn’t use the Fabric project – I didn’t feel that its handling of SSH keys was quite up to par, nor was its threading model.  So rather than deal with it, I used my own thread pool implementation.

Anyways – where did I end up?  Well, I found that about 5% of the deployed systems (when deploying hundreds) would simply fail to initialize properly.  Rather than investigate, I just treat them as cattle and shoot them in the head, then replace them.  After all the nodes are built and joined to the cluster, I created 2 x 200GB volumes and exported them back out to all the nodes.  Lastly, I ran a workload generator on them to drive some decent IO.

I ended up being able to shove 200 hosts into a cluster before the MDM on ScaleIO 1.1 refused to let me add any more. I haven’t identified yet if that is actually the limit, nor have I tried with ScaleIO 1.2 yet.  But – you can bet its next on my list!

What does it all look like?

Here are the nodes in the Amazon Web Services Console…

Screen Shot 2013-11-07 at 11.44.45 AM

And then they’ve all been added to the cluster:

Screen Shot 2013-11-07 at 11.39.26 AMThen, I ran some heavy workload against it.  Caveat: Amazon t1.micro instances are VERY small, and limited to less than 7MB/s throughput each, along with about half a CPU and only about 600MB RAM.  As a result, they do not reasonable represent the performance of a modern machine.  So don’t take these numbers as what ScaleIO is capable of – I’ll have a post in the next couple weeks demonstrating what it can do on some high powered instances.

Screen Shot 2013-11-06 at 9.36.01 PM

 

Pushing over 1.1GB/s of throughput (and yes, thats gigabytes/sec, so over 10Gbits total throughput across almost 200 instances.

Screen Shot 2013-11-07 at 11.39.39 AM

 

The individual host view also shows some interesting info, although I did notice a bug where if you have more than a couple dozen hosts, they won’t all show individual in the monitoring GUI.  Oh well – thats why we do tests like this.

Lastly, when I terminated all the instances simultaneously (with one command, even!), I caught a pic of the very unhappy MDM status:

Screen Shot 2013-11-07 at 11.43.20 AM

 

How much did this cost?  Well, excluding the development time and associated test instance costs….to run the test along required 200 t1.micro instances @ $0.02/hr, 1 m1.small instance @ $0.06/hr, 201 * 10GB EBS volumes @ $0.10 / GB-month.  In total?  About $7.41 :).  Although if I add in the last couple weeks worth of development instances, I’m at about $41

Screen Shot 2013-11-07 at 12.20.33 PM

 

Maybe @sakacc (Chad Sakac) will comp me $50 worth of drinks at EMCworld?

Lastly, you can find all the code I used to drive this test at my GitHub page.  Note – its not very clean, has little documentation and very little error handling.  Nontheless, it helpful if you want some examples of using EC2, S3, thread pooling, etc.

I’ll have 3-5 more posts over the next week or two describing in more depth each of the stages (building the MDM, building the Nodes, adding to the cluster and running the workload generator) for the huge nerds, but for now – enjoy!

 

Quick ScaleIO Tests

image_pdfimage_print

I managed to get my hands on the latest ScaleIO 1.2 beta bits this week, and wanted to share some of the testing results.  I’ve been pretty impressed.

I installed a cluster consisting of 4 total nodes, 3 of which store data, and one of which provides iSCSI services and acts as a ‘tie breaker’ in case of management cluster partitions.

Each of the 3 nodes with data (ScaleIO calls these SDS nodes) was a single VM with 2 vCPUs and 1GB of allocated memory.  Very small, and I suspect I can knock them down to 1vCPU given the CPU usage I saw during the tests.  Each one also had a single pRDM to the host’s local disk (varying sizes, but all 7200 RPM SATA).  Building the cluster was fairly simple – I just used the OVA that ScaleIO provides (although a CentOS or SuSE VM works too) and used their installer script.  The script asks for a bunch of information, and then simply installs the requisite packages on the VMs and builds the cluster based on your asnwers.  Of course, this can all be done manually, but the handy script to do it is nice.BWEl-VnCIAA3HvZ.png-large

Once it was installed, the cluster was up and running and ready to use.  I built a volume and exported it to a relevant client device (the one serving iSCSI).  From there, I decided to run some tests.

The basic IO patterns were the first ones I tried, and I did pretty well:

  1. 125 MB/s sustained read
  2. 45 MB/s sustained write
  3. 385 IO/s for a 50:50 R:W 8K workload (very database like).

These are pretty great numbers for just 3 slow consumer class drives.  Normally, we’d rate a set of 3 drives like this at about 60% of those numbers.  Check out the dashboard during the write test:

Screen Shot 2013-10-08 at 12.50.07 PM

After that basic test, I decided to get more creative.  I tried removing one of the nodes from the cluster (in a controlled manner) on the fly.  There was about 56GB of data on the cluster at that point, and the total time to remove?  6 mins, 44 sec.  Not bad for shuffling around that much data.  I then added that system back (as a clean system), and the rebalance took only 9 mins, 38 sec – again averaging about 48MB/s (about the peak performance that a SATA drive can sustain).

The last set of tests I decided to run were some uncontrolled failure tests, where I simply hard shut down one of the SDS VMs to see how the system would react.  I was impressed that the cluster noted the failure within about 5 seconds of the event and instantly began moving data around to reprotect it (again, peaking around 54 MB/s).  It took about 7 minutes to rebuild…not bad!  I’ve included a little screen cast of that below.

I then powered that host back on to see how the rebalance procedure looks (remember, its not a rebuild anymore, because that data has been reprotected already – its pretty much the same as adding a net-new host).  I have another screencast for that too.

All told, I’m pretty impressed.  Can’t wait to get some heavier duty hardware (Chad Sakac, are you listening?) to really push the limits.

Barcelona Bound

image_pdfimage_print

Just a quick post this week (I have one coming next week about business relationships)…

Couple people have asked me what I’m doing for telecom in Barcelona for the VMworld show.  It has certainly been quite a few years since I was super into cell phone technology (at one point I went through like 4 phones a year!), so I had to do some research.  I knew I wanted to use my iPhone, and I wanted to use data over there, without paying the stupid fees that my carrier (Verizon) charges.

So first I had to do the research to figure out if

  1. my phone (a verizon iphone 5) uses the same frequencies as spain
  2. my phone is gsm compatible
  3. my phone is unlocked

Some quick research showed me that all verizon iphone 5’s are unlocked by verizon (what a surprise!), that the frequency bands are compatible for 3G data (no significant LTE in Europe, apparently) and that I can use it on GSM.  I can.  Sweet!
Now I just needed to find a SIM that would work over there.  Now, a bunch of friends recommended just getting a PAYG SIM once I arrived, but my Spanish is rusty, and I’d like to be able to call my family as soon as I land without paying exorbitant fees.  Also, I’m a planner, so I wanted this out of the way.  Found a bunch of options, but I settled on HolidayPhone for a couple reasons:

  • They provide a pre-cut SIM for my phone
  • They provide the SIM ahead of time and mail it to my house (from Sweden, no less!)
  • They provide a pretty good rate plan ($0.06/m to the US, and at least an hour of free incoming calls)
  • They have a clever call forwarding system so that I remain reachable on my US number, but also via Spanish phone number as well.
  • They have a good data plan (550MB for 30 days for about $7).

I feel like this combo will let me call my family as much as I like (and they can call me), call my colleagues and use Twitter/blogs/tethering etc as much I like (550MB is plenty for me given I will only be there for a week, and be limited to 3G performance).
Worth a shot, at least, for a pretty cheap experiment of about $60

See everyone soon – I’m excited to see Barcelona for the first time.

The VCDX Brain Drain (or VMware Should Be A Net VCDX Exporter)

image_pdfimage_print

I have a growing concern about the future of the VCDX certification track.  Its certainly a vested interest; as a current VCDX, I want the certification to be hard to achieve, well marketed and valuable within the industry.  Why?  Because I like money, of course.

So my concern is this: it appears to me that VMware is aggressively hiring VCDXs, which I think is wrong, and dangerous to the future of the certification.

Currently, VMware employs ~45 VCDXs, when I count through the VCDX page and apply some recent knowledge of moves.  In the past year, VMware has directly hired at least 4 VCDX from partners.  I believe there may be more.  I wont name them directly, because this article isn’t about them (I applaud their personal choices), its about VMware.

VMware should not be hiring VCDXs from partners.  In fact, I might go so far as to suggest VMware probably should avoid hiring existing VCDXs at all.  I believe this presents two dangers to the program itself.

  1. By draining the partner pool of VCDXs, VMware is effectively telling partners, “sure, go ahead and spend many thousands training this person up to VCDX level, paying for their hotels, defenses, etc – when you are done, we will go ahead, swoop in with a sweet offer you can’t match and take them.”  This is hardly the way to engender loyalty among partners.  The natural outgrowth of this tactic is that partners will no longer be interested in supporting the candidacy of a VCDX, simply because it wouldn’t provide them any value.
  2. The size of the VCDX pool is of crucial importance.  Too small, and customers either don’t know about the certification (and therefore don’t recognize the value a partner with the certification brings).  Too large, and it becomes nearly routine (think A+ certifications), and therefore of low value.  The pool needs to be large enough to be known, small enough to be a little rare, but again large enough so that there is a reasonable chance that a customer can find a partner with a VCDX or two (or three) on staff.  By hiring and employing so much of the VCDX pool (nearly 40%, by my count), VMware artificially limits the number of partners than will create or employ a VCDX, thus reducing the visibility and value of the certification itself.

Of all the players (partners, VMware itself, vendors, people), the ones with the most opprtunity to fix this is VMware themselves.  With the VERY solid braintrust they have, existing large VCDX pool and the extensive resources and PSO-style options, VMware should be a VCDX-production machine.  It should be trivial (and a goal) for them to hire good people, train them up to VCDX level, get them certified internally and then (eventually, after a couple years paying their dues in PSO or what-have-you) going out into the partner community.  This have a number of effects:

  • The VCDX population grows to a larger size (which it needs to).
  • The partners are no longer afraid of supporting a VCDX candidacy
  • VMware gets all the VCDX power it needs

This has some parallels to other top level certifications that some colleagues have brought up.  Specifically, they mentioned Cisco CCIE, and if I also believe that Cisco should not hire CCIEs.  I’d argue that, currently, given that there are thousands of CCIEs, that the pool size is no longer a problem, and that Cisco is in a different position.

With VMware out of the business of hiring partner VCDXs, partners & vendors can go back to supporting VCDX candidates without fear and VMware can produce those that it needs internally.

Thoughts?