By Phil Whelan on January 10, 2012
In this blog post I’ll go through an example, in repeatable steps, of how to get up and running with the Tornado chat demo on ActiveState’s public sandbox for Stackato.
ActiveState’s Stackato PaaS (Platform-as-a-Service) is based on VMware’s open-source PaaS, Cloud Foundry, and offers an enterprise PaaS solution that will run on any public cloud, private cloud, laptop or desktop.
Posted in Infrastrucutre, PaaS, System Administration | Tagged activestate, clojure, cloudfoundry, java, mongodb, paas, perl, python, ruby, sandbox, stackato, tornado, vmware |
By Phil Whelan on December 15, 2011
Geohashing is a simple way to encode latitude and longitude and grouping nearby points on the globe with varying resolutions. It was created by Gustavo Niemeyer. This blog post looks at how it’s implemented any why it is such an elegant solution for encoding and managing location-based data.
Posted in Geospatial | Tagged clojure, geo, geographic, geohash, geospatial, java, javascript, key-value, location, maps, memcached, nearest neighbour, perl, python, ruby, scala, search |
By Phil Whelan on December 13, 2011
Last week I attended a presentation by James Golick, CTO of fetlife.com. Intriguingly advertised as “Scaling a Website to 300 Million Page Views Per Month with Scala” I thought it was worth a break from building cooq.com to attend. Fetlife have been heavy users of Ruby, but are transitioning to Scala and having been using [...]
Posted in System Administration, Web Development | Tagged a-b-testing, averages, capacity planning, degrade, fetlife, ganglia, metrics, monitoring, percentiles, redis, rollout, ruby, scala |
By Phil Whelan on March 7, 2011
In this post I will look at the technology infrastructure behind Summify.com, a website that strives to make our lives easier and helps us deal with the information overload we all experience every time we sit down at our computers. Summify has aggregated over 200 million stories from the web and serves them up on-demand through a series of different mediums. The website uses Tornado to push real-time updates out to the users and they have developed over a dozen backend systems, some of which I will cover in this blog post.
Posted in Start-Ups, Web Development | Tagged algorithms, amazon ec2, django, gunicorn, guy kawasaki, iphone api, jquery template, json api, mongodb, mysql, nginx, python, redis, slow clients, summify, the long tail, tornado, tornado swirl |
By Phil Whelan on February 14, 2011
In this blog-post Bradford Stephens, Drawn To Scale’s founder, answers a series of technical, business and personal questions to give an overview of what Drawn To Scale is and where it is going. Who are the founders? What is their background, technology and business model? How were they going to manage other people’s big data? Can one tool fit the demands from a broad range of data challenges that different businesses are seeing?
Posted in Interview, NoSQL, Start-Ups | Tagged big data, bradford stephens, business intelligence, cloud-computing, data management, data processing, drawn to scale, ec2, gaming, hadoop, hbase, high scalability, iaas, media, paas, rackspace, social networks, spire, startup |
By Phil Whelan on January 31, 2011
In this blog post I will delve into the snippets of information available on Quora and look at Quora from a technical perspective. What technical decisions have they made? What does their architecture look like? What languages and frameworks do they use? How do they make that search bar respond so quickly?
Posted in Start-Ups, Web Development | Tagged adam d'angelo, amazon ec2, aws, charlie cheever, comet, git, haproxy, livenode, long-polling, memcached, mysql, nginx, nosql, paste, pylons, quora, quora.com, search-box, steve souders, technology, thrift, tornado, ubuntu linux, webnode2, webscale |
By Phil Whelan on January 24, 2011
In a few of my recent posts I have covered the ease of deploying clusters of Hadoop and Cassandra using Whirr. With Whirr you can simply write a configuration file specifying which cloud provider you are using, your credentials and the definition of the cluster you desire and it will build it for you. In [...]
Posted in Hadoop, Web Development | Tagged amazon ec2, cloud-computing, hadoop, hbase, maven, nosql, ruby, whirr, zookeeper |
By Phil Whelan on January 21, 2011
If you have read my previous post, “Map-Reduce With Ruby Using Hadoop“, then you will know that firing up a Hadoop cluster is really simple when you use Whirr. Without even ssh’ing on the machines in the cloud you can start-up your cluster and interact with it. In this post I’ll show you that it [...]
Posted in Cassandra, Web Development | Tagged amazon ec2, cassandra, cluster, homebrew, mac os x, nosql, whirr |
By Phil Whelan on January 21, 2011
If you have used Memcached and have gotten carried away with how it can make everything faster, you may have felt some pain when that volatile memory disappeared into the ether due to a restart or a failure. Your house of cards comes tumbling down as suddenly nothing is in cache! Your application is scrambling around trying to fully serve your users and the database is screaming at the application, “what the *%$@ is going on up there?”. All your lack-of-performance implementation sins are exposed in one fell swoop and your boss stops inviting you to play golf. It’s tough, but a few hours later Memcached is singing again with a full belly of cache and your application layer is putting the kettle on to make your exhausted database a nice cuppa tea. You start to wonder… how can I persist this “caching” data?
Posted in Membase, NoSQL, Web Development | Tagged caching, distributed, key-value store, membase, memcached, moxi, nosql, review, web-development |
By Phil Whelan on January 18, 2011
In this blog post I will introduce SQLShell and demonstrate, step-by-step, how to install it and start using it with MySQL. I will also reflect on the possibilites of using this with NoSQL technologies, such as HBase, MongoDB, Hive, CouchDB, Redis and Google BigQuery.
SQLShell is a cross-platform, cross-database command-line tool for SQL, much like psql for PostgreSQL or the mysql command-line tool for MySQL.
Posted in Databases | Tagged clapper.org, couchdb, google storage, hbase, hive, homebrew, jdbc, mongodb, mysql, nosql, oracle, postgresql, psql, rdbms, redis, scala, sqlcmd, sqlite, sqlshell |
By Phil Whelan on January 14, 2011
Today I received my invite from Google “Google Storage for Developers”. Yes, like most of us, my life if wrapped in layers of googliness. In this post I’m going to review briefly what Google Storage is and upload the image files, for the images you see in this post, using the Google Storage Manager.
Posted in Web Development | Tagged dfs, distributed file storage, file uploader, file-server, gfs, google, google i/o, google storage, google storage manager, hosting, images, storage, upload |
By Phil Whelan on January 14, 2011
In this post I will explain the concept behind “zero-copy”, which is feature of the Linux allowing for faster transfer of data between pipes, file-descriptors and sockets. I will demonstrate how you can use this functionality in your Ruby projects using a code example. This functionality has been implemented in C, Java, Ruby, Perl and nameless other languages, but in this blog I will focus on the Ruby usage.
Posted in Ruby, Web Development | Tagged file-server, gem, io, io_splice, java, linux, ruby, splice, zero-copy |
By Phil Whelan on January 11, 2011
In this post I will define what I believe to be the most important projects within the Apache Projects for building scalable web sites and generally managing large volumes of data.
Posted in Data processing, Web Development | Tagged activemq, apache projects, apache software foundation, asf, cassandra, hadoop, hbase, high scalability, lucene, mahout, rabbitmq, solr, zeromq, zookeeper |
By Phil Whelan on January 5, 2011
In this post I will show, in repeatable steps, how to install PostGIS, load in geospatial data found in a KML file and run queries against that data. The focus of this geospatial data will be landslides and our resulting database will allow us to query, using longitude and latitude co-ordinates, the landslide status of a specific geographical point.
Posted in Geospatial, PostGIS, Software Development | Tagged esri shapefile, gdal, geo, geographic, geospatial, gis, homebrew, kml, kmz, landslides, latitude, libkml, longitude, macosx, ogr2ogr, postgis, postgresql, shapefile, shp, sql |
By Phil Whelan on January 3, 2011
Here is how you can embed an image in HTML inline. This is similar to how you embed an image in a HTML email message.
Posted in Web Development | Tagged base64, embedded image, html, html5, img, inline-image, web-development |